Data + AI Summit 2022 PPT 下载

本次会议的超清视频已经在前几天分享给大家了，需要的同学可以到《Data + AI Summit 2022 超清视频下载》获取下载链接。本文主要收集了本次会议的 PPT，需要的同学可以获取。

超清 PPT 下载途径

目前可以获取到的 PPT 主要有 170 个左右，关注微信公众号 过往记忆大数据 或者 Java与大数据架构

推荐观看的议题

由于 Data + AI Summit 2022 会议的议题比较多，不一定都感兴趣，所以这块我给大家整理出十几个比较干的议题，推荐大家观看：

可下载 PPT 的议题

A Modern Approach to Big Data for Finance

A Practitioner's Guide to Unity Catalog A Technical Deep Dive

AI Fueled Forecasting The Next Generation of Financial Planning

AI powered Assortment Planning Solution

ALaSpark Gousto Recipe for Building Scalable PySpark Pipelines

Accelerating the Pace of Autism Diagnosis with Machine Learning Models

Achieve Machine Learning Hyper Productivity with Transformers and Hugging Face

Administrator Best Practices and Tips for Future Proofing your Databricks Account

Advanced Migrations From Hive to SparkSQL

Adversarial Drifts, Model Monitoring, and Feedback Loops Building Human in the Loop Machine Learning Systems for Content Moderation

Agile Data Engineering Reliability and Continuous Delivery at Scale

Amgen’s Journey To Building a Global 360 View of its Customers with the Lakehouse

An Advanced S3 Connector for Spark to Hunt for Cyber Attacks

Apache Arrow Flight SQL High Performance, Simplicity, and Interoperability for Data Transfers

Apache Spark SQL Aggregate Improvement at Meta (Facebook)

Apache Spark on Kubernetes—Lessons Learned from Launching Millions of Spark Executors

Apache Spark AQE SkewedJoin Optimization and Practice in ByteDance

Applied Predictive Maintenance in Aviation Without Sensor Data

Auto Encoder Decoder Based Anomaly Detection with the Lakehouse Paradigm

Automate Your Delta Lake or Practical Insights on Building Distributed Data Mesh

Automating Model Lifecycle Orchestration with Jenkins

Automating Business Decisions Using Event Streams

Backfill Streaming Data Pipelines in Kappa Architecture

Best Practices of Maintaining High Quality Data

Big Data in the Age of Moneyball

Build an Enterprise Lakehouse for Free with Trino and Delta Lake

Building Enterprise Scale Data and Analytics Platforms at Amgen

Building Metadata and Lineage Driven Pipelines on Kubernetes

Building Production Ready Recommender Systems with Feature Stores

Building a Data Science as a Service platform in Azure with Databricks

Building a Lakehouse for Data Science at DoorDash

Building an Analytics Lakehouse at Grab

Building and Scaling Machine Learning Based Products in the World's Largest Brewery

Building Spatial Applications with Apache Spark and CARTO

Building an Operational Machine Learning Organization from Zero and Leveraging ML for Crypto Security

Case Study in Rearchitecting an On Premises Pipeline in the Cloud

Challenges in Time Series Forecasting

Chaos Engineering in the World of Large Scale Complex Data Flow

Cloud Native Geospatial Analytics at JLL

Cloud and Data Science Modernization of Veterans Affairs Financial Service Center with Azure Databricks

Computational Data Governance at Scale

Connecting the Dots with DataHub Lakehouse and Beyond

Coral and Transport Portable SQL and UDFs for the Interoperability of Spark and Other Engines

Correlation Over Causation Cracking the Relationship Between User Engagement and User Happiness

Customer centric Innovation to Scale Data AI Everywhere

Cutting the Edge in Fighting Cybercrime Reverse Engineering a Search Language to Cross Compile it to PySpark

DBA Perspective Optimizing Performance Table by Table

Data Boards A Collaborative and Interactive Space for Data Science

Data Centric Principles for AI Engineering

Data Lakehouse and Data Mesh Two Sides of the Same Coin

DataFusion and Arrow Supercharge Your Data Analytical Tool with a Rusty Query Engine

Databricks Meets Power BI

Databricks and Enterprise Observability with Overwatch

Deep Dive into Delta Lake

Deep Dive into the New Features of Apache Spark

Delta Lake Overview

Delta Sharing A New Paradigm for Secure Data Sharing and Data Collaboration on Lakehouse

Democratizing Metrics at Airbnb

Designing Better MLOps Systems

Destination Lakehouse All Your Data Analytics and AI on One Platform

Detecting Financial Crime Using an Azure Advanced Analytics Platform and MLOps Approach

Discover Data Lakehouse With End to End Lineage

Disrupting the Prescription Drug Market with AI and Data

Distributed Machine Learning at Lyft

Doubling the Capacity of the Data Platform Without Doubling the Cost

Elixir The Wickedly Awesome Batch and Stream Processing Language You Should Have in Your Toolbox

Embedding Privacy by Design Into Data Infrastructure Through Open Source Extensible Tooling

Enable Production ML with Databricks Feature Store

Enabling BI in a Lakehouse Environment

Enabling Learning on Confidential Data

Ensuring Correct Distributed Writes to Delta Lake in Rust with Formal Verification

Evolution of Data Architectures and How to Build a Lakehouse

Fugue Tune Distributed Hybrid Hyperparameter Tuning

FutureMetrics Using Deep Learning to Create a Multivariate Time Series Forecasting Platform for Economic Strategic Planning

GIS Pipeline Acceleration with Apache Sedona

Git for Data Lakes How lakeFS Scales Data Versioning to Billions of Objects

Hassle Free Data Ingestion into the Lakehouse

How to Automate the Modernization and Migration of Your Data Warehousing Workloads to Databricks Lakehouse

How EPRI Uses Computer Vision to Mitigate Wildfire Risks for Electric Utilities

How Robinhood Built a Streaming Lakehouse to Bring Data Freshness from 24h to Less Than 15 Mins

How To Make Apache Spark on Kubernetes Run Reliably on Spot Instances

How To Use Databricks SQL for Analytics on Your Lakehouse

How socat and UNIX Pipes Can Help Data Integration

How the Largest County in the US is Transforming Hiring with a Modern Data Lakehouse

How to Build a Complete Security and Governance Solution Using Unity Catalog

How to Implement a Semantic Layer for Your Lakehouse

Implementing Data Governance 3.0 for the Lakehouse Era Community Led and Bottom Up

Implementing a Framework for Data Security and Policy at a Large Public Sector Agency

Implementing an End to End Demand Forecasting Solution Through Databricks and MLflow

Improving Apache Spark Structured Streaming Application Processing Time

Improving Interactive Querying Experience on Spark SQL

Improving patient care with Databricks

Ingesting data into Lakehouse with COPY INTO

Integrating Apache Superset into a B2B Platform Why and How

Introducing Zipline An Open Source Feature Engineering Platform

Learn to Efficiently Test ETL Pipelines

Lessons Learned from Deidentifying 700 Million Patient Notes

Low Code Machine Learning on Databricks with AutoML

MLOps at DoorDash

MLflow Pipelines Accelerating MLOps from Development to Production

Mapping Data Quality Concerns to Data Lake Zones

Meshing About with Databricks

Migrate and Modernize your Data Platform with Confluent and Databricks

Migrating Complex SAS Processes to Databricks Case Study

Monitoring and Quality Assurance of Complex ML Deployments via Assertions

Mosaic A Framework for Geospatial Analytics at Scale

Multimodal Deep Learning Applied to E commerce Big Data

Near Real Time Analytics with Event Streaming, Live Tables, and Delta Sharing

Obfuscating Sensitive Information from Spark UI and Logs

Open Source Powers the Modern Data Stack

Opening the Floodgates Enabling Fast Unmediated End User Access to Trillion Row Datasets with SQL Data Warehouses

Optimizing Speed and Scale of User Facing Analytics Using Apache Kafka and Pinot

Polars Blazingly Fast DataFrames in Rust and Python

Power to the SQL People Python UDFs in DBSQL

Powering Up the Business with a Lakehouse

Practical Data Governance in a Large Scale Databricks Environment

Predicting Repeat Admissions to Substance Abuse Treatment with Machine Learning

Presto On Spark A Unified SQL Experience

Privacy Preserving Machine Learning and Big Data Analytics Using Apache Spark

Productionizing Ethical Credit Scoring Systems with Delta Lake, Feature Store and MLFlow

Protecting Personally Identifiable Information (PII) PHI Data in Data Lake via Column Level Encryption

PySpark in Apache Spark 3.3 and Beyond

Radical Speed on the Lakehouse Photon Under the Hood

Real Time Search and Recommendation at Scale Using Embeddings and Hopsworks

Real Time Cost Reduction Monitoring and Alerting

Realize the Promise of Streaming with the Databricks Lakehouse Platform

Recent Parquet Improvements in Apache Spark

Rethinking Orchestration as Reconciliation Software Defined Assets in Dagster

Running a Low Cost, Versatile Data Management Ecosystem with Apache Spark at Core

Scalable XGBoost on GPU Clusters

Scaling AI Workloads with the Ray Ecosystem

Scaling Your Workloads with Databricks Serverless

Scaling Deep Learning on Databricks

Scaling ML at CashApp with Tecton

Scaling Privacy Practical Architectures and Experiences

Security Best Practices for Lakehouse

Self Serve Automated and Robust CDC pipeline using AWS DMS DynamoDB Streams and Databricks Delta

Serverless Kafka and Apache Spark in a Multi Cloud Data Lakehouse Architecture

Serving Near Real Time Features at Scale

Setting up On Shelf Availability Alerts at Scale with Databricks and Azure

Simplify Global DataOps and MLOps Using Oktas FIG Automation Library

Simplifying Migrations to Lakehouse—the Databricks Way

Smart Manufacturing Real time Process Optimization with Databricks

So Fresh and So Clean Learn How to Build Real Time Warehouses on Lakehouse

Sound Data Engineering in Rust From Bits to DataFrames

Spark Data Source V2 Performance Improvement Aggregate Push Down

Spark Inception Exploiting the Apache Spark REPL to Build Streaming Notebooks

Spline Central Data Lineage Tracking Not Only For Spark

State of the Art Natural Language Processing with Apache Spark NLP

Streaming ML Enrichment Framework Using Advanced Delta Table Features

Survey of Production ML Tech Stacks

Technical and Tactical Football Analysis Through Data

The Databricks Notebook Front Door of the Lakehouse

The Modern Metadata Platform What Why and How

The Road to a Robust Data Lake 0

The Semantics of Biology Vaccine and Drug Research with Knowledge Graphs and Logical Inferencing on Apache Spark teblog.pdf

Time Series Forecasting with PyCaret

Tools for Assisted Apache Spark Version Migrations, From 2.1 to 3.2+

Towards Dynamic Microstructure The Role of Machine Learning in the Next Generation of Exchanges

Turning Big Biology Data into Insights on Disease The Power of Circulating Biomarkers

Turning Fan Data Into an Asset

UIMeta A 10X Faster Cloud Native Spark History Server

Unifying Data Science and Business

Vision AI Animal Health Industry Use Cases Using Databricks on Azure

What to Do When Your Job Goes OOM in the Night Flowcharts

X FIPE eXtended Feature Impact for Prediction Explanation

You Have BI Now What Activate Your Data

dbt Machine Learning What Makes a Great Baton Pass

dbt and Python Better Together