Spark、Hadoop或者HBase相关的文章,欢迎关注微信公众号:过往记忆大数据
本次会议的超清视频已经在前几天分享给大家了,需要的同学可以到 《Data + AI Summit 2022 超清视频下载》获取下载链接。本文主要收集了本次会议的 PPT,需要的同学可以获取。
超清 PPT 下载途径
目前可以获取到的 PPT 主要有 170 个左右,关注微信公众号 过往记忆大数据 或者 Java与大数据架构
推荐观看的议题
由于 Data + AI Summit 2022 会议的议题比较多,不一定都感兴趣,所以这块我给大家整理出十几个比较干的议题,推荐大家观看:
Apache Spark SQL Aggregate Improvement at Meta (Facebook)Recent Parquet Improvements in Apache SparkSpark Data Source V2 Performance Improvement: Aggregate Push DownDeep Dive into the New Features of Apache Spark 3.2 and 3.3Managing Straggler Executors at Apache Spark 3.3Apache Spark on Kubernetes—Lessons Learned from Launching Millions of Spark ExecutorsPySpark in Apache Spark 3.3 and BeyondDelta Lake 2.0 OverviewImproving Interactive Querying Experience on Spark SQLMoving from Apache Spark 2 to Apache Spark 3: Spark Version Upgrade at Scale in PinterestRadical Speed on the Lakehouses: Photon under the hoodDeep-Dive into Delta LakePresto 101: An Introduction to Open Source PrestoApache Spark AQE SkewedJoin Optimization and Practice in ByteDanceAdvanced Migrations: From Hive to SparkSQLPresto On Spark: A Unified SQL Experience 可下载 PPT 的议题
本次可下载视频的议题共 170 个。
A Modern Approach to Big Data for FinanceA Practitioner's Guide to Unity Catalog A Technical Deep DiveAI Fueled Forecasting The Next Generation of Financial PlanningAI powered Assortment Planning SolutionALaSpark Gousto Recipe for Building Scalable PySpark PipelinesAccelerating the Pace of Autism Diagnosis with Machine Learning ModelsAchieve Machine Learning Hyper Productivity with Transformers and Hugging FaceAdministrator Best Practices and Tips for Future Proofing your Databricks AccountAdvanced Migrations From Hive to SparkSQLAdversarial Drifts, Model Monitoring, and Feedback Loops Building Human in the Loop Machine Learning Systems for Content ModerationAgile Data Engineering Reliability and Continuous Delivery at ScaleAmgen’s Journey To Building a Global 360 View of its Customers with the LakehouseAn Advanced S3 Connector for Spark to Hunt for Cyber AttacksApache Arrow Flight SQL High Performance, Simplicity, and Interoperability for Data TransfersApache Spark SQL Aggregate Improvement at Meta (Facebook)Apache Spark on Kubernetes—Lessons Learned from Launching Millions of Spark ExecutorsApache Spark AQE SkewedJoin Optimization and Practice in ByteDanceApplied Predictive Maintenance in Aviation Without Sensor DataAuto Encoder Decoder Based Anomaly Detection with the Lakehouse ParadigmAutomate Your Delta Lake or Practical Insights on Building Distributed Data MeshAutomating Model Lifecycle Orchestration with JenkinsAutomating Business Decisions Using Event StreamsBackfill Streaming Data Pipelines in Kappa ArchitectureBest Practices of Maintaining High Quality DataBig Data in the Age of MoneyballBuild an Enterprise Lakehouse for Free with Trino and Delta LakeBuilding Enterprise Scale Data and Analytics Platforms at AmgenBuilding Metadata and Lineage Driven Pipelines on KubernetesBuilding Production Ready Recommender Systems with Feature StoresBuilding a Data Science as a Service platform in Azure with DatabricksBuilding a Lakehouse for Data Science at DoorDashBuilding an Analytics Lakehouse at GrabBuilding and Scaling Machine Learning Based Products in the World's Largest BreweryBuilding Spatial Applications with Apache Spark and CARTOBuilding an Operational Machine Learning Organization from Zero and Leveraging ML for Crypto SecurityCase Study in Rearchitecting an On Premises Pipeline in the CloudChallenges in Time Series ForecastingChaos Engineering in the World of Large Scale Complex Data FlowCloud Native Geospatial Analytics at JLLCloud and Data Science Modernization of Veterans Affairs Financial Service Center with Azure DatabricksComputational Data Governance at ScaleConnecting the Dots with DataHub Lakehouse and BeyondCoral and Transport Portable SQL and UDFs for the Interoperability of Spark and Other EnginesCorrelation Over Causation Cracking the Relationship Between User Engagement and User HappinessCustomer centric Innovation to Scale Data AI EverywhereCutting the Edge in Fighting Cybercrime Reverse Engineering a Search Language to Cross Compile it to PySparkDBA Perspective Optimizing Performance Table by TableData Boards A Collaborative and Interactive Space for Data ScienceData Centric Principles for AI EngineeringData Lakehouse and Data Mesh Two Sides of the Same CoinDataFusion and Arrow Supercharge Your Data Analytical Tool with a Rusty Query EngineDatabricks Meets Power BIDatabricks and Enterprise Observability with OverwatchDeep Dive into Delta LakeDeep Dive into the New Features of Apache SparkDelta Lake OverviewDelta Sharing A New Paradigm for Secure Data Sharing and Data Collaboration on LakehouseDemocratizing Metrics at AirbnbDesigning Better MLOps SystemsDestination Lakehouse All Your Data Analytics and AI on One PlatformDetecting Financial Crime Using an Azure Advanced Analytics Platform and MLOps ApproachDiscover Data Lakehouse With End to End LineageDisrupting the Prescription Drug Market with AI and DataDistributed Machine Learning at LyftDoubling the Capacity of the Data Platform Without Doubling the CostElixir The Wickedly Awesome Batch and Stream Processing Language You Should Have in Your ToolboxEmbedding Privacy by Design Into Data Infrastructure Through Open Source Extensible ToolingEnable Production ML with Databricks Feature StoreEnabling BI in a Lakehouse EnvironmentEnabling Learning on Confidential DataEnsuring Correct Distributed Writes to Delta Lake in Rust with Formal VerificationEvolution of Data Architectures and How to Build a LakehouseFugue Tune Distributed Hybrid Hyperparameter TuningFutureMetrics Using Deep Learning to Create a Multivariate Time Series Forecasting Platform for Economic Strategic PlanningGIS Pipeline Acceleration with Apache SedonaGit for Data Lakes How lakeFS Scales Data Versioning to Billions of ObjectsHassle Free Data Ingestion into the LakehouseHow to Automate the Modernization and Migration of Your Data Warehousing Workloads to Databricks LakehouseHow EPRI Uses Computer Vision to Mitigate Wildfire Risks for Electric UtilitiesHow Robinhood Built a Streaming Lakehouse to Bring Data Freshness from 24h to Less Than 15 MinsHow To Make Apache Spark on Kubernetes Run Reliably on Spot InstancesHow To Use Databricks SQL for Analytics on Your LakehouseHow socat and UNIX Pipes Can Help Data IntegrationHow the Largest County in the US is Transforming Hiring with a Modern Data LakehouseHow to Build a Complete Security and Governance Solution Using Unity CatalogHow to Implement a Semantic Layer for Your LakehouseImplementing Data Governance 3.0 for the Lakehouse Era Community Led and Bottom UpImplementing a Framework for Data Security and Policy at a Large Public Sector AgencyImplementing an End to End Demand Forecasting Solution Through Databricks and MLflowImproving Apache Spark Structured Streaming Application Processing TimeImproving Interactive Querying Experience on Spark SQLImproving patient care with DatabricksIngesting data into Lakehouse with COPY INTOIntegrating Apache Superset into a B2B Platform Why and HowIntroducing Zipline An Open Source Feature Engineering PlatformLearn to Efficiently Test ETL PipelinesLessons Learned from Deidentifying 700 Million Patient NotesLow Code Machine Learning on Databricks with AutoMLMLOps at DoorDashMLflow Pipelines Accelerating MLOps from Development to ProductionMapping Data Quality Concerns to Data Lake ZonesMeshing About with DatabricksMigrate and Modernize your Data Platform with Confluent and DatabricksMigrating Complex SAS Processes to Databricks Case StudyMonitoring and Quality Assurance of Complex ML Deployments via AssertionsMosaic A Framework for Geospatial Analytics at ScaleMultimodal Deep Learning Applied to E commerce Big DataNear Real Time Analytics with Event Streaming, Live Tables, and Delta SharingObfuscating Sensitive Information from Spark UI and LogsOpen Source Powers the Modern Data StackOpening the Floodgates Enabling Fast Unmediated End User Access to Trillion Row Datasets with SQL Data WarehousesOptimizing Speed and Scale of User Facing Analytics Using Apache Kafka and PinotPolars Blazingly Fast DataFrames in Rust and PythonPower to the SQL People Python UDFs in DBSQLPowering Up the Business with a LakehousePractical Data Governance in a Large Scale Databricks EnvironmentPredicting Repeat Admissions to Substance Abuse Treatment with Machine LearningPresto On Spark A Unified SQL ExperiencePrivacy Preserving Machine Learning and Big Data Analytics Using Apache SparkProductionizing Ethical Credit Scoring Systems with Delta Lake, Feature Store and MLFlowProtecting Personally Identifiable Information (PII) PHI Data in Data Lake via Column Level Encryption PySpark in Apache Spark 3.3 and BeyondRadical Speed on the Lakehouse Photon Under the HoodReal Time Search and Recommendation at Scale Using Embeddings and HopsworksReal Time Cost Reduction Monitoring and AlertingRealize the Promise of Streaming with the Databricks Lakehouse PlatformRecent Parquet Improvements in Apache SparkRethinking Orchestration as Reconciliation Software Defined Assets in DagsterRunning a Low Cost, Versatile Data Management Ecosystem with Apache Spark at CoreScalable XGBoost on GPU ClustersScaling AI Workloads with the Ray EcosystemScaling Your Workloads with Databricks ServerlessScaling Deep Learning on DatabricksScaling ML at CashApp with TectonScaling Privacy Practical Architectures and ExperiencesSecurity Best Practices for LakehouseSelf Serve Automated and Robust CDC pipeline using AWS DMS DynamoDB Streams and Databricks DeltaServerless Kafka and Apache Spark in a Multi Cloud Data Lakehouse ArchitectureServing Near Real Time Features at ScaleSetting up On Shelf Availability Alerts at Scale with Databricks and AzureSimplify Global DataOps and MLOps Using Oktas FIG Automation LibrarySimplifying Migrations to Lakehouse—the Databricks WaySmart Manufacturing Real time Process Optimization with DatabricksSo Fresh and So Clean Learn How to Build Real Time Warehouses on LakehouseSound Data Engineering in Rust From Bits to DataFramesSpark Data Source V2 Performance Improvement Aggregate Push DownSpark Inception Exploiting the Apache Spark REPL to Build Streaming NotebooksSpline Central Data Lineage Tracking Not Only For SparkState of the Art Natural Language Processing with Apache Spark NLPStreaming ML Enrichment Framework Using Advanced Delta Table FeaturesSurvey of Production ML Tech StacksTechnical and Tactical Football Analysis Through DataThe Databricks Notebook Front Door of the LakehouseThe Modern Metadata Platform What Why and HowThe Road to a Robust Data Lake 0The Semantics of Biology Vaccine and Drug Research with Knowledge Graphs and Logical Inferencing on Apache Spark teblog.pdfTime Series Forecasting with PyCaretTools for Assisted Apache Spark Version Migrations, From 2.1 to 3.2+Towards Dynamic Microstructure The Role of Machine Learning in the Next Generation of ExchangesTurning Big Biology Data into Insights on Disease The Power of Circulating BiomarkersTurning Fan Data Into an AssetUIMeta A 10X Faster Cloud Native Spark History ServerUnifying Data Science and BusinessVision AI Animal Health Industry Use Cases Using Databricks on AzureWhat to Do When Your Job Goes OOM in the Night FlowchartsX FIPE eXtended Feature Impact for Prediction ExplanationYou Have BI Now What Activate Your Datadbt Machine Learning What Makes a Great Baton Passdbt and Python Better Together本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Data + AI Summit 2022 PPT 下载】(https://www.iteblog.com/archives/10189.html)