Spark、Hadoop或者HBase相关的文章,欢迎关注微信公众号:iteblog_hadoop
本次会议的议题范围具体如下:
Apache Spark™, Delta Lake, MLflow 以及 Koalas 未来规划;管理机器学习生命周期的最佳实践构建大规模可靠数据管道的技巧流行的深度学习和机器学习框架的最新发展真实的 AI 用户案例下载途径
关注微信公众号 过往记忆大数据 或者 Java技术范 并回复 spark-9832 获取。
可下载的PPT
下面议题提供 PPT 下载
Data Science Across Data Sources with Apache ArrowPortable Scalable Data Visualization Techniques for Apache Spark and Python Notebook-based AnalyticsNative Support of Prometheus Monitoring in Apache Spark 3.0Performant Streaming in Production: Preventing Common Pitfalls when Productionizing Streaming JobsScaling Security Threat Detection with Apache Spark and DatabricksUser Defined Aggregation in Apache Spark: A Love StoryPowering Interactive BI Analytics with Presto and Delta LakeUsing AI to Support Proliferating Merchant ChangesTuning ML Models: Scaling, Workflows, and ArchitectureBattling Model Decay with Deep Learning and GamificationAn Approach to Data Quality for Netflix Personalization SystemsHigh-Performance Analytics with Probabilistic Data Structures: the Power of HyperLogLogPreventing Abuse Using Unsupervised LearningGeospatial Analytics at Scale: Analyzing Human Movement Patterns During a PandemicLeveraging Apache Spark for Scalable Data Prep and Inference in Deep LearningFiltering vs Enriching Data in Apache SparkScalable Acceleration of XGBoost Training on Apache Spark GPU ClustersDeep Dive into GPU Support in Apache Spark 3.xSputnik: Airbnb’s Apache Spark Framework for Data EngineeringPatterns and Anti-Patterns for Memorializing Data Science Project ArtifactsAutomated and Explainable Deep Learning for Clinical Language Understanding at RocheBuilding Understanding Out of Incomplete and Biased Datasets using Machine Learning and DatabricksEncryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA and GovernanceManaging ADLS gen2 using Apache SparkUsing Apache Spark and Differential Privacy for Protecting the Privacy of the 2020 Census RespondentsThe 2020 Census and Innovation in Surveysscaling-data-and-ml-with-apache-spark-and-feastThe Apache Spark File Format EcosystemBuilding the Petcare Data Platform using Delta Lake and 'Kyte': Our Spark ETL PipelineA Production Quality Sketching Library for the Analysis of Big DataChildren Safety Retrieval (CENSER) System for Retrieval of Kidnapped Children from Brothels in IndiaBenchmark Tests and How-Tos of Convolutional Neural Network on HorovodRunner Enabled Apache Spark ClustersScalable AutoML for Time Series Forecasting using RayUsing Machine Learning to Evolve Sports EntertainmentUsing Bayesian Generative Models with Apache Spark to Solve Entity Resolution Problems (DeDup, Merging, Uniqueness) at ScaleFine Tuning and Enhancing Performance of Apache Spark JobsAll In - Migrating a Genomics Pipeline from BASH/Hive to Spark (Azure Databricks) - A Real World Case StudyRunning Apache Spark on Kubernetes: Best Practices and PitfallsLessons Learned from Modernizing USCIS Data Analytics PlatformOn Improving Broadcast Joins in Apache Spark SQLUsing Databricks as an Analysis PlatformIs This Thing On? A Well State Model for the PeopleAdvanced Natural Language Processing with Apache Spark NLPBuilding a Streaming Microservice Architecture: with Apache Spark Structured Streaming and FriendsSimplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesDeploying Apache Spark Jobs on Kubernetes with Helm and Spark OperatorResource-Efficient Deep Learning Model Selection on Apache SparkBring Satellite and Drone Imagery into your Data Science WorkflowsScoring at Scale: Generating Follow Recommendations for Over 690 Million LinkedIn MembersFrom HDFS to S3: Migrate Pinterest Apache Spark ClustersSparkCruise: Automatic Computation Reuse in Apache SparkChromatic Sparse LearningDeploy and Serve Model from Azure Databricks onto Azure Machine LearningCloud-Native Apache Spark Scheduling with YuniKorn SchedulerThe Revolution Will be StreamedDemocratizing PySpark for Mobile Game PublishingRay: Enterprise-Grade, Distributed PythonFugue: Unifying Spark and Non-Spark Ecosystems for Big Data AnalyticsEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificScaling Up AI Research to Production with PyTorch and MLFlowBest Practices for Building Robust Data Platform with Apache Spark and DeltaBuilding a Pipeline for State-of-the-Art Natural Language Processing Using Hugging Face ToolsDesigning the Next Generation of Data Pipelines at Zillow with Apache SparkLessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksFlash for Apache Spark Shuffle with CoscoBuilding a Real-Time Feature Store at iFoodAutoML Toolkit – Deep DiveOperationalize Apache Spark AnalyticsEnd-to-End Deep Learning with Horovod on Apache SparkBuilding Data Quality Audit Framework using Delta Lake at CernerZipline - A Declarative Feature Engineering FrameworkAutomating Federal Aviation Administration’s (FAA) System Wide Information Management (SWIM) Data Ingestion and AnalysisApache Spark NLP for Healthcare: Lessons Learned Building Real-World Healthcare AI SystemsA Thorough Comparison of Delta Lake, Iceberg and HudiProductionizing Machine Learning Pipelines with Databricks and Azure MLAdvertising Fraud Detection at Scale at T-MobileAI-Assisted Feature Selection for Big Data ModelingThe Data Lake Engine Data Microservices in Spark using Apache Arrow FlightIbis: Seamless Transition Between Pandas and Apache SparkSimplify CDC Pipeline with Spark Streaming SQL and Delta LakePower of Visualizing EmbeddingsDeliver Dynamic Customer Journey Orchestration at ScaleTop Down Specialization Using Apache SparkThe Killer Feature Store: Orchestrating Spark ML Pipelines and MLflow for ProductionTackling Scaling Challenges of Apache Spark at LinkedInScaling up Deep Learning by Scaling DownWood Log Inventory Estimation using Image Processing and Deep Learning TechniqueBuilding Identity Graphs over Heterogeneous DataProductionizing Machine Learning with Apache Spark, MLflow and ONNX from the ground to cloud using SQL ServerGenerative Hyperloop Design: Managing Massively Scaled Simulations Focused on Quick-Insight Analytics and Demand ModellingEfficiently Building Machine Learning Models for Predictive Maintenance in the Oil & Gas Industry with DatabricksConsolidate Your Technical Debt With Spark Data Sources -Tools and Techniques to Integrate Native CodeRunning Emerging AI Applications on Big Data Platforms with Ray On Apache SparkBest Practices for Engineering Production-Ready Software with Apache SparkAutomatic Forecasting using Prophet, Databricks, Delta Lake and MLflowComposable Data Processing with Apache SparkAccelerating Spark SQL Workloads to 50X Performance with Apache Arrow-Based FPGA AcceleratorsAccelerating the ML Lifecycle with an Enterprise-Grade Feature StoreFaster Data Integration Pipeline Execution using Spark-JobserverContinuous Delivery of ML-Enabled Pipelines on Databricks using MLflowAccelerating Apache Spark Shuffle for Data Analytics on the Cloud with Remote Persistent Memory PoolsBucketing 2.0: Improve Spark SQL Performance by Removing ShuffleHow to Performance-Tune Apache Spark Applications in Large ClustersSaving Energy in Homes with a Unified Approach to Data and AIProductionizing Deep Reinforcement Learning with Spark and MLflowSQL Performance Improvements at a Glance in Apache Spark 3.0Pandas UDF and Python Type Hint in Apache Spark 3.0Parallelization of Structured Streaming Jobs Using Delta LakeArtificial Lawyers. Will Your Next Attorney be a Machine?Adaptive Query Execution: Speeding Up Spark SQL at RuntimeHow Azure and Databricks Enabled a Personalized Experience for Customers and Patients at CVS HealthOptimize the Large Scale Graph Applications by using Apache Spark with 4-5x Performance ImprovementsData Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data LakeRunning Apache Spark Jobs Using KubernetesKoalas: Making an Easy Transition from Pandas to Apache SparkVectorized Deep Learning Acceleration from Preprocessing to Inference and Training on Apache Spark in SK TelecomText Extraction from Product Images Using State-of-the-Art Deep Learning TechniquesCare and Feeding of Catalyst OptimizerApache Spark vs Apache Spark: An On-Prem Comparison of Databricks and Open-Source SparkEnabling Physics and Empirical-Based Algorithms with Spark Using the Integration of MATLAB in DatabricksDemocratizing DataEvolution is Continuous, and so are Big Data and Streaming PipelinesGeospatial Options in Apache SparkScaling Production Machine Learning Pipelines with DatabricksTaming the Search: A Practical Way of Enforcing GDPR and CCPA in Very Large Datasets with Apache SparkZeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceProductionizing Machine Learning with a Microservices ArchitectureProductionalizing Models through CI/CD Design with MLflowDataSource V2 and Cassandra – A Whole New WorldHyperspace: An Indexing Subsystem for Apache SparkData Driven Decisions at ScaleDeep Dive into the New Features of Apache Spark 3.0Securing Apache Spark Applications at FacebookBuilding a Feature Store around Dataframes and Apache SparkTracing the Breadcrumbs: Apache Spark Workload DiagnosticsEnabling Push Button Productization of AI ModelsEveryday Probabilistic Data Structures for HumansDeep Learning Enabled Price Action with Databricks and AWSClinical Suspecting at Scale Using PySparkUsing Apache Spark for Predicting Degrading and Failing Parts in AviationOperationalizing Big Data Pipelines At ScaleColumbia Migrates from Legacy Data Warehouse to an Open Data Platform with Delta LakeHow Adobe Does 2 Million Records Per Second Using Apache Spark!Accelerating Data Processing in Spark SQL with Pandas UDFsBuilding a Federated Data Directory Platform for Public HealthTranslating Models to Medicine an Example of Managing Visual CommunicationsDelta from a Data Engineer's PerspectiveDisrupting Risk Management through Emerging TechnologiesAutomated Testing For Protecting Data Pipelines from Undocumented AssumptionsGeosp.AI.tial: Applying Big Data and Machine Learning to Solve the World's Toughest Geospatial Intelligence ProblemsDelta from a Data Engineer's PerspectiveHealthcare Claim Reimbursement using Apache SparkFrom Idea to Model: Productionizing Data Pipelines with Apache AirflowWillump: Optimizing Feature Computation in ML InferenceReal-Time Forecasting at Scale using Delta Lake and Delta CachingLeveraging Apache Spark to Develop AI-Enabled Products and Services at BoschContinuous Delivery of Deep Transformer-Based NLP Models Using MLflow and AWS Sagemaker for Enterprise AI ScenariosFrom Python to PySpark and Back Again – Unifying Single-host and Distributed Deep Learning with MaggyShparkley: Scaling Shapley with Apache SparkUnderstanding and Improving Code GenerationUser Defined Aggregation in Apache Spark: A Love StoryMachine Learning Data Lineage with MLflow and Delta LakeMemory Optimization and Reliable Metrics in ML Pipelines at NetflixOperationalizing Machine Learning at Scale at StarbucksPresto on Apache Spark: A Tale of Two Computation EnginesGeneralized SEIR Model on Large NetworksDeep Learning at Scale with Apache Spark and DeterminedHow R Developers Can Build and Share Data and AI Applications that Scale with Databricks and RStudio ConnectRapid Response to Hospital Operations using Data and AI during COVID-19 本博客文章除特别声明,全部都是原创!
原创文章版权归过往记忆大数据(过往记忆)所有,未经许可不得转载。
本文链接: 【Spark Summit North America 202006 高清 PPT 下载】(https://www.iteblog.com/archives/9832.html)