SAIS 2019 EUROPE
Schedule
Speakers
Events
Women in Unified Analytics
Industry Networking Events
Evening Activities
Sponsors
Job Board
Keynote Videos
SAIS 2019 EUROPE
Schedule
Speakers
Events
Women in Unified Analytics
Industry Networking Events
Evening Activities
Sponsors
Job Board
Keynote Videos
SKIP TO
October 15
October 16
October 17
Tuesday, October 15
Wednesday, October 16
Thursday, October 17
TRACKS
Show All
AI Use Cases
Developer
Data Engineering
Data Science, Machine Learning & Deep Learning
Data & ML Research
Data and ML Industry Use Cases
Sponsor Session
Tutorials
FILTER
All Skill Levels
Beginner
Intermediate
Advanced
CONTENT FILTERS
Apache Spark Use Cases
Architecture
Databricks Tech Talks
Hands-on Tutorials
Sponsored Sessions
Technical Deep dives
Technical vs Non-Technical
SCHEDULE
TUESDAY, OCTOBER 15 – TRAINING
07:00
Registration
Registration
720 MINS
09:00
Training: Apache Spark Programming
Training
180 MINS
Room: D201
09:00
Training: Apache Spark Tuning and Best Practices
Training
180 MINS
Room: Elicium 1
09:00
Training: Building Data Pipelines for Apache Spark with Databricks Delta Lake
Training
180 MINS
Room: E102
09:00
Training: Data Science With Apache Spark 2.x
Training
180 MINS
Room: Elicium 2
09:00
Training: Half-Day Prep Course + Databricks Certification Exam
Training
180 MINS
Room: E103
09:00
Training: Machine Learning in Production: MLflow and Model Deployment
Training
180 MINS
Room: D203
09:00
Training: Understand and Apply Deep Learning with Keras, TensorFlow and Apache Spark
Training
180 MINS
Room: E107
12:00
Lunch
Lunch
60 MINS
13:00
Training: Apache Spark Programming
Training
240 MINS
Room: D201
13:00
Training: Apache Spark Tuning and Best Practices
Training
240 MINS
Room: Elicium 1
13:00
Training: Building Data Pipelines for Apache Spark with Databricks Delta Lake
Training
240 MINS
Room: E102
13:00
Training: Data Science With Apache Spark 2.x
Training
240 MINS
Room: Elicium 2
13:00
Training: Machine Learning in Production: MLflow and Model Deployment
Training
240 MINS
Room: D203
13:00
Training: Understand and Apply Deep Learning with Keras, TensorFlow and Apache Spark
Training
240 MINS
Room: E107
18:00
Ask Me Anything (AMA): Apache Spark Committers and Databricks Founders
Break
20 MINS
Room: Expo Hall
18:30
Amsterdam Data Science Meetup – Drinks & Data: Tech Talks in Data Science, AI and Machine Learning
Meetup
150 MINS
Room: Forum
WEDNESDAY, OCTOBER 16 – CONFERENCE
07:00
Registration
Registration
600 MINS
09:00
AlphaStar: Mastering the Real-Time Strategy Game StarCraft II
Oriol Vinyals (Google)
Keynote
90 MINS
Room: Auditorium
09:00
New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, and Koalas
Michael Armbrust (Databricks)
Brooke Wenig (Databricks)
Burak Yavuz (Databricks)
Keynote
90 MINS
Room: Auditorium
09:00
Saving Energy in Homes with a Unified Approach to Data and AI – EU
Dr. Stephen Galsworthy (Quby)
Keynote
90 MINS
Room: Auditorium
09:00
Unified Data Analytics: Helping Data Teams Solve the World’s Toughest Problems
Ali Ghodsi (Databricks)
Keynote
90 MINS
Room: Auditorium
10:00
Expo Hall Open
Meetup
480 MINS
Room: Expo Hall
10:30
Break
Break
30 MINS
10:35
Ask Me Anything (AMA): Delta Lake
Break
20 MINS
Room: Expo Hall
11:00
Accelerating Real Time Video Analytics on a Heterogenous CPU + FPGA Platform
Bhoomika Sharma (Megh Computing, Inc.)
AI
40 MINS
Room: Emerald
11:00
Apache Spark At Scale in the Cloud
Rose Toomey (Bloomberg)
Data Engineering
40 MINS
Room: Forum
11:00
Automating Loss Prevention Using NLP with FastAI on Azure Databricks: A Gentle Walk-Through with PetSmart
Mike Vedomske (PetSmart)
Data and ML Industry Use Case
40 MINS
Room: G106
11:00
Building Reliable Data Lakes at Scale with Delta Lake
Andreas Neumann (Databricks)
Tathagata Das (Databricks)
Mukul Murthy (Databricks)
Data Engineering
40 MINS
Room: G104
11:00
Continuous Deployment for Deep Learning
Nick Pentreath (IBM)
Data Science
40 MINS
Room: G102
11:00
Data Engineers: Stop Hand Coding and Start Accelerating Your Analytics Projects! SAIS EU
Michael Destein (Talend)
Sponsored Sessions
20 MINS
Room: E107
11:00
Deep Learning Pipelines for High Energy Physics using Apache Spark with Distributed Keras on Analytics Zoo
Luca Canali (CERN)
Data & ML Research
40 MINS
Room: E102
11:00
Extending Spark Graph for the Enterprise with Morpheus and Neo4j
Martin Junghanns (Neo4j)
Sören Reichardt (Neo4j)
Developer
40 MINS
Room: Elicium 2
11:00
Koalas: Making an Easy Transition from Pandas to Apache Spark
Tim Hunter (ABN AMRO)
Takuya Ueshin (Databricks)
Data Science
40 MINS
Room: Elicium 1
11:00
Near Real-Time Data Warehousing with Apache Spark and Delta Lake
Jasper Groot (Eventbrite)
Developer
40 MINS
Room: Auditorium
11:00
Vectorized R Execution in Apache Spark
Hyukjin Kwon (Databricks)
Developer
40 MINS
Room: E103
11:20
A Recommender Story: Improving Backend Data Quality While Reducing Costs
Jacques Pierre Francois Doux (Elsevier)
Sponsored Sessions
15 MINS
Room: E107
11:50
.NET for Apache Spark
Rahul Potharaju (Microsoft)
Terry Kim (Microsoft)
Developer
40 MINS
Room: Elicium 2
11:50
Apache Spark’s Built-in File Sources in Depth
Gengliang Wang (Databricks)
Developer
40 MINS
Room: Auditorium
11:50
Building Reliable Data Lakes at Scale with Delta Lake – cont
Andreas Neumann (Databricks)
Tathagata Das (Databricks)
Mukul Murthy (Databricks)
Tutorials
40 MINS
Room: G104
11:50
Commercial Analytics at Scale in Pharma: From Hackathon to MVP with Azure Databricks
Peter Webb (GlaxoSmithKline)
Data and ML Industry Use Case
40 MINS
Room: G106
11:50
Data Reproducibility, Audits, Immediate Rollbacks, and Other Applications of Time Travel with Delta Lake
Kyle Weller (Microsoft)
Data Engineering
40 MINS
Room: Forum
11:50
Fuel Your Apache Spark Analytics Using Intel Optane DC Persistent Memory
Qi Xie (Intel)
Sponsored Sessions
15 MINS
Room: E107
11:50
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Chenzhao Guo (Intel)
Carson Wang (Intel)
Data & ML Research
40 MINS
Room: E102
11:50
Internals of Speeding up PySpark with Arrow
Ruben Berenguel (Hybrid Theory)
Developer
40 MINS
Room: E103
11:50
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Alicia Frame (Neo4j)
Data Science
40 MINS
Room: G102
11:50
Using PySpark to Scale Markov Decision Problems for Policy Exploration
Justin Brandenburg (Databricks)
AI
40 MINS
Room: Emerald
11:50
Zipline—Airbnb’s Declarative Feature Engineering Framework
Varant Zanoyan (Airbnb)
Evgeny Shapiro (Airbnb)
Data Science
40 MINS
Room: Elicium 1
12:10
Migrating Hadoop Analytics to Spark in the Cloud Without Disruption
Clayton Doige (WANdisco)
Sponsored Sessions
15 MINS
Room: E107
12:30
Lunch
Lunch
60 MINS
13:40
AI Scalability for the Next Decade
Dave McDonnell (IBM)
Sponsored Sessions
20 MINS
Room: E107
13:40
Asynchronous Hyperparameter Optimization with Apache Spark
Jim Dowling (Logical Clocks AB)
Moritz Meister (Logical Clocks AB)
Data & ML Research
40 MINS
Room: E102
13:40
Building Data Intensive Analytic Application on Top of Delta Lakes
Ganesh Chand (Databricks)
Ravi Gawai (Databricks)
Data Engineering
40 MINS
Room: Forum
13:40
Deep Learning with DL4J on Apache Spark: Yeah it’s Cool, but are You Doing it the Right Way?
Guglielmo Iozzia (MSD)
Data Science
40 MINS
Room: G102
13:40
Downscaling: The Achilles heel of Autoscaling Apache Spark Clusters
Prakhar Jain (Qubole)
Venkata Krishnan Sowrirajan (LinkedIn)
Developer
40 MINS
Room: Elicium 2
13:40
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Jacek Laskowski (Development and training services)
Developer
40 MINS
Room: Auditorium
13:40
How to Automate Performance Tuning for Apache Spark
Jean-Yves Stephan (Data Mechanics)
Julien Dumazert (Data Mechanics)
Tutorials
40 MINS
Room: G104
13:40
Making Homes Efficient and Comfortable Using AI and IoT Data
Ellissa Verseput (Quby)
AI
40 MINS
Room: Emerald
13:40
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
Holden Karau (Apple)
Data Science
40 MINS
Room: Elicium 1
13:40
Physical Plans in Spark SQL
David Vrba (Socialbakers a.s.)
Developer
40 MINS
Room: E103
13:40
Retrieving Visually-Similar Products for Shopping Recommendations using Spark and Tensorflow
Zhichao Zhong (Wehkamp)
Data and ML Industry Use Case
40 MINS
Room: G106
14:00
Building Resilience in Data Science Processes
Greg Willis (Dataiku)
Sponsored Sessions
15 MINS
Room: E107
14:30
Allies and Adversaries : Explaining Model Reasoning via Contrasting Proximal Prototypes
Deepak Pai (Adobe, Inc.)
Vijay Srivastava (Adobe, Inc.)
Data Science
40 MINS
Room: Elicium 1
14:30
Apache Spark AI Use Case in Telco: Network Quality Analysis and Prediction with Geospatial Visualization
Hongchan Roh (SK Telecom)
Dooyoung Hwang (SK Telecom)
AI
40 MINS
Room: Emerald
14:30
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)—continues
Jacek Laskowski (Development and training services)
Developer
40 MINS
Room: Auditorium
14:30
How to Tune and Optimize the Performance of Apache Spark Data Pipelines
Dave Goodhand (Unravel Data)
Sponsored Sessions
15 MINS
Room: E107
14:30
Koalas: Pandas on Apache Spark EU
Tim Hunter (ABN AMRO)
Brooke Wenig (Databricks)
Niall Turbitt (Databricks)
Tutorials
40 MINS
Room: G104
14:30
Lessons Learned Replatforming A Large Machine Learning Application To Apache Spark
Taylor Hess (Morningstar Inc)
Patrick Caldon (Morningstar Inc)
Data Science
40 MINS
Room: G102
14:30
Physical Plans in Spark SQL—continues
David Vrba (Socialbakers a.s.)
Developer
40 MINS
Room: E103
14:30
Predicting Banking Customer Needs with an Agile Approach to Analytics in the Cloud
Jakub Masek (MONETA Money Bank)
Milan Berka (DataSentics a.s.)
Data and ML Industry Use Case
40 MINS
Room: G106
14:30
Stream Processing: Choosing the Right Tool for the Job
Giselle van Dongen (Klarrio)
Data & ML Research
40 MINS
Room: E102
14:30
Streaming Analytics for Financial Enterprises
Bas Geerdink (Rabobank)
Developer
40 MINS
Room: Elicium 2
14:30
Using Apache Spark to Solve Sessionization Problem in Batch and Streaming
Bartosz Konieczny (Octo Technology)
Data Engineering
40 MINS
Room: Forum
14:45
Making Zeppelin and Apache Spark Enjoyable
Vitaly Khudobakhshov (JetBrains)
Sponsored Sessions
15 MINS
Room: E107
15:20
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Language Real
Georgia Koutrika (ATHENA RESEARCH CENTER)
AI
40 MINS
Room: Emerald
15:20
Assessing Graph Solutions for Apache Spark
Songting Chen (TigerGraph)
Developer
40 MINS
Room: Elicium 2
15:20
Briefing on the Modern ML Stack with R
Javier Luraschi (RStudio)
Data Science
40 MINS
Room: G102
15:20
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs to our Scientists
Eliseo Papa (AstraZeneca)
Data and ML Industry Use Case
40 MINS
Room: G106
15:20
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Architect Things Right
Tathagata Das (Databricks)
Developer
40 MINS
Room: E103
15:20
Dynamic Partition Pruning in Apache Spark
Bogdan Ghit (Databricks)
Juliusz Sompolski (Databricks)
Data Engineering
40 MINS
Room: Forum
15:20
Encrypted Computation in Apache Spark
Kim Laine (Microsoft)
Data & ML Research
40 MINS
Room: E102
15:20
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond SAIS EU
Holden Karau (Apple)
Developer
40 MINS
Room: Auditorium
15:20
Koalas: Pandas on Apache Spark (continued)
Tim Hunter (ABN AMRO)
Brooke Wenig (Databricks)
Niall Turbitt (Databricks)
Tutorials
40 MINS
Room: G104
15:20
No REST till Production – Building and Deploying 9 Models to Production in 3 weeks
Charmee Patel (Syntasa)
Data Science
40 MINS
Room: Elicium 1
15:20
Spark Plus AI/ML on AWS
Danilo Nogueira Machado (Amazon Web Services)
Sponsored Sessions
15 MINS
Room: E107
15:35
Building a Modern FinTech Big Data Infrastructure
Omar Hommos (Adyen)
Rodel van Rooijen (Adyen)
Sponsored Sessions
15 MINS
Room: E107
16:00
Ask Me Anything (AMA): Koalas
Break
20 MINS
Room: Expo Hall
16:00
Break #2
Break
30 MINS
16:00
Financial Services Networking Event
Matthijs van Dorth (ABN AMRO)
Thimo ten Veen (ABN AMRO)
Jonathan Chin (ARM Insight)
Meetup
180 MINS
Room: D202
16:30
Accelerating Astronomical Discoveries with Apache Spark
Julien Peloton (CNRS)
Data & ML Research
40 MINS
Room: E102
16:30
Apache Spark Side of Funnels
Zoran Stipanicev (GetYourGuide)
Developer
40 MINS
Room: Auditorium
16:30
Building A Feature Factory
Daniel Tomes (Databricks)
Data Science
40 MINS
Room: G102
16:30
Cosmos DB Real-time Advanced Analytics Workshop
Srilakshmi Chintala (Microsoft)
Tutorials
40 MINS
Room: G104
16:30
Databricks Delta Lake and Its Benefits
Nitin Raj Soundararajan (Cognizant Worldwide Limited)
Nagaraj Sengodan (HCL Technologies)
Developer
40 MINS
Room: Elicium 2
16:30
Deep Anomaly Detection from Research to Production Leveraging Spark and Tensorflow
Davit Bzhalava (Swedbank)
Shaheer Mansoor (Swedbank)
AI
40 MINS
Room: Emerald
16:30
Driver Location Intelligence at Scale using Apache Spark, Delta Lake, and MLflow on Databricks
Sergio Ballesteros Solanas (TomTom)
Kia Eisinga (TomTom)
Data and ML Industry Use Case
40 MINS
Room: G106
16:30
Introduction to TensorFlow 2.0 Brad Miro
Brad Miro (Google)
Data Science
40 MINS
Room: Elicium 1
16:30
Modern ETL Pipelines with Change Data Capture
Thiago Rigo (GetYourGuide)
David Mariassy (GetYourGuide)
Data Engineering
40 MINS
Room: Forum
16:30
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Jiri Kremser (Red Hat, Inc.)
Developer
40 MINS
Room: E103
16:30
Unlock Value of Disparate and Complex Data Powered by Azure Databricks
Luke Pritchard (Avanade, Inc.)
Felix Moeller (Avanade, Inc.)
Sponsored Sessions
15 MINS
Room: E107
16:45
Power Your Delta Lake with Streaming Transactional Changes
Rupal Shah (StreamSets)
Sponsored Sessions
15 MINS
Room: E107
17:00
Expo Hall Open
Meetup
120 MINS
Room: Expo Hall
17:00
Healthcare and Life Sciences Networking Event
Meetup
120 MINS
Room: D203
17:00
Retail Networking Event
Meetup
120 MINS
Room: D201
17:20
Accelerating Apache Spark with Intel QuickAssist Technology
Qi Xie (Intel)
Data & ML Research
40 MINS
Room: E102
17:20
Application and Challenges of Streaming Analytics and Machine Learning on Multi-Variate Time Series Data for Smart Manufacturing
Pranav Prakash (Quartic.ai)
AI
40 MINS
Room: Emerald
17:20
Astronomical Data Processing on the LSST Scale with Apache Spark
Petar Zecevic (SV Group d.o.o.)
Mario Juric (University of Washington)
Developer
40 MINS
Room: E103
17:20
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Vicky Avison (Cox Automotive UK)
Alex Bush (KPMG Lighthouse)
Data Engineering
40 MINS
Room: Forum
17:20
Build and Deploy a Managed Machine Learning Project in 10 minutes
Scott Lutz (DataRobot)
Sponsored Sessions
15 MINS
Room: E107
17:20
Cosmos DB Real-time Advanced Analytics Workshop-continues
Srilakshmi Chintala (Microsoft)
Tutorials
40 MINS
Room: G104
17:20
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonomous Driving
Gheorghe Pucea (BMW Group)
Jennifer Reinelt (BMW Group)
Data and ML Industry Use Case
40 MINS
Room: G106
17:20
Scalable Time Series Forecasting and Monitoring using Apache Spark and ElasticSearch at Adyen
Andreu Mora (Adyen)
Data Science
40 MINS
Room: G102
17:20
Seamless End-to-End Production Machine Learning with Seldon and MLflow
Adrián González Martín (Seldon Technologies)
Data Science
40 MINS
Room: Elicium 1
17:20
Spark SQL Bucketing at Facebook
Cheng Su (Facebook)
Developer
40 MINS
Room: Elicium 2
17:20
Updates from Project Hydrogen: Unifying State-of-the-Art AI and Big Data in Apache Spark EU
Xingbo Jiang (Databricks)
Developer
40 MINS
Room: Auditorium
17:35
Successful AI/ML Projects with End-to-End Cloud Data Engineering
Louis Polycarpou (Informatica)
Sponsored Sessions
15 MINS
Room: E107
18:00
Women in Unified Analytics Happy Hour
Meetup
90 MINS
Room: Topaz Lounge
19:30
Build. Unify. Celebrate. Attendee Party
Meetup
180 MINS
Room: Amstel Boat House
THURSDAY, OCTOBER 17 – CONFERENCE
08:00
Registration
Registration
420 MINS
09:00
Forecasting ‘What-if’ Scenarios in Retail Using ML-Powered Interactive Tools
Johan Vallin (Electrolux)
Keynote
90 MINS
Room: Auditorium
09:00
Reinventing Payments at HSBC with a Unified Platform for Data and AI in the Cloud
Alessio Basso (HSBC)
Keynote
90 MINS
Room: Auditorium
09:00
Scalable AI for Good
Mark Hamilton (Microsoft)
Christina Lee (Microsoft)
Keynote
90 MINS
Room: Auditorium
09:00
Simplifying Model Management with MLflow
Matei Zaharia (Databricks)
Corey Zumar (Databricks)
Keynote
90 MINS
Room: Auditorium
10:00
Expo Hall Open – Day 3
Meetup
480 MINS
Room: Expo Hall
10:30
Break #3
Break
30 MINS
10:35
Ask Me Anything (AMA): Delta Lake
Break
20 MINS
Room: Expo Hall
10:35
Ask Me Anything (AMA): Delta Lake cont
Break
20 MINS
Room: Expo Hall
11:00
AI-Powered Streaming Analytics for Real-Time Customer Experience
John Haddad (Informatica)
Vishwa Belur (Informatica)
Data Engineering
40 MINS
Room: E102
11:00
Building Reliable Data Lakes at Scale with Delta Lake
Andreas Neumann (Databricks)
Tathagata Das (Databricks)
Mukul Murthy (Databricks)
Data Engineering
40 MINS
Room: G104
11:00
Continuous Evaluation of Deployed Models in Production
Deepak Pai (Adobe, Inc.)
Vijay Srivastava (Adobe, Inc.)
Data Science
40 MINS
Room: Elicium 1
11:00
Creating an Omnichannel Banking Experience with Machine Learning on Azure Databricks
Petr Pluhacek (Ceska sporitelna)
Jakub Stech (DataSentics a.s.)
Data and ML Industry Use Case
40 MINS
Room: G106
11:00
Detecting Financial Fraud at Scale with Machine Learning
Elena Boiarskaia (H2O.ai)
Data Science
40 MINS
Room: G102
11:00
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spark Graph
Mats Rydberg (Neo4j)
Max Kießling (Neo4j)
Tutorials
40 MINS
Room: G104
11:00
How Data is Transforming the Dutch Media Industry
Maurits van der Goes (RTL Netherlands)
AI
40 MINS
Room: Emerald
11:00
Performance Troubleshooting Using Apache Spark Metrics
Luca Canali (CERN)
Developer
40 MINS
Room: Elicium 2
11:00
Petabytes, Exabytes, and Beyond: Managing Delta Lakes for Interactive Queries at Scale
Christopher Hoshino-Fish (Databricks)
Data Engineering
40 MINS
Room: Forum
11:00
The Internals of Stateful Stream Processing in Spark Structured Streaming
Jacek Laskowski (Development and training services)
Developer
40 MINS
Room: E103
11:00
The Parquet Format and Performance Optimization Opportunities
Boudewijn Braams (Databricks)
Developer
40 MINS
Room: Auditorium
11:00
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Layla Yang (Databricks)
Data Science
40 MINS
Room: E107
11:50
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scale Storage and Analytics
Michal Gancarski (Zalando SE)
Developer
40 MINS
Room: Auditorium
11:50
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Arnoud de Munnik (Wehkamp)
Jerry Vos (Wehkamp)
Data Science
40 MINS
Room: G102
11:50
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Denny Lee (Databricks)
Mary Grace Moesta (Databricks)
Data Science
40 MINS
Room: E107
11:50
Building a Scalable Data Science Solution to Outperform Sales Execution in Traditional Trade Markets
Harish Kumar (RB)
Data and ML Industry Use Case
40 MINS
Room: G106
11:50
Data Warehousing with Spark Streaming at Zalando
Sebastian Herold (Zalando SE)
Data Engineering
40 MINS
Room: E103
11:50
Drug Discovery and Development Using AI
Vishnu Vettrivel (Wisecube AI)
AI
40 MINS
Room: Emerald
11:50
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Jim Dowling (Logical Clocks AB)
Kim Hammar (Logical Clocks AB)
Data Science
40 MINS
Room: Elicium 1
11:50
Graph Features in Spark 3.0: Integrating Graph Querying and Algorithms in Spark Graph—continues
Mats Rydberg (Neo4j)
Max Kießling (Neo4j)
Tutorials
40 MINS
Room: G104
11:50
Industrializing Machine Learning on an Enterprise Azure Platform with Databricks: Experiences and Feedbacks
Yannick Radji (Sodexo)
Developer
40 MINS
Room: Elicium 2
11:50
Powering Custom Apps at Facebook using Spark Script Transformation
Abdulrahman Alfozan (Facebook)
Data Engineering
40 MINS
Room: E102
11:50
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Itai Yaffe (Imply)
Data Engineering
40 MINS
Room: Forum
11:50
Women in Unified Data Analytics Lunch
Lunch
100 MINS
Room: D203
12:30
Lunch – Day 3
Lunch
60 MINS
13:30
Democratizing Machine Learning: Perspective from a scikit-learn Creator
Gaël Varoquaux (Inria)
Keynote
60 MINS
Room: Auditorium
13:30
Imaging the Unseen: Taking the First Picture of a Black Hole
Katie Bouman (California Institute of Technology)
Keynote
60 MINS
Room: Auditorium
14:40
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library
Miguel Martinez (NVIDIA)
Thomas Graves (NVIDIA)
Data Science
40 MINS
Room: G102
14:40
Apache Spark Core – Practical Optimization
Daniel Tomes (Databricks)
Developer
40 MINS
Room: Auditorium
14:40
CyberMLToolkit: Anomaly Detection as a Scalable Generic Service Over Apache Spark
Roy Levin (Microsoft)
AI
40 MINS
Room: Emerald
14:40
Data Democratization at Nubank
Andre Jasiskis (Nubank)
Rodrigo Ney (Nubank)
Data Engineering
40 MINS
Room: E103
14:40
From HelloWorld to Configurable and Reusable Apache Spark Applications in Scala – A Developer’s Journey
Oliver Tupran (Devoteam)
Developer
40 MINS
Room: Elicium 2
14:40
Managing the Complete Machine Learning Lifecycle with MLflow EU
Thunder Shiviah (Databricks)
Michael Shtelma (Databricks)
Tutorials
40 MINS
Room: G104
14:40
On-Prem Solution for the Selection of Wind Energy Models
Ana Maria Martinez Fernandez (Vestas)
Data Science
40 MINS
Room: Elicium 1
14:40
Performance Analysis of Apache Spark and Presto in Cloud Environments
Victor Cuevas-Vicenttin (Barcelona Super Computing)
Data Engineering
40 MINS
Room: E107
14:40
Reliable Performance at Scale with Apache Spark on Kubernetes
Will Manning (Palantir)
Matthew Cheah (Palantir)
Data Engineering
40 MINS
Room: Forum
14:40
Revolutionizing the Legal Industry with Spark, NLP and Azure Databricks at Clifford Chance
Mirko Bernardoni (Clifford Chance)
Michael Seddon (Clifford Chance)
Data and ML Industry Use Case
40 MINS
Room: G106
14:40
Tactical Data Science Tips: Python and Spark Together
Bill Chambers (Databricks)
Data Engineering
40 MINS
Room: E102
15:30
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS Library-continues
Miguel Martinez (NVIDIA)
Thomas Graves (NVIDIA)
Data Science
40 MINS
Room: G102
15:30
Apache Spark Core – Practical Optimization —continues
Daniel Tomes (Databricks)
Developer
40 MINS
Room: Auditorium
15:30
Building Reliable Data Lakes at Scale with Delta Lake – continued
Andreas Neumann (Databricks)
Tathagata Das (Databricks)
Mukul Murthy (Databricks)
Data Engineering
40 MINS
Room: Forum
15:30
High-Performance Advanced Analytics with Spark-Alchemy
Simeon Simeonov (Swoop)
Developer
40 MINS
Room: E107
15:30
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Subru Krishnan (Microsoft)
Avrilia Floratou (Microsoft)
AI
40 MINS
Room: Emerald
15:30
Managing the Complete Machine Learning Lifecycle with MLflow—continues
Thunder Shiviah (Databricks)
Michael Shtelma (Databricks)
Tutorials
40 MINS
Room: G104
15:30
Maps and Meaning: Graph-based Entity Resolution in Apache Spark & GraphX
Hendrik Frentrup (Systemati.co)
Data Engineering
40 MINS
Room: E102
15:30
Powering Asurion’s Connected Home Platform with Spark Structured Streaming, Delta Lake, and MLflow
Tomasz Magdanski (Asurion)
Shobhit Gupta (Asurion)
Data and ML Industry Use Case
40 MINS
Room: G106
15:30
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Analytics with Spark AI
Benyue (Emma) Liu (TigerGraph)
Data Science
40 MINS
Room: Elicium 1
15:30
Simplify and Scale Data Engineering Pipelines with Delta Lake
Amanda Moran (Databricks)
Data Engineering
40 MINS
Room: E103
15:30
Working with Complex Types in DataFrames: Optics to the Rescue
Alfonso Roa Redondo (Habla computing)
Developer
40 MINS
Room: Elicium 2
16:10
Break #4
Break
30 MINS
16:15
Ask Me Anything (AMA): MLflow
Break
20 MINS
Room: Expo Hall
16:40
Auto-Pilot for Apache Spark Using Machine Learning
Amogh Margoor (Qubole)
Urvashi Kohli (Qubole)
Mayur Bhosale (Qubole)
AI
40 MINS
Room: Emerald
16:40
Blue Pill/Red Pill: The Matrix of Thousands of Data Streams
Himanshu Gupta (Knoldus Inc.)
Developer
40 MINS
Room: Elicium 2
16:40
Building an AI-Powered Retail Experience with Delta Lake, Spark, and Databricks
Akhil Dhingra (Zalando)
Saurav Verma (Zalando)
Data and ML Industry Use Case
40 MINS
Room: G106
16:40
Building Reliable Data Lakes at Scale with Delta Lake cont
Andreas Neumann (Databricks)
Tathagata Das (Databricks)
Mukul Murthy (Databricks)
Tutorials
40 MINS
Room: G104
16:40
Deploying End-to-End Deep Learning Pipelines with ONNX
Nick Pentreath (IBM)
Data Science
40 MINS
Room: Elicium 1
16:40
Enabling Biobank-Scale Genomic Processing with Spark SQL
Karen Feng (Databricks)
Data Engineering
40 MINS
Room: E107
16:40
Implementing a Reliable Data Lake with Databricks Delta and the AWS Ecosystem
Denis Dubeau (Databricks)
Jordan Martz (Qlik)
Developer
40 MINS
Room: Auditorium
16:40
Improving Apache Spark Downscaling
Christopher Crosbie (Google)
Ben Sidhom (Google)
Data Engineering
40 MINS
Room: E102
16:40
Machine Learning at Scale with MLflow and Apache Spark
Chongguang Liu (Société Générale)
Data Science
40 MINS
Room: G102
16:40
MLflow and Azure Machine Learning—The Power Couple for ML Lifecycle Management
Nishant Thacker (Microsoft)
Data Engineering
40 MINS
Room: E103
16:40
Scaling Data Analytics Workloads on Databricks
Bogdan Ghit (Databricks)
Chris Stevens (Databricks)
Data Engineering
40 MINS
Room: Forum
17:00
Media and Entertainment Networking Event
Meetup
120 MINS
Room: D203
17:30
AI on Spark for Malware Analysis and Anomalous Threat Detection
Jakub Sanojca (Avast)
Joao Da Silva (Avast)
AI
40 MINS
Room: Emerald
17:30
Apache Spark for Cyber Security in an Enterprise Company
Josef Niedermeier (Hewlett Packard Enterprise)
Data and ML Industry Use Case
40 MINS
Room: G106
17:30
Automated Production Ready ML at Scale
Errol Koolmeister (H&M)
Keven Wang (H&M)
Data Science
40 MINS
Room: G102
17:30
Bridging the Gap Between Data Scientists and Software Engineers – Deploying Legacy Python Algorithms to Apache Spark with Minimum Pain
Lucas Partridge (GE Aviation Digital)
Peter Knight (GE Aviation)
Data Engineering
40 MINS
Room: Forum
17:30
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Thunder Shiviah (Databricks)
Data Science
40 MINS
Room: E107
17:30
Improving Apache Spark Downscaling—continues
Christopher Crosbie (Google)
Ben Sidhom (Google)
Data Engineering
40 MINS
Room: E102
17:30
Listening at the Cocktail Party with Deep Neural Networks and TensorFlow
Christian Grant (Demant)
Data Science
40 MINS
Room: Elicium 1
17:30
Optimizing Delta/Parquet Data Lakes for Apache Spark EU
Matthew Powers (Prognos)
Data Engineering
40 MINS
Room: E103
17:30
Refactoring Apache Spark to Allow Additional Cluster Managers
Kevin Doyle (IBM)
Developer
40 MINS
Room: Auditorium
17:30
Using Production Profiles to Guide Optimizations
Adam Barth (Facebook)
Developer
40 MINS
Room: Elicium 2
Search