This is a collaborative post from Databricks and Microsoft Azure. We thank Rajeev Jain, Senior Product Marketing Manager at Microsoft, for his contributions.
Data + AI Summit 2022: Register now to join this in-person and virtual event June 27-30 and learn from the global data community.
Microsoft is a Platinum Sponsor of Data + AI Summit 2022, the world’s largest gathering of the data and analytics community. Join us for breakout sessions, customer keynotes, in-person networking, and more!
At Data + AI Summit, Databricks and Microsoft customers will take the stage across several sessions to share how they achieved business results using the Azure Databricks Lakehouse. Attendees will have the opportunity to hear from data leaders from Akamai and engineering and sales leaders from Microsoft.
The sessions below are a guide for everyone interested in Azure Databricks and they span a range of topics — from scaling business operations for enterprise-wide analytics to building a complete analytics and AI solution built on the lakehouse architecture. If you have questions about Azure Databricks or service integrations, connect with Azure Databricks Solutions Architects at Data + AI Summit at the Microsoft Azure booth on the Expo floor.
Azure Databricks Customer Breakout Sessions
Pushing the limits of scale and performance for enterprise-wide analytics: A fire-side chat with Akamai | Hagai Attias, Senior Software Architect (Akamai) and Arindam Chatterjee, Principal General Manager (Microsoft) | 6/28 @ 4:00 PM PST
With the world’s most distributed compute platform — from cloud to edge — Akamai makes it easy for businesses to develop and run applications, while keeping experiences closer to users and threats farther away.
So when it was time to scale its legacy Hadoop-like infrastructure reaching its capacity limits, while keeping their global operations running uninterrupted, Akamai partnered with Microsoft and Databricks to migrate to Azure Databricks.
How AT&T Data Science Team Solved an Insurmountable Big Data Challenge on Databricks with Two Different Approaches using Photon and RAPIDS Accelerator for Apache Spark | Hao Zhu, Senior Manager (NVIDIA) and Chris Vo, Principal Member of Tech Staff (AT&T) | 6/28 @ 4:45 PM PST
Data driven personalization is an insurmountable challenge for AT&T’s data science team because of the size of datasets and complexity of data engineering. More often these data preparation tasks not only take several hours or days to complete but some of these tasks fail to complete affecting productivity. In this session, the AT&T Data Science team will talk about how RAPIDS Accelerator for Apache Spark and Photon runtime on Databricks can be leveraged to process these extremely large datasets resulting in improved content recommendation, classification, etc while reducing infrastructure costs. The team will discuss the design of experiments on different Azure Databricks runtimes with NVIDIA T4 GPU instances and then by Databricks’ Photon runtime. The team will compare speedups and costs to the regular Databricks runtime Apache Spark environment.
Improving Apache Spark Structured Streaming Application Processing Time by Configurations, Code Optimizations, and Custom Data Source | Nir Dror, Principle Performance Engineer (Akamai) and Kineret Raviv, Principal software developer (Akamai) | 6/28 @ 5:30 PM PST
In this session, we’ll go over several use-cases and describe the process of improving our spark structured streaming application micro-batch time from ~55 to ~30 seconds in several steps.
Our app is processing ~ 700 MB/s of compressed data, it has very strict KPIs, and it is using several technologies and frameworks such as: Spark 3.1, Kafka, Azure Blob Storage, AKS and Java 11.
We’ll share our work and experience in those fields, and go over a few tips to create better Spark structured streaming application
Azure Databricks Breakout Sessions
Your fastest path to Lakehouse and beyond | Nate Shea-han, Director Specialist, Global Black Belt Team (Microsoft) | 6/29 @ 11:30 AM PST
Azure Databricks is an easy, open, and collaborative service for data, analytics & AI use cases, enabled by Lakehouse architecture. Join this session to discover how you can get the most out of your Azure investments by combining the best of Azure Synapse Analytics, Azure Databricks and Power BI for building a complete analytics & AI solution based on lakehouse architecture.
Your AI strategy is only as robust as your data estate | A fire-side chat with Accenture, Avanade, and Microsoft | 6/28 @ 11:00 AM PST
Participants: Paul Barrett, CTO – MD (Accenture) | Tripti Sethi, North America Data and AI Lead (Avanade) | Lindsey Allen, General Manager – Azure Databricks and Applied AI (Microsoft)
We also invite you to visit the Microsoft booth on the Expo floor, where you’ll get to talk 1:1 with Azure data engineering on how to address your toughest analytics challenges with Azure.
Register now to join this free virtual event and join the data and AI community. Learn how companies are successfully building their Lakehouse architecture with Azure Databricks to create a simple, open and collaborative data platform. Get started using Databricks with $200 in Azure credits and a free trial.