Company Blog

Azure Databricks Highlights Adoption of Delta Lake, MLflow, and Integration with Azure Machine Learning at Microsoft Ignite 2019

Share this post

At Microsoft Ignite 2019, thousands of attendees participated in hands-on workshops, breakout sessions, and theater presentations to learn how customers are achieving phenomenal results with Azure Databricks! It was an action-packed week of making new connections and learning about new innovation across data science, data engineering, and business analytics.

Azure Databricks -- Amazing growth in less than 2 years, including thousands of global Azure Databricks customers, millions of server-hours spinning up every day, 2 exabytes of data processed per month, and global coverage with 29 Regions available worldwide.

We shared the news that over 75% of data processed on Azure Databricks is in Delta Lake -- the new open source standard for data lakes. Hands-on labs and breakout sessions gave attendees an opportunity to see and experience Delta Lake on Azure Databricks first hand.

Delta Lake -- Open-source storage layer that brings ACID transactions to Apache SparkTM and big data workloads, including 75% of data processed in Azure Databricks.
Delta Lake’s openness and extensibility enable faster innovation and more effective use of data. Azure Databricks has seen amazing growth over the past two years. This rapid growth has created the need for new tools and capabilities such as the MLflow Model Registry and ML lifecycle management which help customers track, deploy, and update their ML models. Attendees learned three ways Azure Databricks works with Azure Synapse Analytics to bring analytics, business intelligence (BI), and data science together in one solution architecture.

Azure Databricks momentum and acceleration were highlighted in many sessions during the week of Ignite 2019. Below are just a few sessions to give you an idea of the breadth of customer use cases driving such phenomenal growth.

Azure Databricks Sessions at Ignite 2019

  • Delta Lake on Azure Databricks: Implementing a new open source standard for Data Lakes (THR2339) - Ajay Singh from Databricks shared how you can enhance your data lakes for greater value by adding new capabilities for transactions, version control and indexing using the latest open source innovations. Founded by the original creators of Apache Spark, Databricks has continued to innovate by launching a new open source project called Delta Lake designed to make existing data lakes more scalable and reliable. Built on top of infinitely scalable Azure Data Lake Storage and deeply integrated with Azure SQL Data Warehouse, Delta Lake on Azure Databricks makes data better prepared for analytic and machine learning workloads. Available today as part of Azure Databricks, see why large organizations are upgrading their data lake in-place with zero migration to Delta Lake!
  • Managing your ML lifecycle with Azure Databricks and Azure Machine Learning (BRK3245) - Premal Shah from Microsoft and Mike Cornell from Databricks shared how machine learning development has new complexities beyond software development. There are a myriad of tools and frameworks which make it hard to track experiments, reproduce results, and deploy machine learning models. Learn how you can accelerate and manage your end-to-end machine learning lifecycle on Azure Databricks using MLflow and Azure ML to reliably build, share, and deploy machine learning applications using Azure Databricks.
  • Hands-on with Azure Databricks Delta Lake (WRK2013) - In this lab, Kyle Weller, Premal Shah, Shiva Nimmagadda Venkata, and Santosh Perla from Microsoft guided attendees as they learned how to increase performance and reliability of their data engineering pipelines with Azure Databricks Delta Lake. Delta Lake provides optimized layouts to enable big data use cases, from batch and streaming ingests, fast interactive queries, to machine learning.
  • Maximizing your Azure Databricks deployment (BRK3043) - Yatharth Gupta from Microsoft shared insights and key steps to get the most out of your Azure Databricks resources. Whether you’re new to Spark or an Azure Databricks veteran, attend and learn the tips, tricks, and best practices for working with Azure Databricks. Based on experiences with real industry customers, these best practices cover deployments, management, security, networking, monitoring, machine learning, and more. See how the power of Azure Databricks’ managed Spark experience can benefit your analytics pipeline!
  • Azure Databricks and Azure Machine Learning better together (THR2186) - In this theater session, Premal Shah from Microsoft demonstrated the new integrations between Azure Databricks and Azure Machine Learning to realize comprehensive E2E machine learning scenarios for our customers.
  • Data reproducibility, audits, immediate rollbacks and other applications of time travel with Delta Lake in Azure Databricks (BRK3254) - Time travel is now possible with Azure Databricks Delta Lake! Kyle Weller from Microsoft uncovered how Delta Lake makes time travel possible and why it matters to you. Through presentation, notebooks, and code, he showcased several common applications and how they can improve your modern data engineering pipelines. Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark(TM). It provides snapshot isolation for concurrent read/writes. Enables efficient upserts, deletes, and immediate rollback capabilities. It allows background file optimization through compaction and z-order partitioning achieving up to 100x performance improvements. In this presentation, learn what challenges Delta Lake solves, how Delta Lake works under the hood, and applications of the new Delta Time Travel capability.

Get Started with Azure Databricks

The fun doesn’t stop there! Keep learning more:

Follow us on Twitter, LinkedIn, and Facebook for more Azure Databricks news, customer highlights, and new feature announcements.

Try Databricks for free
See all Company Blog posts