Hands-on Workshop:Delta in Production

On Demand

How to spend less time fixing errors and get a 50% faster time-to-insight with Delta Lake

At Databricks, we are a company strongly rooted in bringing R&D to production. The reliability and performance enhancements associated with our Lakehouse vision and production workloads are obtained via our three pillars : delta engine, MLFlow, and databricks jobs.

Over 3,000 customers have deployed Delta based workflows to production since the first use case involving processing petabytes of data in seconds back in 2017. To this day, customers of all sizes continue to leverage our delta based pipelines as the backbone of their data platforms.

In this workshop, we will elaborate on best practices companies follow when deploying Delta to production. Some of these best practices include checkpoints, error alerting, and job retries. Further, we will elaborate on Delta performance enhancements such as data skipping, caching, and z-ordering. Finally, we will touch on optimisations and vacuum operations.

In the hands-on portion of these sessions, you’ll ingest real time streaming data, refine it, and serve it for downstream ML & BI use cases. You will also incorporate the best practices described above to ensure production grade performance, visibility, and fault tolerance. This pipeline will serve as a reusable template that you can tailor to meet your specific use cases in the future!