Automating Unity Catalog Migration with UCX: Building Robust Python Applications on Databricks
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Data Governance |
TECHNOLOGIES | Databricks Experience (DBX), Governance |
SKILL LEVEL | Advanced |
DURATION | 40 min |
DOWNLOAD SESSION SLIDES |
We'll dive into the intricacies of crafting resilient Python applications on Databricks, drawing lessons from building a Terraform provider, UCX, and SDK for Python. We’ll show you how to use Databricks Labs UCX application can help you find out what code is not compatible with Unity Catalog, try to automatically fix it, and migrate hundreds of thousands of Hive Metastore Tables to Unity Catalog along the way with permissions and updated cluster settings. This session will give developers an in-depth look into failure recovery, rate limits, logging, multithreading, and unified client authentication. Real-world examples from projects like Databricks Labs UCX provide tangible insights, ensuring attendees gain a practical understanding of implementation and benefits. We’ll explore the strategies and best practices for handling failures gracefully, like error handling on the level of API call, background thread, reflection in logs, or the failures of the whole Databricks Workflow. We'll explore how PyTest simplifies writing reproducible sandbox scenarios on a shared Workspace, where multiple developers can run their verification suites simultaneously. Learn how other toolkits in Databricks Labs help deal with complexities.
SESSION SPEAKERS
Serge Smertin
/All things @ Databricks Labs
Databricks