HomepageData + AI Summit 2022 Logo
Watch on demand

Migrating Complex SAS Processes to Databricks - Case Study

On Demand

Type

  • Session

Format

  • Hybrid

Track

  • Industry and Business Use Cases

Branche

  • Öffentlicher Sektor

Difficulty

  • Intermediate

Room

  • Moscone South | Upper Mezzanine | 159

Duration

  • 35 min
Download session slides

Überblick

Many federal agencies use SAS software for critical operational data processes (ETL). While SAS has historically been a leader in analytics, it has often been used by data analysts for ETL purposes as well. However, modern data science demands on ever-increasing volumes and types of data require a shift to modern, cloud architectures and data management tools and paradigms for ETL/ELT. In this presentation, we will provide a case study at Centers for Medicare and Medicaid Services (CMS) detailing the approach and results of migrating a large, complex legacy SAS process to modern, open-source/open-standard technology - Spark SQL & Databricks – to produce results ~75% faster without reliance on proprietary constructs of the SAS language, with more scalability, and in a manner that can more easily ingest old rules and better govern the inclusion of new rules and data definitions. Numerous benefits resulted in migrating to Databricks. These include:
- The application scales to accommodate workloads with query plans which are enhanced and contain proper exception/error handling
- Workflows are automated, with data pipelines which are native to Databricks and able to run in parallel
- Users can develop collaboratively, using an integrated workspace
- Integration of new data sources and/or the inclusion of new data definitions is easier and better governed
- The code is optimized to run in Databricks rather than simply migrated over to run sub-optimally
- Runtime performance gains of ~75% enable CMS to meet the needs of their state customers in a timelier manner
- Legacy developers are provided with a primer to follow when migrating current work processes to Databricks
- Removal of the reliance on SAS programmers, specifically on individuals proficient in SAS Macro Programming

Session Speakers

Jesse Beaumont

Chief Technology Officer

Tensile AI LLC

Uday Kumar

Chief Digital Officer

Akira Technologies

Das Beste des Data+AI Summits anzeigen

Watch on demand