Session

From Days to Seconds — Reducing Query Times on Large Geospatial Datasets by 99%

Overview

ExperienceIn Person
TypeBreakout
TrackData Engineering and Streaming
IndustryEnergy and Utilities, Public Sector, Financial Services
TechnologiesApache Spark, Delta Lake, Databricks Workflows
Skill LevelIntermediate
Duration40 min

The Global Water Security Center translates environmental science into actionable insights for the U.S. Department of Defense. Prior to incorporating Databricks, responding to these requests required querying approximately five hundred thousand raster files representing over five hundred billion points. By leveraging lakehouse architecture, Databricks Auto Loader, Spark Streaming, Databricks Spatial SQL, H3 geospatial indexing and Databricks Liquid Clustering, we were able to drastically reduce our “time to analysis” from multiple business days to a matter of seconds. Now, our data scientists execute queries on pre-computed tables in Databricks, resulting in a “time to analysis” that is 99% faster, giving our teams more time for deeper analysis of the data. Additionally, we’ve incorporated Databricks Workflows, Databricks Asset Bundles, Git and Git Actions to support CI/CD across workspaces. We completed this work in close partnership with Databricks.

Session Speakers

Chris Crawford

/Sr. Solutions Archtect
Databricks

Hobson Bryan

/Associate Director of Technology
Global Water Security Center