Session

Using Databricks Geospatial Processing at Scale

Overview

ExperienceIn Person
TrackData Engineering & Streaming
IndustryTransportation
TechnologiesDatabricks SQL
Skill LevelIntermediate

Our team started on Geospatial journey with Databricks DBR 16.X, from legacy project migration and continued to expand by ingesting and processing three different geospatial datasets (HD USHR maps, public OSM and curated OSM (Overture)). We applied different optimization techniques to make spatial joins more performant and worked closely with Databricks Geospatial team in testing out incremental improvements in DBR, culminating in large scale performance increase on DBR 17.X bringing support to geospatial data types (GEOMETRY, GEOGRAPHY) as dedicated data types that could be stored in delta table. A significant optimization was observed in spatial joins, which could eliminate previously used optimization techniques, making data pipelines maintenance easier. All of these improvements enabled us to enrich internal datasets based on various road attributes from these datasets and use them in analytics, training and potential reinforcement learning applications where needed.

Session Speakers

Speaker placeholderIMAGE COMING SOON

Chinmay Gupte

/Lead Software Engineer, Data
Rivian

Filip Ilic

/Senior Data Engineer, Autonomy
Rivian