Radical Speed on the Lakehouses: Photon under the hood
- Data Lakes, Data Warehouses and Data Lakehouses
- Moscone South | Level 3 | 306
- 35 min
Many organizations are standardizing on the lakehouse, however, this new architecture poses challenges with an underlying query execution engine for accessing structured and unstructured data. The execution engine needs to provide the performance of a data warehouse and the scalability of data lakes. To ensure optimum performance, the Databricks Lakehouse Platform offers Photon. This next-gen vectorized query execution engine outperforms existing data warehouses in SQL workloads and implements a more general execution framework for efficient processing of data with support of the Apache Spark™ API. With Photon, analytical queries are seeing a 3 to 5x speed increase, with a 40% reduction in compute hours for ETL workloads. In this session, we will dive into Photon, describe its integration with the Databricks Platform and Apache Spark™ runtimes, talk through customer use cases, and show how your SQL and DataFrame workloads can benefit from the performance of Photon.