December 13, 2022
New York City | 09:00 AM EST
New York Marriott Downtown
85 West Street At Albany Street,
New York, NY 10006
Data + AI World Tour New York
Lakehouse is fast emerging as the new standard for data architecture but every region has its own unique stories and challenges.
Please join us for this event with speakers, customers and content designed with you in mind.
Our lineup of data and AI experts, leaders and visionaries includes Databricks executives, as well as Databricks customers from across the Americas.
Registration is now full for the New York in-person event. Please register for the virtual event here.
Co-Founder & SVP, Field Engineering
Principal Software Engineer
Staff Product Manager
- The Case for Lakehouse, Breaking down the data and governance silos
- The Case for Lakehouse, Customer Story
- Data Engineering on the Lakehouse
- Data Warehousing on the Lakehouse
- Machine Learning on the Lakehouse
The data lakehouse is the future for modern data teams seeking to innovate with a data architecture that simplifies data workloads, eases collaboration, and maintains the flexibility and openness to stay agile as a company scales. The Databricks Lakehouse Platform realizes this idea by unifying analytics, data engineering, machine learning, and streaming workloads across clouds on one simple, open data platform. In this session, learn how the Databricks Lakehouse Platform can meet your needs for every data and analytics workload, with examples of real-customer applications, reference architectures, and demos to showcase how you can create modern data solutions of your own.
Discover the latest innovations from Databricks that can help you build and operationalize the next generation of machine learning solutions. This session will dive into Databricks Machine Learning, a data-centric AI platform that spans the full machine learning lifecycle - from data ingestion and model training to production MLOps. You'll learn about key capabilities that you can leverage in your ML use cases and see the product in action.
The data lakehouse is the next best data warehouse. In this session, learn how Databricks SQL can help you lower costs and get started in seconds with instant, elastic SQL serverless compute, and how to empower every analytics engineers and analysts to quickly find and share new insights using their favorite BI and SQL tools, like Fivetran, dbt, Tableau or PowerBI.
Join the Databricks Lakehouse Overview session to discover how the Databricks Lakehouse Platform can help you compete in the world of big data and artificial intelligence. You’ll be introduced to foundational concepts in big data, explain key roles and abilities to look for when building data teams, and familiarize yourself with all parts of a complete data landscape. We’ll also review how the Databricks Lakehouse Platform can help your organization streamline workflows, break down silos, and make the most out of your data.
Data engineers have the difficult task of cleansing complex, diverse data, and transforming it into a usable source to drive data analytics, data science, and machine learning. They need to know the data infrastructure platform in depth, build complex queries in various languages and stitch them together for production. Join this talk to learn how Delta Live Tables (DLT) simplifies the complexity of data transformation and ETL. DLT is the first ETL framework to use modern software engineering practices to deliver reliable and trusted data pipelines at any scale. Discover how analysts and data engineers can innovate rapidly with simple pipeline development and maintenance, how to remove operational complexity by automating administrative tasks and gaining visibility into pipeline operations, how built-in quality controls and monitoring ensure accurate BI, data science, and ML, and how simplified batch and streaming can be implemented with self-optimizing and auto-scaling data pipelines.
To make the lakehouse a reality, the query engine needs to support both structured and unstructured data, while providing the performance of a data warehouse and the scalability of data lakes. In this session, learn how Databricks’ next generation vectorized engine Photon outperforms existing data warehouses in SQL workloads and implements a more general execution framework for efficient processing of data with support of the Apache Spark™ API.
As companies roll out ML pervasively, operational concerns become the primary source of complexity. Machine Learning Operations (MLOps) has emerged as a practice to manage this complexity. At Databricks, we see firsthand how customers develop their MLOps approaches across a huge variety of teams and businesses.
In this session, we will share how Databricks uniquely solves this by unifying the key aspects of MLOps, namely DataOps, ModelsOps and DevOps, on a unified platform through the Lakehouse, enabling faster and more reliable production ML . We will show how your organization can build robust MLOps practices incrementally. and unpack general principles which can guide your organization’s decisions for MLOps, presenting the most common target architectures we observe across customers.
Orchestrating and managing end-to-end production pipelines have remained a bottleneck for many organizations. Data teams spend too much time stitching pipeline tasks and manually managing and monitoring the orchestration process – with heavy reliance on external or cloud-specific orchestration solutions, all of which slow down the delivery of new data. In this session, we introduce you to Databricks Workflows: a fully managed orchestration service for all your data, analytics, and AI, built in the Databricks Lakehouse Platform. Join us as we dive deep into the new workflow capabilities, and understand the integration with the underlying platform. You will learn how to create and run reliable production workflows, centrally manage and monitor workflows, and learn how to implement recovery actions such as repair and run, as well as other new features.
Streaming is the future of all data pipelines and applications. It enables businesses to make data-driven decisions sooner and react faster, develop data-driven applications considered previously impossible, and deliver new and differentiated experiences to customers. However, many organizations have not realized the promise of streaming to its full potential because it requires them to completely redevelop their data pipelines and applications on new, complex, proprietary, and disjointed technology stacks.
See how you can scale blazing-fast Business Intelligence (BI) on a Databricks Lakehouse using AtScale as your universal semantic layer. This session will offer practical advice for running all types of large BI workloads on top of a Databricks Lakehouse by leveraging Databricks SQL with AtScale’s universal semantic layer. See how AtScale pushes down queries to the Databricks Lakehouse with no data movement off of your Databricks infrastructure while delivering lightning-fast interactive queries at lower costs. In addition, you will see how using a universal semantic layer delivers consistent metrics across any BI/AI tool like Power BI, Tableau, Excel, Looker, Jupyter Notebooks and more, using a live connection to the Databricks Lakehouse.
Customers around the world are experiencing tremendous success migrating from legacy on-premises Hadoop architectures to a modern Databricks Lakehouse in the cloud. At Databricks, we have formulated a migration methodology that helps customers sail through this migration journey with ease. In this talk, we will touch upon some of the key elements that minimize risks and simplify the process of migrating to Databricks, and will walk through some of the customer journeys and use cases.
Modern data assets take many forms: not just files or tables, but dashboards, ML models, and unstructured data like video and images, all of which cannot be governed and managed by legacy data governance solutions. Join this session to learn how data teams can use Unity Catalog to centrally manage all data and AI assets with a common governance model based on familiar ANSI SQL, ensuring much better native performance and security. Built-in automated data lineage provides end-to-end visibility into how data flows from source to consumption, so that organizations can identify and diagnose the impact of data changes. Unity Catalog delivers the flexibility to leverage existing data catalogs and solutions and establish a future-proof, centralized governance without expensive migration costs. It also creates detailed audit reports for data compliance and security, while ensuring data teams can quickly discover and reference data for BI, analytics, and ML workloads, accelerating time to value.
Everyone wants to reduce the time it takes to turn data into actionable information. This often involves integrating several data tools in what has become known as the modern data stack (MDS). But, most approaches have only focused on half the problem by rooting the MDS in the data warehouse. A true MDS should solve all modern problems, and this means tackling AI and streaming in addition to reporting and BI. In this deep dive demo session, we show you how easy it is to integrate the Databricks Lakehouse Platform into your modern data stack to connect all your data tools across SQL, AI/ML, and streaming, and discover new methods to unlock insights faster.