Project Lightspeed: Faster and Simpler Stream Processing With Apache SparkJune 28, 2022 by Karthik Ramasamy, Matei Zaharia, Reynold Xin, Michael Armbrust, Awez Syed, Ray Zhu, Alexander Balikov, Jerry Peng, Shrikanth Shankar and Sameer Paranjpye in Engineering Blog Streaming data is a critical area of computing today. It is the basis for making quick decisions on the enormous amounts of incoming...
Architecting MLOps on the LakehouseJune 22, 2022 by Joseph Bradley, Rafi Kurlansik, Matthew Thomson and Niall Turbitt in Insights Here at Databricks, we have helped thousands of customers put Machine Learning (ML) into production. Shell has over 160 active AI projects saving...
Build Reliable Production Data and ML Pipelines With Git Support for Databricks WorkflowsJune 21, 2022 by Vaibhav Sethi and Roland Fäustlin in Engineering Blog We are happy to announce native support for Git in Databricks Workflows , which enables our customers to build reliable production data and...
Introduction to Analyzing Crypto Data Using DatabricksMay 2, 2022 by Monica Lin, Christoph Meier, Matthew Parker and Kiran Ravella in Data Science and ML The market capitalization of cryptocurrencies increased from $17 billion in 2017 to $2.25 trillion in 2021 . That's over a 13,000% ROI in...
Announcing General Availability of Databricks Feature StoreApril 29, 2022 by Maxim Lukiyanov, Matei Zaharia, Mani Parkhe and Ari Paul in Platform Blog Today, we are thrilled to announce that Databricks Feature Store is generally available (GA)! In this blog post, we explore how Databricks Feature...
How Wrong Is Your Model?April 28, 2022 by Srijith Rajamohan, Ph.D. and Vini Jaiswal in Engineering Blog In this blog, we look at the topic of uncertainty quantification for machine learning and deep learning. By no means is this a...
Simplifying Change Data Capture With Databricks Delta Live TablesApril 25, 2022 by Mojgan Mazouchi in Engineering Blog This guide will demonstrate how you can leverage Change Data Capture in Delta Live Tables pipelines to identify new records and capture changes...
Model Evaluation in MLflowApril 19, 2022 by Mark Zhang in Machine Learning Many data scientists and ML engineers today use MLflow to manage their models. MLflow is an open-source platform that enables users to govern...
Supercharge Your Machine Learning Projects With Databricks AutoML — Now Generally Available!April 18, 2022 by Kasey Uhlenhuth, Ari Paul, Nicolas Pelaez, Xiangrui Meng and Ying Xiong in Platform Blog Machine Learning (ML) is at the heart of innovation across industries, creating new opportunities to add value and reduce cost. At the same...
Building a Geospatial Lakehouse, Part 2March 28, 2022 by Alex Barreto, Yong Sheng Huang and Jake Therianos in Engineering Blog In Part 1 of this two-part series on how to build a Geospatial Lakehouse , we introduced a reference architecture and design principles...