Scaling Hyperopt to Tune Machine Learning Models in PythonOctober 29, 2019 by Joseph Bradley and Max Pumperla in Solutions Try the Hyperopt notebook to reproduce the steps outlined below and watch our on-demand webinar to learn more. Hyperopt is one of the...
Delta Lake Now Hosted by the Linux Foundation to Become the Open Standard for Data LakesOctober 16, 2019 by Michael Armbrust and Reynold Xin in Platform Blog Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. At today’s Spark +...
Simple, Reliable Upserts and Deletes on Delta Lake Tables using Python APIsOctober 3, 2019 by Tathagata Das and Denny Lee in Solutions We are excited to announce the release of Delta Lake 0.4.0 which introduces Python APIs for manipulating and managing data in Delta tables...
Parallelizing SAIGE Across Hundreds of CoresOctober 2, 2019 by Karen Feng, Henry Davidge and Frank Austin Nothaft in Engineering Blog As population genetics datasets grow exponentially, it is becoming impractical to work with genetic data without leveraging Apache Spark™. There are many ways...
Diving Into Delta Lake: Schema Enforcement & EvolutionSeptember 24, 2019 by Burak Yavuz, Brenner Heintz and Denny Lee in Company Blog Try this notebook series in Databricks Data, like our experiences, is always evolving and accumulating. To keep up, our mental models of the...