Introducing the MLflow Model RegistryOctober 17, 2019 by Clemens Mewald, Matei Zaharia and Cyrielle Simeone in Announcements Watch the announcement and demo At today’s Spark + AI Summit in Amsterdam , we announced the availability of the MLflow Model Registry...
Delta Lake Now Hosted by the Linux Foundation to Become the Open Standard for Data LakesOctober 16, 2019 by Michael Armbrust and Reynold Xin in Platform Blog Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. At today’s Spark +...
How Informatica Data Engineering Goes Hadoop-less with DatabricksOctober 10, 2019 by Hiral Jasani in Company Blog Back in May, we announced our partnership with Informatica to build out a rich set of integrations between our two platforms. It’s been...
Simple, Reliable Upserts and Deletes on Delta Lake Tables using Python APIsOctober 3, 2019 by Tathagata Das and Denny Lee in Solutions We are excited to announce the release of Delta Lake 0.4.0 which introduces Python APIs for manipulating and managing data in Delta tables...
Parallelizing SAIGE Across Hundreds of CoresOctober 2, 2019 by Karen Feng, Henry Davidge and Frank Austin Nothaft in Engineering Blog As population genetics datasets grow exponentially, it is becoming impractical to work with genetic data without leveraging Apache Spark™. There are many ways...