Next Generation Physical Planning in Apache SparkApril 1, 2017 by Aaron Davidson, Eric Liang and Thomas Desrosiers in Engineering Blog Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. — Andrew Tanenbaum, 1981 magine a cold, windy...
On-Demand Webinar and FAQ: Apache Spark MLlib 2.x: How to Productionize your Machine Learning ModelsMarch 28, 2017 by Richard Garris and Jules Damji in Engineering Blog On March 9th, we hosted a live webinar— Apache Spark MLlib 2.x: How to Productionize your Machine Learning Models —to address the following...
Analyse One Year of Radio Station Songs Aired with Apache Spark, Spark SQL, Spotify, and DatabricksMarch 27, 2017 by Paul Leclercq in Engineering Blog Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...
Voice from Facebook: Using Apache Spark for Large-Scale Language Model TrainingFebruary 28, 2017 by Tejas Patil and Jing Zheng in Engineering Blog This is a guest post from Facebook. Tejas Patil and Jing Zheng, software engineers in the Facebook engineering team, show how to use...
Working with Complex Data Formats with Structured Streaming in Apache Spark 2.1February 23, 2017 by Burak Yavuz, Michael Armbrust, Tathagata Das and Tyson Condie in Engineering Blog In part 1 of this series on Structured Streaming blog posts, we demonstrated how easy it is to write an end-to-end streaming ETL...