Distributing the Singular Value Decomposition with Apache SparkJuly 21, 2014 by Li Pu and Reza Zadeh in Engineering Blog Guest post by Li Pu from Twitter and Reza Zadeh from Databricks on their recent contribution to Apache Spark's machine learning library. The...
The State of Apache Spark in 2014July 18, 2014 by Matei Zaharia in Engineering Blog This post originally appeared in insideBIGDATA and is reposted here with permission. With the second Spark Summit behind us, we wanted to take...
New Features in MLlib in Apache Spark 1.0July 16, 2014 by Xiangrui Meng in Engineering Blog MLlib is an Apache Spark component focusing on machine learning. It became a standard component of Spark in version 0.8 (Sep 2013). The...
Shark, Spark SQL, Hive on Spark, and the future of SQL on Apache SparkJuly 1, 2014 by Reynold Xin in Engineering Blog With the introduction of Spark SQL and the new Hive on Apache Spark effort ( HIVE-7292 ), we get asked a lot about...
Exciting Performance Improvements on the Horizon for Spark SQLJune 2, 2014 by Michael Lumb and Zongheng Yang in Engineering Blog Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...