Guest Blog: Streamliner - An Open Source Apache Spark Streaming ApplicationDecember 18, 2015 by Ankur Goyal in Company Blog This is a guest blog from Ankur Goyal, VP of Engineering at MemSQL Our always-on interconnected world constantly shuttles data between devices and...
Succinct Spark from AMPLab: Queries on Compressed RDDsNovember 10, 2015 by Rachit Agarwal and Anurag Khandelwal in Engineering Blog This is a guest post from Rachit Agarwal and Anurag Khandelwal of the UC Berkeley AMPLab, leads of an ongoing research project called...
Announcing the TFOCS for Spark Optimization PackageNovember 2, 2015 by Aaron Staple in Engineering Blog Aaron is the developer of this Apache Spark package, with support from Databricks. Aaron is a freelance software developer with experience in data...
Introducing Redshift Data Source for SparkOctober 19, 2015 by Sameer Wadkar and Josh Rosen in Engineering Blog This is a guest blog from Sameer Wadkar, Big Data Architect/Data Scientist at Axiomine. The Spark SQL Data Sources API was introduced in...
Generalized Linear Models in SparkR and R Formula Support in MLlibOctober 5, 2015 by Eric Liang in Engineering Blog To get started with SparkR, download Apache Spark 1.5 or sign up for a 14-day free trial of Databricks today . Apache Spark...
Apache Spark 1.5.1 and What do Version Numbers Mean?October 1, 2015 by Reynold Xin in Engineering Blog The inaugural Spark Summit Europe will be held in Amsterdam on October 27 - 29. Check out the full agenda and get your...
Improved Frequent Pattern Mining in Apache Spark 1.5: Association Rules and Sequential PatternsSeptember 28, 2015 by Feynman Liang, Jiajin Zhang and Dandan Tu in Engineering Blog We would like to thank Jiajin Zhang and Dandan Tu from Huawei for contributing to this blog. To get started mining patterns from...
Large Scale Topic Modeling: Improvements to LDA on Apache SparkSeptember 22, 2015 by Feynman Liang, Yuhao Yang and Joseph Bradley in Engineering Blog This blog was written by Feynman Liang and Joseph Bradley from Databricks, and Yuhao Yang from Intel. To get started using LDA, download...
Apache Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time Intervals, and UDAFsSeptember 16, 2015 by Michael Armbrust, Yin Huai, Davies Liu and Reynold Xin in Engineering Blog To try new features highlighted in this blog post, download Spark 1.5 or sign up Databricks for a 14-day free trial today...
Announcing Apache Spark 1.5September 9, 2015 by Reynold Xin and Patrick Wendell in Engineering Blog The inaugural Spark Summit Europe will be held in Amsterdam this October. Check out the full agenda and get your ticket before it...