Apache Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time Intervals, and UDAFsSeptember 16, 2015 by Michael Armbrust, Yin Huai, Davies Liu and Reynold Xin in Engineering Blog To try new features highlighted in this blog post, download Spark 1.5 or sign up Databricks for a 14-day free trial today...
Announcing Apache Spark 1.5September 9, 2015 by Reynold Xin and Patrick Wendell in Engineering Blog The inaugural Spark Summit Europe will be held in Amsterdam this October. Check out the full agenda and get your ticket before it...
From Pandas to Apache Spark's DataFrameAugust 12, 2015 by Olivier Girardot in Engineering Blog This is a cross-post from the blog of Olivier Girardot. Olivier is a software engineer and the co-founder of Lateral Thoughts, where he...
Diving into Apache Spark Streaming's Execution ModelJuly 30, 2015 by Tathagata Das, Matei Zaharia and Patrick Wendell in Engineering Blog With so many distributed stream processing engines available, people often ask us about the unique benefits of Apache Spark Streaming . From early...
New Features in Machine Learning Pipelines in Apache Spark 1.4July 29, 2015 by Joseph Bradley and Burak Yavuz in Engineering Blog Apache Spark 1.2 introduced Machine Learning (ML) Pipelines to facilitate the creation, tuning, and inspection of practical ML workflows. Spark’s latest release, Spark...