Joint Blog Post: Bringing ORC Support into Apache SparkJuly 16, 2015 by Zhan Zhang, Cheng Liang and Patrick Wendell in Engineering Blog This is a joint blog post with our partner Hortonworks. Zhan Zhang is a member of technical staff at Hortonworks, where he collaborated...
Introducing Window Functions in Spark SQLJuly 15, 2015 by Yin Huai and Michael Armbrust in Engineering Blog Get an early preview of O'Reilly's new ebook for the step-by-step guidance you need to start using Delta Lake. In this blog post...
New Visualizations for Understanding Apache Spark Streaming ApplicationsJuly 8, 2015 by Tathagata Das, Shixiong Zhu and Andrew Or in Engineering Blog Earlier, we presented new visualizations introduced in Apache Spark 1.4.0 to understand the behavior of Spark applications. Continuing the theme, this blog highlights...
Guest blog: PMML Support in Apache Spark's MLlibJuly 2, 2015 by Vincenzo Selvaggio in Engineering Blog This is a guest blog from our friend Vincenzo Selvaggio who contributed this feature. He is a Senior Java Technical Architect and Project...
Understanding your Apache Spark Application Through VisualizationJune 22, 2015 by Andrew Or in Engineering Blog The greatest value of a picture is when it forces us to notice what we never expected to see. - John Tukey In...