Continuing with our bi-weekly digest series, here’s our recap of what’s transpired over the last two weeks with Apache Spark since our previous digest.
- Databricks announced general availability of Apache Spark 2.0 on its just-in-time data platform. Reynold Xin elaborated on the merits of Spark 2.0, as easier, faster, and smarter.
- Databricks’ CTO and Co-founder Matei Zaharia envisaged the evolution of real-time streaming in Apache Spark 2.0 with Continuous Applications.
- Messrs Matei Zaharia, Tathagata Das, Michael Armbrust, and Reynold Xin elaborated on Structured Streaming Model and its API in Apache Spark 2.0: How to write end-to-end continuous applications using the DataFrame and Dataset based streaming API.
- Jax Magazine interviewed Xiangrui Meng about Apache MLlib - Making practical machine learning easier and scalable.
- Xiangrui Meng talked about Apache SparkR at the Walmart Meetup: Peruse the tech-talk slides here.
- The MongoDB Connector for Apache Spark released as a Spark package, and Sam Weaver’s guest blog demonstrated the connector using a Databricks notebook.
- Databricks released a new version of the spark-redshift package that works with Spark 2.0.
- Kaarthik Sivashanmugam, tech lead for the Mobius Project at Microsoft, explained in a guest blog how Mobius’ C# API extends and enables .NET developers to write Apache Spark applications.
- KDD rated Matrix Computations and Optimization in Apache Spark as Best Runner up Paper in Applied Data Science Track for 2016.
- Daniel Pape opined on Spark 2.0: Datasets and case classes in a community blog.
- Databricks announced the agenda for Spark Summit Europe 2016 in Brussels, Belgium. Register today!
See all Engineering Blog posts
Try Databricks for free