August 8, 2016

Databricks Bi-Weekly Digest: 8/8/16

Continuing with our bi-weekly digest series, here’s our recap of what’s transpired over the last two weeks with Apache Spark since our previous digest.

Databricks announced general availability of Apache Spark 2.0 on its just-in-time data platform. Reynold Xin elaborated on the merits of Spark 2.0, as easier, faster, and smarter.
Databricks’ CTO and Co-founder Matei Zaharia envisaged the evolution of real-time streaming in Apache Spark 2.0 with Continuous Applications.
Messrs Matei Zaharia, Tathagata Das, Michael Armbrust, and Reynold Xin elaborated on Structured Streaming Model and its API in Apache Spark 2.0: How to write end-to-end continuous applications using the DataFrame and Dataset based streaming API.
Jax Magazine interviewed Xiangrui Meng about Apache MLlib - Making practical machine learning easier and scalable.
Xiangrui Meng talked about Apache SparkR at the Walmart Meetup: Peruse the tech-talk slides here.
The MongoDB Connector for Apache Spark released as a Spark package, and Sam Weaver’s guest blog demonstrated the connector using a Databricks notebook.
Databricks released a new version of the spark-redshift package that works with Spark 2.0.
Kaarthik Sivashanmugam, tech lead for the Mobius Project at Microsoft, explained in a guest blog how Mobius’ C# API extends and enables .NET developers to write Apache Spark applications.
KDD rated Matrix Computations and Optimization in Apache Spark as Best Runner up Paper in Applied Data Science Track for 2016.
Daniel Pape opined on Spark 2.0: Datasets and case classes in a community blog.
Databricks announced the agenda for Spark Summit Europe 2016 in Brussels, Belgium. Register today!

What’s Next?

To stay abreast with what’s happening with Apache Spark, follow us on Twitter @databricks and visit SparkHub.

Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.

View all blogs

What’s Next?

Get the latest posts in your inbox

Sign up