Engineering Blog | Databricks Blog

Page 68

Announcing Apache Spark Packages

December 22, 2014 by Patrick Wendell in Solutions

Today, we are happy to announce Apache Spark Packages ( http://spark-packages.org ), a community package index to track the growing number of open source packages and libraries that work with Apache Spark. Spark Packages makes it easy for users to find, discuss, rate, and install packages for any version of Spark, and makes it easy for developers to contribute packages.

Announcing Apache Spark 1.2

December 19, 2014 by Patrick Wendell in Engineering Blog

We at Databricks are thrilled to announce the release of Apache Spark 1.2! Apache Spark 1.2 introduces many new features along with scalability...

Pearson uses Apache Spark Streaming for next generation adaptive learning platform

December 8, 2014 by Dibyendu Bhattacharya in Company Blog

This is a guest blog post from our friends at Pearson outlining their Apache Spark use case. Introduction of Pearson Pearson is a...

Apache Spark Officially Sets a New Record in Large-Scale Sorting

November 5, 2014 by Reynold Xin in Engineering Blog

A month ago, we shared with you our entry to the 2014 Gray Sort competition, a 3rd-party benchmark measuring how fast a system...

Efficient Similarity Algorithm Now in Apache Spark, Thanks to Twitter

October 20, 2014 by Reza Zadeh in Engineering Blog

Our friends at Twitter have contributed to MLlib, and this post uses material from Twitter’s description of its open-source contribution , with permission...

Apache Spark the Fastest Open Source Engine for Sorting a Petabyte

October 10, 2014 by Reynold Xin in Engineering Blog

Update November 5, 2014 : Our benchmark entry has been reviewed by the benchmark committee and Apache Spark has won the Daytona GraySort...

Sharethrough Uses Apache Spark Streaming to Optimize Advertisers' Return on Marketing Investment

October 7, 2014 by Russell Cardullo in Company Blog

This is a guest blog post from our friends at Sharethrough providing an update on how their use of Apache Spark has continued...

Apache Spark as a platform for large-scale neuroscience

October 1, 2014 by Jeremy Freeman in Engineering Blog

The brain is the most complicated organ of the body, and probably one of the most complicated structures in the universe. It’s millions...

Scalable Decision Trees in MLlib

September 29, 2014 by Manish Amde and Joseph Bradley in Engineering Blog

This is a post written together with one of our friends at Origami Logic. Origami Logic provides a Marketing Intelligence Platform that uses...

Apache Spark 1.1: MLlib Performance Improvements

September 22, 2014 by Burak Yavuz in Engineering Blog

With an ever-growing community, Apache Spark has had it’s 1.1 release . MLlib has had its fair share of contributions and now supports...