Databricks Blog

Page 197

Tuning Java Garbage Collection for Apache Spark Applications

May 28, 2015 by Daoyuan Wang and Jie Huang in Partners

This is a guest post from our friends in the SSG STO Big Data Technology group at Intel. Join us at the Spark...

NTT DATA: Operating Apache Spark clusters at thousands-core scale and use cases for Telco and IoT

May 14, 2015 by Masaru Dobashi, Kousuke Saruta, Toru Shimogaki and Masayoshi Tsuzuki in Company Blog

This is a guest blog from our one of our partners: NTT DATA Corporation About NTT DATA Corporation NTT DATA Corporation is a...

Project Tungsten: Bringing Apache Spark Closer to Bare Metal

April 28, 2015 by Reynold Xin and Josh Rosen in Engineering Blog

In a previous blog post , we looked back and surveyed performance improvements made to Apache Spark in the past year. In this...

Recent performance improvements in Apache Spark: SQL, Python, DataFrames, and More

April 24, 2015 by Reynold Xin in Engineering Blog

Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Big Graph Analytics with LynxKite & Apache Spark

April 23, 2015 by Daniel Darabos in Company Blog

This is a guest blog from our one of our partners: Lynx Analytics About Lynx Analytics Lynx Analytics is a data analytics consultancy...

Analyzing Apache Access Logs with Databricks

April 21, 2015 by Ion Stoica and Vida Ha in Partners

Databricks provides a powerful platform to process, analyze, and visualize big and small data in one place. In this blog, we will illustrate...

New MLlib Algorithms in Apache Spark 1.3: FP-Growth and Power Iteration Clustering

April 17, 2015 by Jacky Li, Fan Jiang, Youhua Zhang, Stephen Boesch and Bing Xiao in Engineering Blog

This is a guest blog post from Huawei’s big data global team. Huawei, a Fortune Global 500 private company, has put together a...

The Easiest Way to Run Apache Spark Jobs

April 16, 2015 by Ion Stoica in Product

Recently, Databricks added a new feature, Jobs, to our cloud service. You can find a detailed overview of this feature here . This...

Celtra Scales Big Data Analysis Projects Six-Fold with Databricks

April 15, 2015 by Kavitha Mariappan and Dave Wang in Product

We are thrilled to announce that Celtra selected Databricks to scale its big data analysis projects, increasing the amount of ad-hoc analysis done...

Running Apache Spark GraphX algorithms on Library of Congress subject heading SKOS

April 14, 2015 by Bob DuCharme in Engineering Blog

This is a guest post from Bob DuCharme. Original article appeared in: http://www.snee.com/bobdc.blog/2015/04/running-spark-graphx-algorithm.html Well, one algorithm, but a very cool one. Last month...