Tuning Java Garbage Collection for Apache Spark ApplicationsMay 28, 2015 by Daoyuan Wang and Jie Huang in Partners This is a guest post from our friends in the SSG STO Big Data Technology group at Intel. Join us at the Spark...
NTT DATA: Operating Apache Spark clusters at thousands-core scale and use cases for Telco and IoTMay 14, 2015 by Masaru Dobashi, Kousuke Saruta, Toru Shimogaki and Masayoshi Tsuzuki in Company Blog This is a guest blog from our one of our partners: NTT DATA Corporation About NTT DATA Corporation NTT DATA Corporation is a...
Project Tungsten: Bringing Apache Spark Closer to Bare MetalApril 28, 2015 by Reynold Xin and Josh Rosen in Engineering Blog In a previous blog post , we looked back and surveyed performance improvements made to Apache Spark in the past year. In this...
Recent performance improvements in Apache Spark: SQL, Python, DataFrames, and MoreApril 24, 2015 by Reynold Xin in Engineering Blog Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...
Big Graph Analytics with LynxKite & Apache SparkApril 23, 2015 by Daniel Darabos in Company Blog This is a guest blog from our one of our partners: Lynx Analytics About Lynx Analytics Lynx Analytics is a data analytics consultancy...
Analyzing Apache Access Logs with DatabricksApril 21, 2015 by Ion Stoica and Vida Ha in Partners Databricks provides a powerful platform to process, analyze, and visualize big and small data in one place. In this blog, we will illustrate...
New MLlib Algorithms in Apache Spark 1.3: FP-Growth and Power Iteration ClusteringApril 17, 2015 by Jacky Li, Fan Jiang, Youhua Zhang, Stephen Boesch and Bing Xiao in Engineering Blog This is a guest blog post from Huawei’s big data global team. Huawei, a Fortune Global 500 private company, has put together a...
The Easiest Way to Run Apache Spark JobsApril 16, 2015 by Ion Stoica in Product Recently, Databricks added a new feature, Jobs, to our cloud service. You can find a detailed overview of this feature here . This...
Celtra Scales Big Data Analysis Projects Six-Fold with DatabricksApril 15, 2015 by Kavitha Mariappan and Dave Wang in Product We are thrilled to announce that Celtra selected Databricks to scale its big data analysis projects, increasing the amount of ad-hoc analysis done...
Running Apache Spark GraphX algorithms on Library of Congress subject heading SKOSApril 14, 2015 by Bob DuCharme in Engineering Blog This is a guest post from Bob DuCharme. Original article appeared in: http://www.snee.com/bobdc.blog/2015/04/running-spark-graphx-algorithm.html Well, one algorithm, but a very cool one. Last month...