Using sparklyr in DatabricksMay 25, 2017 by Hossein Falaki in Engineering Blog Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...
Detecting Abuse at Scale: Locality Sensitive Hashing at Uber EngineeringMay 9, 2017 by Yun Ni, Kelvin Chu and Joseph Bradley in Solutions This is a cross blog post effort between Databricks and Uber Engineering. Yun Ni is a software engineer on Uber’s Machine Learning Platform...
Analyse One Year of Radio Station Songs Aired with Apache Spark, Spark SQL, Spotify, and DatabricksMarch 27, 2017 by Paul Leclercq in Engineering Blog Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...
Voice from CERN: Apache Spark 2.0 Performance Improvements Investigated With Flame GraphsOctober 3, 2016 by Luca Canali in Engineering Blog This is a guest post from CERN, the European Organization for Nuclear Research. In this blog, Luca Canali of CERN investigates performance improvements...
Apache Spark @Scale: A 60 TB+ production use case from FacebookAugust 31, 2016 by Sital Kedia, Shuojie Wang and Avery Ching in Solutions This is a guest Apache Spark community blog from Facebook Engineering . In this technical blog, Facebook shares their usage of Apache Spark...