Transactional Writes to Cloud Storage on DatabricksMay 31, 2017 by Eric Liang, Srinath Shankar and Bill Chambers in Platform Blog In another blog post published today , we showed the top five reasons for choosing S3 over HDFS. With the dominance of simple...
Entropy-based Log Redaction for Apache Spark on DatabricksMay 30, 2017 by Weiluo Ren and Yu Peng in Engineering Blog This blog post is part of our series of internal engineering blogs on Databricks platform, infrastructure management, tooling, monitoring, and provisioning. We love...
Using sparklyr in DatabricksMay 25, 2017 by Hossein Falaki in Engineering Blog Try this notebook on Databricks with all instructions as explained in this post notebook In September 2016, RStudio announced sparklyr , a new...
On-Demand Webinar and FAQ: Deep Learning and Apache Spark: Workflows and Best PracticesMay 23, 2017 by Tim Hunter and Jules Damji in Engineering Blog On May 4th, we hosted a live webinar — Deep Learning and Apache Spark: Workflows and Best Practices . Rather than comparing deep...
Running Streaming Jobs Once a Day For 10x Cost SavingsMay 22, 2017 by Burak Yavuz and Tyson Condie in Engineering Blog This is the sixth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark. Traditionally, when people...