Skip to main content
<
Page 61
>

Real-Time End-to-End Integration with Apache Kafka in Apache Spark’s Structured Streaming

April 4, 2017 by Sunil Sitaula in
View the Notebook in Databricks Community Edition Structured Streaming APIs enable building end-to-end streaming applications called continuous applications in a consistent, fault-tolerant manner...

Next Generation Physical Planning in Apache Spark

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway. — Andrew Tanenbaum, 1981 magine a cold, windy...

On-Demand Webinar and FAQ: Apache Spark MLlib 2.x: How to Productionize your Machine Learning Models

On March 9th, we hosted a live webinar— Apache Spark MLlib 2.x: How to Productionize your Machine Learning Models —to address the following...

Analyse One Year of Radio Station Songs Aired with Apache Spark, Spark SQL, Spotify, and Databricks

March 27, 2017 by Paul Leclercq in
Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Voice from Facebook: Using Apache Spark for Large-Scale Language Model Training

February 28, 2017 by Tejas Patil and Jing Zheng in
This is a guest post from Facebook. Tejas Patil and Jing Zheng, software engineers in the Facebook engineering team, show how to use...

Working with Complex Data Formats with Structured Streaming in Apache Spark 2.1

In part 1 of this series on Structured Streaming blog posts, we demonstrated how easy it is to write an end-to-end streaming ETL...

Processing a Trillion Rows Per Second on a Single Machine: How Can Nested Loop Joins be this Fast?

This blog post describes our experience debugging a failing test case caused by a cross join query running “too fast.” Because the root...

Intel’s BigDL on Databricks

February 9, 2017 by Sue Ann Hong and Joseph Bradley in
Try this notebook on Databricks Intel recently released its BigDL project for distributed deep learning on Apache Spark. BigDL has native Spark integration...

Real-time Streaming ETL with Structured Streaming in Apache Spark 2.1

Explore why lakehouses are the data architecture of the future with the father of the data warehouse, Bill Inmon. Try this notebook in...

Top 10 Apache Spark Blog Posts from 2016

December 30, 2016 by Jules Damji in
Spark Summit will be held in Dublin, Ireland on Oct 24-26, 2017. Check out the get your ticket before it sells out! Here’s...