If you’ve worked through each section of this guide, you are well on your way to building your own Apache Spark applications on Databricks.
Your first next step should be Spark: The Definitive Guide. Written by the creator of the open-source cluster-computing framework, this comprehensive guide teaches you how to use, deploy, and maintain Apache Spark. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. The notebooks from the guide are available on GitHuband the datasets are available in the DBFS folder /databricks-datasets/definitive-guide
.
There are many more resources within the Databricks documentation and the Databricks website; we recommend that you check these out at your leisure:
For in-depth documentation on various Apache Spark APIs, see:
The Databricks Guide is a comprehensive reference document containing examples, sample applications, and other resources.
Learn Apache Spark from the team that started the project.
Apache Spark training for your team: delivered at your organization or online.
Learn More
Organized learning events: open to the public at conferences and classrooms.
Learn More
Below are some great blogs and videos of Apache Spark and Databricks use cases: