Skip to main content

New eBook Released: Mastering Advanced Analytics with Apache Spark

Dave Wang

in

Share this post

Mastering Advanced Analytics with Apache Spark eBook

We are excited to announce that the second eBook in our technical blog book series, Mastering Advanced Analytics with Apache Spark, has been released today!

You can download the eBook here.

We focused on the topic of “Advanced Analytics” due to the challenges created by the continued growth in data. This coupled with increasingly complex use cases demands much more than running queries against the data set. Whether you’re scrutinizing the clickstream from millions of visitors to optimize online ad placements or sifting through billions of transactions to identify signs of fraud, more sophisticated approaches to automatically glean insights from enormous volumes of data - such as machine learning and graph processing - is more important than ever.

This eBook offers a collection of the most popular technical blog posts that provide an introduction to machine learning and other advanced techniques on Spark, including:

  • An introduction to machine learning in Apache Spark
  • Using Spark for advanced topics such as clustering, trees, graph processing
  • How you can use SparkR to analyze data at scale with the R language

Screenshot from the Mastering Advanced Analytics with Apache Spark eBook

We’ve also augmented the blogs with new code examples in Databricks notebooks, which are freely available with the eBook download. A sample of the new notebooks include:

  • Scalable Decision Trees with MLlib
  • ML Import, Export, and Simple Operations
  • Generalized Linear Models in SparkR
  • Random Forests and Boosting in MLlib

Download the eBook to get started on your next advanced analytics project today. To try out the code examples, get on the waitlist for the Databricks Community Edition. If you have not read the first eBook in the series, be sure to check out Apache Spark Analytics Made Simple for technical content and code examples geared toward an introduction to data analytics with Apache Spark.

Try Databricks for free

Related posts

Another Record-Setting Spark Summit

The lure of San Francisco is indisputable as is its position as the preeminent high-tech hub. On day one of Spark Summit 2016...

How to Build a Credit Data Platform on the Databricks Lakehouse

Get started and build a credit data platform for your business by visiting the demo at Databricks Demo Center. Introduction According to the...

Loan Risk Analysis with XGBoost and Databricks Runtime for Machine Learning

August 9, 2018 by Amy Wang and Denny Lee in
Try this notebook series in Databricks For companies that make money off of interest on loans held by their customer, it’s always about...
See all Company Blog posts