Skip to main content

What is SparkR?

Run R programs at scale using Apache Spark's distributed computing engine with familiar R syntax

4 Personas Analytics AIBI 4

Summary

  • SparkR brings the power of Apache Spark's distributed computing to R programmers using familiar R syntax
  • Most Python Spark capabilities are available in SparkR, making it easy for R users to work with big data
  • The tool integrates seamlessly into R environments, allowing data scientists to scale their existing R workflows

SparkR is a tool for running R on Spark. It follows the same principles as all of Spark’s other language bindings. To use SparkR, we simply import it into our environment and run our code. It’s all very similar to the Python API except that it follows R’s syntax instead of Python. For the most part, almost everything available in Python is available in SparkR.
 

Additional Resources

A 5X LEADER

Gartner®: Databricks Cloud Database Leader

Never miss a Databricks post

Subscribe to our blog and get the latest posts delivered to your inbox