Make R Programming on Big Data simpler With Databricks



Why Databricks + RStudio?

R ユーザーの生産性を強化

Databricks と RStudio 間のシームレスな統合により、データサイエンティストは使い慣れたツールと言語を使用して、RStudio IDE で直接、Databricks 上の Rジョブを実行することができます。

Simplify access to large data sets

Unify datasets in Databricks for your R-based machine learning and AI projects with the ability to code in RStudio. Databricks provides scalable data processing with Delta Lake and optimized Apache Spark to clean, blend and join datasets in an open data format.

大規模な分散型 R コンピューティングを実現

Databricks supports R as a first-class language, offering unprecedented performance (up to 100x faster than Apache Spark) as well as the ability to auto-scale cloud-based clusters to handle the most demanding jobs, while keeping the total cost of ownership low.

RStudio and Databricks Integration

R ベースのコンピューティングをビッグデータに適用することを検討しているデータサイエンティスト向けに、Databricks はセットアップが簡単で、一般的な R ツールやフレームワークとの統合が容易な統合分析プラットフォームを提供します。



Access RStudio IDE on Databricks

Install your desired RStudio Server version (open source or pro) on a Databricks cluster. Seamlessly use Apache Spark™ from RStudio IDE inside Databricks using both SparkR or sparklyr.

Prepare high-quality data sets for analyses

Clean, blend and join data sets using RStudio’s familiar interface and tools without the need for cluster management. Access data from Delta tables or external data sources using Apache Spark.

Interactively develop and build Shiny applications

Develop and test Shiny applications inside a hosted RStudio Server using a high-bandwidth connection to a powerful Apache Spark cluster.

Inventory optimization analytics

Forecast warehouse stocking levels using R and Databricks to optimize safety stock levels.


Enable efficient cloud processing to turn population-scale genetic data into meaningful insights.

Predict portfolio performance

Use pre-built functions for performance and risk management calculations of large-scale portfolio data.


