Skip to main content

Databricks FAQ

Basics

The business knows that there’s gold in all that data, and your team’s job is to find it. But being a detective with a bunch of clunky tools and difficult to setup infrastructure is hard. You want to be the hero who figures out what’s going on with the business, but you’re spending all your time wrestling with the tools.

We built Databricks to make big data simple. Apache Spark™ made a big step towards achieving this mission by providing a unified framework for building data pipelines. Databricks takes this further by providing a zero-management cloud platform built around Spark that delivers 1) fully managed Spark clusters, 2) an interactive workspace for exploration and visualization, 3) a production pipeline scheduler, and 4) a platform for powering your favorite Spark-based applications. So instead of tackling data headaches, you can finally focus on finding answers that make an immediate impact on your business.

Availability

Databricks pricing is detailed on our pricing page

Technical

Databricks currently supports browser-based file uploads, pulling data from Azure Blob Storage, AWS S3, Azure SQL Data Warehouse, Azure Data Lake Store, NoSQL data stores such as Cosmos DB, Cassandra, Elasticsearch, JDBC data sources, HDFS, Sqoop, and a variety of other data sources supported natively by Apache Spark.

Deployment

Databricks is currently available on Microsoft Azure, Amazon AWS and Google Cloud.

Security

Users of Databricks read from and persist data to their own datastores, using their own credentials.