Skip to main content

Databricks Empowers Enterprises to Secure Their Apache Spark Workloads

Company Becomes First Vendor to Provide an End-to-End Security Framework for Apache Spark

June 8, 2016
Share this post

SAN FRANCISCO, CA--(Marketwired - Jun 8, 2016) -, the company founded by the team that created Apache® Spark™, today announced the completion of the first phase of the Databricks Enterprise Security (DBES) framework, making it the first company to provide end-to-end enterprise security for open source Apache Spark. The announcement was made at Spark Summit 2016 in San Francisco.

Traditionally, enterprise organizations have only had security solutions that addressed parts of their big data infrastructure. This model is no longer sufficient as enterprises demand holistic security that covers the full spectrum of their big data lifecycle, ranging from file processing, big data clusters, code management, job workflows, application deployments, and dashboards, to reporting.

DBES combines encryption, integrated identity management, role-based access control, data governance, and compliance standards to secure Apache Spark workloads in an end-to-end security framework:

  • Encryption: Provides strong encryption at rest and inflight with best-in-class standards such as SSL and keys stored in AWS Key Management System (KMS).
  • Integrated Identity Management: Facilitates seamless integration with enterprise identity providers via SAML 2.0 and Active Directory.
  • Role-Based Access Control: Enables fine-grain management access to every component of the enterprise data infrastructure, including files, clusters, code, application deployments, dashboards, and reports.
  • Data Governance: Guarantees the ability to monitor and audit all actions taken in every aspect of the enterprise data infrastructure.
  • Compliance Standards: Achieves security compliance standards that exceed the high standards of FedRAMP as part of Databricks' ongoing DBES strategy.

"End-to-end security requirements are top-of-mind for today's enterprises that are building advanced analytics solutions," said Ali Ghodsi, CEO at Databricks. "Yet building a truly secure, multi-tenant, and cloud-based enterprise data platform proves to be an impossible undertaking for most. We're delighted to be the first vendor to solve this problem comprehensively for Apache Spark, allowing enterprises to maximize the value from their data without compromising compliance and security."

Databricks Enterprise Security builds upon the extensive Databricks access management and encryption functionalities that already exist. With the completion of DBES Phase One today, Databricks gains additional security capabilities such as:

  • Cluster Access Control Lists: Individuals have permissions to create, terminate, or run code on existing Apache Spark clusters. This offers a central authority the ability to protect the access to production resources or to limit the expenditure associated with launching new resources to a few trusted entities;
  • Single Sign-On: A central authority can grant and revoke access via a SAML 2.0 compatible identity provider service to safeguard enterprise resources as needed;
  • Audit Logs: A comprehensive record of activity, allowing users to monitor detailed usage patterns of Databricks as the business requires.

"ESG research shows the number one attribute sought in evaluating a big data/analytics solution is now security. As Apache Spark grows rapidly in production environments, satisfying the stringent operational requirements of the enterprise becomes critical. Databricks is accelerating the maturity of their just-in-time data platform built on top of open source Apache Spark in important ways," said Nik Rouda, Senior Analyst at Enterprise Strategy Group.

The new features are available today to all Databricks customers. To sign up for Databricks, visit or contact [email protected].

To learn more, read the Databricks blog post here:

About Databricks:

Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project providing 10x more code than any other company. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact [email protected].

Recent Press Releases

Databricks Strengthens Presence in Korea with Senior Leadership Hires
Read Now
test press release
Read Now
Introducing Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering
Read Now
Databricks Open Sources Unity Catalog, Creating the Industry's Only Universal Catalog for Data and AI
Read Now
Introducing Databricks AI/BI: Intelligent Analytics for Real-World Data
Read Now
View All