What’s New with Data Sharing and Collaboration
At Databricks, our mission is to democratize data + AI. An open approach to sharing and collaboration is critical to maximize reach and impact. Within our data intelligence platform, the Delta Sharing open protocol helps our customers easily and securely share data and AI assets to accelerate innovation. For collaboration with third-party data, the Databricks Marketplace is the open marketplace for all your data, analytics and AI needs. With a growing ecosystem of data partners sharing a wide array of Data and AI assets, the Databricks Marketplace enables data consumers the ability to deliver innovation. Databricks Clean Rooms provides privacy-safe collaboration for businesses to easily collaborate in a secure environment on any cloud. Last week, we announced 12 new industry-leading partners to expand Delta Sharing's open ecosystem. Today, we are excited to announce how we are accelerating our ecosystem growth and new updates on Delta Sharing features releases. We are also excited to announce the availability of privacy-safe collaboration with Databricks Clean Rooms in Public Preview (coming soon) on AWS and Azure.
Accelerating data sharing growth with Delta Sharing
Databricks customers are driving cross-platform, cross-cloud collaborations with their customers and partners on a flexible, secure and open ecosystem without vendor lock-in. Databricks' commitment to innovation and collaboration has yielded significant results in the past year, with the ecosystem seeing impressive growth.
We've seen massive growth across our ecosystem, with 16,000+ data recipients from a wide range of organizations that have adopted Delta Sharing to collaborate with partners and customers. Today we are excited to announce 300%+ YoY growth for active Delta Shares across our open ecosystem, with 40% of Delta Shares using our cross-platform open connectors that support for Apache Spark, Pandas, Power BI, and recently announced Tableau to access and read shared data.
Delta Sharing’s latest group of partners are building data sharing solutions, expanding existing Built on partnerships for new capabilities, and advancing technology partnerships that help joint customers seamlessly share between platforms. These new partnerships include Acxiom, Amperity, Atlassian, Aveva, HealthVerity, Shutterstock, Stocktwits, T-Mobile, TetraScience, and The Trade Desk. Databricks is also announcing expanded partnerships with Epsilon, LiveRamp, S&P Global, and Tableau.
"Atlassian Analytics recently launched Data Shares, leveraging Delta Sharing from Databricks, to boost flexibility and accelerate customers' time-to-insight. … Delta Sharing's open ecosystem of connectors, including Tableau, PowerBI, and Spark, enables customers to easily power their environments with data directly from the Atlassian Data Lake."— Ben Jackson, Senior Group Product Manager, Data & Analytics, Atlassian
New Delta Sharing Innovations Enable Data + AI Success
Three years ago, we announced the open source Delta Sharing project — the industry's first open protocol for secure data sharing. Since then, Delta Sharing has continued to innovate and make it easy for customers to share live data and AI across platforms, clouds and regions — with no need for replication.
Building on this open approach, our guiding principle is to make Delta Sharing the most open, secure, and flexible tool — where anyone can share any data asset to any recipient on any platform, for any use case ranging from SQL to AI. To this end, we've continued developing new open sharing capabilities for both data providers and data recipients and are delighted to announce several new Delta Sharing product innovations.
Recently released as Public Preview, we have two Delta Sharing features we're happy to announce are now generally available, Volume Sharing and Cloudflare R2 support. "Volumes" are a new object type in Unity Catalog for collections of directories and files. With Volume Sharing, you now have the flexibility to share large amounts of unstructured or non-tabular data (e.g., images, audio, videos, or PDF files) across workspaces and without the need for expensive replication. This new feature helps accelerate innovation for processing unstructured / non-tabular data for data science, AI and machine learning workloads. Cloudflare R2 support helps joint customers of Cloudflare's zero egress, distributed object storage offering take advantage of zero egress fees without costly replication across regions and no vendor lock-in. This strategic partnership with Cloudflare has already helped customers, such as Allium save up to $645K per year using both Delta Sharing and Cloudflare R2.
Cross-Platform View Sharing is an exciting new feature that allows data providers to easily share views to any recipients. While Views have been a very popular mechanism for years to enable dynamic sharing of data, sharing Views is often confined to sharing within the same platform and cloud region, making it difficult to reach all users wherever they are. We are excited to share that Databricks customers will be able to securely share views to any recipients, regardless of which cloud, region, or platform they use. Cross-Platform View Sharing will be available in Private Preview coming soon, and you can sign up now to request access to preview when it is available. Another Delta Sharing feature we're releasing is Materialized Views and Streaming Tables Sharing in Private Preview. Customers who use Delta Live Tables to easily build reliable and cost-effective data pipelines, can now easily share the output of these pipelines with their recipients, without the need to create and maintain any additional copies or pipelines. Sign up to request access to the preview.
Customers told us that they need a sharing ecosystem that can access all the data they need, wherever it may live. We are very excited to announce Sharing for Lakehouse Federation, a new capability that enables customers to share data from directly where it is stored, without the need to copy it into Databricks. This enables data providers to easily grant access to data stored in their data warehouse or database (e.g. Snowflake, BigQuery, Redshift, MySQL, PostgreSQL, etc.) - allowing Databricks customers to access the widest possible set of data sets without any additional overhead for providers. This feature will be available in Private Preview, coming soon. Sign up to request access to the preview.
All of these incredible new features add to the recent innovations from the past six months, including AI Model Sharing, currently in Public Preview allows you to share models with your partners and customers, who can deploy them in their Databricks environment using MosaicAI. AI Model Sharing provides game-changing advantages for easily sharing models across clouds and regions, while enabling recipients to protect the privacy of their data when using third-party models.
Announcing Clean Rooms Public Preview on AWS + Azure
Databricks Clean Rooms provides a privacy-safe environment for collaboration for all your data and AI assets without direct access to sensitive data. Today, we are announcing Databricks Clean Rooms will be in Public Preview (coming soon) on AWS and Azure. You can sign up here to get early access to the preview.
Organizations are looking for ways to securely exchange their data and collaborate with external partners to foster data-driven innovations. In the past, organizations had limited data sharing solutions, relinquishing control over how their sensitive data was shared with partners and little to no visibility into how their data was consumed. This created the risk for potential data misuse and data privacy breaches. Customers who tried using other clean room solutions have told us these solutions are limited and do not meet their needs, as they often require all parties to copy their data into the same platform, do not allow sophisticated analysis beyond basic SQL queries, and have limited visibility or control over their data.
Organizations need an open, flexible, and privacy-safe way to collaborate on data, and Databricks Clean Rooms meets these critical needs.
- Any cloud, any platform. Secure, open, flexible collaboration is powered by Delta Sharing, Clean Rooms allows you to collaborate across clouds, regions, and even across platforms using the new Sharing for Lakehouse Federation (see details above).
- Any language and workload of your choice: Unlike other data clean rooms on the market, Databricks Clean Rooms supports any language or workload, including native support for ML and AI with Python. Clean Rooms is a flexible interoperable solution, enabling organizations to collaborate with anyone, regardless of cloud or platform without the need for replication.
- Any scale: Clean Rooms also supports collaboration and operational capabilities at scale. With support for APIs, SQL commands, and built-in Databricks Workflows orchestration, you can easily automate Clean Room workloads. Collaborators also get approved output data directly in their Unity Catalog that can be conveniently used for subsequent use cases. Coming soon, multiple collaborators can work together in a Databricks Clean Room.
Databricks Marketplace ecosystem growth and product innovation
Many marketplaces are closed ecosystems, restricted to specific clouds or data warehouses, and often focused solely on data or simple applications. In June 2023, we launched the Databricks Marketplace, an open platform designed to meet all your data, analytics, and AI needs. Powered by Delta Sharing, the Marketplace offers a diverse array of datasets, AI models, notebooks, and solutions.
Over the past year, Databricks Marketplace has introduced several innovations such as AI Model Sharing on Marketplace, Volume Sharing on Marketplace (see recent blog, Shutterstock Uses Volume Sharing for Seamless Collaboration), Databricks to Open Sharing, Private Exchanges, and Solution accelerators to help data consumers discover and evaluate data products faster and accelerate their analytics and AI initiatives. The chart below provides a quick overview of these product feature releases and the benefits for customers.
Databricks Marketplace has also experienced remarkable growth, with more than 2,000 listings of datasets, AI models, and solution accelerators available on the Databricks Marketplace, a 320% increase year-over-year in listings and a 300% increase in new data providers.
"Shutterstock is bringing its vast collection of nearly a billion creative content assets to the Databricks Marketplace, a platform renowned for fostering open data and AI collaboration. This integration provides unparalleled access to our extensive library of ethically-sourced visual content, propelling responsible AI and ML initiatives forward across various industries. We are excited to add Delta Sharing as a method to deliver data. Customers utilizing our rich dataset on Databricks can tap into new opportunities, catalyze product innovations, and secure a competitive advantage."— Aimee Egan, Chief Enterprise Officer, Shutterstock
Get started with Data Sharing and Collaboration in Databricks
Databricks enables open data sharing and collaboration and we are looking forward to seeing how you use Delta Sharing, Databricks Marketplace, Databricks Clean Rooms to innovate and deliver on your data and AI initiatives.
Be sure to stay connected with all our data sharing and collaboration updates at the Data and AI Summit from June 10-13, or watch livestreams of keynotes and select sessions.
Submit your interest to join our Databricks Clean Rooms interest form before Public Preview is released. You can also enroll for Delta Sharing Cross-Platform View Sharing private preview and Delta Sharing Materialized Views and Streaming Table Sharing private preview.