We’re pleased to announce Databricks Marketplace, an open marketplace for exchanging data products such as datasets, notebooks, dashboards, and machine learning models. To accelerate insights, data consumers can discover, evaluate, and access more data products from third-party vendors than ever before. Providers can now commercialize new offerings and shorten sales cycles by providing value-added services on top of their data. Databricks Marketplace is powered by Delta Sharing allowing consumers to access data products without having to be on the Databricks platform. This open approach allows data providers to broaden their addressable market without forcing consumers into vendor lock-in.
This blog will discuss the key limitations of the existing data marketplaces and our vision for an open marketplace on the Databricks Lakehouse platform
Existing data marketplaces fail to maximize business value for data providers and data consumers
The demand for 3rd party data to make data-driven innovations is greater than ever and data marketplaces act as a bridge between data providers and data consumers to help facilitate the discovery and delivery of datasets. However, as organizations continue leveraging more third party data, the value these platforms provide has not kept up with the needs of both providers and consumers.
Challenges for data consumers
Data consumers value ease of data discovery and frictionless data evaluation from a data marketplace.
However, existing data marketplaces that provide only datasets miss out on one of the key considerations for data consumers which is the context around the data. In most of the current data marketplaces, consumers receive a brief overview of the datasets, and maybe a few sample queries. This often leads to frustration as consumers have to spend time understanding the data model and going back and forth with the data provider’s support teams before they are able to determine if it is the right fit for their analytic needs.
Additionally, most current marketplaces work in walled garden environments. Data exchange can only be done on their closed platforms and sometimes only within their proprietary data formats. There are limited options to access the data from 3rd party tools or platforms seamlessly and the data consumers are forced to be on the platform, creating lock-in.
Challenges for data providers
From the data providers’ perspectives, two important measures of success are an increase in sales and a lowering in operational cost. However, most data marketplaces fall short on both of these measures.
With existing data marketplaces, data providers can only package and distribute datasets. And most marketplaces limit providers to only offering a brief write-up or out-of-context query examples to augment their dataset product profiles. Data consumers end up incurring significant effort and downstream costs to evaluate these datasets. This results in cumbersome onboarding, unnecessary long sales cycles, and eventually lost revenue opportunities.
Additionally, many data marketplaces require data providers to load data into their proprietary format, leverage their compute, and replicate data into different clouds and regions in which their customers operate. This quickly increases compute costs and operational burden as more and more moving parts are added to the system to maintain parity across cloud providers/regions. As the number of datasets and their volume grows, data providers must consider these costs and trade-off decisions. Some data providers may be left with the decision to deprioritize potentially valuable datasets as the cost to commercialize them grows.
Unlock business value with Databricks Marketplace
The vision behind Databricks Marketplace is to address these problems and help. consumers and providers achieve their business objectives.
Benefits for Data Consumers
Faster time to insights
With the Databricks marketplace, Data consumers can get access not only to just datasets but other data assets including dashboards, notebooks, and ML models. This provides data consumers an easy way to evaluate data and accelerate time to insights. For example, data consumers can leverage a starter notebook to do exploratory data analysis or a machine learning model that helps predict future rankings of the dataset. Before requesting access to the data, Databricks hosted dashboards enable customers to explore the data live without any additional cost. All of this helps speed up the evaluation, acquisition, and analysis cycle and get more value from the data.
An open marketplace
Powered by Delta Sharing, Databricks Marketplace allows data consumers to seamlessly access the data products without the need to be on the Databricks platform. There is no lock-in, and it provides consumers options to maximize the data value from the tools of their choice.
Benefits for Data Providers
Distribute and monetize a wide array of data products
With the Databricks Marketplace, providers can market and distribute not only just datasets, but also their other data products such as notebooks, dashboards, and models that are essential to help consumers realize the full value of a dataset.
Lets say a provider is selling Environmental Social and Governance (ESG) data. The provider can package a notebook along with the data to show how the data can be utilized for NLP analysis, a dashboard that provides a visualization of the worst polluting companies, and a model that will show how the shared ESG data can provide recommendations on when a company’s ESG ranking will change. With the existing data marketplaces, there is no easy way for providers to share all these highly valuable assets.
Broaden the reach of the data products
With Databricks Marketplace,data providers can expand their addressable market beyond the consumers who are on the Databricks Platform. This helps data providers increase the revenue potential of their data products.
No replication of data products
Databricks Marketplace allows data providers to share their data products without having to move or replicate the data products from their cloud storage. This allows providers to deliver data products to other clouds, tools, and platforms from a single source. Providers may choose to replicate data products as desired, but they have the option to choose versus being forced to do so and incurring additional costs.
What Databricks Partners are saying:
“Databricks Marketplace is a compelling platform for us. We like the fact that it is open and provides us a way to reach existing and new types of personas for our data offerings. We see the platform as a key enabler to accelerate value with our data offerings to our customers”
– Chris Anderson, CTO Intellectual Property Solutions, LexisNexis
“Customers need solutions, not only raw data. Being able to package raw data along with the code and analytics on top of it is how we see customers consuming raw data in the future”
– Ross Epstein, VP New Projects, Safegraph
“Facteus is extremely excited to be part of the inception of the Databricks Marketplace. A marketplace built on their Delta Share protocol is a huge step forward in democratizing and simplifying data access.”
– Jonathan Chin, Co-Founder Head of Data and Growth, Facteus
“With more than 1.2B non-identified patient records, IQVIA has unparalleled healthcare data and is focused on advancing innovation for a healthier world. We are looking forward to the upcoming launch of Databricks’ Delta Sharing Marketplace to enable seamless data sharing with our customers, which will accelerate time to insights and value across the ecosystem.”
– Avinob Roy, VP & GM Product Management, IQVIA