Databricks Marketplace Apps and Packaged Clean Rooms let data providers distribute IP as installable applications, keeping brand data where it belongs.
by Sridhar Sundaresan and Suvan Kaul
Brands invest heavily in building first-party data assets, including purchase histories, CRM records, loyalty programs,and website interactions. That data is fragmented across systems and difficult to activate across channels. However, first-party data alone only tells part of the story.
To build complete audience profiles, brands need to match their records against identity providers' spines for cross-channel identity graphs spanning email, device IDs, cookies, and offline touchpoints.
The traditional approach is painful. Brands export customer records to a third-party platform, the identity provider runs their matching algorithms, and results come back days later. Every step introduces risk: data leaves the brand's secure environment, PII travels across networks, and compliance teams must review data-sharing agreements that can take weeks to negotiate.
At the same time, privacy regulations and platform restrictions have made:
This creates a fundamental gap: Brands have data but lack the ability to connect it to a unified identity layer safely
To bridge this, brands need to:
The Marketing Cloud, a Global Marketing Services Agency, a Stagwell company, experienced this friction firsthand across their brand clients. They pushed for a better model: one where brands could access Stagwell's identity matching capabilities without ever sending their raw data outside their own infrastructure.
Traditional clean room implementations are high-touch, engineering-heavy, and can be slow to deploy.
Databricks Marketplace Apps flip the traditional data-sharing model. Instead of "send us your data and we will process it," the model becomes "install our app and it runs where your data already lives”. Brands can now install a pre-built application, connect their data, and run identity matching workflows instantly.
When an application is published to the Databricks Marketplace, any brand with a Databricks workspace can request access and install it directly. The app runs inside the brand's own environment with its own auto-provisioned service principal. The brand's data never crosses a network boundary.
This is a fundamental shift for data providers. Previously, distributing proprietary algorithms meant either exposing source code (which partners will not do) or requiring brands to export data (which compliance teams resist). Marketplace Apps solve both problems: the app's code is containerized and opaque to the consumer, while the brand's data stays in their Unity Catalog.
With marketplace distribution, deployment time drops from months to minutes, standardized workflows improve usability, and governance is baked into the platform. Stagwell was among the first partners to put this model into production.
Stagwell built a marketplace-ready clean room application on Databricks that enables secure ingestion of brand first-party data, matching against the Stagwell Identity Spine, privacy-safe insights generation, and seamless transition to audience creation and activation.
At its core, the system combines Databricks Clean Rooms for secure collaboration, Unity Catalog for governance and access control, Jobs and Notebooks for identity matching execution, and a React and Express app layer for user experience.

Here’s how the end-to-end flow works.
The app uses four distinct identity layers, each scoped to its purpose:
On-Behalf-Of (OBO) user token - When the brand user logs in, the app receives their OAuth token via the x-forwarded-access-token header. This token is used for any operation that touches the brand's data: previewing tables, querying the SQL warehouse, retrieving the brand's sharing identifier. Unity Catalog ACLs apply based on the user's identity.
App service principal - The auto-provisioned SP handles app-level operations: telemetry, internal state management, and calls to Stagwell's backend API. This identity is scoped to the app itself and does not carry user-level permissions.
Stagwell backend service principal - Stagwell's own M2M OAuth credentials manage the clean room lifecycle on their side: creating the clean room, adding assets, contributing notebooks, and designating the brand as runner.
Brand user personal access token (PAT) - The brand's clean room collaborator generates a scoped PAT with clean room, SQL, and Unity Catalog permissions and provides it during app installation via secret resource binding. This token carries the generating user's identity, which means it works natively across workspaces and enables operations that require clean room-level authorization on the brand side - such as adding brand tables and running the matching notebook.
Standard Clean Rooms require an approval step: the collaborator reviews and approves before any notebook can run. This makes sense for ad-hoc partnerships, but it creates friction for a marketplace distribution model where hundreds of brands might install the same app.
Packaged Clean Rooms remove this friction. When Stagwell creates a clean room designated as a packaged clean room, the brand can run notebooks immediately after the clean room is set up. No approval queue, no back-and-forth, no delays.
This is what makes the marketplace model viable at scale. A brand installs the app, connects their data, and runs their first identity match in minutes - not weeks.
The industry is seeing a fundamental shift, from static data sharing, manual onboarding, and risk-heavy integrations toward secure governed collaboration, on-demand identity resolution, and productized data workflows.
Stagwell's app demonstrates a pattern that any data provider can follow. Consider the possibilities:
In each case, the value proposition is the same: the data provider monetizes their IP through the Marketplace, while the consumer gets insights and activates audiences without the compliance overhead of data sharing.
Stagwell’s approach illustrates how data depth amplifies this model. Their ID Spine combines behavioral signals with attitudinal data from The Harris Poll, Harris Quest Brand, and National Research Group - blending what consumers do with what they think to deliver audience quality that goes beyond standard identity matching.
For brands, this means faster time to insight, better audience understanding, stronger privacy compliance, and new ways to activate their first-party data. For the ecosystem, clean rooms and marketplaces are becoming the operating system for data collaboration.
The building blocks are all part of the Databricks platform: Unity Catalog for governance, Marketplace for distribution, Packaged Clean Rooms for privacy-safe computation, Delta Sharing for results delivery, and Databricks Apps for the runtime environment. What is new is how they compose together into a complete distribution channel for data-driven applications.
The future of identity isn't just about better graphs - it's about making identity resolution accessible, secure, and scalable through productized experiences. And that's exactly what marketplace-driven clean room apps unlock.
If you are a data provider looking to distribute your algorithms and models through the Databricks Marketplace, here’s what to do next:
Subscribe to our blog and get the latest posts delivered to your inbox.