Skip to main content

Boost engagement banner image


Customer data is the lifeblood of modern organizations in every industry. As organizations level-up their data teams and practices with the Data Lakehouse, they’re increasingly using the Lakehouse not just as a center for source-of-truth for analytics–but also as an engine powering marketing, operations, personalization, and more.

Databricks Ventures invested in Hightouch to power the Data Lakehouse-native Customer Data Platform (CDP). Hightouch provides all the features that Databricks users need to collect, store, model, and activate customer data directly from the Lakehouse. This Lakehouse-centric architecture creates a complete Composable CDP centered on your own data infrastructure. Read this blog to learn more about what this Lakehouse-native Composable CDP really means, why it’s the best approach to customer data, and, most importantly, how you can get started building one yourself.

What is a Customer Data Platform (CDP)?

CDPs offer companies a way to gather, store, model, and activate customer data. Ultimately, they power use cases for this customer data by sending it to downstream tools that marketers, advertisers, and other business users rely on daily. CDPs promise to be the source of truth for customer data and help companies build a 360-degree view of the customer of each customer.

Generally, a CDP has several key components:

  • Event Tracking: Gather digital user interactions with an SDK that can be implemented on websites or mobile applications.
  • Identity Resolution: Deduplicate and unify disparate customer records from different devices, interactions over time, and more.
  • Audience Building: Define different segments of customer audiences in a marketer-friendly UI.
  • Data Activation: Sync customer audiences to downstream tools to power marketing automation, business operations, and more.

Historically, CDP solutions were all-in-one bundled platforms. If you purchase a traditional CDP, it will collect and model customer data within its own dedicated data storage and provide tooling to build audiences and activate them from this separate storage layer.

What is a Composable Customer Data Platform?

The Composable CDP is a new approach to customer data that puts your existing data infrastructure, like the Data Lakehouse, at the center of your operations. While traditional CDPs are bundled platforms with their own data storage, Composable CDPs are unbundled, giving you more flexibility in your tech stack, and allowing you to use the Data Lakehouse for data storage and modeling.

Building a complete and composable CDP on the Lakehouse

The Composable CDP on the Data Lakehouse has become a powerful and popular solution for customer marketing. This popularity has led many CDPs to market with the word “Composable,” sometimes erroneously, so it’s essential to clearly define what Composability actually means.

The Composable CDP is different from a traditional CDP in 4 key ways:

  • It runs on your data infrastructure. The Composable CDP allows you to use all your data and modeling already in your Data Lakehouse without requiring data copy into an external black box.
  • It is schema agnostic. The Data Lakehouse and, therefore, the Composable CDP has no limitations or opinions on how data should look. You can organize customer data around whatever entities matter for your business, such as households, pets, bank accounts, or anything else. Traditional CDPs rely on rigid data models built around users and events.
  • It is modular and interoperable. Every enterprise has some data infrastructure (like event collection, ETL, dashboards, etc.) before deciding to buy a CDP. The Composable CDP works with what you already have and then fills in any gaps in capability that you need.
  • It has unbundled pricing. You only need to pay for the capabilities you’ll use—not the shelfware that comes with the platform.

Advantages of the Lakehouse Composable CDP

The Lakehouse Composable CDP benefits from the data investments and modeling your team is already making in the Data Lakehouse. This single source of truth and machine learning can power all business use cases. This creates a virtuous feedback cycle between business and data teams: business teams can easily use existing data and then communicate with data teams about additional models or attributes that would help further innovation. For example, Mews, which runs products used by over 3,500 hospitality brands, uses the Lakehouse to unify their disparate data into a single source of truth before powering use cases directly from it.

The Lakehouse Composable CDP’s data comprehensiveness is matched by its data flexibility. The Lakehouse can match data to whatever schema your business needs. Traditional CDPs are limited to web events and other narrow user attributes that fit into their predefined schema. The Lakehouse is better equipped for complex companies to support the correct data for their CDP use cases. For example, PetSmart runs marketing campaigns from the Lakehouse based on pets each person owns– and that “pet” entity couldn’t be supported by a traditional CDP. A traditional CDP only has data models for events and users (people), so it’s not feasible to also track each user’s multiple “pets” and their associated traits like birthdays, medications, food brands, and more.

The Data Lakehouse also excels at data governance, providing full transparency, assurance, and auditability at each step of your customer data architecture. Using a CDP powered by the Lakehouse, your data team fully controls and owns your customer data rather than delegating that ownership and power to a black box third-party system.

Building a Composable CDP architecture around the Lakehouse also ensures that you remain modular and future-proofed. If you want to change out parts of your CDP tech stack, like event collection, you can freely do so at your discretion, as your core data assets remain safe in the Lakehouse regardless of the rest of your tech stack. You don’t get locked into a monolithic CDP vendor but instead can choose the right tech provider for each CDP use case as your business evolves.

In addition, the Lakehouse Composable CDP is a stronger return on investment than a traditional CDP. You can get a far faster time to value because you work with your existing infrastructure rather than starting from scratch with a new system. This makes a Composable CDP more cost-effective. Some of this cost efficiency happens because of vendor selection: you just purchase the CDP components you need rather than buying an all-in-one platform with redundant features. You also don’t have to pay to store your data and compute in an additional redundant platform, and you benefit from the economies of scale you have with your main source-of-truth Data Lakehouse.

Composable CDP Methodology: The Hightouch-Databricks Partnership

Hightouch and Databricks work better together and provide businesses with the best way to activate their customer data, which is why Databricks invested in Hightouch.

Hightouch provides all of the components an organization needs to compose its Lakehouse-native CDP, including:

  • Event Collection: collect events and load them into the Lakehouse
  • Identity Resolution: model data and unify Customer 360 profiles in the Lakehouse
  • Audience Building: group customers into audiences based on their attributes on the fly and coordinate marketing campaigns and experiments across audiences in a marketer-friendly interface
  • Data Activation: sync data from the lakehouse to 200+ downstream tools to enable your CDP use cases like lifecycle marketing, advertising campaigns, operational analytics, and more.

Building a complete and composable CDP on the Lakehouse

Hightouch also offers features that traditional CDPs do not for Lakehouse users. For example, Match Booster enriches first-party data with third-party identifiers in-flight to ad platforms to increase match rates directly to Databricks customers, performing a similar role to that of a data onboarding platform like Liveramp. The Personalization API also allows websites and apps to call out to predictive models in the Data Lakehouse to power real-time personalization.

Importantly, Hightouch fully embraces the idea of a composable CDP: you can build on the Lakehouse with as many or as few of these offerings as you need. If you’d rather perform identity resolution with dbt directly in the Lakehouse, you don’t need to buy a redundant service from Hightouch. Composability means choosing your own adventure, allowing you to focus on your organization's needs to add just what you need.

Getting Started

Building a Composable CDP on the Data Lakehouse has never been easier. You can get started with Databricks for free and speak with Hightouch’s solution engineers to determine an implementation plan for the Composable CDP features you need.

Try Databricks for free

Related posts

See all Industries posts