Blog

Customer Data Platform Architecture

September 11, 2024

Descriptions of Customer Data Platform architecture nearly always start with some version of the “butterfly diagram.”  One wing represents data sources, the other wing represents outputs, and the body that connects them is the CDP itself.

In more detailed CDP architecture diagrams, the butterfly body is expanded to show components for the specific functions that assemble, store, and activate the CDP’s unified customer profiles.  

These functions are the heart of the CDP.  They include:

  • Capture data: collects data directly from source systems.  In particular, this includes capturing website behaviors that are not usually stored in a data warehouse.
  • Ingest data: load data from source systems via API connections, queries, streaming feeds, and batch files.  Key requirements revolve around connector management and real-time updates.
  • Prepare data: includes data quality processes, standardization, event detection, and extracting structured features from unstructured inputs.
  • Link data: identifying data from different sources that relates to the same customer.  This includes several types of matching and maintaining a persistent customer ID over time.
  • Profile customers: building customer profiles by combining all information about each customer and placing it in accessible formats.  Includes creation of derived variables and real-time access to data stored outside of the system.
  • Store data: includes database functions such as storing all data types, retaining full detail, building a suitable data model, and fulfilling privacy requirements.
  • Share profiles: enable external systems to read the profile data.  Methods may include direct queries, API access, or real-time data sharing.  Also involves privacy enforcement.
  • Integrate profiles: building automated processes that use profile data.  The processes may execute within the CDP or in an external system. 
  • Select profiles: enabling end-users to select and extract customer segments based on profile data.

For a more detailed explanation of these categories, see this CDP Institute white paper.

Integrated CDP vs. Composable CDP Architectures

All of these functions are needed regardless of whether the profiles are built by a single, integrated CDP with its own data store, or are assembled by separate “composable CDP” components tied to an independent data warehouse.  But there is a key difference between the two approaches:

  • An integrated CDP architecture has data flowing directly from one component to the next. 
  • A composable CDP architecture (a.k.a. “warehouse-native CDP” architecture) accesses the data store separately for each component.  This may result in substantially higher processing costs than an integrated system, as well as requiring more technical work to set up the component connections.  (That said, the actual difference in cost depends on the details.  Some component products perform several functions in an integrated flow so fewer separate components are involved.)

Of course, the cost to your own company will depend on what systems you already have in place.  Use the CDP Institute’s Composable CDP Self-Assessment tool to take a detailed inventory and receive some suggestions for what to do next.  You can also take the CDP Institute’s free Build, Buy or Compose online course on how to compare packaged CDP vs composable CDP alternatives.

Marketing Data Architecture

The expanded butterfly offers a helpful tool for ensuring that any proposed CDP architecture includes all necessary components.  But it captures only part of a complete data architecture, which extends beyond data flows to include supporting features such as security, governance, and metadata.  The butterfly also leaves out a substantial number of marketing-related applications including campaigns, analytics, content management, and marketing administration.  A more complete Marketing Data Architecture diagram would touch on all these items. 

The version shown here is organized into four layers, showing sources, data management, decisions, and delivery systems.  Data generally flows from the bottom to the top, with some significant feedback loops along the way.  You’ll notice this diagram doesn’t distinguish an integrated CDP architecture from a composable CDP architecture, or customer data from other types of marketing data.  These remain important distinctions, and many companies will find that storing some customer data separately from other marketing data actually makes the most practical sense.  But providing a coherent overview of data requirements is one purpose of defining a data architecture, so it’s important to provide at least one version that shows everything in one place.  This provides a foundation for further refinements in the design of the actual systems that bring the architecture to life.

CDP Architecture Views – From Simple to Advanced

We’ve seen that a CDP architecture shows the sources, processes, and outputs of a CDP system.  The approach taken here treats the building of profiles as the main purpose of a CDP, even though many CDPs provide additional functions such as analytics and campaign execution.  The focus on profile building makes it easier to expand the architecture definition to include a complete list of the specific functions, or components, a CDP must include to deliver those profiles.  Other marketing functions can be included in a broader marketing data architecture, which can include other systems and clarify how those relate to the CDP.