CDP 2.0: Why Zero Waste Is Now
November 18, 2024Introduction
As we head into 2025, the customer data platform (CDP) space is more dynamic—and more confusing—than ever. With a growing variety of CDPs and architectures, it’s imperative to understand what each solution truly offers. In this crowded landscape, marketers and IT teams alike need CDPs that provide a strong first-party data foundation while leveraging advanced machine learning and AI capabilities to drive customer experiences and data-driven decisions. But how can buyers distinguish real value from marketing hype?
mParticle’s new blog series, CDP 2.0: Why Zero-Waste Is Now, aims to cut through the noise by exploring major developments in the CDP space, and providing guidance empowering you to choose the right solution for your organization’s unique goals and context.
From CDP 1.0 to CDP 2.0: Moving Beyond Operational Efficiency
The first generation of CDPs, or CDP 1.0, emerged over a decade ago to solve data fragmentation through integration. Integration of data sources, destinations, and customer profiles created both operational efficiency and the solved challenges around access.
These early CDPs offered essential functions, including:
- Data Unification: Combining data from multiple sources into one repository.
- Profile Creation: Building a unified view of each customer.
- Segmentation: Enabling targeted campaigns across channels.
- Simplified Integrations: Streamlining data flows into marketing tools.
Over time, however, functionality becomes commoditized, and differentiation narrowed to vertical specialization, real-time capabilities, and integration quality. Outside of these aspects, choosing between providers based solely on functionality became difficult.
The Rise of Cloud Data Warehouses and the Composable CDP
In parallel with the rise of CDPs, cloud data warehouses (CDWs) like Snowflake, BigQuery, and Redshift became central to IT initiatives around data governance and data quality. The adoption of CDWs led to the birth of warehouse-native (“composable”) CDPs, designed to operate as overlays on CDWs.
These CDPs touted the virtues of modularity, allowing businesses to pay only for the features they needed. On the surface, this seems like a win—after all, why buy a standalone CDP when you can layer CDP-like capabilities on top of your CDW? However, the warehouse-native approach introduces hidden complexities in the form of exploding compute costs and architectural compromises, while failing to uniquely satisfy important marketing use cases.
Zero-Copy as a Data Strategy: Perception vs. Reality
In recent years, zero-copy deployments have become a common feature requirement in data and IT team-led CDP evaluations. One of the key selling points of warehouse-native CDPs is their ability to function in a “zero-copy” fashion, promising to eliminate data duplication. In theory, zero-copy architectures allow CDPs to operate directly on the data in the CDW, reducing storage costs and enhancing governance by consolidating operations in one place.
However, zero-copy isn’t always as advantageous as advertised. Many warehouse-native CDPs still create data “mirrors” for performance reasons, especially for low-latency needs, which contradicts the zero-copy promise. At scale, accessing data in the CDW will also be slower than using a dedicated, optimized data store, leading to bottlenecks and overhead.
Zero-copy is a noble undertaking for data storage optimization; however, promises of reduced maintenance burden and cost savings are unfounded.
The Hidden Costs of Warehouse-Native CDPs
While warehouse-native CDPs can be cost-effective for organizations without a sophisticated audience segmentation strategy, as usage and use cases become more demanding, costs scale linearly, then exponentially.
Based on our research comparing the cost of assembling audiences via zero-copy approach versus using mParticle, hourly refreshes are 25x the cost, and aiming for near real-time (5 minute refreshes) drives up costs by 50x. Unoptimized data queries combined with high concurrency of queries and greater number of refreshes (scans) will erode all progress the marketing team is actually making for the business.
These costs often become apparent only after deployment, when the strain on the CDW budget limits the organization’s ability to fully leverage CDP functionalities.
Composable Gets Conflated
The term “composability” is widely used to describe platforms that are modular and flexible, but warehouse-native CDPs have co-opted the term composability to include zero-copy support, rapid deployment, and CDW alignment. These attributes blur the lines between the virtues of composability and warehouse-native models, complicating buyer decisions.
While composable CDPs offer unbundled pricing, their primary purpose is to reallocate compute cycles back into cloud data warehouses. This may not matter for teams with basic needs or unlimited resources, but the hidden costs are real.
Embracing the Zero-Waste CDP Model
At mParticle, we are committed to delivering innovative solutions supporting our customers’ adoption of the Lakehouse architecture. We recognize the need for governance and are committed to helping our customers eliminate unnecessary data copies; however, we advocate for a “zero-waste” strategy that aims to maximize net economic value. We believe there is an opportunity to bypass the Cloud Data Warehouse all together connecting directly to your storage layer within your environment. Our approach focuses on three core principles:
- Optimized Compute: Efficient handling of compute-heavy tasks such as audience construction and updating.
- Integrated Intelligence: Leveraging AI and machine learning, zero-waste CDPs streamline workflows with predictive audience creation and NLP for segmentation.
- Enhanced Addressability: Real-time identity resolution, which enables multi-channel personalization combined with investments in identity enrichment ensure that Marketers can unlock the full potential of their audience data, running campaigns at unmatched scale.
Choosing the Right CDP in 2025
With many options in the CDP market, choosing the right solution can be challenging. Key factors to consider include:
- True Cost Transparency: Understand hidden compute costs, especially for more regularly occurring audience refresh needs, along with the amount of engineering resources required to implement and maintain the CDP, especially as new use cases are added.
- Multi-channel strategies: Ensure the platform handles real-time identity resolution and can scale across any channels — paid, earned, and owned.
- Scalability & Security: Verify that the CDP can grow with your data needs without overwhelming your budget, and make sure that the vendor and its key vendors have a flawless history with security.
- Infrastructure Compatibility: Choose a CDP that complements your current tech stack and CDW.
- Data Governance and Quality: Confirm that the CDP supports governance and compliance, especially in regulated environments; and can keep data consistent across your environments.
Please feel free to reach out to marketing@mparticle.com with questions or comments.