Blog

Getting Closer to Classifying CDP Systems

November 2, 2017

It’s nearly two months since my last post on building a classification scheme for CDP vendors.   Work has continued behind the scenes.  I still don’t have a final solution but have made enough progress to warrant an update.

Where we currently stand is several of the CDP vendors have provided lists of technical features they see as important differentiators.   I’ve reviewed these and classified them based on the core CDP functions they support.   These functions fall into two categories:

  • building the unified customer database itself (ingestion, transformation and identity unification, storage, and sharing with external systems).  All CDPs do this.
  • applications that use the database (analytics, campaign management, and personalization).  Many CDPs offer one or more of these but it’s not a core requirement.

Looking at the specific differentiators, it turns out that about half of them apply to nearly all customers who might want a CDP.  I’ve labeled those “base”.  The rest relate to specific channels or uses.  I’ve identified these as:

  • semi-structured and unstructured data
  • real time processing
  • Web marketing
  • mobile marketing
  • advertising
  • B2B marketing
  • offline marketing

I’m perfectly aware that these categories overlap: B2B marketers use the Web, Web market involves unstructured data, and so on.  But each category is something that some marketers might use and others might not.  Since the ultimate goal of this exercise is to help marketers find CDPs that meet their particular needs, breaking the features into categories related to those needs seems reasonable.

Naturally, two sets of dimensions (functions and uses) lends itself to a matrix.  In practice, though, such a matrix would have many blank cells because so many items fall into the “base” use case.  So to make the presentation a bit more manageable, I’ve presented two tables below: one that lists the base items only and one that has each of the other uses in a column of its own.  In both cases, the rows are grouped by function.  (To be clear, there’s no particular relationship among items on the same row in different columns.  That said, it might make sense to have one row list the connectors appropriate for each use: B2B systems should have connectors to Salesforce.com; mobile systems should have connectors to SMS gateways, advertising systems should have connectors to DMPs, etc. )

If you read the entries carefully, you’ll see that I’ve marked several with questions, mostly about whether they’re too vague.  Ideally the items would be specific enough that we could easily determine whether or not a particular vendor offers that item or not.  So things like “has API” are too general, since APIs differ greatly.  Tightening up these definitions, and adding more items as needed, is the next step in this process.

As always, I eagerly await your comments and suggestions.

(Note: tables converted from Excel using Tabelizer. Worth knowing about!)

Function base
ingest API load (specify features)
batch file load
client and server-side APIs for real time & batch connections;
data exchange via webhook, data layer, pixes, firehose, CSV
marketer can set up data collection without writing code
prebuilt connectors (specify system, categories)
transform/ unify deterministic matching/stitching
extract input deltas (adds, changes, deletes)
identify best value per element (‘golden record’)
system-assigned customer ID (too vague?)
store data stores supported (specify)
ingest and match 3rd party data (too vague? Not needed? Specify connectors?)
modify data structure without technical skills (too vague?)
open APIs to build custom features (need to specify API capabilities?)
scalable tech (need specifics e.g. virtual servers? Dynamic load balancing?)
share/ access API access to data (specify features?)
prebuilt connectors to access data (specify channel? Functions?)
SQL access to data (specify details e.g. speed, scale, all tables, SQL-like e.g. HQL?)
analytics Automated data normalization for predictive models
Automated feature extraction for predictive models
automated model building & deployment tools wihthin system (specify types? Features?)
Automated model validation
content analytics (specify details?)
GUI interface for analysis, profiling, mapping, segmentation
high throughput for batch recommendations (specify speed?)
ideal customer profile
Interpretable models (specify details)
manual model building & deployment tools within system (specify skill level needed)
marketer can define segments without writing code
nature language processing (too vague?)
product recommendations work with arbitrary catalog schema (too vague?)
real time segmentation (specify speed, capabilities, GUI)
system-generated incremental attribution models
user-selected fractional attribution models
campaigns automated segment delivery (specify features? GUI?)
campaigns decision tree built with GUI interface (specify minimum features e.g. branches?)
campaigns with multiple channels in same campaign
campaigns with multiple waves in same campaign
predictive model-based message selection
real time trigger campaigns
rule-based message selection
personalize personalize content (specify how e.g. variable substitution, rule-based dynamic content; specify channels)
Use
Function un- and semi-structured real time web mobile advertising B2B offline
ingest JSON load push to API to ingest continuous real-time data capture non-click behaviors (time spent, scorring, page tags, product categories, etc.) SDK for mobile load
ingest nested JSON objects streaming load extract UTM parameters SDK: automatic collection of standard device attributes & location
real time load (specify speed, features) Javascript site tag captures behaviors (specify other features?) SDK: batch data collection to save battery (vs streaming)
tag management
transform/ unify convert unstructured to structured (feature extraction, etc.) unknown to known conversion (keep history) recognize devices and attach to customer ID lead to account match find name/address/company matches based on similarity
postal address standardization & cleaning
store graph database (specify products?) account/contact data structure
NoSQL database (specify?)
share/ access query nested JSON fields 20 millisecond latency on API calls for identification, etc. Javascript connectors to access data (specify features?) SDK connectors to access data API connectors to Google, Facebook, etc.
real time access (specify speed, capabiliteis) SDK: push messages and in-app content to mobile continuously recompute audience assignments
cookie synch w/DMP, DSP, etc.
DMP functions (specify)
ID synch with Google, Facebook, Liveramp
analytics templates let marketer build & deploy web experiences without writing HTML, CSS, JS include/exclude user segments for ad campaigns
manage display, social audiences
personalize return recommendations in real time (specify speed?) personalize on anonymous users (based on device, campaign, referrer, location, weather, history, etc.) SDK for personalization (specify features)