Getting Closer to Classifying CDP Systems
November 2, 2017It’s nearly two months since my last post on building a classification scheme for CDP vendors. Work has continued behind the scenes. I still don’t have a final solution but have made enough progress to warrant an update.
Where we currently stand is several of the CDP vendors have provided lists of technical features they see as important differentiators. I’ve reviewed these and classified them based on the core CDP functions they support. These functions fall into two categories:
- building the unified customer database itself (ingestion, transformation and identity unification, storage, and sharing with external systems). All CDPs do this.
- applications that use the database (analytics, campaign management, and personalization). Many CDPs offer one or more of these but it’s not a core requirement.
Looking at the specific differentiators, it turns out that about half of them apply to nearly all customers who might want a CDP. I’ve labeled those “base”. The rest relate to specific channels or uses. I’ve identified these as:
- semi-structured and unstructured data
- real time processing
- Web marketing
- mobile marketing
- advertising
- B2B marketing
- offline marketing
I’m perfectly aware that these categories overlap: B2B marketers use the Web, Web market involves unstructured data, and so on. But each category is something that some marketers might use and others might not. Since the ultimate goal of this exercise is to help marketers find CDPs that meet their particular needs, breaking the features into categories related to those needs seems reasonable.
Naturally, two sets of dimensions (functions and uses) lends itself to a matrix. In practice, though, such a matrix would have many blank cells because so many items fall into the “base” use case. So to make the presentation a bit more manageable, I’ve presented two tables below: one that lists the base items only and one that has each of the other uses in a column of its own. In both cases, the rows are grouped by function. (To be clear, there’s no particular relationship among items on the same row in different columns. That said, it might make sense to have one row list the connectors appropriate for each use: B2B systems should have connectors to Salesforce.com; mobile systems should have connectors to SMS gateways, advertising systems should have connectors to DMPs, etc. )
If you read the entries carefully, you’ll see that I’ve marked several with questions, mostly about whether they’re too vague. Ideally the items would be specific enough that we could easily determine whether or not a particular vendor offers that item or not. So things like “has API” are too general, since APIs differ greatly. Tightening up these definitions, and adding more items as needed, is the next step in this process.
As always, I eagerly await your comments and suggestions.
(Note: tables converted from Excel using Tabelizer. Worth knowing about!)
Function | base |
---|---|
ingest | API load (specify features) |
batch file load | |
client and server-side APIs for real time & batch connections; | |
data exchange via webhook, data layer, pixes, firehose, CSV | |
marketer can set up data collection without writing code | |
prebuilt connectors (specify system, categories) | |
transform/ unify | deterministic matching/stitching |
extract input deltas (adds, changes, deletes) | |
identify best value per element (‘golden record’) | |
system-assigned customer ID (too vague?) | |
store | data stores supported (specify) |
ingest and match 3rd party data (too vague? Not needed? Specify connectors?) | |
modify data structure without technical skills (too vague?) | |
open APIs to build custom features (need to specify API capabilities?) | |
scalable tech (need specifics e.g. virtual servers? Dynamic load balancing?) | |
share/ access | API access to data (specify features?) |
prebuilt connectors to access data (specify channel? Functions?) | |
SQL access to data (specify details e.g. speed, scale, all tables, SQL-like e.g. HQL?) | |
analytics | Automated data normalization for predictive models |
Automated feature extraction for predictive models | |
automated model building & deployment tools wihthin system (specify types? Features?) | |
Automated model validation | |
content analytics (specify details?) | |
GUI interface for analysis, profiling, mapping, segmentation | |
high throughput for batch recommendations (specify speed?) | |
ideal customer profile | |
Interpretable models (specify details) | |
manual model building & deployment tools within system (specify skill level needed) | |
marketer can define segments without writing code | |
nature language processing (too vague?) | |
product recommendations work with arbitrary catalog schema (too vague?) | |
real time segmentation (specify speed, capabilities, GUI) | |
system-generated incremental attribution models | |
user-selected fractional attribution models | |
campaigns | automated segment delivery (specify features? GUI?) |
campaigns decision tree built with GUI interface (specify minimum features e.g. branches?) | |
campaigns with multiple channels in same campaign | |
campaigns with multiple waves in same campaign | |
predictive model-based message selection | |
real time trigger campaigns | |
rule-based message selection | |
personalize | personalize content (specify how e.g. variable substitution, rule-based dynamic content; specify channels) |
Use | |||||||
---|---|---|---|---|---|---|---|
Function | un- and semi-structured | real time | web | mobile | advertising | B2B | offline |
ingest | JSON load | push to API to ingest continuous real-time data | capture non-click behaviors (time spent, scorring, page tags, product categories, etc.) | SDK for mobile load | |||
ingest nested JSON objects | streaming load | extract UTM parameters | SDK: automatic collection of standard device attributes & location | ||||
real time load (specify speed, features) | Javascript site tag captures behaviors (specify other features?) | SDK: batch data collection to save battery (vs streaming) | |||||
tag management | |||||||
transform/ unify | convert unstructured to structured (feature extraction, etc.) | unknown to known conversion (keep history) | recognize devices and attach to customer ID | lead to account match | find name/address/company matches based on similarity | ||
postal address standardization & cleaning | |||||||
store | graph database (specify products?) | account/contact data structure | |||||
NoSQL database (specify?) | |||||||
share/ access | query nested JSON fields | 20 millisecond latency on API calls for identification, etc. | Javascript connectors to access data (specify features?) | SDK connectors to access data | API connectors to Google, Facebook, etc. | ||
real time access (specify speed, capabiliteis) | SDK: push messages and in-app content to mobile | continuously recompute audience assignments | |||||
cookie synch w/DMP, DSP, etc. | |||||||
DMP functions (specify) | |||||||
ID synch with Google, Facebook, Liveramp | |||||||
analytics | templates let marketer build & deploy web experiences without writing HTML, CSS, JS | include/exclude user segments for ad campaigns | |||||
manage display, social audiences | |||||||
personalize | return recommendations in real time (specify speed?) | personalize on anonymous users (based on device, campaign, referrer, location, weather, history, etc.) | SDK for personalization (specify features) |