Glossary of CDP-Related Terms
February 3, 2021Please comment to suggest changes or additions!
a/b test | a testing method that compares results from two or more test groups whose members are similar except for being given different test treatments |
adtech | any system used to support advertising activities; in particular, systems that work with digital media |
algorithmic attribution | a type of multi-touch attribution method that allocates fractions of total revenue to different marketing contacts based on statistical analysis of historical data that estimates the impact of each contact |
analytics CDP | a CDP whose primary functions include assembling, sharing and analyzing unified customer profiles, typically including predictive analytics |
anonymization | the process of removing all personal identifiers from a data set, making it impossible to connect personal data in the set with the individual who generated it |
anonymous individual | an individual that is not connected to any personal identifiers which can be linked to a specific person in the real world |
Application Program Interface (API) | a method for communicating between systems (or between components of the same system) that makes requests (“calls”) for the other system to send data or take an action (cf Webhook) |
arbitration | the process of selecting which message to send to an individual who is eligible to receive messages from several separate campaigns. Users must specify the selection criteria (highest immediate value, highest renewal rate, etc.). The decision is usually based on a combination of decision rules and predictive models. |
artificial intelligence | computer processes that mimic human thought processes |
attribute data | data describing individual characteristics, such as birthdate, address, or education; one person typically has a single value for each attribute and the attributes change infrequently or never |
attribution | the process of estimating the revenue (or other measure) caused by a particular marketing contact (or other interaction with a customer) |
banner ad | a type of Web display ad that appears in a box at the top, bottom or side of a page. |
batch processing | processing a set of data that is accumulated over time and fed into the system at once, such as a file containing all transactions during the previous day. This precludes immediate response to events reflected in the data, such as someone visiting a Web site. |
behavioral data | data describing individual actions, such as purchases, Web page views, and customer service calls; one person be associated with many behaviors of the same type |
big data | technology to capture, store, access, and analyze very large data volumes in general, and semi-structured or unstructured data in particular. |
buying stage | the current relationship of a customer to a specific purchase, where relationships are described as a sequence of states (awareness, interest, selection, purchase, use, replacement). Buying stages mark progress in the buyer journey. |
California Consumer Protection Act | a California regulation that restricts how personal data is collected and used; it gives individuals rights to reject commercial use of their data |
campaign CDP | a CDP whose primary functions include assembling, sharing and analyzing unified customer profiles, and selecting personalized messages for individuals |
case study | a description of how an actual user completed a business task, typically including results. Used to illustrate the capabilities of a system and the value the system helps to create. |
CDP inside | a system that provides CDP functionality but whose primary functions include delivery and operational processes |
channel preference | the likelihood that a customer will engage with messages in a particular channel. Generally based on past behavior. Used to select the most effective channel for each individual. May vary by message type. |
churn | the process of ending a customer relationship; a measure of how many customers stop being customers |
citizen developer | a person who creates software without having acquired conventional programming skills; typically a business user rather than IT professional |
cloud-based | a system deployed on remote servers accessed through the internet and maintained by an external vendor. |
cluster analysis | any statistical analysis that classifies cases into groups whose members are in some way similar to each other |
collaborative filtering | a type of predictive modeling that identifies products an individual is likely to purchase, based on past purchases by that individual and by other individuals who have similar purchase histories |
consent management | the process of collecting, classifying, retaining, accessing, and updating individual consent for data use under privacy regulations. |
consent management system | software that manages the consent management process. May be a stand-alone system or part of a larger product such as a CDP. |
content management system (CMS) | software that manages and deploys formatted information, such as Web pages and documents. |
contextual advertising | advertising targeted on the basis of context, such as the type of editorial content the advertising accompanies. Requires no information about the individual viewing the ad. |
control group | a group that is held out from testing to provide a baseline for estimating results that would have occurred without any test |
cross device match | a match that links two devices to the same individual, based either on deterministic or probabilistic matches |
cross-channel marketing | a marketing program where the same campaign includes messages in different channels |
customer data platform (CDP) | packaged software that builds unified, persistent customer profile accessible to other systems, including primarily first party data and known individuals |
customer experience | all interactions between a customer and a company, across all stages of the customer relationship. Includes both prepurchase events (marketing and sales) and post-purchase events (product use, customer service). |
customer journey analysis | the process of tracking customer interactions leading up to a specified event, such as a purchase, or interactions across their entire relationship with a company. Typically includes identifying more common sequences and differences between the sequences leading to different outcomes or taken by different customer segments. |
customer profile (single customer view, 360 degree view) | all data associated with a person, collected and organized for easy access |
customer relationship management software (CRM) | software that stores details of direct interactions between a company’s customers and its sales and service personnel |
data activation | making use of data; specifically, sharing customer data with systems that will use it for analytics, personalization, or marketing campaigns |
data CDP | a CDP whose primary functions are limited to assembling and sharing unified customer profiles |
data cleansing | the process of making data more usable through error correction, standardization, transformations, and other processes. Exact steps will depend on the intended purpose. |
data enrichment | the process of adding new information to customer data, most often by importing third party data and appending it to existing customer profiles |
data governance | the process of controlling how data is collected and used in a system, with particular focus on ensuring data quality |
data lake | a collection of data copied from company systems, stored in its original forms and accessible for analysis and further processing |
data management platform (DMP) | software that stores anonymous customer profiles, primarily to support Web display advertising |
data quality | the degree to which data is fit for its intended purpose(s); more broadly, how accurately data reflects the real-world entities it represents |
data standardization | the process of placing data in a consistent format so that all instances of the same item are the same. May be done by applying rules (e.g., ‘all phone numbers are divided into country code and domestic number, with no separators’) or reference data (e.g., list of formal first names and variations, all changed to the formal first name; all postal addresses changed to match postal agency standards). Important for accurate matching and reporting. |
data transformation | the process of converting data from one format to another. Enables disparate data to be combined. |
data warehouse | a collection of data copied from company systems, reorganized and often summarized for analysis |
delivery CDP | a CDP whose primary functions include assembling, sharing and analyzing unified customer profiles, and selecting and delivering personalized messages for individuals |
demand side platform (DSP) | a system used by ad buyers to purchase digital media, typically through automated bidding |
derived variables | data that is based on other data, usually through calculations such as summary of purchases over time. Predictive model scores are a sophisticated type of derived variable. |
descriptive analytics | statistical methods that find patterns and relationships within existing data sets, such as identifying customer segments |
deterministic match | a match that links two personal identifiers to the same individual, based either on information provided by the individual or by the individual’s actions (e.g., logging into a customer account on a specific device; see ‘identity stitching’). |
device ID | an identifier linked to a device such as a computer, mobile phone, or smart TV. These may be a permanent attribute of the device itself, such as a serial number, or impermanent because they related to software running on the device, such as a Web browser or operating system |
digital asset management (DAM) | software that manages and deploys any type of digital content, including documents, videos, sound files, etc. |
display advertising | Web advertising that appears on Web site or social media pages and is purchased by contract or by bidding on impressions. May be targeted by Web site or by individual. |
dynamic content | digital content that changes depending on the recipient and other variables, typically achieved by creating a content template that includes rules which select different elements based on data about the recipient and situation (time of day, local weather, product inventory, etc.) |
dynamic list | a customer list that is automatically updated as customers become qualified or disqualified for the list’s selection criteria. Membership may be adjusted continuously (as new data is received) or updated each time the list is used. |
earned media | marketing messages that are delivered by unpaid third parties, such as the press. These are often considered to be news rather than advertising. |
event triggered campaign | a marketing program that is started when a specified event occurs. Typically, the program is targeted at individuals and the trigger event initiates the program for a single individual (e.g., an onboarding program triggered when someone becomes a new customer). |
feature extraction | the process of identifying attributes within unstructured data so these can be treated as structured data. Typical examples include finding company names within a press release or products within a video. Extracted features are usually applied as tags to the original item. |
fingerprinting | a technique that uses device attributes such as operating system and build date to identify specific devices, even without a specific device ID. Generally done without user consent and potentially a privacy violation. |
first party cookie | a Web browser cookie set by the domain of the Web site that sets the cookie |
first party data | personal data that an organization has acquired directly from an individual |
first touch attribution | an attribution method that allocates all revenue to the first marketing contact with a customer |
fractional attribution | a type of multi-touch attribution method that allocates specified fractions of total revenue to different marketing contacts based on when they occurred relative to a purchase (first, middle, last) |
fuzzy match | a match that links two sets of personal identifiers to the same individual, based on identifiers that are similar but not identical (e.g., two similar postal addresses) |
General Data Protection Regulation | a European Union regulation that restricts how personal data is collected, used, and protected; it gives individuals rights to consent, review, and demand deletion of personal data |
geofencing | targeting of marketing and advertising messages based on the recipient’s passage into or out of a specific physical location, such as entry to a retail store. Sometimes used in combination with data known about an individual. |
geotargeting | targeting of marketing and advertising messages based on the recipient’s location, often in combination with other data known about the individual |
golden record | a record containing the version of each item that is considered the most appropriate for use; this is usually the version judged most accurate and complete. Typically shared with other company systems. |
ideal customer profile | the set of personal data associated with a company’s best customers. Used to define targets for sales and marketing efforts. |
identity graph | a set of relations among personal identifiers, indicating how each has been linked to the others and which are linked to the same individual. |
identity resolution | the process of linking personal identifiers to individual identities, whether known or anonymous |
identity stitching | the process of connecting a personal identifier to an individual through an intermediary personal identifier (e.g., new device linked to an email address provided by a customer; the device is associated with the customer even though the customer has not herself reported the connection). |
incremental attribution | a type of multi-touch attribution that estimates the increase in total revenue resulting from a particular type of marketing contact. |
individual | a distinct person; more formally, an entity linked to at least one personal identifier that can distinguish it from other entities. Identity management systems assign a unique, permanent “master ID” to each individual and then connect all personal identifiers to that master ID. |
ingestion | the process of gathering data from one system and loading it into another |
in-memory data | data which is stored in system memory for immediate access. In-memory data is typically discarded after use, although it may be copied to persistent storage first. Some systems keep all data in-memory, to enable high-speed access. This becomes more affordable as memory costs drop, although it is still typically used for relatively small data volumes. |
integration platform | software that moves data between systems to support processes that span multiple systems, but does not store the data internally |
intent data | data that indicates how likely a person is to purchase a particular product. Generally based on behaviors such as store visits, social media comments, and consumption of related Web content. |
journey orchestration | coordinating customer treatments over time and across channels, either to achieve a specific purpose (e.g., a marketing campaign with a defined goal) or throughout a company’s relationship with a customer |
key performance indicator (KPI) | a measure that correlates with achievement of specific business goals. Separate KPIs are often defined each business project or objective. |
known individual | an individual connected to at least one personal identifier that can be linked to a specific person in the real world |
last touch attribution | an attribution method that allocates all revenue to the last marketing contact with a customer |
lead to account match | the process of connecting individual records to business units associated with those individuals. Applies to business-to-business data and relates specifically to the data structure of Salesforce.com Sales cloud CRM, which stores people as either “leads” (individuals not connected with an account within a business) or “contacts” (individuals associated with an account). Most B2B marketing programs expect all individuals to be associated with a business. |
life stage | the current relationship of a customer to a business, where relationships are described as a sequence of states (prospect, new customer, existing customer, at-risk customer, lapsed customer). Life stages mark progress in the customer journey. |
lifetime value | the total value generated by a customer throughout their relationship with a company. Often expressed in revenue although profit is more meaningful. May be measured in terms of future value only (e.g. for a new customer), past value only (e.g. to identify most important customers), or total value. Future values are typically discounted and may be limited to a specific time frame (e.g., next five years). |
location data | data that reports the physical location of an entity over time. Based on latitude and longitude but may also include derived data such as political jurisdiction or aisle within a retail store. Typically collected by mobile devices and used to target advertising and other marketing messages. |
look alike modeling | a type of predictive modeling that identifies individuals similar to a company’s current customers, used to select advertising audiences. |
machine learning | automated processes that build predictive models with little human assistance |
marketing automation system | software that maintains customer and prospect lists and runs campaigns against them. Primarily used for outbound campaigns (e.g., email) but some systems also support real time interactions (e.g. Web site messages). Largely limited to data generated within the system itself and to imports from CRM systems. |
martech (marketing technology) | any system used to support marketing activities; in particular, systems that work with customer-level data |
master data management (MDM) software | software that reconciles different versions of information about an entity (person, product, location, etc.), selects the version to be used as a standard across company systems, and shares this version (called a “golden record”) with those systems. MDM systems may perform identity matching as part of their function. |
multi-channel marketing | a marketing program where separate campaigns run in different channels (email, Web, etc.) |
multistep campaign | a marketing program including multiple messages over time, typically including the ability to adjust later messages based on each individual’s response to earlier messages |
multi-touch attribution | an attribution method that allocates fractions of total revenue to different marketing contacts; multiple allocation methods are possible |
multivariate analysis | any statistical analysis that uses multiple variables as inputs |
multivariate testing | a testing method that estimates the impact of different combinations of variables on results; can estimate results from combinations that have not actually been tested |
natural language processing | a branch of artificial intelligence that works with human language, typically to extract features (e.g., people mentioned) or meaning (events described, intent, sentiment, etc.) |
next best action | the treatment that a business believes will produce the most desirable result for an individual customer; typically based on a combination of rules and predictive analytics; requires specification of the measure that is desired |
no-code software | software that can be built or configured without using conventional programming skills |
NoSQL data store | a data store organized not organized into tables, rows, and columns. There are many types, optimized for different purposes. Generally more flexible than SQL databases because columns are not predefined. Used for structured, semi-structured, and unstructured data. |
offline data | data collected by physical interaction such as retail purchases, local events, shipments, etc. |
omni-channel marketing | a marketing program where the same campaign lets customers interact in whichever channels they choose |
onboarding | broadly, the process of adding people to a system; narrowly, attaching personal identifiers to individual profiles so each customer can be identified and contacted across multiple channels. In particular, it refers to sending offline identifiers (name, postal address, phone number) to third party vendors who match these with online identifiers (email address, device IDs, cookies, etc.) |
online data | data collected by digital systems include Web, mobile apps, smart TVs, etc. |
on-premises system | a system deployed on servers controlled by a company. May include “private cloud” deployments as well as deployments in a company’s own data center. |
operational CDP | a CDP whose primary functions include assembling, sharing and analyzing unified customer profiles; selecting and delivering personalized messages; and operational activities such as order processing or customer support |
out of the box data model | predefined set of data objects and relationships provided with a system. Typically designed to meet the needs of a specific industry or company type. Purpose is to save design effort compared with building a custom data model. |
owned media | marketing messages delivered through a company’s own channels, such as email or Web site |
paid media | marketing messages that are purchased, such as paid advertising |
persistent data | data which is stored in a stable format until the user decides to discard it. Actual retention period may be limited by legal requirements. |
persistent ID | a personal identifier that does not change over time and thus can be used as a permanent “master” ID. It is linked to other personal identifiers which may change (e.g., postal address) |
personal data | data that is linked to an individual, including attributes and behaviors |
personal identifier | information that can be used to identify a specific individual, either by itself or in combination with other information |
personalization | creating communications that are tailored to a specific individual based on data about that individual |
personally identifiable information (PII) | information that can be used to identify a specific individual; same as personal identifier |
predictive analytics/model | statistical methods that use data to predict outcomes such as response to promotion or membership in a group |
prescriptive analytics | statistical methods that use data to recommend decisions such as customer segments to contact or offers to develop |
privacy by design | a design approach that builds privacy requirements into system planning; this often includes collecting and exposing the least personal data needed to complete a business task |
probabilistic match | a match that links two personal identifiers to the same individual, based on behavior patterns that suggest but do not prove a relationship (e.g., two devices frequently used in the same places at the same times) |
programmatic advertising | a type of ad buying based on automated bidding for each impression, typically in real time. Originally developed for Web display advertising and now applied to other digital media. |
prospecting | the process of searching for new customers |
pseudonymization | the process of masking personal identifiers in a data set, so that someone with the right information (such as an encryption key or reference list) could reconnect personal data in the set with the individual who generated it |
reactivation campaign | a marketing program aimed at convincing a former customer to renew their relationship |
real time | responding to event so quickly that there is no perceptible delay. Required time depends on the situation: for human interactions it is typically considered one to two seconds. For computer-to-computer interactions such as programmatic ad bidding, it may be less than 1/10th of a second. |
real time access | receiving a data request from an external system and returning the data to that system in real time |
real time decision | receiving a decision request from an external system and returning the decision in real time; often includes real time data access, calculations, and rule execution |
real time ingestion | loading data into a system, completing whatever processing is needed, and making the data available for use in real time |
real time interaction | exchanging data with a system or person in real time, such that each action takes into account all previous actions including the most recent |
RealCDP | the CDP Institute’s criteria used to certify that a system provides CDP functionality. Criteria include: load all data types; store all original detail; retain data as long as the user desires; assemble unified customer profiles; make profiles available to other systems. |
recommendation engine | a system that suggests which product to offer an individual. Selection criteria may differ (highest likelihood to purchase, highest expected value, highest future purchased, etc.). Selection method is usually a combination of business rules and predictive models. Selections are usually based on a combination of individual data (purchase history, behaviors, etc.) and context (inventory, product demand, season, etc.) |
regression model | a statistical method that finds estimated relationships between multiple inputs and a result and expresses these in a mathematical formula |
retargeting campaign | a marketing program aimed at convincing a customer to purchase a product they had apparently considered buying but did not purchase |
search engine optimization | the process of maintaining a Web site to achieve the highest possible ranking in Web search engines and thus attract as much organic traffic as possible. |
search marketing/paid search | Web advertising that appears on search engine pages and is purchased by bidding on keywords. |
second party data | personal data that an organization has acquired through a direct relationship with the organization that collected it as first party data |
segmentation | any method that divides an audience into groups of individuals who are in some way similar to each other, typically so they can be treated similarly for marketing purposes |
sell side platform (SSP) | a system used by ad sellers to offer digital media, typically through automated bidding |
semi-structured data | data that is presented and stored in a format where the elements and contents are defined together, as in event logs or key:value pairs |
shopping cart | the area of an ecommerce Web site where buyers assemble the set of products they plan to order. Placing a product in a cart is a high indicator of purchase intent and is often the basis for retargeting an individual with offers for the same product if they do not complete the purchase. |
site tag | Javascript code embedded in a Web site that collects specified information and sends it to an external destination, such as the site owner for analytics or an ad network to track visits and ad views. Site tags may also place cookies on a Web browser to track return visits. |
software development kit (SDK) | instructions and tools for building software, and in particular for building connectors between two pieces of software. Often used to enable mobile apps to send data to a customer database. |
SQL data store | a data store organized into tables with rows and columns, where each row represents a record and each column represents a predefined data element. Used for structured data, primarily to process transactions and store attributes. |
static list | a customer list that is selected once and not updated or is only updated on demand. |
stream test | a type of test where customers are divided into groups and each group receives different treatments over time. Used to measure results of fundamental differences, such as different price or service levels, which must be held constant over extended periods to show their results. |
streaming data | data received in a continuous flow, such as Web site activity or location history |
structured data | data that is presented and stored in a fixed format where each element is in a specified location, such as the columns of a relational database table or the fields of a data file |
tag manager | software that manages Web site tags, typically replacing individual tags with a single tag that captures data required by multiple tags and distributes that data to the appropriate destinations. |
third party cookie | a Web browser cookie set by a different domain other than the Web site that sets the cookie |
third party data | personal data that an organization has acquired through a marketplace relationship with an organization that acquired it directly or indirectly |
tracking pixel | an image link embedded in a Web site that calls an external server to return a single pixel. Used to track site visitors. Captures less information than a site tag. |
tree analysis | a statistical method that classifies cases into groups with different expected results by repeatedly splitting groups of cases into subgroups, using a single variable for each split |
unstructured data | data that is presented and stored in a format where the elements are not defined, such as a block of text, video, or audio files |
use case | a description of the steps that an agent takes to complete a business task. Used to illustrate the capabilities a system needs to support a task and to illustrate the tasks a system may support. |
Web content management (WCM) | software that manages and deploys Web pages and other Web site contents. |
Webhook | a method for communicating between Web systems that sends data to other systems, typically after an event in the originating system (cf API) |