5 things about data collection and customer data platforms

I don't like to get into the specifics of customer data platform features as far as defining the entire category. Getting into feature specifics is critical at the use case level, however.

It would be impossible to properly evaluate and then plan for the use of a customer data platform without really understanding how at least some features work. To be a customer data platform, there absolutely must be data collection capabilities. What things might a person interested in data collection want to know about customer data platform features? Here are five!

Thing 1: Some customer data platforms automatically collect a lot more behavioral (event) and experiential (state) data than the rest

flickr/my_public_domain_photos

All CDP should automatically collect some mobile/web data, but they don't all do it in the same way. Right now the spectrum is skewed to the point where there is a far end of the spectrum with just a couple systems like Celebrus Technologies that automate data collection needs for a variety of auditability, machine learning and other data science applications. It looks like Snowplow is heading there, too.

Most customer data platforms are more like the traditional digital analytics systems when it comes to automated data collection, meaning that if you are not collecting something there is no going back and figuring it out. You just have to start collecting it. For many use cases, you can start using it immediately. For learning use cases, you just have to wait until you get enough information.

It's not that the rest of the field of vendors is the same. Whether you are still searching, or already have a customer data platform, look really close. It may be that there are industry-specific features that you are not taking advantage of, or other capabilities that fit in nicely with the way that you think about customer data. For example, when context and user data is already being surfaced to a data layer, make sure the CDP uses this as a source of truth as opposed to other areas that may be less reliable.

Speaking of data layers... whether data is being collected automatically or via CDP configuration, surfacing of human consumable information is key. Machines are great, but nobody knows what 1896zb-lg is. It's probably the large one, though!

Thing 2: Some CDP are not a system of record at all, some can be considered a system of record, and some are definitely a system of record

It is not good or bad for a CDP to be a system of record, but it sure is good to know what to expect in this area! Platforms that have focused more on marketing execution especially find themselves in the "not a system of record" bucket. If the CDP does not maintain history so that it can be determined how a profile has changed over time, it can not really be considered a system of record even a little bit.

Meanwhile, there are a host of other CDPs that are more CRM-like in nature. This does not mean you need to utilize the platform as a system of record, but it does mean that it likely comes with a host of capabilities that non-record-keepers don't have. There are an awful lot of these potential added capabilities, too many to list. One that ties closely to the first-party data program is replay/auditability. All of these should at least have the history needed to understand with reasonable proximity what a profile looked like at a specific point in time.

Thing 3: Original data may not be retained

flickr/pagedooley

24 platforms all had a Y in the CDP Institute's vendor comparison chart under "retain original data". There are many flavors of Y.

Nuance example one: All CDP have profiles. Those profiles are multi-sourced. Once multi-sourced data is combined, it can be tough to undo. In order to change the way data is combined into profiles, original data might be required. To "reload" your data, some CDP require reprocessing, and for some this is not a simple or possible task at all.

Nuance example two: You may be expecting data in its original form for analytics functions. Some platforms heavily transform data on its way in, some transform it a little, and some for the most part retain the original data as it was.

Thing 4: Auxiliary data might not be a thing

If you have use cases that depend on auxiliary (non-customer) data, such as in many predictive / machine learning situations, you'll need a way to store and reference the data. Or, you'll need a solid connection to your analytics systems where these processes run.

Think about all the ways auxiliary data could be used, and determine whether it is possible or not. Even simple mappings to external datasets can go a long way in extending customer profiles. Make a case to your vendor to support your use cases when you run into a wall! Systems are evolving month-by-month to support use cases, and raising these issues early and often is key.

Thing 5: Data validation and cleansing work very differently

Some systems are wide open, accepting of anything thrown at them, and some are heavily governed to light up alarms if data doesn't look as expected. Consider your needs in this area, as the GIGO effect can be extreme in areas of personalized messaging.

Especially when no validation or cleansing options are available, field-level and profile-level checks should be put in place to make sure data stays as healthy as possible.

Areas of possibility

There are dozens of opportunities any organization can take advantage of to evolve data collection and better support existing programs, or to create new programs altogether. The vendors calling themselves CDPs collect data in very different ways, so how data is collected, brought in, and maintained may differ quite a bit from one to the next.

Understanding how your platform works is key to growing first-party data programs so that you are focusing and planning in areas of possibility.

Leave a comment

Your email address will not be published. Required fields are marked *