As everyone knows, “big data” is all the rage in digital marketing nowadays. Marketing organizations across the globe are trying to find ways to collect and analyze user-level or touchpoint-level data in order to uncover insights about how marketing activity affects consumer purchase decisions and drives loyalty.

In fact, the buzz around big data in marketing has risen to the point where one could easily get the illusion that utilizing user-level data is synonymous with modern marketing.

This is far from the truth. Case in point, Gartner’s hype cycle as of last August placed “big data” for digital marketing near the apex of inflated expectations, about to descend into the trough of disillusionment.

It is important for marketers and marketing analysts to understand that user-level data is not the end-all be-all of marketing: as with any type of data, it is suitable for some applications and analyses but unsuitable for others.

There are a lot of companies looking towards "big data" as their savior, but just aren't ready to implement. This leads to disenfranchisement towards lower level data. It reminds me of the early days of Campaign Management (now Marketing Automation) where there were so many failed implementations. The vendors were too inexperienced to determine how to successfully implement their products, the technology was too nascent and the customers were just not ready culturally to handle the products. This is "big data" in a nutshell.

1. User Data Is Fundamentally Biased

The user-level data that marketers have access to is only of individuals who have visited your owned digital properties or viewed your online ads, which is typically not representative of the total target consumer base.

Even within the pool of trackable cookies, the accuracy of the customer journey is dubious: many consumers now operate across devices, and it is impossible to tell for any given touchpoint sequence how fragmented the path actually is. Furthermore, those that operate across multiple devices is likely to be from a different demographic compared to those who only use a single device, and so on.

User-level data is far from being accurate or complete, which means that there is inherent danger in assuming that insights from user-level data applies to your consumer base at large.

I don't necessarily agree with this. While there are true statements, having some data is better than none. Would I change my entire digital strategy on incomplete data? Maybe if the data was very compelling, but this data will lead to testable hypothesis that will lead to better customer experiences. Never be afraid of not having all the data and never search for all the data, that pearl is not worth the dive.

2. User-Level Execution Only Exists In Select Channels

Certain marketing channels are well suited for applying user-level data: website personalization, email automation, dynamic creatives, and RTB spring to mind.

Very true. Be careful to apply to the correct channels and don't make assumptions about everyone. When there is enough data to make a decision, use that data. If not, use the data you have been working with for all these years, it has worked up till now.

3. User-Level Results Cannot Be Presented Directly

More accurately, it can be presented via a few visualizations such as a flow diagram, but these tend to be incomprehensible to all but domain experts. This means that user-level data needs to be aggregated up to a daily segment-level or property-level at the very least in order for the results to be consumable at large.

Many new segments can come from this rich data and become aggregated. It is fine to aggregate data for reporting purposes to executives, in fact this is what they want to see. Every once in awhile throw in a decision tree or a naive bayes output to show there is more analysis being done at a more granular level.

4. User-Level Algorithms Have Difficulty Answering “Why”

Largely speaking, there are only two ways to analyze user-level data: one is to aggregate it into a “smaller” data set in some way and then apply statistical or heuristic analysis; the other is to analyze the data set directly using algorithmic methods.

Both can result in predictions and recommendations (e.g. move spend from campaign A to B), but algorithmic analyses tend to have difficulty answering “why” questions (e.g. why should we move spend) in a manner comprehensible to the average marketer. Certain types of algorithms such as neural networks are black boxes even to the data scientists who designed it. Which leads to the next limitation:

This is where the "art" comes into play when applying analytics on any dataset. There are too many unknown variables that go into a purchase decision of a human being to be able to predict with absolute certainty an outcome, so there should never be a decision to move all spending in some direction or change an entire strategy based on any data model. What should be done is test the new data models against the old way of doing business and see if they perform better. If they do, great, you have a winner. If they don't, use that new data to create models that will maybe create better results than the current model. Marketing tactics and campaigns are living and breathing entities, they need to be cared for and changed constantly.

5. User Data Is Not Suited For Producing Learnings

This will probably strike you as counter-intuitive. Big data = big insights = big learnings, right?

Actionable learnings that require user-level data – for instance, applying a look-alike model to discover previously untapped customer segments – are relatively few and far in between, and require tons of effort to uncover. Boring, ol’ small data remains far more efficient at producing practical real-world learnings that you can apply to execution today.

In some cases yes, but don't discount the learnings that can come from this data. Running this data through multiple modeling techniques may not lead to production ready models that will impact revenue streams overnight. These rarely happen and takes many hundreds of data scientists with an accuracy rating of maybe 3% of the models making it into production. However, running data through data mining techniques can give you unique insights into your data that regular analytics could never produce. These are true learnings that create testable hypothesis that can be used to enhance the customer experience.

6. User-Level Data Is Subject To More Noise

If you have analyzed regular daily time series data, you know that a single outlier can completely throw off analysis results. The situation is similar with user-level data, but worse.

This is very true. There is so much noise in the data, that is why most time spent data modeling involves cleaning of the data. This noise is why it is so hard to predict anything using this data. The pearl may not be worth the dive for predictive analytics, but for data mining it is certainly worth the effort.

7. User Data Is Not Easily Accessible Or Transferable

Oh so true. Take manageable chucks when starting to dive into these user-level data waters.

This level of data is much harder to work with than traditional data. In fact, executives usually don't appreciate the time and effort it takes to glean insights from large datasets. Clear expectations should be set to ensure there are no overinflated expectations at the start of the user-level data journey. Under promise and over deliver for a successful implementation.