Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:


Plus Report - By Thomas Baekdal - August 2013

The Future of Customer Attribution Models

Shared by Plus subscriber
Avinash Kaushik
This is Baekdal Plus content. It is shared with you for free by a member. Please reshare it.

The big topic for marketers these days is customer attribution models, and it is a fantastically important topic. Partly because we are only now getting to a point where our data modeling services can actually tell us anything useful, and partly because in the connected world of abundance (of channels), measuring ROI is critical to our success.

In short, customer attribution models serve the very simple purpose of attributing the value of each one of our marketing channels. It helps us understand, for instance, what the true impact of social is... or an email campaign, or advertising, or any other channel.

That of course, is very important because without customer attribution models we would be completely in the dark and probably ruin our business.

In this report I'm going to take you beyond what we have today and illustrate to you what the future of customer attribution will be like, and why that is very important for your brand today.

One step forward, 1,000 more to go...

As you probably already know, in the old days we only had last-interaction models, which were hopeless for all kinds of reasons. Imagine that a friend of yours sees an ad for a product, she then tells you about it via social media. Intrigued, you visit their store that afternoon after work, and end up buying the product.

That is a successful conversion, but what channel contributed to it? If we only look at the last interaction, you would attribute it to the location of the store itself. As in, you would assume that people just happened to walk by, completely ignoring the value of your ad.

Luckily, we no longer have this problem, because modern analytics tools have in recent years added multi-funnel analytics, allowing us to look at the entire interaction... which is just wonderful.

And if you want to know the extent of what our new world of multi-funnel attribution models can do, I highly recommend that you read Avinash's great posts, "Multi-Channel Attribution: Definitions, Models and a Reality Check" and "Multi-Channel Attribution Modeling: The Good, Bad and Ugly Models".

But while we have taken the much important step away from the hopeless last-interaction models, there are so many things yet to come.

So let's explore the future.

The purpose of customer attribution models

The purpose of customer attribution models is to answer three critical questions:

  1. What is the financial impact of our own sales activities?
  2. What is the financial impact of other people's sales activities?
  3. What's the rate of conversions which are based on uncertainties (unknowns), indicating how much risk is involved?

These are three simple question, of which none of the tools we have today can answer.

Let's start with the third question, the rate of uncertainties.

Analytics tools are never in doubt

If you go into any analytical tool, it will tell you in absolute numbers exactly how many conversions you had and where they came from.

It can look something like this:

The first problem with this is that while this looks nice, as soon as you compare it with your internal order system, the numbers don't match. The number of transactions are different from what your analytics system is telling you.

And if you use more than one analytics system, you end up with 3 different sets of conversion numbers for the same thing.

This of course, is a known fact and has caused many analytics people to suggest that you should only look at one analytics system and then just stick with that. That is not a bad suggestion, but it really isn't a good one either.

The good news is that this problem might soon be a thing of the past. New advances in analytics like Universal Analytics (and similar from other suppliers), allows us to record conversions when the conversion actually happens and not when your browser thinks it does.

This is fantastically exciting.

Another problem is the dreaded 'direct visitor'. The way all analytics tools work is that when they can't identify where people are coming from, your analytics system will identify that as direct.

That is just wrong on so many levels. Let me illustrate how wrong it is to count anything you can't identify as 'direct':

While this is an issue in itself, we have to remember that what we are looking for here is not what source people come from, but what we can attribute customer value to. And even if all the direct traffic was truly direct, it's not relevant to look at in terms of attribution.

Truly direct traffic cannot, unlike all the other sources, work alone. Something else always has to cause it to happen.

Let's say you are in the market for a new pair of shoes. You open your browser and type in 'Z' 'A' 'P' 'P' 'O' 'S' '.com'. In doing so you end up on Zappos, where you find and buy a nice pair of sneakers.


In terms of analytics, this is a direct visitor. But in terms of customer attribution, it's not. You see, how did you know that typing '' into your browser would lead you to a website that is selling shoes? That information doesn't pop up in your head by magic. Something else had to have an influence for you to take that action.

It might have been a print ad from Zappos. A friend might have recommended it to you. You might have read an article about Tony Hsieh (the CEO of Zappos), and it might have been a number of other things.

But the direct action is never actually 'direct'. 'Direct' is not a source, it's an action... and as such it doesn't have any value on its own.

Now imagine that you are in the market for a seriously cool leather bag. You head over to Google and do a search. After looking around a bit, you come across Hard Graft (a spectacular brand) and you look at five different pages before deciding what to buy.

Again, we have a wonderful conversion which looks like this:

In this case, it's easy to attribute value to our channels because it's all coming from search.

Let's make it a bit more complicated. Imagine that the same person is doing exactly the same thing, but is interrupted in the process because he has to go to a meeting. So he saves a link so that he can come back later. And that very evening he returns and buys the bag he likes.

Now our model looks like this:

But wait-a-minute. That direct visit in the middle has nothing to do with 'direct'. It's merely a placeholder for a delay. And because that delay was longer than 20 minutes, the server counts the 'second' visit as a 'direct visit'.

This is completely wrong, because the time people wait between steps should never alter what source they came from. This person was influenced by search, and only search. There was no 'direct' anything involved, especially not something we can attribute value to.

You see the problem here?

When it comes to customer behavior, 'direct' traffic is sometimes interesting because it can tell something about how our customers use our shop. For instance, when there was a delay, or when a person had to come back five times before he eventually decided to buy something.

That's very interesting.

But from a customer attribution perspective, direct traffic is just wrong. It's completely skewing your value.

On one hand, direct traffic is actually 'unknown' traffic. As in traffic coming from sources that we can't identify. On the other hand it might be traffic where an interruption or a second look was needed, in which case it isn't a source at all.

If we go back to the original graph, what this means is that we suddenly see a completely different picture.

At first, this might look incredibly scary. Look at how much of our conversions we have no idea about the cause. That's just scary.

But it's actually wonderful.

It provides us with a much better picture of the rate of uncertainty involved, which in turn tells us all we need to know about the risks involved.

Imagine that you went to an executive meeting with the first graph and you told the other managers that you believe we should focus more on email (because it has the highest dollar amount per conversion). You tell them that you want to increase the share of email conversion by 15%. But when the other managers then look at the graph, all they see are absolute numbers, which means that there is no uncertainty involved.

Six months later you have another exec meeting where you present the result of an overall sales increase of 1.57% and an increase of email by just 6.3%. And you are promptly fired.

But a 1.57% increase in sales actually exactly corresponds to a 15% increase in email conversions, while the 6.3% reported increase also exactly corresponds to the percentage of uncertainty involved.

You completely met your goals, but because the first graph did not illustrate the level of uncertainty the other managers didn't know that.

Of course, this is a simplified example, but you get the point. Knowing how much you don't know is just as important as knowing what you do know.

After launching your new email initiative you increased overall sales by what you promised, even though only 42% of the conversion could be directly attributed to your actions.

Of course, you should also work to identify what the unknown traffic really is. For instance, if you do hold-out tests and your uncertainty level doesn't change, it would indicate that sales are mainly coming from existing customers who don't need to be influenced via marketing (as in, you are wasting your money).

If you do an email hold-out test for social media, and the uncertainty level drops at 5 times the rate of social, you would know that a huge part of you unknown traffic is actually social interaction that you just couldn't identify as a source.

Of course, if you don't want to do hold-out tests, you could also do the opposite. Boost a particular channel to see how that affects the numbers of all the other channels.

New versus loyal customers

Let's move on to another area in which customer attribution models will change dramatically from what we have today. It's when it comes to new versus loyal customers.

Today, multi-funnel reports treats everyone as a new customer. Every path starts with a source and ends with a conversion. Granted, you can segment your data into new versus returning, but that doesn't tell you anything. A person might be coming back a lot without ever buying something.

As a publisher for a paid-for magazine this is something I know all about. Most of my traffic is from people who don't pay and only read the free stuff. Case in point, 90% of my 'returning visitors' have converted.

The problem with this is obvious to anyone. There is a huge difference in behavior between a person who pays and a person who doesn't. The channels people come from are different, the impact of our activities are different (and thus the calculation of value), and repeat customers are much more likely to come back on their own (thus higher levels of direct traffic) than people who just happen to come by.

When we see a path like this one, for instance, what does that actually tells us?

I would kind of expect that the 6th path is my subscribers, while the 5th is a new visitor. But I have no idea (besides, path analysis like this is completely useless because there are too many paths to consider).

Consider conversions dominated by email, for instance. We all know that email provides the highest level of dollar value per conversion of any channel. That is a known fact that we see practically everywhere.

But that is not really surprising is it?

So customer attribution models for email almost exclusively targets loyal customers, while other channels, like search, social (in part) and display are dominated by new and mostly infrequent visitors.

Today, our tools can't differentiate between different types of conversion. They can't tell if a conversion is actually a follow-up to a previous conversion by an existing customer (a person coming back for more), or just a one time interaction by someone who happens to come by.

While our multi-funnel reports are impressive they are entirely one-dimensional, which kind of defeats the point of using them. Sure, it's a lot better than what we used to have. But we still far to go to make this perfect.

Consider this path for just one person:

Obviously I'm both simplifying and overgeneralizing things here, but it illustrate the concept of multi-conversion behavior. Here we have three macro-conversions and one micro-conversion for the same person.

With today's attribution models, all these paths would be displayed individually, and we would have no idea that it was actually the same person. But here you see not only that this person had three conversions, but also that his behavior changed dramatically as he became more and more loyal to your brand.

Again, I will stress that I'm over-simplifying things here, but the point is that different channels have different types of value depending on the status of your customers.

Email, for instance, is mainly a customer retention tool, while search and display advertising is mainly a customer acquisition tool. While email has a higher dollar value per conversion, it's not really good at bringing in new customers.

Social is kind of in the middle. Your direct social relationship is mostly for customer retention (like email), but when people share, retweet or +1 something, it becomes a customer acquisition tool (like search and display).

But that's not all. You might also find that each channel is good at selling different types of products, also depending on what status each individual customer has.

We cannot just look at our customer attribution model as single conversions.

Two is better than one

This leads to another very important realization. We all know, through countless studies, that when you combine two or more marketing channels, you get better results.

For instance, a while back I wrote about how combining TV advertising with display advertising caused brands to get a better result that either TV or display could provide on their own.

Not only did shifting TV ad budget to digital result in increased reach, but that reach was more effective. The combination of digital advertising and television commercials was found to be a particularly potent mix, with duplicated reach shown to be more effective on key brand effect metrics than either platform alone.

And we see this everywhere. It doesn't really matter what channels you combine. The magic is in the mix itself.

The reason why this is interesting is because it kind of invalidates our current attribution models. Our models today try to assign a value to our individual channels. As illustrated in this graph (again).

But if we know that two or more channels are better than one, why are we still trying to assign value to each channel individually?

Shouldn't we do this?

Of course, these are just made-up examples, but the point is that it isn't really relevant to just assign value to individual channels. We need to think of this as a mix.

The new world of multiple devices

Speaking of mixes. Many people in the analytics circles are still segmenting their traffic based on what device people use, and that is also true when it comes to customer attribution models.

You create separate reports; one for smartphone conversion, tablet conversion and desktop conversion.

The problem here is twofold. First, is the problem that desktop and mobile users are not different people. It's the same person just using two or more devices. I illustrated this in "Debunking The Unique Visitor: Finding Your Real Readers" with this graph:

All standard analytics services identify mobile and desktop as separate individuals, heavily inflating our results (a problem that is getting worse as we get more and more devices). In reality, most of that traffic is just the same person.

It's a huge difference.

Secondly, more and more studies illustrate just how huge multiple screen use is. For instance, you can see several examples over at Google Think's Databoard.

So when you segment your data on devices, you are basically cutting up people's path to conversion based on an erroneous belief that people only use one device at a time. And the result of the remaining data set is likely to be completely misleading.

Of course, how bad this is massively depends on what you selling. But one of the key things we are learning about the new world of multi is that segmented data (at least the way it's done today) can do more harm than good.

The mix of people

Another mix that none of our analytics support today is the mix of people. In the old world of print and display advertising, people played almost no measurable role in our path to sale. It was almost entirely based on passive exposure.

But in today's world, people absolutely rule how we spend our time online.

Think of the past month or so. How many times has another person told you about a product or a brand? How many times have you told others about one?

The answer is 'almost every day'.

We communicate with other people all the time, and a huge amount of that is about brands and products.

So why is it that our current customer acquisition channels create a conversion path completely devoid of the influence of other people? It's only looking at what sources people come from, but not the nature of those sources.

Social is a good example (obviously). There are three types of social sources:

  1. Social traffic caused by you explicitly posting something on your own social page.
  2. Social traffic caused by people resharing your postings
  3. Social traffic caused by other people, many of which are not following you and have not seen what you posted.

Which one do you think creates the most traffic? Obviously the answer depends on many factors and the size and awareness of your brand, but in most cases, most of your traffic is caused by others.

And this doesn't just apply to social. It applies to all our channels. It's fantastically important that we differentiate between the conversion that is caused by our own marketing, and the activities that is caused by other factors.

Activity that is caused by others isn't influenced by your ads or what you posted on your Facebook Page, because they never saw it. It is instead influenced by word-of-mouth, the quality and the attention to detail of your products, and other factors.

This is a very important aspect of customer attribution modelling that we are completely missing today. Although you can add some of it by being very careful to identify what is what with campaign variables and custom tags.

But, what does this all mean?

All of this leads us to the wonderful new world of customer attribution models that are coming soon. And, let me try to illustrate it in a very simple way:

We used to have this; the single-funnel customer attribution model:

This model was obviously hopeless because it only looked at the very last interaction and completely neglected to take into account all the wonderful things people did with advertising and social media. The kind of activities that build up momentum for a conversion to take place.

Today, we have taken a big step forward, and we now have multi-funnel models:

This is obviously much better. Now we can see how multiple channels affect a conversion and we can start to explore the complexity of the path that leads people to it.

But this model also has a gigantic flaw that it's one-dimensional. We are looking at one thing at a time and as such completely miss the bigger picture. We only look at one conversion, one person, one device type and so forth.

It's much better than what we had, but it's nowhere near where we need to be.

Another problem with this model is that it's also based on only one point of measurement, i.e. what people in their browser, on your site, assigned to a single cookie. So if a person moves to a different browser that is a separate line, and everything people do outside the browser and outside your site isn't being considered.

This leads us to the future of customer attribution

The first step, which we are starting see now, is with universal analytics (or similar technologies by other providers). The concept is simple. It allows us to measure things that don't happen in a browser.

A simple example, you can tap into the Twitter API, and measure mentions of your product and then compare that with the number of clicks. Before we could only measure the clicks but not the 3rd party exposure that caused them.

But the real fancy part is that we can tie it into the offline world of retail and our backend world of order management systems. That means that instead of relying on the browser to guess when we have a conversion, we can track exactly when that conversion happens. We can create much smarter shopping carts, we can combine exposure from different sites, and so many other things.

It's wonderful.

But with this new world of analytics, we also need a new customer attribution model. What would that look like? Well, here is one concept:

This model is not looking at conversions, it's looking at people - which is much more important. It builds up a pattern of behavior as people become more and more loyal.

Each dot represents not just one channel, but a mix of channels. The green dots represent when people bought a product, and the path illustrates the different types of influences on the group of people.

Of course, this looks kind of weird. So let's expand it with a few more labels and a trend line, and you will start to see what I'm on about:

What this customer attribution model is doing is combining the level of engagement of people with the type of marketing activity and what influences it. Three exceptionally important metrics.

At the top left we have the mix of channels that are predominantly about customer attributioncaused by your marketing team. These are things like paid-search, display advertising, sponsored social posts, print advertising, posters and direct mail. All things that you push out to create awareness.

In the middle we have all the channels that are influenced by a mix of channels caused mostly by other people. This is your social activity not initiated by you; when people write about you on their blog or in a magazine, or when friends discuss your brand on the bus.

And at the bottom right, we have the channels that are predominantly caused by customer retention activities, by people who follow you in some form. These are email newsletters, what you do on your social channels, but enhanced by the community that you have built up around you. They are all initiated by you but enhanced by your followers.

Now this graph suddenly starts to make a lot of sense. We see that to acquire customers in the first place, we predominantly have to focus on the mix of channels in the top left. We see the importance of PR, the quality of our products, and the feeling of goodwill people have towards our brand (it's what causes the middle part to happen).

We also see that as people become more and more engaged, their behavior shifts towards much more powerful customer retention channels (far more conversions).

But we can also see another pattern (in blue), which is the shift from customer acquisition to customer retention. They don't actually follow the trend line.

At some point they shift from one to the other caused by 'marketing by you' ... as in, something you did specifically to nurture your existing customers.

This could be a special promotion only for existing customers. This could be a loyalty follow-up program or another activity that caused people to become loyal.

How spectacular is that?

Now, all of this is just a simplified concept using completely made-up data (although this is a very common behavior). But just imagine if our customer attribution models worked like this.

In reality, of course, the model would be much more complex with far more lines (when we look across groups of people). But this leads us to my final point:

The future of customer attribution models is all about pattern recognition. It's not about the dots, it's about those behavioral lines that you see underneath. And understanding the value of our marketing activities is illustrated in how it affects those patterns.

You might do an experiment in which you use social media more for cheap discounts, which might push the conversion up to the top, but you might also learn that this would decrease the much more valuable long term loyalty in the customer retention zones.

You might learn that one type of channel leads to a higher dollar value per sale, but also a lower sales volume. So is that the right thing to focus on?

What about today?

At this point you are probably asking how you can use this today, which is a great question. The point of this article was not to give you a tool, but to change the way you think. Most people see customer attribution models as some fancy page in their analytics, but most people don't know what to do with it.

With this article I wanted you to think about what customer attribution is really all about. It's not about data point, sources, or channels. It's about the mix of activity in relation to people and their behavior. It's about understanding why your customers act in a certain way, not what source they came from.

It's about the shift from numbers to patterns.

And with this I hope that when you look at your analytics today, you think about what you see in a different way. For instance, when you see that most of your conversions are direct, you realize that something else had to cause that. Because It didn't just happen on its own.

And if you are an analytics ninja, I hope this article inspires you to think about what we could do in the future. I hope that it energizes you to experiment and create the much needed tools that we all so desperately need.

And most of all, I hope it encourages you to put pressure on the rest of the analytics community to change faster.

The shift from single to multi-funnel analytics was amazing, but it's only the first step of a much bigger shift.


The Baekdal Plus Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé


—   trends   —


The trend and future outlook for "brand+publisher", and how to make that work


How scared should we be of AIs taking our jobs?


What is the role of print in 2023?


Advertising ... 10 years from now


Advertising will always be a struggle unless we think like brands


The trends currently favor media innovation