plus
The way we define people in analytics and the way we do it in real life is very often not the same thing, and I have talked about this before. The most visible example is when you look at the unique user counts that are being reported by media companies. Often their total traffic exceeds their total market, which, even at the best of times, isn't very realistic.
There are many reasons for this. We have problems with how to measure people across browsers and devices; we have problems with ad blockers; and bots are a growing problem.
So what is the problem with bots?
Well, in simple terms, bots are traffic that isn't coming from real people, and it makes up a staggering amount of the total. In order to get real insights, we need to identify and filter them out, otherwise we end up with very inaccurate results.
Some bots, like the Googlebot (for Google Search) are easily identified, but there is quite a lot of activity which isn't that simple.
To give you an example, if I look at my data, only 8% of my traffic is identified as real people. 31% show up as unique visitors but fail to behave in a way that a human would, and 61% are directly identified as bots (using the system I have for identifying them).
Many analytics systems are pretty good at filtering out most of the 'bad traffic'. For instance, tools like Google Analytics or Adobe Analytics have built-in systems for detecting and removing bot traffic. Adobe Analytics, for instance, is using the IAB/ABC International Spiders and Bots List(but there are plenty of open-source and free lists available). And GA/Adobe also filters out all the bots that don't activate your scripts.
The problem is that these systems only identify and filter out bots that are marked as such. They can't detect traffic that looks legitimate but isn't.
This is obviously not a new problem, but, as publishers spend more and more resources on their own data and analytics capabilities, understanding how big an impact this has is vital to your future plans.
For instance, just last week, we heard about how Hearst is opening a 20-person data studio, focusing on bringing their '1st-party data' to advertisers. But are they filtering out all the invalid data?
I don't know how they work, but I can tell that most publishers don't filter out the data correctly. Not because they don't want to, but because they don't realize just how much of their internal (1st party) data is filled with invalid views.
So, in this somewhat technical article, let's talk about how to accurately identify real traffic, and why you really need to look at your data in stages.
I'm not going to show you any coding, but we need to talk about how the internet works.
One of the amazing things about doing your own analytics is that you have access to the raw data. This means that you can do a lot of detailed analysis that you can't really do with an external service.
One thing that I do, for instance, is to output a list of 'user-agents' (which is what identifies which browser/device people are using), and then compare it to what actually happens on my site.
Register to try out Baekdal Plus completely for free for one week.
Baekdal Plus is your premium destination for trends and analysis for the media industry. Every year you get 25 reports about the future media trends, business and editorial strategies, monetization analysis and insights about how to use analytics specifically for publishers.
As a subscriber, you also get full access to all the Plus reports (more than 200) published over the past 8 years, as well as the ability to share what you read.
Yes, of course, please write to plus@baekdal.com and I will send you a regular invoice that you can pay via your bank. I will need your company name, address and VAT number (if within the EU). Also, please note that due to this process being manual, this will be for an annual subscription only.
Yes, please write to plus@baekdal.com for details. But for 25-99 users: the price is 20% off the subscription price ($79/year per user), 100+ users is a fixed price at $5,000 (for all combined).
Yes, please head over to Baekdal Media to read about consulting where I can help you with strategy reviews, trend and strategy reports, and strategic guidance for you media company or a specific publication.
Creating a propensity model is one of the most important tools publishers can have.
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
Several publishers have found that reducing volume leads to an increase in revenue
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
Time is such a critical metric for publishers, but it's also a very complicated one.
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
Free for subscribers
...or full access for $12
How subscriber analytics is moving from looking at metrics to identifying patterns.
Free for subscribers
...or full access for $12
Founder, media analyst, author, and publisher. Follow on Twitter
"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé
plus
free
plus
plus
plus
free