A week ago I wrote that I was in the process of writing an article about how to make sense of Google Analytics. Since then a number of people have asked me when I planned to publish it. Well, soon...
But in my studies of the inner workings of Google Analytics, I have run into a number of strange things - especially in terms of calculating loyalty, visitors and page visits. The numbers reported by Google Analytics does not add up to the number of requests in the server logs (and no, I am not confusing page views with page visits).
To find out what is going on, I have created a test site in which I am going to perform a carefully selected number of tasks, from 5 different sources (at different locations).
My test setup is as follows:

As you can see I got 4 computers, from 2 different IP addresses, running in 3 different browsers + 2 bots. One of these bots are activated from an external source (a server located in the US), another from work. Both of these should not be included in the stats (they represent automated request, not real people - hence has no statistical relevance).
The test will reveal how many times a single person is counted as a visitor. The correct number is 1 (I am only one person), but since it is likely not to be able to identify that I am one person using 4 computers it should as a maximum report 4 people (absolute unique visits) during the 5 day test period.
The other things we will be able to see is how good Google Analytics is at calculating visitor loyalty, not to mention record browser and OS usages (I do not expect any problems here).
During the test an independent statistic system will record exactly what is going on the test site. And, the test site has never been used with Google Analytics before.

I will run these tests next week - then give me another week or two to write the final report.
Stay tuned.
Jonathan - May. 25, 2007
This is noble work and I await the results.
However, the fact that you (and others) have noticed these discrepancies illustrates why I have a rather cautious view (to say the least) of web analytics. Every time I have used such systems (and I've used many over the years) I have come to the conclusion that they confirm to the law of diminishing returns. If you expect a web analytics package to tell you some subtle or hidden "tuth" about your site, you are doomed to be consumed by doubt: either about the accuracy of what it's saying, or your capacity to understand it.
The best you can do to avoid insanity is to keep half an eye on general trends such systems show you. Do not agonise about whether report X is consistent with report Y, or whether it's under counting proxy requests from AOL or something. Therein madness lies.
And don't even think about the issue of whether a page-based tracking model is appropriate for asynchronous interactions...!
Thomas Baekdal - May. 25, 2007
Uffe, Thanks!
Jonathan, My report will be about two main topics.
1: The results of my study (above)
2: to illustrate some of the more misleading reports (e.g. use of AJAX vs. pageviews - good point) and how to make some sense of it.
Thomas Baekdal - Jun. 6, 2007
The analysis has been delayed - I am having some problem with it (Google doesn't find the tracker in my test site).
Gamermk - Jul. 5, 2007
Average Loyalty = Page Views / Uniques
Doesn't the above make page view count useful?
Thomas Baekdal - Jul. 5, 2007
Nope... :)
How loyal people are has nothing to do with how mange times they see a page.
Thomas Baekdal - Jul. 5, 2007
Here is an example: If you got 10 people visiting your site - one of them sees 100 pages in one day, the rest of them only one page in a month - then the average loyalty is 10.9.
In reality you got 0 loyal visitors. The one who saw 100 pages cannot be said to be loyal, because it is not the number of pages that count, but how often people revisits the site. And while he read 100 pages (or more likely something like 17 pages, because the rest was counted just by navigating), he didn't return the day afterwards.
The 9 remaining people was not loyal because none of them returned either.
To calculate loyalty you have to look at three things:
1: Does a specific person read all your new stuff. That is count number of unique visits by a person in relation to what pages he read - and only count the new ones.
2: Does a specific person come back several times each week, just to check if anything new has happened. That is a frequent visit with no page counts, unless something new has been added to the site.
3: Has he subscribed to RSS feeds or email newsletters?
General stats is useless. It is like saying that people buy for $50 in average (in a web shop). When it is really a few people buy a huge amount of products, while others buy very little - thus nobody buys for $50.
If we believe in the $50 average - managers will probably start making product sthat costs $50 - and destroy the business (to cheap for those who spend big bucks, and too pricey for those who spend very little).
mandeep - Aug. 1, 2007
Hello I am also waiting for your results. Till date i have only liked on stats service and that is webceo . Its instant and gives report accurately no matter how often.But as it was paid i used google analytics . But then i was clueless where are my 30% viewers going? Why my sites lost traffic. And an example of mind wash a person gets while using abig company product is that i trusted that google analytics must be the best so there is some other reason for my loss of visitors. then i found your blog and i again used webceo and you are right google analytics is not at all near to be good forget perfection.
But i will keep an eye on your experiment. But one drawback of google analytics has already been proven that damn thing is so stupid that it cannot find its code almost after 24 to 48 hours.
Published: May. 23, 2007 in notes
Uffe - May. 24, 2007
Hi Thomas,
For some time I have been confused about the discrepancy between the number of visitors indicated by the internal statistics of my webshop and by Google Analytics. I think it is very interesting and relevant to do a controlled and independent test of GA.
I am looking forward to reading the result and hopefully I will find out who to trust...
Keep up the good work :-)