White Edition

Is Twitter using Misleading Stats?

I was watching a video about Twitter, and one thing that caught my attention was when they said "People Twitter in average 3 times per day". My guess is that they are extremely wrong - or rather that the stats are misleading them.

Note: I do not know the details of Twitter's usage patterns - I am simply speculating.

Averages are, unfortunately, one of the most frequently used forms of statistics. It is also one of the most frequent causes of being misled. You should never rely on averages, unless you are absolutely sure that the average really represents the majority.

In the perfect world, stats would look something like the graph below. The majority is centered on the average, while it is trailing off at either end. If this is your graph you can safely say that the average is such and such - and people can rely on it.

But it is much more likely that you graph looks like something else. Here are two examples. The first one has the same average, but the majority of uses is not centered on the middle, but is instead focused at the edges. In this case the average gives you a completely wrong picture of what is really going on.

Note: If this is your graph is is much more accurate to say the the "average" is either not doing anything at all, or doing a lot

Here is another example where many people do very little with it trailing off as the usage gets higher - only to be broken by a relatively small group of "heavy usage". The thing to notice in this case is that the average is still the same as with the other two graphs. But again, the average is directly misleading.

In this case you should say, the "average" is only doing little, with a small group of power users

Saying that the average user Twitters 3 times per day - is very likely to be wrong. I would not base my strategy on an average usage.

Comments

1

George - Apr. 22, 2008

The word you're looking for is: median.

Using averages can be very misleading, but it's still accurate to say it's the average. Twitter should comment on the statistics for the median Twitter usage per user. My guess is as yours, that the average is higher than the median, and thus more sexy to use in promoting the site.

2

Thomas Baekdal - Apr. 22, 2008

George, Yes - median is the proper term for what I want. Thanks!

3

Jonathan - Apr. 23, 2008

Thomas - you are saying that medians are misleading. I'm sure you're correct, but what's your point here? Nobody uses medians for presumably exactly the reasons you state. At least, I don't recall anyone talking about "The median man in the street..." or "The median European wage" etc.

4

Thomas Baekdal - Apr. 23, 2008

Jonathan, No I am saying that "averages" are misleading - you should use "median" :o)

http://en.wikipedia.org/wiki/Median

5

Thomas Baekdal - Apr. 23, 2008

E.g. the average of the following numbers is "43.6", which is misleading because nobody have that number. Whereas the median is "1", which is much more correct, since that is indeed what most people have.

1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 200, 200, 200

6

Jonathan - Apr. 24, 2008

Well, yes and no. I think the thing that confuses me is the line you have in your graphs. You label them the same when in fact they are different things. In the first one (a "normal distribution") the line would represent both the average and the median. That is, it has a standard deviation of zero.

In the second graph, the line is the median. That is (to quote your Wikipedia article) "the number separating the higher half of a sample ... from the lower half." But in this case, the average would be way off the median, having a large standard deviation (similar to the example you give in your comment above).

In the third graph, the line might be the average, but it's too hard to say without the data to calculate. It would be entirely coincidental if it was in the centre of the graph though.

In short, your words don't match your pictures :-)

Anyway, I think the discussion is flawed for another reason because there are a number of different ways to compute averages. But that would take us off topic...

7

Thomas Baekdal - Apr. 24, 2008

Anyway, the thing I am speaking out against is the common approach to calculate the average based on the sum of all the number divided by how many numbers there are - and then go on to say "the average user does such and such". I would trust such stats.

8

Jonathan - Apr. 24, 2008

Indeed, as I say, a discussion of the type of average in question would take us off topic.

Meanwhile (no pun intended) the graphs are really not helping you make your point.

 

Published: Apr. 22, 2008
in work notes

Subscribe / Select »

Thomas Baekdal

Thomas Baekdal is a Writer, Interaction Designer, Change Advocate and Project Manager.

» About Baekdal
» Contact Information