Reset password:

Something to think about... / blog
The Result Of Testing Twitter's Algorithm

Written by on March 11, 2016

On a number of occasions, I have urged people to do controlled tests of the value of their social stream for two reasons. Firstly, I want people to realize what these algorithms actually do, and secondly, I want you to base your insights on real data rather than just 'what you think'.

For instance, when Twitter launched 'while you were away', I noticed many people saying that it was the best thing that ever happened to their Twitter stream. It was a sentiment that many have repeated now that Twitter has gone all-in on 'stream' ranking.

And it's not surprising that they say this. Most people's stream consists of an unruly mess of irrelevant people, so having even a slight form of ranking tends to eliminate the worst parts. But while the worst content is removed, does that mean you also get the best? Or are you just getting the best average?

Or, worse, do the social algorithms eliminate both the best and the worst, leaving you with a snackable middle that doesn't really mean much?

So, let's test it.

Obviously everyone's stream is different, which is why you should test this yourself. In this article, I'm just going to test my stream, but I am going to explain how I did it so that you can do it too.

The Twitter Timeline Quality Test

The first thing we need to do is to give our social stream some 'time off' so that it can accumulate enough content for it to rank. So I closed my Twitter app (completely) and went off Twitter for 8 hours. This means that when I returned to it, I had 8 hours of tweets that I hadn't seen before.

I then started Twitter, and I wrote down every Tweet in my stream for both the ranked and the unranked stream (and I wrote down the ranked stream first so as not to do anything to change its algorithm).

The result is this:

Let's start with the unranked stream. Over the 8 hours, I measured 191 tweets in my timeline. Of those, only 26 were something I wanted to read about.

This gives us a quality score of 13.6%.

Having only 13.6% of the tweets to be 'worth reading' might seem low, but it's quite high. Try visiting any newspaper site and see what percentage of the articles on the front page you are really interested in. It's likely going to be much lower than 13.6%.

Keep in mind that I'm very picky about who I follow. I only follow 321 people. This would obviously be a lot worse if I were following thousands of people.

So, is the ranked stream better? Well, let's see:

The ranked stream reduced the number of tweets from 191 to only 31. In other words, Twitter's ranking system removed 84% of the tweets from my stream. And of those, only 4 of them were worth reading. Thus, instead of only removing the bad tweets, it also removed the good tweets.

This gives us a quality score of 12.9%... which is lower than before. Yep, Twitter's ranked stream made my timeline worse. Not better.

More to the point, those 4 tweets that Twitter picked weren't the ones I would have chosen to be the top ones. If I had to personally pick 4 tweets that I absolutely wanted to see over that 8 hour period, none of them were in the ranked stream.

Also more to the point, because I'm now only seeing 31 tweets in total, I give up sooner because I reach old tweets a lot faster than before. The result of this is a drop in reach.

We saw the same thing on Facebook. Back when Facebook started its NewsFeed, most brands experienced a drop in reach because if you give people fewer options, the reach per post goes down.

It's simple math.

It's the same with engagement. If you reduce the number of tweets people see by 86%, you need a lot of extra engagements to make up for the difference. Of course, I'm dramatically over-simplifying it here because the reality is that we don't see every tweet in our unranked timelines either. I get about 1,000 tweets in my timeline per day, and I only see a fraction of them.

So, the actual result is tricky to estimate. For some it might be an improvement for those tweets we end up seeing, for others it might not be. But if we look at social channels that have already done this, we see a general drop in value, reach and total impact. The algorithms have a tendency to make everyone part of the gray mass, where you become less involved and less interested. You still engage as much as always, but it's with less feeling and dedication.

As weird as it might sound, it's far better to have a wide span of good and bad content because this way you really feel the power of good content. But if everything is at the same level, you lack the distinction, and you become a social zombie... especially if the base level is no better than before (which is what you see here).

But my point with all of this is that you should do this test for yourself, and not just for Twitter. You should do this test on all your social channels because it's often an eye-opener.

All the social channels are spending a considerable amount of resources on designing their newsfeed algorithms, and I'm sure that for them (and at scale) it works. But I have now tested Facebook's Newsfeed four times, Twitter's timeline three times (twice after 'while you were away' was introduced and now with the fully ranked stream), and Google+'s stream once.

The result every single time, for me, is that the ranked stream is more condensed, but also less valuable. They all seem to be doing the same thing. They reduce the feed into the 'best average', removing both good posts that you really wanted to see and bad tweets that you didn't.

Try to do this test yourself. It's quite fascinating to realize what you see and what you don't.

Share on

Thomas Baekdal

Thomas Baekdal

Founder of Baekdal, author, writer, strategic consultant, and new media advocate.

Follow    

Baekdal PLUS: Premium content that helps you make the right decisions, take the right actions, and focus on what really matters.

There is always more...