Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:
 

free

 
By Thomas Baekdal - February 2021

A lesson in data reporting when reporting about COVID-19

This is an archived version of a Baekdal Plus newsletter (it's free). It is sent out about once per week and features the latest articles as well as unique insights written specifically for the newsletter. If you want to get the next one, don't hesitate to add your email to the list.

Welcome back to the newsletter. I have no new Plus report this week, but I plan to start writing the next one later this week. I plan to write about 'delayed subscription models'. (Also, don't forget that I just published "Paid-for strategies: Defining what people pay for", if you haven't read it yet).

I do, however, have one quick story that I personally find very interesting, but it's also very important from a journalistic perspective. It's about COVID and data journalism.

What about the unknowns?

One of the special things about being a media analyst is that you never have accurate or consistent data. For instance, if you look at three different studies about how much people trust the news, they are nowhere close to each other.

I have talked about this before. The way you deal with this is by trying to understand the patterns and the causes behind this data.

I'm reminded about this every day when I look at how newspapers are reporting about COVID-19.

Over the past year, we have seen a huge amount of amazing data journalism because of the pandemic. No other news story has been this focused on measurable data, updated daily, and many newspapers have centered their reporting around data dashboards, graphs, and the like.

As an analyst, of course, I'm thrilled about this. I absolutely love this new world of journalism where data has this amazing role. But, it is not without mistakes.

There are specifically three types of mistakes that seriously misinform the public.

The first mistake is when journalists are so focused on the day-to-day changes that they forget to zoom out and look at the whole. One example from my country is how often the press have reported that 'things are stable'.

Technically, you might say that this is correct, because ... look ... the infection rate was flat. But then look at the graph, and you clearly see how problematic this is. Every time we reported to the public that things were stable, people stopped being careful, and then the virus started spreading once more.

I'm reminded of a quote that Mathew Ingram posted:

The curve is flattening, we can start lifting restrictions now" = "The parachute has slowed our descent, we can take it off now.

Similar to this is the second mistake we make in the press, which is how we think about the R-number.

By now, you should be very familiar with what the R-number is. If it's more than 1, it means the infections are accelerating, if it's below 1, it means we are (slowly) containing it.

Here is an example from the BBC(and look, once again, they say that 'things are stable' ... which they are not).

The mistake we have made in the media is to think that R=1 is the optimal place. Every time the R number is above one, we have reported this to the public, and told everyone that things are getting worse ... and, as a result, the public have taken more care to stop the spread.

But then, as soon as the R-number goes below one, we flip the script, and now we tell the public that "it's the lowest number since...", "things are stable", "everything is under control". And then we start to push for restrictions to be eased, and we create a public sentiment that it's time to go back to normal.

So what happens? Well, the public becomes less careful in spreading the virus, and the infections start to rise again.

In fact, we can see this very clearly in this graph.

We have been hovering around R=1 since the very beginning, and a big reason for that is because of how we are reporting about this in the press.

Here is another graph, this time for my country (Denmark), and you see exactly the same thing. Since July, every single time the R-number rose above 1, we started doing something about it. But then, as soon as we just hit R=1, in the press, we reported "things are stable again", causing people to stop doing what they were doing, and R went right back up again.

I mean, just look at this. We have effectively kept the pandemic at R=1 for six months.

The good news here in Denmark is that it's currently only R=0.7, but the joker here is that the mutated viruses are still at R=1.1. So what do you think will happen if we ease up right now, push for an end to restrictions, and tell the public that things are stable?

Yep... the number will go up again. This is not rocket science. If you have a mutated strain of a virus that is far more infectious, and you tell people to be less careful, it will start to infect even more people. You don't have to be a virologist to understand that.

Update: It has been a few more days since I wrote the above, and now the R-number looks like this.

R=1 means that the pandemic lives on forever, that it keeps spreading at a steady rate, and that however many people died of the virus last month, the same number will also die the next month.

R=1 is terrible!!

What we need to do is to get to R=0. That's the real goal. You do not ever want to be at R=1 (or anywhere close to it).

Wait-a-minute, you say, it's not us in the press who are saying this, it's the health authorities who say this. We are just reporting it. And yes, that's true.

But this is where we make another mistake. The reason why the health authorities say that R=1 is 'stable' is because it means that the influx of new patients to hospitals is not going up. And so, as long as it stays at R=1, the hospitals will be able to manage with their current level of resources. So, for a health official charged with managing public health resources, R=1 means that their job is under control.

They are talking about this:

However, from a societal perspective, this is not the goal at all. Having 2,000 new people hospitalized every single day is not a stable situation. That's a terrible situation.

Imagine if we had 2,000 people hospitalized every day because McDonald's were serving bad food. As a journalist, you would be all over them in the press, demanding actions to stop this insane level of damage caused to the public.

It is not acceptable and it's not a stable situation to have 2,000 people hospitalized every day. But in the press, this is what most newspapers have been reporting by 'quoting' health authorities.

This is insane.

But let's talk about the third mistake, which is about contact tracing. Here in Denmark, the health authorities have released a report about how their contact tracing efforts are going. One of the key data points is where people think they were infected.

For the month of December 2020, it looks like this:

Keep in mind, this is where people think they got infected, but we have no way of knowing if that is actually true.

Okay, so let me do a test here. If you are a journalist or an editor, how would you report about this?

Would you report it like this?

A new study shows us where people are getting infected by COVID-19. Many places show very low levels of infections, and experts say that this indicates where we can ease restrictions.

Would you report it like that? The answer, I hope, is 'no'.

Look again at these numbers. Do you see something odd about them?

Well, the first thing you should notice is that 28.5% are unknown, which is a larger percentage than any other category. So overall, we don't know.

This also applies if the newspaper reported "In some areas, only a few people are infected". You don't know that ... because they may be in the 28.5% 'unknown' group instead.

This, for instance, is the case with the 'while using public transport' segment, which is reported to be only 0.1%. This number cannot be used for anything.

We know that the incubation period takes a few days between the time you get infected to when you start to have symptoms. So, the public would never know if they got COVID-19 while taking the bus. And this is likely why that number is so low. So the 0.1% for public transport doesn't tell us anything.

It's the same for 'large events', which is reported at 0.3%. Well, here in Denmark, we have had a lockdown since September where large events were not allowed, so obviously that number is low. But again, that doesn't mean that large events are safe. If people were allowed to gather in large groups in confined spaces for several hours, it's highly likely that it would infect people at least at the same level as workplaces (16.9%) ... and because of the volume, even more people would be infected as a total.

So that number is also useless.

But let's look at the top of the list. The top source of infections is reported being at home (27.4%).

Here is the problem with this metric.

Ask yourself this: How did the virus get into their homes in the first place? Right? It didn't get in there by itself. The virus doesn't magically materialize out of thin air inside your home. One of the other family members had to bring it there. They had to bring it with them from somewhere outside the home.

So this metric doesn't tell you what you think it tells you.

This is something that everyone who has ever worked with analytics knows all about. What you see here is what we call 'last-click attribution'. Instead of measuring how something happened, you instead measure the last place where it was known to be. And anyone who has ever worked with analytics knows that this is a highly misleading model.

But you see the problem here? If we, as journalists, look at a study like the one above, and then report that "most people are infected at home" and "almost nobody is infected in these other places", thus calling for those other places to be allowed even more, we end up causing even more people to bring the virus home.

That's how that works. The reason why most people say they got the virus at home, is because most people brought the virus into their home ... from somewhere else.

And so, reporting that it's safe to open up 'somewhere else' is not at all safe to do. The data does not support that conclusion in any way.

Schools are another unknown.

In the study, we see that only 3.1% of the infections were reported happening in primary schools, and another 1% in daycare. Both very low numbers. We also know from other studies that kids have a lower percentage of positive test results. And we know that, of those kids who do get the virus, many of them are asymptomatic (meaning, they never get any symptoms).

This is great if you are a kid, and it might mean that these very low numbers are true. But it might also mean that they are bringing the virus home to their parents, who then get sick, and report that they got sick while staying home, and that they don't know where it came from, because nobody else around them seemed to be ill.

The problem is that, if you look at this purely from a data perspective, there is no way to tell which one is true.

The problem that I have as an analyst is that I see these mistakes almost every single day. Last week alone, I saw it every day, and not just in one or two newspapers, but across a wide selection.

Under normal circumstances, we could just look at this and shrug our shoulders ¯\_(ツ)_/¯ ... because it's 'just reporting', and these examples are just one of many. Who cares if some of the articles are not entirely accurate?

But these are not normal times. We are in the middle of a pandemic. There are hundreds of thousands of people across Europe that are being infected every single day, and thousands of them are dying.

The level of deaths is higher right now than during the first wave.

You don't call for the end of restrictions when this is happening, that would be irresponsible. You do whatever you can as the press to help get this number down, and to get as close to R=0 as quickly as possible.

Right now, we need the press more than ever ... but instead, we are reporting that 'things are stable' and we mischaracterize COVID-19 data. We interview experts who support the notion that 'this is fine'.

I mean, take a story like this one:

This newspaper must have gone mad. How can they even write something like this? It's articles like this that keep us at R=1 instead of helping us get closer to R=0. But worse than that, this is killing people. If the number of infected people doubles, the number of deaths also go up.

And don't give me the excuse that you are "just reporting", or that you are providing people with "different sides of the story". That's not an excuse, that's saying that you don't think the press should act responsibly.

But more than that, this is not a new thing. This has been going on for a year. Back in December, I called upon the press to take part in "Operation: Save Christmas", and use our influence as journalists to help focus the public to get as close to R=0 as we could.

Instead, many newspapers did the exact opposite. They called upon the government to ease restrictions, and they interviewed experts who argued that Christmas was "just one day", and so maybe we should just let people do what they wanted.

The result was this:

What you see here is that we were starting to get the second wave under control, but then came Christmas, and we lost it all. In other words, the reason why people are so fed up with being in a lockdown is because we prolonged it for another two months.

If we had made the public understand the significance of this back in December, we would not be where we are today. It's the same with all the anti-lockdown protests we see right now. They are protesting because it has been going on for so long. But if we had taken the right steps back then, there would be nothing to protest about right now.

And you can try to blame everyone else for this. You can try to blame the government, social media, or whatever, but the fact is that we - the press - play a significant role here.

And today, I see editorials from journalists saying that "we can't do anything about this, so maybe we should just learn to live with COVID as it is right now?"

No, editors and journalists. This is not something we have to do.

And so, let me do another call to action.

In December, it was "Operation: Save Christmas" ... so now let's do "Operation: Save 2021".

We have a choice and a responsibility right now. We can either do the same thing we have done over the past several months, reporting that "things are stable"... in which case the pandemic will go on, fluctuating around R=1, for the entirety of 2021 (until eventually the vaccine helps us reach herd-immunity levels). But I don't want to spend another year in a pandemic. This is not an acceptable situation. You don't want to do this either.

So, the other thing we can do is to get our act together and laser-focus the public on getting us to R=0 as quickly and as efficiently as possible.

This doesn't seem like a difficult choice to make. So, if you are a journalist or an editor, what is it going to be?

This is an archived version of a Baekdal Plus newsletter (it's free). It is sent out about once per week and features the latest articles as well as unique insights written specifically for the newsletter. If you want to get the next one, don't hesitate to add your email to the list.

 
 
 

The Baekdal Plus Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé

 

—   newsletter   —

free

newsletter:
Publishers need to shift entirely to first-party data. Good and bad reader experiences

free

newsletter:
A guide to listening to your audience, and let's talk about Google's no-cookies

free

newsletter:
Did newspapers in the past control all advertising, and does that even matter?

free

newsletter:
A guide to delayed subscription models, and what is up with Twitter subscriptions?

free

newsletter:
When should the media be paid?

free

newsletter:
A lesson in data reporting when reporting about COVID-19