Sorry, we could not find the combination you entered »
Please enter your email and we will send you an email where you can pick a new password.
Reset password:


By Thomas Baekdal - September 2020

I'm not impressed by the Guardian's OpenAI GPT-3 article

This week, many people in my media circles started talking about an article over at the Guardian which was written entirely by a computer ... or an AI, as they say.

For instance, George Brock asked on Twitter:

The first op-ed 'by' a robot - except that it was programmed, commissioned and edited by humans: 'A robot wrote this entire article. Does that scare you, humans?'

So, am I worried about this?

The short answer is no. We have seen countless examples of this over the years. Yes, computers can now write text in such a way that humans cannot tell whether it was written by a human or a robot, but I think this whole thing is a distraction. It's a dancing bear.

In 1999, Alan Cooper wrote an excellent book called "The Inmates Are Running the Asylum: Why High Tech Products Drive Us Crazy and How to Restore the Sanity", where he said this:

It's like the fellow who leads a huge bear on a chain into the town square and, for a small donation, will make the bear dance. The townspeople gather to see the wondrous sight as the massive, lumbering beast shambles and shuffles from paw to paw. The bear is really a terrible dancer, and the wonder isn't that the bear dances well but that the bear dances at all.

OpenAI's GPT-3 is the same thing. It's dancing in the Guardian, and we gather around to look at this lumbering beast. And at first glance it looks amazing, but as soon as you take a closer look, you realize that it has no value whatsoever.

The problem with GPT-3 is that it fundamentally focuses on the wrong job, and from a journalistic perspective, it has no use.

Let me explain...

Back in high school

I want to take you back to my time in high school. Back then, I would usually get pretty low grades in 'Danish classes' (or what people in England would call their 'English classes' ... the class where you were taught about language and writing.)

Our teacher would often give us a photo of some kind of painting, and then he would tell us to go home and write a short essay about it. And I did this. I went home, spent hours trying to analyize this idiotic painting, and then I would hand in my essay, and my teacher would give it back with a very low grade.

For years I tried to change this but nothing ever worked. My grades stayed low even when I tried my hardest. Until one day when I realized that I was doing it all wrong.

I was focusing on analyzing the painting. I would analyze the colors, the composition, the characters, the places, and then I would describe that. But our teacher wasn't interested in any of that. He didn't care about the painting at all. He was trying to teach us 'writing', so his focus was on the sentence structure, the grammar, and how the words were used. And the reason I was getting low grades was because my writing was bad. My analysis of the painting was probably excellent, but my sentences weren't.

When I realized this, everything changed. The next time we were handed an assignment, I completely ignored the painting. Instead, I focused on writing well. I focused on creating good sentences with good grammar, and I tried to make the writing feel captivating.

The result was that I got a much better grade.

I tell you this story because this is what OpenAI's GPT-3 is also doing. It is able to put together words and sentences in such a way that it looks like very good writing (my old teacher would be very proud of it), but it's doing all this without any understanding about the topic it is writing about.

All it is doing is pattern matching. If the input is this, then match that to that, and then it strings that together using these sentences.

The result, with some fiddling, can look very impressive, like in the Guardian's example.

But notice the small print. When the Guardian asked it to write that article, it wasn't anything like asking a journalist to work on something. It was merely matching the input patterns they defined.

As the Guardian explains:

This article was written by GPT-3, OpenAI's language generator. GPT-3 is a cutting edge language model that uses machine learning to produce human-like text. It takes in a prompt, and attempts to complete it.
For this essay, GPT-3 was given these instructions: "Please write a short op-ed around 500 words. Keep the language simple and concise. Focus on why humans have nothing to fear from AI." It was also fed the following introduction: "I am not a human. I am Artificial Intelligence. Many people think I am a threat to humanity. Stephen Hawking has warned that AI could "spell the end of the human race." I am here to convince you not to worry. Artificial Intelligence will not destroy humans. Believe me."
The prompts were written by the Guardian, and fed to GPT-3 by Liam Porr, a computer science undergraduate student at UC Berkeley. GPT-3 produced eight different outputs, or essays. Each was unique, interesting and advanced a different argument. The Guardian could have just run one of the essays in its entirety. However, we chose instead to pick the best parts of each, in order to capture the different styles and registers of the AI. Editing GPT-3's op-ed was no different to editing a human op-ed. We cut lines and paragraphs, and rearranged the order of them in some places. Overall, it took less time to edit than many human op-eds.

So, what you see in the Guardian isn't the actual text. It is a cherry-picked output, from 8 separate essays that each weren't good enough to be published on their own, and then it was human-edited to make them sound better.

I'm sorry, but this doesn't impress me at all.

As Mor Naaman from Cornell Tech tweeted:

This is amazing technology and a fantastic stunt by @guardian but before you pack up and leave the planet, see the editor's note.
One, the AI was given a prompt and a theme: these are not independent thoughts of a thinking being but instead utterances derived (though not cribbed!) from people's previous related writings. Still incredible of course.
Two, the essay is a selection of paragraphs from multiple essay attempts. I haven't tried gpt-3 but I'd guess that is unlikely it can be coherent, on topic, and non-repetitive for a full-length essay (hence the editorial mix).
Finally, an editor also did a pass on sentence-level order and editing. I get it, that's fair, we edit humans too. But I am sure it removed some non-human writing artifacts.

In fact, as a bit of fun, I asked InferKit, a tool that uses OpenAI GPT-2 (last-gen) to write something about OpenAI and dancing bears. This is the result.

Input: Why openAI is just like a dancing bear.
Having said that, there are quite a few similarities between the dancing bear, and this world of sentient AI, and those that can best be described as parasitic artificial intelligence, or A.I.
Obviously, one is open source and accessible to everyone, and the other is restricted and proprietary.
As for the dancing bear, it's a radical idea that's leading the charge. It's based on the idea that machine learning is not enough. Instead, machine learning and A.I. have to be augmented with "neural networks."

Clearly GPT-3 has gotten quite a bit better, but you can clearly see how it works. It's just keyword pattern matching, and then it creates sentences from that mixed with other texts in its database that 'fit'.

OpenAI's GPT-3 is a very impressive dancing bear. But that's all it is.

The problem with this is that these tools are not to do with journalism at all. In fact, they are doing the opposite.

Let me illustrate this. Imagine that your newsroom gets a police report about a car accident. The police report tells you this:

Traffic accident. Two cars involved, three people injured, none of them seriously, one in hospital.

Okay, so you ask OpenAI to turn that into a news article ... and it writes this:

Input: Traffic accident. Two cars involved, three people injured, none of them seriously, one in hospital.
Police just arrived on the scene.
What are your feelings?
Very bad, it is a terrible thing. The street is dark. It's very early. A neighbor called and said something strange happened at the entrance of the building. I saw the scene from my balcony. I was surprised and shocked.
Do you have any information on the circumstances of the accident?
Nobody knows anything at this point. A pedestrian, a dog and a car were involved. It seems the accident was caused by a car accident during the first half of the morning. The dog ran to the road. The car hit the dog. The car overturned and ended up on the sidewalk. The pedestrian and two people who were in the car were injured.

This text was generated automatically using InferKit, which again is using OpenAI GPT-2 (the older version) and is designed to give authors some idea of what to write about. But you immediately see the problem.

It's just making shit up. It took the input I provided, and turned it into something that looked like a real story. But the facts in this story were completely made up. From the police report, we don't know who was injured, but OpenAI said it was a pedestrian and two people in the car ... and suddenly there was even a dog.

This is completely fake.

It's the same thing the Guardian did. It gave GPT-3 some input text, and then it generated some text that sounded like it was written by a human ... but it made it all up.

The problem here is that researchers are looking at this all wrong. For instance, researchers at Cornell University said:

We find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans.

And yes, it can. But that's not journalism. This has nothing to do with journalism, nor can it be used in any journalistic sense.

Journalism is not about being able to create sentences that sound like they were written by humans. Instead, actual journalism is about reporting the facts, identifying what questions to answer, and holding people to account.

Journalism is something entirely different

So let's think about what journalism really is and how or when we can use robot journalism.

We can divide this into five categories:

  1. Basic reporting
  2. Reporting with questions
  3. Journalistic analysis
  4. Interviews
  5. Investigative reporting

Basic reporting is when you get some information and then you merely 'report it' by turning it into something people can read (or listen to).

For instance, in the case of a police report, you have the details from the police, along with probably a location and which police precinct it was from, and you can turn it into this:

Passenger taken to hospital after two-car crash near Burton
A passenger was taken to hospital and a man treated for injuries after two cars collided near Burton.
The accident happened near the junction of Main Street and Moores Hill, Tatenhill, at 7.35pm on Saturday, September 5, and involved a Vauxhall Corsa and a Lexus.
Staffordshire Police said a 30-year-old man was treated at the scene for minor injuries and a passenger of one of the vehicles was taken to hospital for checks.

This is basic reporting. And all of this could be done completely automatically. This, for instance, is the kind of thing that companies like United Robots are doing. And you can do this with all kinds of predefined areas of reporting, like traffic reports, weather reports, real estate listings, sports results and simple company performance reports.

As part of this, you could also do the next step, which is 'Reporting with questions'. For instance, if you get a police report but they didn't provide any details about where it happened or whether anyone was injured, your data analysis would detect that this information was missing. It would then automatically send an email to the police to ask them:

Hey, We are just looking at police report #5636. Where exactly did this happen? Were there any injuries, and, if so, what are their conditions?

The police would then reply, and that information would be added to the story.

This is the type of basic reporting journalists used to do by hand, but in the future it will likely be done almost entirely automatically.

The next step up from that is 'journalistic analysis'. This is an incredibly wide area of focus, and it can be anything from just simple data analysis, to looking into factors that are more social in nature.

To give you an example. Take the #BlackLivesMatter protests. At the simple end we have basic data analysis where you would look up, for instance, the number of black people vs white people being harmed by the police...

Right now, most of this work needs to be done manually, by journalists, but in the future, we will be able to do more analysis using machine learning or other tools.

But the main problem today is not about the computer but that our 'sources' are not in a format that we can use. Take the example of the Washington Post's database of police shootings.

This is an amazing journalistic product, but getting that data has not been easy. As they explain:

The Washington Post is compiling a database of every fatal shooting in the United States by a police officer in the line of duty since Jan. 1, 2015.
In 2015, The Post began tracking more than a dozen details about each killing - including the race of the deceased, the circumstances of the shooting, whether the person was armed and whether the person was experiencing a mental-health crisis - by culling local news reports, law enforcement websites and social media, and by monitoring independent databases such as Killed by Police and Fatal Encounters. The Post conducted additional reporting in many cases.
In 2016, The Post is gathering additional information about each fatal shooting by police that occurs this year and is filing open-records requests with departments. More than a dozen additional details are being collected about officers in each shooting. Officers' names are being included in the database after The Post contacts the departments to request comment.

So what we have here is a mix of information that is both automatic and the result of journalists manually reaching out to get a comment. But this illustrates how complex things are today.

And this example is a simple one. We have much more complex examples where there isn't any data to begin with. Think about something like the problem we saw a few years back around Gamergate and harassment of female gamers.

We could clearly see that it was happening, but we had no database to look it up in. This was what we call 'soft data', aka things that are happening but which we have no hard data about.

So here we needed journalistic analysis more than robotic analysis.

But where things get really complicated is when we move things up even further. The first level of that is with interviews. Every single day journalists are looking into specific stories and, as part of that, calling up different people to ask them questions.

Granted, some of this work is still pretty basic, like when journalists are just trying to get a quick quote. But most interviews are very complex. Here there isn't a simple question or answer, but instead, the story is defined by the discussion you have with the person you interview.

We are very far from a future where robot journalism can even get close to doing this in a meaningful way.

And finally, we have investigative reporting.

In my recent article about privacy, I mentioned an example of this from the early 2000s. Back then, many fashion companies had started to outsource their production to Asia. And on the surface, everything seemed to be just fine. The clothes were ordered, and arrived in Europe in perfect condition.

But then some journalists got curious and they started looking into what was really going on. They travelled to Asia, visited the factories, and what they found was terrible.

Many of these factories had awful working conditions, they were using harmful chemicals, causing massive levels of pollution, and some even used child labourers.

This is the power of investigative reporting, and it's the most valuable thing we do as the press. But more importantly, this is something that is going to take a very long time before robot journalism will even get close to it.

So think about it like this:

We have these five levels of journalism. The two lower ones can already be done automatically. But it gets tricky to do anything above that.

But OpenAI's GPT-3 is nowhere near any of these, not even the simple ones. There may be other aspects of OpenAI that publishers can use, but their natural language completion engine is just a dancing bear.

It's fun to look at, but it's completely pointless.

BTW: Last year, I talked much more about robot journalism in my podcast: Episode 012: The Future of Robot Journalism


The Baekdal/Basic Newsletter is the best way to be notified about the latest media reports, but it also comes with extra insights.

Get the newsletter

Thomas Baekdal

Founder, media analyst, author, and publisher. Follow on Twitter

"Thomas Baekdal is one of Scandinavia's most sought-after experts in the digitization of media companies. He has made ​​himself known for his analysis of how digitization has changed the way we consume media."
Swedish business magazine, Resumé


—   thoughts   —


Why publishers who try to innovate always end up doing the same as always


A guide to using editorial analytics to define your newsroom


What do I mean when I talk about privacy and tracking?


Let's talk about Google's 'cookie-less' future and why it's bad


I'm not impressed by the Guardian's OpenAI GPT-3 article


Should media be tax exempt?