Deciphering clues in a news article to understand how it was reported

22nd November 2023

Written journalism is full of conventions that hint at the underlying reporting process, many of which are not entirely obvious. Learning how to read and interpret these can help you get a lot more out of the news.

I’m going to use a recent article about the ongoing OpenAI calamity to illustrate some of these conventions.

I’ve personally been bewildered by the story that’s been unfolding since Sam Altman was fired by the board of directors of the OpenAI non-profit last Friday. The single biggest question for me has been why—why did the board make this decision?

Before Altman’s Ouster, OpenAI’s Board Was Divided and Feuding by Cade Metz, Tripp Mickle and Mike Isaac for the New York Times is one of the first articles I’ve seen that felt like it gave me a glimmer of understanding.

It’s full of details that I hadn’t heard before, almost all of which came from anonymous sources.

But how trustworthy are these details? If you don’t know the names of the sources, how can you trust the information that they provide?

This is where it’s helpful to understand the language that journalists use to hint at how they gathered the information for the story.

The story starts with this lede:

Before Sam Altman was ousted from OpenAI last week, he and the company’s board of directors had been bickering for more than a year. The tension got worse as OpenAI became a mainstream name thanks to its popular ChatGPT chatbot.

The job of the rest of the story is to back that up.

Anonymous sources

Sources in these kinds of stories are either named or anonymous. Anonymous sources have a good reason to stay anonymous. Note that they are not anonymous to the journalist, and probably not to their editor either (except in rare cases).

There needs to be a legitimate reason for them to stay anonymous, or the journalist won’t use them as a source.

This raises a number of challenges for the journalist:

How can you trust the information that the source is providing, if they’re not willing to attach their name and reputation to it?
How can you confirm that information?
How can you convince your editors and readers that the information is trustworthy?

Anything coming from an anonymous source needs to be confirmed. A common way to confirm it is to get that same information from multiple sources, ideally from sources that don’t know each other.

This is fundamental to the craft of journalism: how do you determine the likely truth, in a way that’s robust enough to publish?

Hints to look out for

The language of a story like this will include crucial hints about how the information was gathered.

Try scanning for words like according to or email or familiar.

Let’s review some examples (emphasis mine):

Mr. Altman complained that the research paper seemed to criticize OpenAI’s efforts to keep its A.I. technologies safe while praising the approach taken by Anthropic, according to an email that Mr. Altman wrote to colleagues and that was viewed by The New York Times.

“according to an email [...] that was viewed by The New York Times” means a source showed them an email. In that case they likely treated the email as a primary source document, without finding additional sources.

Senior OpenAI leaders, including Mr. Sutskever, who is deeply concerned that A.I. could one day destroy humanity, later discussed whether Ms. Toner should be removed, a person involved in the conversations said.

Here we only have a single source, “a person involved in the conversations”. This speaks to the journalist’s own judgement: this person here is likely deemed credible enough that they are acceptable as the sole data point.

But shortly after those discussions, Mr. Sutskever did the unexpected: He sided with board members to oust Mr. Altman, according to two people familiar with the board’s deliberations.

Now we have two people “familiar with the board’s deliberations”—which is better, because this is a key point that the entire story rests upon.

Familiar with comes up a lot in this story:

Mr. Sutskever’s frustration with Mr. Altman echoed what had happened in 2021 when another senior A.I. scientist left OpenAI to form the company Anthropic. That scientist and other researchers went to the board to try to push Mr. Altman out. After they failed, they gave up and departed, according to three people familiar with the attempt to push Mr. Altman out.

This is one of my favorite points in the whole article. I know that Anthropic was formed by a splinter-group from OpenAI who had disagreements about OpenAI’s approach to AI safety, but I had no idea that they had first tried to push Sam Altman out of OpenAI itself.

“After a series of reasonably amicable negotiations, the co-founders of Anthropic were able to negotiate their exit on mutually agreeable terms,” an Anthropic spokeswoman, Sally Aldous, said.

Here we have one of the few named sources in the article—a spokesperson for Anthropic. This named source at least partially confirms those details from anonymous sources. Highlighting their affiliation helps explain their motivation for speaking to the journalist.

After vetting four candidates for one position, the remaining directors couldn’t agree on who should fill it, said the two people familiar with the board’s deliberations.

Another revelation (for me): the reason OpenAI’s board was so small, just six people, is that the board had been disagreeing on who to add to it.

Note that we have repeat anonymous characters here: “the two people familiar with...” were introduced earlier on.

Hours after Mr. Altman was ousted, OpenAI executives confronted the remaining board members during a video call, according to three people who were on the call.

That’s pretty clear. Three people who were on that call talked to the journalist, and their accounts matched.

Let’s finish with two more “familiar with” examples:

There were indications that the board was still open to his return, as it and Mr. Altman held discussions that extended into Tuesday, two people familiar with the talks said.

And:

On Sunday, Mr. Sutskever was urged at OpenAI’s office to reverse course by Mr. Brockman’s wife, Anna, according to two people familiar with the exchange.

The phrase “familiar with the exchange” means the journalist has good reason to believe that the sources are credible regarding what happened—they are in a position where they would likely have heard about it from people who were directly involved.

Relationships and reputation

Carefully reading this story reveals a great deal of detail about how the journalists gathered the information.

It also helps explain why this single article is credited to three reporters: talking to all of those different sources, and verifying and cross-checking the information, is a lot of work.

Even more work is developing those sources in the first place. For a story this sensitive and high profile the right sources won’t talk to just anyone: journalists will have a lot more luck if they’ve already built relationships, and have a reputation for being trustworthy.

As news consumers, the credibility of the publication itself is important. We need to know which news sources have high editorial standards, such that they are unlikely to publish rumors that have not been verified using the techniques described above.

I don’t have a shortcut for this. I trust publications like the New York Times, the Washington Post, the Guardian (my former employer) and the Atlantic.

One sign that helps is retractions. If a publication writes detailed retractions when they get something wrong, it’s a good indication of their editorial standards.

There’s a great deal more to learn about this topic, and the field of media literacy in general. I have a pretty basic understanding of this myself—I know enough to know that there’s a lot more to it.

I’d love to see more material on this from other experienced journalists. I think journalists may underestimate how much the public wants (and needs) to understand how they do their work.

Simon Willison’s Weblog