Simon Willison’s Weblog

Subscribe

32 items tagged “journalism”

2024

AI for Data Journalism: demonstrating what we can do with this stuff right now

I gave a talk last month at the Story Discovery at Scale data journalism conference hosted at Stanford by Big Local News. My brief was to go deep into the things we can use Large Language Models for right now, illustrated by a flurry of demos to help provide starting points for further conversations at the conference.

[... 6080 words]

On the zombie edition of the Washington Independent I discovered, the piece I had published more than ten years before was attributed to someone else. Someone unlikely to have ever existed, and whose byline graced an article it had absolutely never written.

[...] Washingtonindependent.com, which I’m using to distinguish it from its namesake, offers recently published, article-like content that does not appear to me to have been produced by human beings. But, if you dig through its news archive, you can find work human beings definitely did produce. I know this because I was one of them.

Spencer Ackerman # 7th March 2024, 2:59 am

2023

Simon Willison (Part Two): How Datasette Helps With Investigative Reporting. The second part of my Newsroom Robots podcast conversation with Nikita Roy. This episode includes my best audio answer yet to the “what is Datasette?” question, plus notes on how to use LLMs in journalism despite their propensity to make things up. # 5th December 2023, 8:27 pm

Deciphering clues in a news article to understand how it was reported

Written journalism is full of conventions that hint at the underlying reporting process, many of which are not entirely obvious. Learning how to read and interpret these can help you get a lot more out of the news.

[... 1456 words]

Weeknotes: the Datasette Cloud API, a podcast appearance and more

Datasette Cloud now has a documented API, plus a podcast appearance, some LLM plugins work and some geospatial excitement.

[... 1243 words]

The New York Times launches “enhanced bylines,” with more information about how journalists did the reporting. I really like these: “Elian Peltier and Yagazie Emezi visited refugee sites on Chad’s Sudan border, where tens of thousands of people have found refuge since a war started in Sudan last month.” I’m a fan of anything that helps people better appreciate the details of how quality reporting is produced. # 19th May 2023, 4:16 am

Other tech-friendly journalists I know have been going through something similar: Suddenly, we’ve got something like a jetpack to strap to our work. Sure, the jetpack is kinda buggy. Yes, sometimes it crashes and burns. And the rules for its use aren’t clear, so you’ve got to be super careful with it. But sometimes it soars, shrinking tasks that would have taken hours down to mere minutes, sometimes minutes to seconds.

Farhad Manjoo # 21st April 2023, 8:41 pm

Microsoft declined further comment about Bing’s behavior Thursday, but Bing itself agreed to comment — saying “it’s unfair and inaccurate to portray me as an insulting chatbot” and asking that the AP not “cherry-pick the negative examples or sensationalize the issues.”

Matt O'Brien, Associated Press # 19th February 2023, 9:25 pm

2021

Stanford School Enrollment Project (via) This is Project Pelican: I’ve been working with the Big Local News team at Stanford helping bundle up and release the data they’ve been collecting on school enrollment statistics around the USA. This Datasette instance has data from 33 states for every year since 2015—3.3m rows total. Be sure to check out the accompanying documentation! # 8th August 2021, 12:23 am

M1RACLES: M1ssing Register Access Controls Leak EL0 State. You need to read (or at least scan) all the way to the bottom: this security disclosure is a masterpiece. It not only describes a real flaw in the M1 silicon but also deconstructs the whole culture of over-hyped name-branded vulnerability reports. The TLDR is that you don’t really need to worry about this one, and if you’re writing this kind if thing up for a news article you should read all the way to the end first! # 26th May 2021, 3:25 pm

2020

I’ve often joked with other internet culture reporters about what I call the “normie tipping point.” In every emerging internet trend, there is a point at which “normies” — people who don’t spend all day online, and whose brains aren’t rotted by internet garbage — start calling, texting and emailing us to ask what’s going on. Why are kids eating Tide Pods? What is the Momo Challenge? Who is Logan Paul, and why did he film himself with a dead body?

The normie tipping point is a joke, but it speaks to one of the thorniest questions in modern journalism, specifically on this beat: When does the benefit of informing people about an emerging piece of misinformation outweigh the possible harms?

Kevin Roose # 5th October 2020, 3:40 pm

You always get the name of the dog, the editor explained. The dog is a character in your story, and names tell readers a lot about your characters. It’s a crucial storytelling detail, and if you’re alert and inquisitive enough to ask for the name of the dog, you’ll surely not miss any other important details.

Justin Willett # 22nd July 2020, 2:29 pm

What do you call the parts of a story? Or: why can’t journalists spell “lead”? (via) Carl M. Johnson’s analysis of what journalists call different elements of a story, useful for data modeling a CMS for a news organization. # 3rd January 2020, 1:13 am

2019

Guide To Using Reverse Image Search For Investigations (via) Detailed guide from Bellingcat’s Aric Toler on using reverse image search for investigative reporting. Surprisingly Google Image Search isn’t the state of the art: Russian search engine Yandex offers a much more powerful solution, mainly because it’s the largest public-facing image search engine to integrate scary levels of face recognition. # 30th December 2019, 10:23 pm

My JSK Fellowship: Building an open source ecosystem of tools for data journalism

I started a new chapter of my career last week: I began a year long fellowship with the John S. Knight Journalism Fellowships program at Stanford.

[... 876 words]

JSK Journalism Fellowships names Class of 2019-2020 (and I’m in it!) (via) In personal news... I’ve been accepted for a ten month journalism fellowship at Stanford (starting September)! My work there will involve “Improving the impact of investigative stories by expanding the open-source ecosystem of tools that allows journalists to share the underlying data”. # 1st May 2019, 4:43 pm

2018

If I tweeted a throwaway comment in appreciation for McDonald’s apple pies and some other randos on Twitter happened to also tweet similar thoughts over the last few months, it doesn’t mean by extrapolation that ‘Millennials Can’t Get Enough Of McDonald’s Apple Pies’. 
The Twitter search box is not a polling agency and Twitter doesn’t include everybody’s thoughts on everything. Just some people’s thoughts on some things.

Nick Walker # 28th January 2018, 4:18 pm

2017

The Story Behind the Chicago Newspaper That Bought a Bar (via) Absolutely fascinating story—the Chicago Sun-Times bought a bar back in 1976 to investigate corrupt city inspectors, staffing it with journalists and with photographers hidden in a back room. # 3rd November 2017, 3:27 pm

2010

Journalism Warning Labels. These are absolutely fantastic. “I’ve been putting them on copies of the free papers that I find on the London Underground. You might want to as well.” # 14th August 2010, 11:16 am

If journalism is the first draft of history, live blogging is the first draft of journalism.

Andrew Sparrow # 10th May 2010, 4:28 pm

Live blogging the general election. The Guardian’s ongoing live blogs covering the UK election have been the best way of following events that I’ve seen (yes, better than Twitter). Live-blog author Andrew Sparrow explains his approach. # 10th May 2010, 4:27 pm

2009

Most journalists have grown up with a fortress mindset. They have lived and worked in proud institutions with thick walls. Their daily knightly task has been simple: to battle journalists from other fortresses. But the fortresses are crumbling and courtly jousts with fellow journalists are no longer impressing the crowds.

Peter Horrocks # 20th July 2009, 5:20 pm

#DataJourn part 1: a new conversation. Journalism.co.uk report on the first instance of a Guardian story that was driven by an external developer’s work with data originally released on our Datablog. # 9th April 2009, 10:57 am

A few notes on the Guardian Open Platform

This morning we launched the Guardian Open Platform at a well attended event in our new offices in Kings Place. This is one of the main projects I’ve been helping out with since joining the Guardian last year, and it’s fantastic to finally have it out in the open.

[... 839 words]

Learning to Think Like A Programmer. Outstanding advice aimed mainly at journalists, but important to anyone who collects information for a living and might want it to be automatically processed at some point in the future. # 22nd January 2009, 6:06 pm

2008

Google apps for your newsroom. How the LJ World team use online tools like Google Spreadsheet, Swivel, ManyEyes and Google MyMaps to collaborate with the newsroom and build data-heavy applications even faster. # 7th January 2008, 9:24 pm

2007

journa-list.com. Fantastic new site that indexes UK news stories by the person who wrote them. Being able to track a journalist’s output like this makes it much easier to figure out their personal biases over time. # 11th October 2007, 4:04 pm

Times to Stop Charging for Parts of Its Web Site. The New York Times finally acknowledges that you can’t be the “paper of record” if no one can link to you. # 18th September 2007, 8:40 am

Making the “24-hour newsroom” work (via) More on the Lawrence Journal-World, this time from the point of view of the reporters in the newsroom. # 16th June 2007, 12:27 am