Simon Willison’s Weblog


15 items tagged “nytimes”


NYT Flash-based visualizations work again. The New York Times are using the open source Ruffle Flash emulator—built using Rust, compiled to WebAssembly—to get their old archived data visualization interactives working again. # 21st January 2024, 5:58 am

OpenAI and journalism. Bit of a misleading title here: this is OpenAI’s first public response to the lawsuit filed by the New York Times concerning their use of unlicensed NYT content to train their models. # 8th January 2024, 6:33 pm

Does GPT-2 Know Your Phone Number? (via) This report from Berkeley Artificial Intelligence Research in December 2020 showed GPT-3 outputting a full page of chapter 3 of Harry Potter and the Philosopher’s Stone—similar to how the recent suit from the New York Times against OpenAI and Microsoft demonstrates memorized news articles from that publication as outputs from GPT-4. # 8th January 2024, 5:26 am


The New York Times launches “enhanced bylines,” with more information about how journalists did the reporting. I really like these: “Elian Peltier and Yagazie Emezi visited refugee sites on Chad’s Sudan border, where tens of thousands of people have found refuge since a war started in Sudan last month.” I’m a fan of anything that helps people better appreciate the details of how quality reporting is produced. # 19th May 2023, 4:16 am


nyt-2020-election-scraper. Brilliant application of git scraping by Alex Gaynor and a growing team of contributors. Takes a JSON snapshot of the NYT’s latest election poll figures every five minutes, then runs a Python script to iterate through the history and build an HTML page showing the trends, including what percentage of the remaining votes each candidate needs to win each state. This is the perfect case study in why it can be useful to take a “snapshot if the world right now” data source and turn it into a git revision history over time. # 6th November 2020, 2:24 pm


Breakfast Instapaper. Handy tool for selecting and bulk-submitting stories from today’s Guardian and NYTimes to your Instapaper account, by Daniel Vydra. # 29th April 2010, 11:49 am

The making of the NYT’s Netflix graphic. A database dump from Netflix, some clever hackery in ArcView GIS, hpricot to scrape Metacritic and a lot of careful thought about the UI for navigating the data. # 25th January 2010, 1:11 pm


How Different Groups Spend Their Day. Classy interactive infographic from the New York Times. # 10th August 2009, 3:37 pm

Announcing the Article Search API. The most interesting API from the NYTimes yet—search against 2.8 million articles from 1981 until today using 35 searchable fields and get back detailed metadata as well as the first paragraph of the articles themselves. # 5th February 2009, 11:06 pm


Represent. Andrei Scheinkman and Derek Willis describe how they built the NYTimes Represent feature using GeoDjango and PostGIS. # 29th December 2008, 10:10 pm

Represent and GeoDjango. The NYTimes new Represent application is built on GeoDjango. # 20th December 2008, 9:07 pm

Represent— Superb new application from the NYTimes—a sort of cross between TheyWorkForYou and a news archive search. Enter your address in New York and it tells you your local representatives and shows both their votes and their mentions in the newspaper. # 19th December 2008, 4:22 pm

Announcing the New York Times Campaign Finance API (via) The New York Times have released their first data API, exposing campaign finance data from the Federal Election Commission. # 15th October 2008, 2:05 pm

Popular Websites Vulnerable to Cross-Site Request Forgery Attacks. Ed Felten and Bill Zeller announce four CSRF holes, in ING Direct, YouTube, MetaFilter and the New York Times. The ING Direct hole allowed transfer of funds out of a user’s bank accounts! The first three were fixed before publication; the New York Times hole still exists (despite being reported a year ago), and allows you to silently steal e-mail addresses by CSRFing the “E-mail this” feature. # 29th September 2008, 1:08 pm


Times to Stop Charging for Parts of Its Web Site. The New York Times finally acknowledges that you can’t be the “paper of record” if no one can link to you. # 18th September 2007, 8:40 am