Simon Willison’s Weblog

Subscribe

16th March 2026 - Link Blog

Coding agents for data analysis. Here's the handout I prepared for my NICAR 2026 workshop "Coding agents for data analysis" - a three hour session aimed at data journalists demonstrating ways that tools like Claude Code and OpenAI Codex can be used to explore, analyze and clean data.

Here's the table of contents:

I ran the workshop using GitHub Codespaces and OpenAI Codex, since it was easy (and inexpensive) to distribute a budget-restricted API key for Codex that attendees could use during the class. Participants ended up burning $23 of Codex tokens.

The exercises all used Python and SQLite and some of them used Datasette.

One highlight of the workshop was when we started running Datasette such that it served static content from a viz/ folder, then had Claude Code start vibe coding new interactive visualizations directly in that folder. Here's a heat map it created for my trees database using Leaflet and Leaflet.heat, source code here.

Screenshot of a "Trees SQL Map" web application with the heading "Trees SQL Map" and subheading "Run a query and render all returned points as a heat map. The default query targets roughly 200,000 trees." Below is an input field containing "/trees/-/query.json", a "Run Query" button, and a SQL query editor with the text "SELECT cast(Latitude AS float) AS latitude, cast(Longitude AS float) AS longitude, CASE WHEN DBH IS NULL OR DBH = '' THEN 0.3 WHEN cast(DBH AS float) <= 0 THEN 0.3 WHEN cast(DBH AS float) >= 80 THEN 1.0" (query is truncated). A status message reads "Loaded 1,000 rows and plotted 1,000 points as heat map." Below is a Leaflet/OpenStreetMap interactive map of San Francisco showing a heat map overlay of tree locations, with blue/green clusters concentrated in areas like the Richmond District, Sunset District, and other neighborhoods. Map includes zoom controls and a "Leaflet | © OpenStreetMap contributors" attribution.

I designed the handout to also be useful for people who weren't able to attend the session in person. As is usually the case, material aimed at data journalists is equally applicable to anyone else with data to explore.

This is a link post by Simon Willison, posted on 16th March 2026.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe