Simon Willison’s Weblog

Subscribe

40 items tagged “datajournalism”

2019

Generating a commit log for San Francisco’s official list of trees

San Francisco has a neat open data portal (as do an increasingly large number of cities these days). For a few years my favourite file on there has been Street Tree List, a list of all 190,000 trees in the city maintained by the Department of Public Works.

[... 1051 words]

Publish the data behind your stories with SQLite and Datasette. I presented a workshop on Datasette at the IRE and NICAR CAR 2019 data journalism conference yesterday. Here’s the worksheet I prepared for the tutorial. # 9th March 2019, 6:27 pm

socrata2sql (via) Phenomenal new open source tool released by Andrew Chavez at the Dallas Morning News. Socrata is the open data portal software used by huge numbers of local governments worldwide. socrata2sql is a tool that interacts with the standard Socrata API and can use it to suck down a dataset and save it as a SQLite, PostgreSQL, MySQL or other SQLAlchemy-supported database. I just tried this and it took a single command to create a SQLite database of every police arrest in Dallas in the past five years. # 8th February 2019, 3:27 pm

2018

Helicopter accident analysis notebook (via) Ben Welsh worked on an article for the LA Times about helicopter accident rates, and has published the underlying analysis as an extremely detailed Jupyter notebook. Lots of neat new (to me) notebook tricks in here as well. # 19th November 2018, 6:25 pm

How to Instantly Publish Data to the Internet with Datasette

I spoke about my Datasette project at PyBay in August and they’ve just posted the video of my talk.

[... 58 words]

Notes from my appearance on the Changelog podcast

After I spoke at Zeit Day SF last weekend I sat down with Adam Stacoviak to record a 25 minute segment for episode 296 of the Changelog podcast, talking about Datasette. We covered a lot of ground!

[... 536 words]

Baltimore Sun Public Salary Records (via) The Baltimore Sun have published an interactive search engine for public salaries of Maryland state employees, and it’s powered by Datasette! Since data journalism is one of my key use-cases for Datasette I’m incredibly excited to see this in the wild. They’ve also published the underlying source code (see the via link) which is a really nice example of how to use Datasette’s custom templates and canned query functionality. # 28th March 2018, 5:12 pm

2009

Learning to Think Like A Programmer. Outstanding advice aimed mainly at journalists, but important to anyone who collects information for a living and might want it to be automatically processed at some point in the future. # 22nd January 2009, 6:06 pm

Train Crash Leads LA Times to Create Django Database on Deadline. A story from last September. I didn’t know the LA Times used Django. UPDATE: Yes I did, I introduced their panel about it at DjangoCon. Sorry, mind like a sieve sometimes. # 21st January 2009, 5:19 pm

Washington Post Update. Peter Harkins summarises the large number of Django-powered database journalism projects released by the Post since September 2007. # 16th January 2009, 12:18 pm