Simon Willison’s Weblog

Subscribe

Weeknotes: Datasette, sqlite-utils, Datasette Desktop

20th July 2022

A flurry of releases this week, including a new Datasette alpha and a fixed Datasette Desktop.

datasette 0.62a1

Work on Datasette Cloud continues—the last 10% of the work needed for the beta launch is trending towards taking 90% of the time. It’s been driving all sorts of fixes to the wider Datasette ecosystem, which is nice.

I ran into a bug which would have been easier to investigate using Sentry. The datasette-sentry plugin wasn’t working right, and it turned out I needed a new handle_exception() plugin hook to fix it. This was the impetus I needed to push out a new Datasette alpha—I like to get new hooks into an alpha as quickly as possible so I can depend on that Datasette version from alpha releases of plugins.

Here are some other highlights from the alpha’s release notes:

  • The render_cell() plugin hook is now also passed a row argument, representing the sqlite3.Row object that is being rendered. (#1300)

A neat thing about Pluggy is that you can add new arguments to existing plugin hooks without breaking plugins that already use them.

  • New --nolock option for ignoring file locks when opening read-only databases. (#1744)

Since the very start of the project Datasette has suggested trying the following command to start exploring your Google Chrome history, which is stored using SQLite:

datasette ~/Library/Application\ Support/Google/Chrome/Default/History

I’m not sure when this changed, but I tried running the command recently and got the following error:

sqlite3.OperationalError: database is locked

Since Datasette opens databases in read-only mode I didn’t see why a lock like this should be respected. It turns out SQLite can be told to ignore locks like so:

sqlite3.connect(
    "file:places.sqlite?mode=ro&nolock=1",
    uri=True
)

So I added a --nolock option to Datasette which does exactly that:

datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock

Inspired by 6 More Things I Learned Building Snowpack to 20,000 Stars (Part 2) by Fred K. Schott I finally setup a chat community for Datasette, using Discord.

It’s attracted 88 members already! You can join it here. I wrote detailed notes on how I configured it in this issue.

  • Database file downloads now implement conditional GET using ETags. (#1739)

This is a change I made to support Datasette Lite—I noticed that the WASM version of Datasette was downloading a fresh database every time, so I added ETag support to encourage browsers to avoid a duplicate download and use a cached copy of the database file instead, provided it hasn’t changed.

Datasette Desktop

Datasette Desktop was hanging on launch. Paul Everitt figured out a fix, which it took me way too long to get around to applying.

I finally shipped that in Datasette Desktop 0.2.2, but I wanted to reduce the chances of this happening again as much as possible. Datasette Desktop’s Elecron tests used the spectron test harness, but that’s marked as deprecated.

I’m a big fan of Playwright and I was optimistic to see that it has support for testing Electron apps. I figured out how to use that with Datasette Desktop and run the tests in GitHub Actions: I wrote up what I learned in a TIL, Testing Electron apps with Playwright and GitHub Actions.

sqlite-utils 3.28

Annotated release notes:

davidleejy suggested the table.duplicate() method and contributed an implementation. This was the impetus for pushing out a fresh release.

I added the CLI equivalent, sqlite-utils duplicate.

  • sqlite_utils.utils.rows_from_file() is now a documented API. It can be used to read a sequence of dictionaries from a file-like object containing CSV, TSV, JSON or newline-delimited JSON. It can be passed an explicit format or can attempt to detect the format automatically. (#443)
  • sqlite_utils.utils.TypeTracker is now a documented API for detecting the likely column types for a sequence of string rows, see Detecting column types using TypeTracker. (#445)
  • sqlite_utils.utils.chunks() is now a documented API for splitting an iterator into chunks. (#451)

I have a policy that any time I need to use an undocumented method from sqlite-utils in some other project I file an issue to add that to the documented API surface.

I had used rows_from_file() and TypeTracker in datasette-socrata.

  • sqlite-utils enable-fts now has a --replace option for replacing the existing FTS configuration for a table. (#450)
  • The create-index, add-column and duplicate commands all now take a --ignore option for ignoring errors should the database not be in the right state for them to operate. (#450)

This was inspired by my TIL Ignoring errors in a section of a Bash script—a trick I had to figure out because one of my scripts needed to add columns and enable FTS but only if those changes had not been previously applied.

In looking into that I spotted inconsistencies in the design of the sqlite-utils commands, so I fixed those as much as I could while still maintaining backwards compatibility with the 3.x releases.

Releases this week

TIL this week