Weeknotes: Datasette, sqlite-utils, Datasette Desktop
A flurry of releases this week, including a new Datasette alpha and a fixed Datasette Desktop.
Work on Datasette Cloud continues—the last 10% of the work needed for the beta launch is trending towards taking 90% of the time. It’s been driving all sorts of fixes to the wider Datasette ecosystem, which is nice.
I ran into a bug which would have been easier to investigate using Sentry. The datasette-sentry plugin wasn’t working right, and it turned out I needed a new handle_exception() plugin hook to fix it. This was the impetus I needed to push out a new Datasette alpha—I like to get new hooks into an alpha as quickly as possible so I can depend on that Datasette version from alpha releases of plugins.
Here are some other highlights from the alpha’s release notes:
- The render_cell() plugin hook is now also passed a
rowargument, representing the
sqlite3.Rowobject that is being rendered. (#1300)
A neat thing about Pluggy is that you can add new arguments to existing plugin hooks without breaking plugins that already use them.
--nolockoption for ignoring file locks when opening read-only databases. (#1744)
Since the very start of the project Datasette has suggested trying the following command to start exploring your Google Chrome history, which is stored using SQLite:
datasette ~/Library/Application\ Support/Google/Chrome/Default/History
I’m not sure when this changed, but I tried running the command recently and got the following error:
sqlite3.OperationalError: database is locked
Since Datasette opens databases in read-only mode I didn’t see why a lock like this should be respected. It turns out SQLite can be told to ignore locks like so:
sqlite3.connect( "file:places.sqlite?mode=ro&nolock=1", uri=True )
So I added a
--nolock option to Datasette which does exactly that:
datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock
- Datasette now has a Discord community.
Inspired by 6 More Things I Learned Building Snowpack to 20,000 Stars (Part 2) by Fred K. Schott I finally setup a chat community for Datasette, using Discord.
It’s attracted 88 members already! You can join it here. I wrote detailed notes on how I configured it in this issue.
- Database file downloads now implement conditional GET using ETags. (#1739)
This is a change I made to support Datasette Lite—I noticed that the WASM version of Datasette was downloading a fresh database every time, so I added ETag support to encourage browsers to avoid a duplicate download and use a cached copy of the database file instead, provided it hasn’t changed.
Datasette Desktop was hanging on launch. Paul Everitt figured out a fix, which it took me way too long to get around to applying.
I finally shipped that in Datasette Desktop 0.2.2, but I wanted to reduce the chances of this happening again as much as possible. Datasette Desktop’s Elecron tests used the spectron test harness, but that’s marked as deprecated.
I’m a big fan of Playwright and I was optimistic to see that it has support for testing Electron apps. I figured out how to use that with Datasette Desktop and run the tests in GitHub Actions: I wrote up what I learned in a TIL, Testing Electron apps with Playwright and GitHub Actions.
Annotated release notes:
- New table.duplicate(new_name) method for creating a copy of a table with a matching schema and row contents. Thanks, David. (#449)
sqlite-utils duplicate data.db table_name new_nameCLI command for Duplicating tables. (#454)
davidleejy suggested the
table.duplicate() method and contributed an implementation. This was the impetus for pushing out a fresh release.
I added the CLI equivalent,
sqlite_utils.utils.rows_from_file()is now a documented API. It can be used to read a sequence of dictionaries from a file-like object containing CSV, TSV, JSON or newline-delimited JSON. It can be passed an explicit format or can attempt to detect the format automatically. (#443)
sqlite_utils.utils.TypeTrackeris now a documented API for detecting the likely column types for a sequence of string rows, see Detecting column types using TypeTracker. (#445)
sqlite_utils.utils.chunks()is now a documented API for splitting an iterator into chunks. (#451)
I have a policy that any time I need to use an undocumented method from
sqlite-utils in some other project I file an issue to add that to the documented API surface.
I had used
TypeTracker in datasette-socrata.
This was inspired by my TIL Ignoring errors in a section of a Bash script—a trick I had to figure out because one of my scripts needed to add columns and enable FTS but only if those changes had not been previously applied.
In looking into that I spotted inconsistencies in the design of the
sqlite-utils commands, so I fixed those as much as I could while still maintaining backwards compatibility with the 3.x releases.
Releases this week
s3-ocr: 0.5—(5 releases total)—2022-07-19
Tools for running OCR against files stored in S3
datasette-graphql: 2.1.1—(36 releases total)—2022-07-18
Datasette plugin providing an automatic GraphQL API for your SQLite databases
datasette-sentry: 0.2a0—(3 releases total)—2022-07-18
Datasette plugin for configuring Sentry
datasette: 0.62a1—(112 releases total)—2022-07-18
An open source multi-tool for exploring and publishing data
sqlite-utils: 3.28—(102 releases total)—2022-07-15
Python CLI utility and library for manipulating SQLite databases
datasette-publish-vercel: 0.14—(21 releases total)—2022-07-13
Datasette plugin for publishing data using Vercel
datasette-app: 0.2.2—(4 releases total)—2022-07-13
The Datasette macOS application
datasette-app-support: 0.11.6—(19 releases total)—2022-07-12
Part of https://github.com/simonw/datasette-app
TIL this week
More recent articles
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023
- download-esm: a tool for downloading ECMAScript modules - 2nd May 2023
- Let's be bear or bunny - 1st May 2023