Weeknotes: Datasette 0.63.3, datasette-ripgrep
20th December 2022
We’re back in the UK to see family over Christmas (our first trip back since 2019). Here are a few notes from the past couple of weeks.
- Fixed a bug where
datasette --root, when running in Docker, would only output the URL to sign in as root when the server shut down, not when it started up. (#1958)
- You no longer need to ensure
await datasette.invoke_startup()has been called in order for Datasette to start correctly serving requests—this is now handled automatically the first time the server receives a request. This fixes a bug experienced when Datasette is served directly by an ASGI application server such as Uvicorn or Gunicorn. It also fixes a bug with the datasette-gunicorn plugin. (#1955)
That second fix ended up taking longer than expected.
The root of that fix was that back in Datasette 0.63 I introduced the need to call
await datasette.invoke_startup() as part of Datasette’s setup process—mainly to trigger plugins that might need to run their own async setup code.
This turned out to break a bunch of unexpected things—most notably, it affected any time people wanted to run Datasette using an ASGI handler such us Gunicorn or Uvicorn.
It broke my own datasette-gunicorn plugin too.
The core problem was that the
Datasette() class constructor can be called synchronously, but needed a subsequent
await ... call to run those
async def setup methods.
I realized that a neater way to handle this would be to introduce a mechanism such that the first time anyone attempted to run an HTTP request through Datasette—an operation that always involved an
invoke_startup() method would be called automatically.
I got that working, but in doing so I ran into a longer-running set of problems.
Datasette has around 1,200 tests at this point, and parts of the test suite date back to the start of the project and no longer reflect my preferred way of writing tests.
I’ve started running into “too many open files” errors running the test suite on macOS, and have so far not quite tracked down the best way to keep open file handles under control.
Test failures were hampering my efforts to fix the issue, so I used this as the impetus to refactor a large chunk of the test suite.
Several hundred of Datasette’s tests now share a single in-memory fixtures database—previously, these tests were using a
fixtures.db database file created in a temporary directory.
There’s still more test refactoring that I want to do, described in this issue, but I’m happy with the progress I’ve made so far.
datasette-ripgrep, cosmetic upgrade
I built datasette-ripgrep a couple of years ago—it’s a Datasette plugin that provides a UI for running
ripgrep code search queries—and linking to the results. It’s very handy for finding uses of APIs that I might want to deprecate.
In using it to investigate Datasette’s error output I spotted that the results would be more readable if they included a gap between non-consecutive line numbers, so I shipped an update with that improvement.
Releases this week
datasette-gunicorn: 0.1.1—(2 releases total)—2022-12-18
Plugin for running Datasette using Gunicorn
datasette: 0.63.3—(122 releases total)—2022-12-18
An open source multi-tool for exploring and publishing data
datasette-ripgrep: 0.8—(13 releases total)—2022-12-15
Web interface for searching your code using ripgrep, built as a Datasette plugin
datasette-media: 0.5.1—(7 releases total)—2022-12-13
Datasette plugin for serving media based on a SQL query
datasette-secret-santa: 0.1—(2 releases total)—2022-12-11
Run secret santa gift circles using Datasette
datasette-render-binary: 0.3.1—(4 releases total)—2022-12-10
Datasette plugin for rendering binary data
TIL this week
- Finding uses of an API with the new GitHub Code Search
- Reformatting text with Copilot
- Show files opened by pytest tests
- Viewing GeoPackage data with SpatiaLite and Datasette
- SQLite can use more than one index for a query
- Comparing database rows before and after with SQLite JSON functions
- Start, test, then stop a localhost web server in a Bash script
More recent articles
- Datasette Enrichments: a new plugin framework for augmenting your data - 1st December 2023
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023
- ospeak: a CLI tool for speaking text in the terminal via OpenAI - 7th November 2023
- DALL-E 3, GPT4All, PMTiles, sqlite-migrate, datasette-edit-schema - 30th October 2023