Simon Willison’s Weblog

Subscribe

Datasette 0.60: The annotated release notes

14th January 2022

I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.

filters_from_request plugin hook

The inspiration for this hook was my ongoing quest to simplify and refactor Datasette’s TableView, the most complex page in the project which provides an interface for filtering and paginating through a table of data.

The main job of that page is to convert a query string—with things like ?country_long=China and &capacity_mw__gt=200 in it—into a SQL query.

So I extracted part of that logic out into a new plugin hook. I’ve already started using it in datasette-leaflet-freedraw to help support filtering a table by drawing on a map, demo here.

I also used the new hook to refactor Datasette itself. The filters.py module now registers where_filters(), search_filters() and through_filters() implementations against that hook, to support various core pieces of Datasette functionality.

Tracing, write API improvements and performance

I built a new plugin called datasette-pretty-traces to help with my refactoring. It takes Datasette’s existing ?_trace=1 feature, which dumps out a big blob of JSON at the bottom of the page, and turns it into something that’s a bit easier to understand.

The plugin quickly started highlighting all sorts of interesting potential improvements!

After I added tracing to write queries it became apparent that Datasette’s schema introspection code—which runs once when the server starts, and then re-runs any time it notices a change to a database schema—was painfully inefficient.

It writes information about the schema into an in-memory database, which I hope to use in the future to power features like search of all attached tables.

I ended up adding two new documented internal methods for speeding up those writes: db.execute_write_script() and db.execute_write_many(). These are now available for plugins to use as well.

  • The db.execute_write() internal method now defaults to blocking until the write operation has completed. Previously it defaulted to queuing the write and then continuing to run code while the write was in the queue. (#1579)

Spending time with code that wrote to the database highlighted a design flaw in Datasette’s original write method. I realized that every line of code I had written that used it looked like this:

db.execute_write("insert into ...", block=True)

The block=True parameter means “block until the write has completed”. Without it, the write goes into a queue and code continues executing whether or not the write has been made.

This was clearly the wrong default. I used GitHub code search to check if changing it would be disruptive—it would not—and made the change. I’m glad I caught this before Datasette 1.0!

I noticed that writes to a database with SpatiaLite were failing with an error, because the SpatiaLite module was not being correctly loaded. This fixes that.

Faceting

A bunch of different fixes for Datasette’s Faceting made it into this release:

  • The number of unique values in a facet is now always displayed. Previously it was only displayed if the user specified ?_facet_size=max. (#1556)
  • Facets of type date or array can now be configured in metadata.json, see Facets in metadata.json. Thanks, David Larlet. (#1552)
  • New ?_nosuggest=1 parameter for table views, which disables facet suggestion. (#1557)
  • Fixed bug where ?_facet_array=tags&_facet=tags would only display one of the two selected facets. (#625)

Other, smaller changes

  • The Datasette() constructor no longer requires the files= argument, and is now documented at Datasette class. (#1563)

A tiny usability improvement, mainly for tests. It means you can write a test that looks like this:

import pytest
from datasette.app import Datasette

@pytest.mark.asyncio
async def test_datasette_homepage():
    ds = Datasette()
    response = await ds.client.get("/")
    assert "<title>Datasette" in response.text

Previously the files= argument was required, so you would have to use Datasette(files=[]).

  • The query string variables exposed by request.args will now include blank strings for arguments such as foo in ?foo=&bar=1 rather than ignoring those parameters entirely. (#1551)

This came out of the refactor—this commit tells the story.

  • Upgraded Pluggy dependency to 1.0. (#1575)

I needed this because Pluggy 1.0 allows multiple implementations of the same hook to be defined within the same file, like this:

@hookimpl(specname="filters_from_request")
def where_filters(request, database, datasette):
    # ...

@hookimpl(specname="filters_from_request")
def search_filters(request, database, table, datasette):
    # ...

I really like Plausible as an analytics product. It does a great job of respecting user privacy while still producing useful numbers. It’s cookie-free, which means it doesn’t trigger a need for GDPR banners in Europe. I’m increasing using it on all of my projects.

  • New CLI reference page showing the output of --help for each of the datasette sub-commands. This lead to several small improvements to the help copy. (#1594)

I first built this for sqlite-utils and liked it so much I brought it to Datasette as well. It’s generated by cog, using this inline script in the reStructuredText.

And the rest

  • Label columns detected for foreign keys are now case-insensitive, so Name or TITLE will be detected in the same way as name or title. (#1544)
  • explain query plan is now allowed with varying amounts of whitespace in the query. (#1588)
  • Fixed bug where writable canned queries could not be used with custom templates. (#1547)
  • Improved fix for a bug where columns with a underscore prefix could result in unnecessary hidden form fields. (#1527)

This is Datasette 0.60: The annotated release notes by Simon Willison, posted on 14th January 2022.

Part of series Datasette: The annotated release notes

  1. Datasette 0.58: The annotated release notes - July 16, 2021, 2:21 a.m.
  2. Datasette Desktop 0.2.0: The annotated release notes - Sept. 13, 2021, 11:30 p.m.
  3. Datasette 0.59: The annotated release notes - Oct. 19, 2021, 4:59 a.m.
  4. Datasette 0.60: The annotated release notes - Jan. 14, 2022, 2:30 a.m.
  5. Datasette 0.61: The annotated release notes - March 24, 2022, 1:53 a.m.
  6. Datasette 0.63: The annotated release notes - Oct. 27, 2022, 10:13 p.m.
  7. Datasette's new JSON write API: The first alpha of Datasette 1.0 - Dec. 2, 2022, 11:15 p.m.
  8. … more

Next: Weeknotes: s3-credentials prefix and Datasette 0.60

Previous: How I build a feature