Datasette 0.60: The annotated release notes
14th January 2022
I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.
filters_from_request plugin hook
- New plugin hook: filters_from_request(request, database, table, datasette), which runs on the table page and can be used to support new custom query string parameters that modify the SQL query. (#473)
The inspiration for this hook was my ongoing quest to simplify and refactor Datasette’s TableView
, the most complex page in the project which provides an interface for filtering and paginating through a table of data.
The main job of that page is to convert a query string—with things like ?country_long=China
and &capacity_mw__gt=200
in it—into a SQL query.
So I extracted part of that logic out into a new plugin hook. I’ve already started using it in datasette-leaflet-freedraw to help support filtering a table by drawing on a map, demo here.
I also used the new hook to refactor Datasette itself. The filters.py module now registers where_filters()
, search_filters()
and through_filters()
implementations against that hook, to support various core pieces of Datasette functionality.
Tracing, write API improvements and performance
- The tracing feature now traces write queries, not just read queries. (#1568)
- Added two additional methods for writing to the database: await db.execute_write_script(sql, block=True) and await db.execute_write_many(sql, params_seq, block=True). (#1570)
- Made several performance improvements to the database schema introspection code that runs when Datasette first starts up. (#1555)
I built a new plugin called datasette-pretty-traces to help with my refactoring. It takes Datasette’s existing ?_trace=1 feature, which dumps out a big blob of JSON at the bottom of the page, and turns it into something that’s a bit easier to understand.
The plugin quickly started highlighting all sorts of interesting potential improvements!
After I added tracing to write queries it became apparent that Datasette’s schema introspection code—which runs once when the server starts, and then re-runs any time it notices a change to a database schema—was painfully inefficient.
It writes information about the schema into an in-memory database, which I hope to use in the future to power features like search of all attached tables.
I ended up adding two new documented internal methods for speeding up those writes: db.execute_write_script()
and db.execute_write_many()
. These are now available for plugins to use as well.
- The db.execute_write() internal method now defaults to blocking until the write operation has completed. Previously it defaulted to queuing the write and then continuing to run code while the write was in the queue. (#1579)
Spending time with code that wrote to the database highlighted a design flaw in Datasette’s original write method. I realized that every line of code I had written that used it looked like this:
db.execute_write("insert into ...", block=True)
The block=True
parameter means “block until the write has completed”. Without it, the write goes into a queue and code continues executing whether or not the write has been made.
This was clearly the wrong default. I used GitHub code search to check if changing it would be disruptive—it would not—and made the change. I’m glad I caught this before Datasette 1.0!
- Database write connections now execute the prepare_connection(conn, database, datasette) plugin hook. (#1564)
I noticed that writes to a database with SpatiaLite were failing with an error, because the SpatiaLite module was not being correctly loaded. This fixes that.
Faceting
A bunch of different fixes for Datasette’s Faceting made it into this release:
- The number of unique values in a facet is now always displayed. Previously it was only displayed if the user specified
?_facet_size=max
. (#1556)- Facets of type
date
orarray
can now be configured inmetadata.json
, see Facets in metadata.json. Thanks, David Larlet. (#1552)- New
?_nosuggest=1
parameter for table views, which disables facet suggestion. (#1557)- Fixed bug where
?_facet_array=tags&_facet=tags
would only display one of the two selected facets. (#625)
Other, smaller changes
- The
Datasette()
constructor no longer requires thefiles=
argument, and is now documented at Datasette class. (#1563)
A tiny usability improvement, mainly for tests. It means you can write a test that looks like this:
import pytest from datasette.app import Datasette @pytest.mark.asyncio async def test_datasette_homepage(): ds = Datasette() response = await ds.client.get("/") assert "<title>Datasette" in response.text
Previously the files=
argument was required, so you would have to use Datasette(files=[])
.
- The query string variables exposed by
request.args
will now include blank strings for arguments such asfoo
in?foo=&bar=1
rather than ignoring those parameters entirely. (#1551)
This came out of the refactor—this commit tells the story.
- Upgraded Pluggy dependency to 1.0. (#1575)
I needed this because Pluggy 1.0 allows multiple implementations of the same hook to be defined within the same file, like this:
@hookimpl(specname="filters_from_request") def where_filters(request, database, datasette): # ... @hookimpl(specname="filters_from_request") def search_filters(request, database, table, datasette): # ...
- Now using Plausible analytics for the Datasette documentation.
I really like Plausible as an analytics product. It does a great job of respecting user privacy while still producing useful numbers. It’s cookie-free, which means it doesn’t trigger a need for GDPR banners in Europe. I’m increasing using it on all of my projects.
- New CLI reference page showing the output of
--help
for each of thedatasette
sub-commands. This lead to several small improvements to the help copy. (#1594)
I first built this for sqlite-utils and liked it so much I brought it to Datasette as well. It’s generated by cog, using this inline script in the reStructuredText.
And the rest
- Label columns detected for foreign keys are now case-insensitive, so
Name
orTITLE
will be detected in the same way asname
ortitle
. (#1544)explain query plan
is now allowed with varying amounts of whitespace in the query. (#1588)- Fixed bug where writable canned queries could not be used with custom templates. (#1547)
- Improved fix for a bug where columns with a underscore prefix could result in unnecessary hidden form fields. (#1527)
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024