Datasette 0.60: The annotated release notes
I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.
filters_from_request plugin hook
- New plugin hook: filters_from_request(request, database, table, datasette), which runs on the table page and can be used to support new custom query string parameters that modify the SQL query. (#473)
The inspiration for this hook was my ongoing quest to simplify and refactor Datasette’s
TableView, the most complex page in the project which provides an interface for filtering and paginating through a table of data.
The main job of that page is to convert a query string—with things like
&capacity_mw__gt=200 in it—into a SQL query.
So I extracted part of that logic out into a new plugin hook. I’ve already started using it in datasette-leaflet-freedraw to help support filtering a table by drawing on a map, demo here.
I also used the new hook to refactor Datasette itself. The filters.py module now registers
through_filters() implementations against that hook, to support various core pieces of Datasette functionality.
Tracing, write API improvements and performance
- The tracing feature now traces write queries, not just read queries. (#1568)
- Added two additional methods for writing to the database: await db.execute_write_script(sql, block=True) and await db.execute_write_many(sql, params_seq, block=True). (#1570)
- Made several performance improvements to the database schema introspection code that runs when Datasette first starts up. (#1555)
I built a new plugin called datasette-pretty-traces to help with my refactoring. It takes Datasette’s existing ?_trace=1 feature, which dumps out a big blob of JSON at the bottom of the page, and turns it into something that’s a bit easier to understand.
The plugin quickly started highlighting all sorts of interesting potential improvements!
After I added tracing to write queries it became apparent that Datasette’s schema introspection code—which runs once when the server starts, and then re-runs any time it notices a change to a database schema—was painfully inefficient.
It writes information about the schema into an in-memory database, which I hope to use in the future to power features like search of all attached tables.
I ended up adding two new documented internal methods for speeding up those writes:
db.execute_write_many(). These are now available for plugins to use as well.
- The db.execute_write() internal method now defaults to blocking until the write operation has completed. Previously it defaulted to queuing the write and then continuing to run code while the write was in the queue. (#1579)
Spending time with code that wrote to the database highlighted a design flaw in Datasette’s original write method. I realized that every line of code I had written that used it looked like this:
db.execute_write("insert into ...", block=True)
block=True parameter means “block until the write has completed”. Without it, the write goes into a queue and code continues executing whether or not the write has been made.
This was clearly the wrong default. I used GitHub code search to check if changing it would be disruptive—it would not—and made the change. I’m glad I caught this before Datasette 1.0!
- Database write connections now execute the prepare_connection(conn, database, datasette) plugin hook. (#1564)
I noticed that writes to a database with SpatiaLite were failing with an error, because the SpatiaLite module was not being correctly loaded. This fixes that.
A bunch of different fixes for Datasette’s Faceting made it into this release:
- The number of unique values in a facet is now always displayed. Previously it was only displayed if the user specified
- Facets of type
arraycan now be configured in
metadata.json, see Facets in metadata.json. Thanks, David Larlet. (#1552)
?_nosuggest=1parameter for table views, which disables facet suggestion. (#1557)
- Fixed bug where
?_facet_array=tags&_facet=tagswould only display one of the two selected facets. (#625)
Other, smaller changes
Datasette()constructor no longer requires the
files=argument, and is now documented at Datasette class. (#1563)
A tiny usability improvement, mainly for tests. It means you can write a test that looks like this:
import pytest from datasette.app import Datasette @pytest.mark.asyncio async def test_datasette_homepage(): ds = Datasette() response = await ds.client.get("/") assert "<title>Datasette" in response.text
files= argument was required, so you would have to use
- The query string variables exposed by
request.argswill now include blank strings for arguments such as
?foo=&bar=1rather than ignoring those parameters entirely. (#1551)
This came out of the refactor—this commit tells the story.
- Upgraded Pluggy dependency to 1.0. (#1575)
I needed this because Pluggy 1.0 allows multiple implementations of the same hook to be defined within the same file, like this:
@hookimpl(specname="filters_from_request") def where_filters(request, database, datasette): # ... @hookimpl(specname="filters_from_request") def search_filters(request, database, table, datasette): # ...
- Now using Plausible analytics for the Datasette documentation.
I really like Plausible as an analytics product. It does a great job of respecting user privacy while still producing useful numbers. It’s cookie-free, which means it doesn’t trigger a need for GDPR banners in Europe. I’m increasing using it on all of my projects.
- New CLI reference page showing the output of
--helpfor each of the
datasettesub-commands. This lead to several small improvements to the help copy. (#1594)
I first built this for sqlite-utils and liked it so much I brought it to Datasette as well. It’s generated by cog, using this inline script in the reStructuredText.
And the rest
- Label columns detected for foreign keys are now case-insensitive, so
TITLEwill be detected in the same way as
explain query planis now allowed with varying amounts of whitespace in the query. (#1588)
- Fixed bug where writable canned queries could not be used with custom templates. (#1547)
- Improved fix for a bug where columns with a underscore prefix could result in unnecessary hidden form fields. (#1527)
More recent articles
- Weeknotes: Parquet in Datasette Lite, various talks, more LLM hacking - 4th June 2023
- It's infuriatingly hard to understand how closed models train on their input - 4th June 2023
- ChatGPT should include inline tips - 30th May 2023
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023