Simon Willison’s Weblog

Subscribe

Weeknotes: datasette-tiddlywiki, filters_from_request

24th December 2021

I made some good progress on the big refactor this week, including extracting some core logic out into a new Datasette plugin hook. I also got distracted by TiddlyWiki and released a new Datasette plugin that lets you run TiddlyWiki inside Datasette.

datasette-tiddlywiki

TiddlyWiki is a fascinating and unique project. Jeremy Ruston has been working on it for 17 years now and I’ve still not seen another piece of software that works even remotely like it.

It’s a full-featured wiki that’s implemented entirely as a single 2.3MB page of HTML and JavaScript, with a plugin system that allows it to be extended in all sorts of interesting ways.

The most unique feature of TiddlyWiki is how it persists data. You can create a brand new wiki by opening tiddlywiki.com/empty.html in your browser, making some edits... and then clicking the circle-tick “Save changes” button to download a copy of the page with your changes baked into it! Then you can open that up on your own computer and keep on using it.

There’s actually a lot more to TiddlyWiki persistence than that: The GettingStarted guide lists dozens of options that vary depending on operating system and browser—it’s worth browsing through them just to marvel at how much innovation has happened around the project just in the persistence space.

One of the options is to run a little server that implements the WebServer API and persists data sent via PUT requests. SQLite is an obvious candidate for a backend, and Datasette makes it pretty easy to provide APIs on top of SQLite... so I decided to experiment with building a Datasette plugin that offers a full persistant TiddlyWiki experience.

datasette-tiddlywiki is the result.

You can try it out by running datasette install datasette-tiddlywiki and then datasette tiddlywiki.db --create to start the server (with a tiddlywiki.db SQLite database that will be created if it does not already exist.)

Then navigate to http://localhost:8001/-/tiddlywiki to start interacting with your new TiddlyWiki. Any changes you make there will be persisted to the tiddlywiki database.

Animated demo showing creating a new tiddler

I had a running research issue that I updated as I was figuring out how to build it—all sorts of fun TiddlyWiki links and TILs are embedded in that thread. The issue started out in my private “notes” GitHub repository but I transferred it to the datasette-tiddlywiki repository after I had created and published the first version of the plugin.

filters_from_request() plugin hook

My big breakthrough in the ongoing Datasette Table View refactor project was a realization that I could simplify the table logic by extracting some of it out into a new plugin hook.

The new hook is called filters_from_request. It acknowledges that the primary goal of the table page is to convert query string parameters—like ?_search=tony or ?id__gte=6 or ?_where=id+in+(1,+2+,3) into SQL where clauses.

(Here’s a full list of supported table arguments.)

So that’s what filters_from_request() does—given a request object it can return SQL clauses that should be added to the WHERE.

Datasette now uses those internally to implement ?_where= and ?_search= and ?_through=, see datasette/filters.py.

I always try to accompany a new plugin hook with a plugin that actually uses it—in this case I’ve been updating datasette-leaflet-freedraw to use that hook to add a “draw a shape on a map to filter this table” interface to any table that it detects has a SpatiaLite geometry column. There’s a demo of that here:

https://calands.datasettes.com/calands/CPAD_2020a_SuperUnits?_freedraw=%7B%22type%22%3A%22MultiPolygon%22%2C%22coordinates%22%3A%5B%5B%5B%5B-121.92627%2C37.73597%5D%2C%5B-121.83838%2C37.68382%5D%2C%5B-121.64063%2C37.45742%5D%2C%5B-121.57471%2C37.19533%5D%2C%5B-121.81641%2C36.80928%5D%2C%5B-122.146%2C36.63316%5D%2C%5B-122.56348%2C36.65079%5D%2C%5B-122.89307%2C36.79169%5D%2C%5B-123.06885%2C36.96745%5D%2C%5B-123.09082%2C37.33522%5D%2C%5B-123.0249%2C37.562%5D%2C%5B-122.91504%2C37.77071%5D%2C%5B-122.71729%2C37.92687%5D%2C%5B-122.58545%2C37.96152%5D%2C%5B-122.10205%2C37.96152%5D%2C%5B-121.92627%2C37.73597%5D%5D%5D%5D%7D

Animated demo of drawing a shape on a map and then submitting the form to see items within that map region

Note the new custom ?_freedraw={...} parameter which accepts a GeoJSON polygon and uses it to filter the table—that’s implemented using the new hook.

This isn’t in a full Datasette release yet, but it’s available in the Datasette 0.60a1 alpha (added in 0.60a0) if you want to try it out.

Optimizing populate_table_schemas()

I introduced the datasette-pretty-traces plugin last week—it makes it much easier to see the queries that are running on any given Datasette page.

This week I realized it wasn’t tracking write queries, so I added support for that—and discovered that on first page load after starting up Datasette spends a lot of time populating its own internal database containing schema information (see Weeknotes: Datasette internals from last year.)

Example trace showing a cavalcade of write SQL

I opened a tracking ticket and made a bunch of changes to optimize this. The new code in datasette/utils/internal_db.py uses two new documented internal methods:

db.execute_write_script() and db.execute_write_many()

These are the new methods that were created as part of the optimization work. They are documented here:

They are Datasette’s async wrappers around the Python sqlite3 module’s executemany() and executescript() methods.

I also made a breaking change to Datasette’s existing execute_write() and execute_write_fn() methods: their block= argument now defaults to True, where it previously defaulted to False.

Prior to this change, db.execute_write(sql) would put the passed SQL in a queue to be executed once the write connection became available... and then return control to the calling code, whether or not that SQL had actually run- a fire-and-forget mechanism for executing SQL.

The block=True option would change it to blocking until the query had finished executing.

Looking at my own code, I realized I had never once used the fire-and-forget mechanism: I always used block=True to ensure the SQL had finished writing before I moved on.

So clearly block=True was a better default. I made that change in issue 1579.

This is technically a breaking change... but I used the new GitHub code search to see if anyone was using it in a way that would break and could only find one example of it in code not written by me, in datasette-webhook-write—and since they use block=True there anyway this update won’t break their code.

If I’d released Datasette 1.0 I would still consider this a breaking change and bump the major version number, but thankfully I’m still in the 0.x range where I can be a bit less formal about these kinds of thing!

Releases this week

TIL this week