Simon Willison’s Weblog

Subscribe

Weeknotes: cookiecutter templates, better plugin documentation, sqlite-generate

26th June 2020

I spent this week spreading myself between a bunch of smaller projects, and finally getting familiar with cookiecutter. I wrote about my datasette-plugin cookiecutter template earlier in the week; here’s what else I’ve been working on.

sqlite-generate

Datasette is supposed to work against any SQLite database you throw at it, no matter how weird the schema or how unwieldy the database shape or size.

I built a new tool called sqlite-generate this week to help me create databases of different shapes. It’s a Python command-line tool which uses Faker to populate a new database with random data. You run it something like this:

sqlite-generate demo.db \
    --tables=20 \
    --rows=100,500 \
    --columns=5,20 \
    --fks=0,3 \
    --pks=0,2 \
    --fts

This command creates a database containing 20 tables, each with between 100 and 500 rows and 5-20 columns. Each table will also have between 0 and 3 foreign key columns to other tables, and will feature between 0 and 2 primary key columns. SQLite full-text search will be configured against all of the text columns in the table.

I always try to include a live demo with any of my projects, and sqlite-generate is no exception. This GitHub Action runs on every push to main and deploys a demo to https://sqlite-generate-demo.datasette.io/ showing the latest version of the code in action.

The demo runs my datasette-search-all plugin in order to more easily demonstrate full-text search across all of the text columns in the generated tables. Try searching for newspaper.

click-app cookiecutter template

I write quite a lot of Click powered command-line tools like this one, so inspired by datasette-plugin I created a new click-app cookiecutter template that bakes in my own preferences about how to set up a new Click project (complete with GitHub Actions). sqlite-generate is the first tool I’ve built using that template.

Improved Datasette plugin documentation

I’ve split Datasette’s plugin documentation into five separate pages, and added a new page to the documentation about patterns for testing plugins.

The five pages are:

  • Plugins describing how to install and configure plugins
  • Writing plugins showing how to write one-off plugins, how to use the datasette-plugin cookiecutter template and how to package templates for release to PyPI
  • Plugin hooks documenting all of the available plugin hooks
  • Testing plugins describing my preferred patterns for writing tests for them (using pytest and HTTPX)
  • Internals for plugins describing the APIs Datasette makes available for use within plugin hook implementations

There’s also a list of available plugins on the Datasette Ecosystem page of the documentation, though I plan to move those to a separate plugin directory in the future.

datasette-block-robots

The datasette-plugin template practically eliminates the friction involved in starting a new plugin.

sqlite-generate generates random names for people. I don’t particularly want people who search for their own names stumbling across the live demo and being weirded out by their name featured there, so I decided to block it from search engine crawlers using robots.txt.

I wrote a tiny plugin to do this: datasette-block-robots, which uses the new register_routes() plugin hook to add a /robots.txt page.

It’s also a neat example of the simplest possible plugin to use that feature—along with the simplest possible unit test for exercising such a page.

datasette-saved-queries

Another new plugin, this time with a bit more substance to it. datasette-saved-queries exercises the new canned_queries() hook I described last week. It uses the new startup() hook to create tables on startup (if they are missing), then lets users insert records into those tables to save their own queries. Queries saved in this way are then returned as canned queries for that particular database.

main, not master

main is a better name for the main GitHub branch than master, which has unpleasant connotations (it apparently derives from master/slave in BitKeeper). My datasette-plugin and click-app cookiecutter templates both include instructions for renaming master to main in their READMEs—it’s as easy as running git branch -m master main before running your first push to GitHub.

I’m working towards making the switch for Datasette itself.