Simon Willison’s Weblog

Weeknotes: Learning Kubernetes, learning Web Components

I’ve been mainly climbing the learning curve for Kubernetes and Web Components this week. I also released Datasette 0.59.1 with Python 3.10 compatibility and an updated Docker image.

Datasette 0.59.1

A few weeks ago I wrote about finding and reporting an asyncio bug in Python 3.10 that I discovered while trying to get Datasette to work on the latest release of Python.

Łukasz Langa offered a workaround which I submitted as a PR to the Janus library that Datasette depends on.

Andrew Svetlov landed and shipped that fix, which unblocked me from releasing Datasette 0.59.1 that works with Python 3.10.

The last step of the Datasette release process, after the package has been released to PyPI, is to build a new Docker image and publish it to Docker Hub. Here’s the GitHub Actions workflow that does that.

It turns out this stopped working when I released Datasette 0.59! I was getting this cryptic error message half way through the image build process:

/usr/bin/perl: error while loading shared libraries: cannot open shared object file: No such file or directory

I opened an issue for myself and started investigating.

The culprit was this section of the Datasette Dockerfile:

# software-properties-common provides add-apt-repository 
# which we need in order to install a more recent release 
# of libsqlite3-mod-spatialite from the sid distribution 
RUN apt-get update && \ 
    apt-get -y --no-install-recommends install software-properties-common && \ 
    add-apt-repository "deb sid main" && \ 
    apt-get update && \ 
    apt-get -t sid install -y --no-install-recommends libsqlite3-mod-spatialite && \ 
    apt-get remove -y software-properties-common && \ 

This was a hack I introduced seven months ago in order to upgrade the bundled SpatiaLite to version 5.0.

SpatiaLite 5.0 wasn’t yet available in Debian stable back then, so I used the above convoluted hack to install it from Debian unstable (“Sid”) instead.

When the latest stable version of Debian, Debian Bullseye, came out on October 9th my hack stopped working! I guess that’s what I get for messing around with unstable software.

Thankfully, Bullseye now bundles SpatiaLite 5, so the hack I was using is no longer necessary. I upgraded the Datasette base image from python:3.9.2-slim-buster to 3.9.7-slim-bullseye, installed SpatialLite the non-hacky way and fixed the issue.

Doing so also dropped the size of the compressed Datasette image from 94.37MB to 78.94MB, which is nice.

Learning Kubernetes

Datasette has been designed to run in containers from the very start. I have dozens of instances running on Google Cloud Run, and I’ve done a bunch of work with Docker as well, including trying out mechanisms to programatically launch new Datasette containers via the Docker API.

I’ve dragged my heels on really getting into Kubernetes due to the infamously tough learning curve, but I think it’s time to dig in, figure out how to use it and work out what new abilities it can provide me.

I’ve spun up small a Kubernetes cluster on Digital Ocean, mainly because I trust their UI to help me not spend hundreds of dollars by mistake. Getting the initial cluster running was very pleasant.

Now I’m figuring out how to do things with it.

DigitalOcean’s Operations-ready DigitalOcean Kubernetes (DOKS) for Developers course (which started as a webinar) starts OK and then gets quite complicated quite fast.

I got Paul Bouwer’s hello-kubernetes demo app working—it introduced me to Helm, but that operates at a higher level than I’m comfortable with—learning my way around kubectl and Kubernetes YAML is enough of a mental load already without adding an extra abstraction on top.

I’m reading Kubernetes: Up and Running which is promising so far.

My current goal is to figure out how to run a Datasette instance in a Kubernetes container with an attached persistent volume, so it can handle SQLite writes as well as reads. It looks like StatefulSets will be key to getting that to work. (Update: apparently not! Graham Dumpleton and Frank Wiles assure me that I can do this with just a regular Deployment.)

I’ll be sure to write this up as a TIL once I get it working.

Learning Web Components

Datasette’s visualization plugins—in particular datasette-vega—are long overdue for some upgrades.

I’ve been trying to find a good pattern for writing plugins that avoids too much (ideally any) build tool complexity, and that takes advantage of modern JavaScript—in particular JavaScript modules, which Datasette has supported since Datasette 0.54.

As such, I’m deeply intrigued by Web Components—which had a big moment this week when it was revealed that Adobe had used them extensively for Photoshop on the web.

One of my goals for Datasette visualization plugins is for them to be usable on other external pages—since Datasette can expose JSON data over CORS, being able to drop a visualization into an HTML page would be really neat (especially for newsroom purposes).

Imagine being able to import a JavaScript module and add something like this to get a map of all of the power plants in Portugal:


I’m hoping to be able to build components using regular, unadorned modern JavaScript, without the complexity of a build step.

As such, I’ve been exploring Skypack (TIL) and Snowpack which help bridge the gap between build-tooling-dependent npm packages and the modern world of native browser ES modules.

I was also impressed this week by Tonic, a framework for building components without a build step that weighs in at just 350 lines of code and makes extremely clever use of tagged template literals and async generators.

This morning I saw this clever example of a Single File Web Component by Kristofer Joseph—I ended up creating my own annotated version of his code which I shared in this TIL.

Next step: I need to write some web components of my own!

Releases this week

TIL this week