Simon Willison’s Weblog

31 items tagged “docker”


logpaste (via) Useful example of how to use the Litestream SQLite replication tool in a Dockerized application: S3 credentials are passed to the container on startup, it then attempts to restore the SQLite database from S3 and starts a Litestream process in the same container to periodically synchronize changes back up to the S3 bucket. # 17th March 2021, 3:48 pm


New for AWS Lambda – Container Image Support. “You can now package and deploy Lambda functions as container images of up to 10 GB in size”—can’t wait to try this out with Datasette. # 1st December 2020, 5:34 pm

Sandboxing and Workload Isolation (via) run other people’s code in containers, so workload isolation is a Big Deal for them. This blog post goes deep into the history of isolation and the various different approaches you can take, and fills me with confidence that the team at know their stuff. I got to the bottom and found it had been written by Thomas Ptacek, which didn’t surprise me in the slightest. # 30th July 2020, 10:19 pm

GitHub Actions: Manual triggers with workflow_dispatch (via) New GitHub Actions feature which fills a big gap in the offering: you can now create “workflow dispatch” events which provide a button for manually triggering an action—and you can specify extra UI form fields that can customize how that action runs. This turns Actions into an interactive automation engine for any code that can be wrapped in a Docker container. # 7th July 2020, 4:33 am

datasette-publish-fly (via) Fly is a neat new Docker hosting provider with a very tempting pricing model: Just $2.67/month for their smallest always-on instance, and they give each user $10/month in free credit. datasette-publish-fly is the first plugin I’ve written using the publish_subcommand plugin hook, which allows extra hosting providers to be added as publish targets. Install the plugin and you can run “datasette publish fly data.db” to deploy SQLite databases to your Fly account. # 19th March 2020, 3:40 am

Weeknotes: Datasette Cloud and zero downtime deployments

Yesterday’s piece on Deploying a data API using GitHub Actions and Cloud Run was originally intended to be my weeknotes, but ended up getting a bit too involved.

[... 1428 words]

How to do Zero Downtime Deployments of Docker Containers. I’m determined to get reliable zero-downtime deploys working for a new project, because I know from experience that even a few seconds of downtime during a deploy changes the project mentality from “deploy any time you want” to “don’t deploy too often”. I’m using Docker containers behind Traefik, which means new containers should have traffic automatically balanced to them by Traefik based on their labels. After much fiddling around the pattern described by this article worked best for me: it lets me start a new container, then stop the old one and have Traefik’s “retry” mechanism send any requests to the stopped container over to the new one instead. # 16th January 2020, 11:12 pm

Weeknotes: Improv at Stanford, planning Datasette Cloud

Last week was the first week of the quarter at Stanford—which is called “shopping week” here because students are expected to try different classes to see which ones they are going to stick with.

[... 806 words]


Dockerfile for creating a Datasette of NHS dentist information (via) Really neat Dockerfile example by Alf Eaton that uses multi-stage builds to pull dentist information from the NHS, compile to SQLite using csvs-to-sqlite and serve the results with Datasette. TIL the NHS like to use ¬ as their CSV separator! # 26th April 2019, 2:09 pm

Smaller Python Docker Containers with Multi-Stage Builds and Python Wheels (via) Clear tutorial on how to use Docker’s multi-stage build feature to create smaller final images by taking advantage of Python’s wheel format—so an initial stage can install a full compiler toolchain and compile C dependencies into wheels, then a later stage can install those pre-compiled wheels into a slimmer container without including the C compiler. # 26th April 2019, 2:05 pm

Ministry of Silly Runtimes: Vintage Python on Cloud Run (via) Cloud Run is an exciting new hosting service from Google that lets you define a container using a Dockerfile and then run that container in a “scale to zero” environment, so you only pay for time spent serving traffic. It’s similar to the now-deprecated Zeit Now 1.0 which inspired me to create Datasette. Here Dustin Ingram demonstrates how powerful Docker can be as the underlying abstraction by deploying a web app using a 25 year old version of Python 1.x. # 9th April 2019, 5:33 pm


repo2docker (via) Neat tool from the Jupyter project team: run “jupyter-repo2docker” and it will pull a GitHub repository, create a new Docker container for it, install Jupyter and launch a Jupyter instance for you to start trying out the library. I’ve been doing this by hand using virtual environments, but using Docker for even cleaner isolation seems like a smart improvement. # 28th November 2018, 10:06 pm

dive (via) Handy command-line tool (as with so much of the Docker ecosystem it’s written in Go, which means you can download a Darwin binary directly from the GitHub releases page and run it directly on your Mac) for visually exploring the different layers of a given Docker image. # 19th November 2018, 4:41 am

Building smaller Python Docker images

Changes are afoot at Zeit Now, my preferred hosting provider for the past year (see previous posts). They have announced Now 2.0, an intriguing new approach to providing auto-scaling immutable deployments. It’s built on top of lambdas, and comes with a whole host of new constraints: code needs to fit into a 5MB bundle for example (though it looks like this restriction will soon be relaxed a littleupdate November 19th you can now bump this up to 50MB).

[... 1872 words]

elasticsearch-dump. Neat open source utility by TaskRabbit for importing and exporting data in bulk from Elasticsearch. It can copy data from one Elasticsearch cluster directly to another or to an intermediary file, making it a swiss-army knife for migrating data around. I successfully used the “docker run” incantation to execute it without needing to worry about having the correct version of Node.js installed. # 9th April 2018, 10:10 pm

rubber-docker/linux.c. rubber-docker is a workshop that talks through building a simply Docker clone from scratch in Python. I particularly liked this detail: linux.c is a Python extension written in C that exposes a small collection of Linux syscalls that are needed for the project—clone, mount, pivot_root, setns, umount, umount2 and unshare. Just reading through this module gives a really nice overview of how some of Docker’s underlying magic actually work. # 2nd April 2018, 6:18 pm

Cloud-first: Rapid webapp deployment using containers (via) The Research Software Engineering group at ICL have written a tutorial on deploying web apps as Docker containers using Azure and they use Datasette as the example application. # 28th March 2018, 3:50 pm

Andrew Godwin’s www-router Docker container (via) Really clever Docker trick: a container that runs Nginx and uses it to route traffic to other containers based on the hostname—but the hostnames to be routed are configured using environment variables which the CMD script uses to dynamically construct an nginx config when the container starts. # 21st February 2018, 5:04 am

Building a Full-Text Search App Using Docker and Elasticsearch. Deep, comprehensive tutorial from Patrick Triest showing how to use docker-compose to run three containers (Node API, nginx static content, elasticsearch) and then use that to build a neat Vue.js web search UI against 100 books from Project Gutenberg. # 1st February 2018, 3:41 pm


How to compile and run a Pony program using Docker. My notes on using the Docker ponylang/ponyc container to compile and execute a Pony program without needing to install anything (since Docker will download and run the image the first time you run the command). # 18th December 2017, 9:47 pm

Inside Docker’s “FROM scratch” (via) I’m a big fan of understanding your abstractions. Here’s a neat tutorial that dives deep into Docker’s “scratch” image which offers the smallest possible Docker image, and hence provides a great opportunity to understand what a Docker container at its most minimal does for you. # 27th November 2017, 4:33 pm

Run the First Edition of Unix (1972) with Docker (via) This is so cool... just run “docker run --rm -it bahamat/unix-1st-ed” to drop into a simulation of a PDP-11 running genuine 1972 era Unix! If you haven’t got into Docker yet, Docker for Mac is a single click install these days and works incredibly well. # 22nd November 2017, 3:36 pm

gzthermal-web (via) I built a quick web application wrapping the gzthermal gzip visualization tool and deployed it to Zeit Now wrapped up in a Docker container. Give it a URL and it shows you a PNG visualization of how gzip encodes that page. # 21st November 2017, 6:24 pm

tuxracer-web. Brilliant Docker hack from David Cooper: just run “docker run -p 8008:80 dtcooper/tuxracer-web” to get Tux Racer (the 3D game) running in your browser, on top a cunning mix of the noVNC HTML5 VNC client and icecast for sound. # 14th November 2017, 11:28 pm

Datasette: instantly create and publish an API for your SQLite databases

I just shipped the first public version of datasette, a new tool for creating and publishing JSON APIs for SQLite databases.

[... 968 words]

Docker.qcow2 never shrinks—disk space usage leak in docker for mac (via) Interesting year-long thread on disk usage by Docker for Mac, including a bunch of potential workarounds for if it swallows too much disk space. # 5th November 2017, 3:06 pm

Docker Containers on the Desktop (via) Jessie Frazelle’s classic explanation from 2015 of how she runs every desktop application on her Linux machine in its own Docker container. # 5th November 2017, 4:16 am

Running a load testing Go utility using Docker for Mac

I’m playing around with Zeit Now at the moment (see my previous entry) and decided to hit it with some traffic using Apache Bench. I got this SSL handshake error:

[... 818 words]

arxiv-vanity (via) Beautiful new project from Ben Firshman and Andreas Jansson: “Arxiv Vanity renders academic papers from Arxiv as responsive web pages so you don’t have to squint at a PDF”. It works by pulling the raw LaTeX source code from Arxiv and rendering it to HTML using a heavily customized Pandoc workflow. The real fun is in the architecture: it’s a Django app running on Heroku which fires up on-demand Docker containers for each individual rendering job. # 25th October 2017, 8:06 pm

A Brief Intro to Docker for Djangonauts (via) This is great—a really clear introduction to both Docker and Docker Compose, aimed at Django developers. Includes line-by-line annotations of an example Dockerfile and docker-compose.yml. # 18th October 2017, 9:06 pm