Simon Willison’s Weblog


Datasette Desktop—a macOS desktop application for Datasette

8th September 2021

I just released version 0.1.0 of the new Datasette macOS desktop application, the first version that end-users can easily install. I would very much appreciate your help testing it out!

Datasette Desktop

Datasette Desktop screenshot

Datasette is “an open source multi-tool for exploring and publishing data”. It’s a Python web application that lets you explore data held in SQLite databases, plus a growing ecosystem of plugins for visualizing and manipulating those databases.

Datasette is aimed at data journalists, museum curators, archivists, local governments, scientists, researchers and anyone else who has data that they wish to explore and share with the world.

There’s just one big catch: since it’s a Python web application, those users have needed to figure out how to install and run Python software in order to use it. For people who don’t live and breath Python and the command-line this turns out to be a substantial barrier to entry!

Datasette Desktop is my latest attempt at addressing this problem. I’ve packaged up Datasette, SQLite and a full copy of Python such that users can download and uncompress a zip file, drag it into their /Applications folder and start using Datasette, without needing to know that there’s a Python web server running under the hood (or even understand what a Python web server is).

Please try it out, and send me feedback and suggestions on GitHub.

What the app does

This initial release has a small but useful set of features:

  • Open an existing SQLite database file and offer all of Datasette’s functionality, including the ability to explore tables and to execute arbitrary SQL queries.
  • Open a CSV file and offer the Datasette table interface (example here). By default this uses an in-memory database that gets cleared when the app shuts down, or you can...
  • Import CSV files into tables in on-disk SQLite databases (including creating a new blank database first).
  • By default the application runs a local web server which only accepts connections from your machine... but you can change that in the “File -> Access Control” menu to allow connections from anyone on your network. This includes Tailscale networks too, allowing you to run the application on your home computer and then access it securely from other devices such as your mobile phone anywhere in the world.
  • You can install plugins! This is the most exciting aspect of this initial release: it’s already in a state where users can customize it and developers can extend it, either with Datasette’s existing plugins (69 and counting) or by writing new ones.

How the app works

There are three components to the app:

  • A macOS wrapper application
  • Datasette itself
  • The datasette-app-support plugin

The first is the macOS application itself. This is currently written with Electron, and bundles a full copy of Python 3.9 (based on python-build-standalone by Gregory Szorc). Bundling Python is essential: the principal goal of the app is to allow people to use Datasette who aren’t ready to figure out how to install their own Python environment. Having an isolated and self-contained Python is also a great way of avoiding making XKCD 1987 even worse.

The macOS application doesn’t actually include Datasette itself. Instead, on first launch it creates a new Python virtual environment (currently in ~/.datasette-app/venv, feedback on that location welcome) and installs the other two components: Datasette and the datasette-app-support plugin.

Having a dedicated virtual environment is what enables the “Install Plugin” menu option. When a plugin is installed the macOS application runs pip install name-of-plugin and then restarts the Datasette server process, causing it to load that new plugin.

The datasette-app-support plugin is designed exclusively to work with this application. It adds API endpoints that the Electron shell can use to trigger specific actions, such as “import from this CSV file” or “attach this SQLite database”—these are generally triggered by macOS application menu items.

It also adds a custom authentication mechanism. The user of the app should have special permissions: only they should be able to import a CSV file from anywhere on their computer into Datasette. But for the “network share” feature I want other users to be able to access the web application.

An interesting consequence of installing Datasette on first-run rather than bundling it with the application is that the user will be able to upgrade to future Datasette releases without needing to re-install the application itself.

How I built it

I’ve been building this application completely in public over the past two weeks, writing up my notes and research in GitHub issues as I went (here’s the initial release milestone).

I had to figure out a lot of stuff!

First, Electron. Since almost all of the user-facing interface is provided by the existing Datasette web application, Electron was a natural fit: I needed help powering native menus and bundling everything up as an installable application, which Electron handles extremely well.

I also have ambitions to get a Windows version working in the future, which should share almost all of the same code.

Electron also has fantastic initial developer onboarding. I’d love to achieve a similar level of quality for Datasette some day.

The single biggest challenge was figuring out how to bundle a working copy of the Datasette Python application to run inside the Electron application.

My initial plan (touched on last week) was to compile Datasette and its dependencies into a single executable using PyInstaller or PyOxidizer or py2app.

These tools strip down a Python application to the minimal required set of dependencies and then use various tricks to compress that all into a single binary. They are really clever. For many projects I imagine this would be the right way to go.

I had one big problem though: I wanted to support plugin installation. Datasette plugins can have their own dependencies, and could potentially use any of the code from the Python standard library. This means that a stripped-down Python isn’t actually right for this project: I need a full installation, standard library and all.

Telling the user they had to install Python themselves was an absolute non-starter: the entire point of this project is to make Datasette available to users who are unwilling or unable to jump through those hoops.

Gregory Szorc built PyOxidizer, and as part of that he built python-build-standalone:

This project produces self-contained, highly-portable Python distributions. These Python distributions contain a fully-usable, full-featured Python installation as well as their build artifacts (object files, libraries, etc).

Sounds like exactly what I needed! I opened a research issue, built a proof-of-concept and decided to commit to that as the approach I was going to use. Here’s a TIL that describes how I’m doing this: Bundling Python inside an Electron app

(I find GitHub issue threads to be the ideal way of exploring these kinds of areas. Many of my repositories have a research label specifically to track them.)

The last key step was figuring out how to sign the application, so I could distribute it to other macOS users without them facing this dreaded dialog: can't be opened because Apple cannot check it for malicious software

It turns out there are two steps to this these days: signing the code with a developer certificate, and then “notarizing” it, which involves uploading the bundle to Apple’s servers, having them scan it for malicious code and attaching the resulting approval to the bundle.

I was expecting figuring this out to be a nightmare. It ended up not too bad: I spent two days on it, but most of the work ended up being done by electron-builder—one of the biggest advantages of working within the Electron ecosystem is that a lot of people have put a lot of effort into these final steps.

I was adamant that my eventual signing and notarization solution should be automated using GitHub Actions: nothing defangs a frustrating build process more than good automation! This made things a bit harder because all of the tutorials and documentation assumed you were working with a GUI, but I got there in the end. I wrote this all up as a TIL: Signing and notarizing an Electron app for distribution using GitHub Actions (see also Attaching a generated file to a GitHub release using Actions).

What’s next

I announced the release last night on Twitter and I’ve already started getting feedback. This has resulted in a growing number of issues under the usability label.

My expectation is that most improvements made for the benefit of Datasette Desktop will benefit the regular Datasette web application too.

There’s also a strategic component to this. I’m investing a lot of development work in Datasette, and I want that work to have the biggest impact possible. Datasette Desktop is an important new distribution channel, which also means that any time I add a new feature to Datasette or build a new plugin the desktop application should see the same benefit as the hosted web application.

If I’m unlucky I’ll find this slows me down: every feature I build will need to include consideration as to how it affects the desktop application.

My intuition currently is that this trade-off will be worthwhile: I don’t think ensuring desktop compatibility will be a significant burden, and the added value from getting new features almost for free through a whole separate distribution channel should hopefully be huge!

TIL this week

Releases this week