Datasette 0.64, with a warning about SpatiaLite
9th January 2023
I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.
Here are the release notes quoted in full:
- Datasette now strongly recommends against allowing arbitrary SQL queries if you are using SpatiaLite. SpatiaLite includes SQL functions that could cause the Datasette server to crash. See SpatiaLite for more details.
- New default_allow_sql setting, providing an easier way to disable all arbitrary SQL execution by end users:
datasette --setting default_allow_sql off
. See also Controlling the ability to execute arbitrary SQL. (#1409)- Building a location to time zone API with SpatiaLite is a new Datasette tutorial showing how to safely use SpatiaLite to create a location to time zone API.
- New documentation about how to debug problems loading SQLite extensions. The error message shown when an extension cannot be loaded has also been improved. (#1979)
- Fixed an accessibility issue: the
<select>
elements in the table filter form now show an outline when they are currently focused. (#1771)
The problem with SpatiaLite
Datasette allows arbitrary SQL execution as a core feature. It takes a bunch of steps to provide this safely: database connections are opened in read-only mode, it imposes a strict time limit on SQL queries and Datasette is designed to be run in containers for a further layer of protection.
SQLite itself is an excellent platform for this feature: it has a set of default functionality that supports this well, protected by a legendarily thorough test suite.
SpatiaLite is a long running third-party extension for SQLite that adds a bewildering array of additional functionality to SQLite—much of it around GIS, but with a whole host of extras as well. It includes debugging routines, XML parsers and even it’s own implementation of stored procedures!
Unfortunately, not all of this functionality is safe to expose to untrusted queries—even for databases that have been opened in read-only mode.
After identifying functions which could crash the Datasette instance, I decided that Datasette should make a strong recommendation not to expose SpatiaLite in an unprotected manner.
In addition to the new documentation, I also added a feature I’ve been planning for a while: a simple setting for disabling arbitrary SQL queries entirely:
datasette --setting default_allow_sql off
Prior to 0.64 you could achieve the same thing by adding the following line to your metadata.json file:
{
"allow_sql": false
}
Or in metadata.yml
:
allow_sql: false
The new setting achieves the same thing, but is more obvious and can be easily applied even for Datasette instances that don’t use metadata.
A new SpatiaLite tutorial
The documentation now recommends running SpatiaLite instances with pre-approved SQL implemented using Datasette’s canned queries feature.
To help clarify how this works, I decided to publish a new entry in the official series of Datasette tutorials:
Building a location to time zone API with SpatiaLite
This is an updated version of a tutorial I first wrote back in 2017.
The new tutorial now includes material on Chris Amico’s datasette-geojson-map plugin, SpatiaLite point-in-polygon queries, polygon intersection queries, spatial indexes and how to use the simplify()
function to reduce huge polygons down to a size that is more practical to display on a map.
I’m really happy with this new tutorial. Not only does it show a safe way to run SpatiaLite, but it also illustrates a powerful pattern for using Datasette to create and deploy custom APIs.
The resulting API can be accessed here:
https://timezones.datasette.io/timezones
It’s hosted on Fly, using their $1.94/month instance size with 256MB of RAM—easily powerful enough to host this class of application.
I also updated the datasette-publish-fly plugin to make it easier to deploy instances with SQL execution disabled, see the 1.3 release notes.
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024