Weeknotes: Publishing data using Datasette Cloud
12th October 2022
My initial preview releases of Datasette Cloud (the SaaS version of my Datasette open source project) have focused on private data collaboration.
Users can create private spaces for their data, and then invite in team members to collaborate in that space.
Each space runs in its own Firecracker container (hosted on Fly), completely isolated from other instances.
The more time I’ve spent with the preview, the more I’ve felt that something crucial is missing.
Datasette can do a lot of things, but in my opinion the thing it is better at than anything else at is publishing data.
A question I like to ask people in my office hours sessions is “what job have you hired Datasette to solve?”. One of the most common answers is “to publish data online”.
So Datasette Cloud really needs to be able to help people publish their data to a wider audience! Leaving that feature out means leaving a lot of the value of the open source product off the table.
I’d been contemplating a number of more elaborate strategies for this: I’m looking forward to being able to use LiteFS to host globally distributed read-replicas for example.
But the simplest thing that could possibly work would just be to allow people to toggle individual tables from private to public.
I decided to build that first.
I try to build as much of the functionality of Datasette Cloud as possible as open source plugins. So I built a new plugin: datasette-public.
The plugin hooks into Datasette’s authentication and permissions system, which is already used for the existing private spaces feature.
It adds a new SQLite table called _public_tables
—and a new permission rule which grants access to any user if the table they are trying to access is listed there.
Then it adds a little bit of UI which users can use to add or remove a table from that list of public tables.
And that’s pretty much it! It’s a simple implementation but it works very nicely as a first draft of the new feature.
Here’s an example of a table I published using the feature:
https://simon.datasette.cloud/data/global-power-plants
Working on Datasette Cloud has been fantastic for ironing out details in Datasette itself. Here are a couple of new Datasette issues that emerged from this work:
- #1829: Table/database that is private due to inherited permissions does not show padlock
- #1831: If user can see table but NOT database/instance nav links should not display
My next area of focus is going to be around the Datasette and Datasette Cloud APIs.
Datasette Cloud gets a lot more interesting once it’s possible to use authenticated API calls to write data to it—rather than just supporting uploaded CSVs.
Working on this will drive some long-needed work around writable APIs in Datasette itself—something that until now has been entirely the realm of plugins such as datasette-insert.
Releases this week
-
datasette-public: 0.2—(2 releases total)—2022-10-07
Make specific Datasette tables visible to the public -
datasette-sentry: 0.3—(6 releases total)—2022-10-06
Datasette plugin for configuring Sentry -
datasette-search-all: 1.1—(8 releases total)—2022-10-05
Datasette plugin for searching all searchable tables at once
TIL this week
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024