449 posts tagged “datasette”
Datasette is an open source tool for exploring and publishing data.
2023
Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle.
We’ve built many tools for publishing to the web - but I want to make the claim that we have underdeveloped the tools and platforms for publishing collections, indexes and small databases. It’s too hard to build these kinds of experiences, too hard to maintain them and a lack of collaborative tools.
Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model
Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).
[... 1,323 words]datasette-granian (via) Granian is a new Python web server—similar to Gunicorn—written in Rust. I built a small plugin that adds a “datasette granian” command starting a Granian server that serves Datasette’s ASGI application, using the same pattern as my existing datasette-gunicorn plugin.
Datasette is my data hammer (via) Jeremia Kimelman—a data journalist at CalMatters in Sacramento—enthuses about how he uses Datasette as his default hammer for all kinds of data projects—in particular how much he appreciates Datasette’s focus on URLs. So nice to see this!
Weeknotes: AI hacking and a SpatiaLite tutorial
Short weeknotes this time because the key things I worked on have already been covered here:
How to implement Q&A against your documentation with GPT3, embeddings and Datasette
If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.
[... 3,447 words]Datasette 0.64, with a warning about SpatiaLite
I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.
[... 675 words]2022
Weeknotes: Datasette 0.63.3, datasette-ripgrep
We’re back in the UK to see family over Christmas (our first trip back since 2019). Here are a few notes from the past couple of weeks.
[... 801 words]Datasette 1.0a2: Upserts and finely grained permissions
I’ve released the third alpha of Datasette 1.0. The 1.0a2 release introduces upsert support to the new JSON API and makes some major improvements to the Datasette permissions system.
[... 2,844 words]Over-engineering Secret Santa with Python cryptography and Datasette
We’re doing a family Secret Santa this year, and we needed a way to randomly assign people to each other without anyone knowing who was assigned to who.
[... 2,044 words]Weeknotes: datasette-ephemeral-tables, datasette-export
Most of what I’ve been working on for the past week and a half is already documented:
[... 603 words]Datasette’s new JSON write API: The first alpha of Datasette 1.0
This week I published the first alpha release of Datasette 1.0, with a significant new feature: Datasette core now includes a JSON API for creating and dropping tables and inserting, updating and deleting data.
[... 2,817 words]Weeknotes: Implementing a write API, Mastodon distractions
Everything is so distracting at the moment. The ongoing Twitter catastrophe, the great migration (at least amongst most of the people I pay attention to) to Mastodon, the FTX calamity. It’s been very hard to focus!
[... 916 words]Tracking Mastodon user numbers over time with a bucket of tricks
Mastodon is definitely having a moment. User growth is skyrocketing as more and more people migrate over from Twitter.
[... 1,534 words]Datasette Lite: Loading JSON data (via) I added a new feature to Datasette Lite: you can now pass it the URL to a JSON file (hosted on a CORS-compatible hosting provider such as GitHub or GitHub Gists) and it will load that file into a database table for you. It expects an array of objects, but if your file has an object as the root it will search through it looking for the first key that is an array of objects and load those instead.
Datasette is 5 today: a call for birthday presents
Five years ago today I published the first release of Datasette, in Datasette: instantly create and publish an API for your SQLite databases.
[... 548 words]Designing a write API for Datasette
Building out Datasette Cloud has made one thing clear to me: Datasette needs a write API for ingesting new data into its attached SQLite databases.
[... 1,493 words]Datasette 0.63: The annotated release notes
I released Datasette 0.63 today. These are the annotated release notes.
[... 1,531 words]Weeknotes: DjangoCon, SQLite in Django, datasette-gunicorn
I spent most of this week at DjangoCon in San Diego—my first outside-of-the-Bay-Area conference since the before-times.
[... 1,184 words]Measuring traffic during the Half Moon Bay Pumpkin Festival
This weekend was the 50th annual Half Moon Bay Pumpkin Festival.
[... 2,693 words]Automating screenshots for the Datasette documentation using shot-scraper
I released shot-scraper back in March as a tool for keeping screenshots in documentation up-to-date.
[... 1,810 words]Weeknotes: Publishing data using Datasette Cloud
My initial preview releases of Datasette Cloud (the SaaS version of my Datasette open source project) have focused on private data collaboration.
[... 582 words]Figure out how to serve an AWS Lambda function with a Function URL from a custom subdomain (via) This took me five hours and 77 issue comments to figure out, but I finally managed to serve an AWS Lambda function running Datasette on a custom subdomain with an HTTPS certificate. I was going to write this up as a TIL but I’m exhausted so I decided to share my private notes thread instead.
Weeknotes: Datasette Cloud preview invitations
This week I finally started sending out invitations for people to try out the preview of the new Datasette Cloud, my SaaS offering for Datasette.
[... 713 words]Exploring 10m scraped Shutterstock videos used to train Meta’s Make-A-Video text-to-video model
Make-A-Video is a new “state-of-the-art AI system that generates videos from text” from Meta AI. It looks incredible—it really is DALL-E / Stable Diffusion for video. And it appears to have been trained on 10m video preview clips scraped from Shutterstock.
[... 923 words]Deploying Python web apps as AWS Lambda functions. After literally years of failed half-hearted attempts, I finally managed to deploy an ASGI Python web application (Datasette) to an AWS Lambda function! Here are my extensive notes.
Weeknotes: Datasette Lite, s3-credentials, shot-scraper, datasette-edit-templates and more
Despite distractions from AI I managed to make progress on a bunch of different projects this week, including new releases of s3-credentials and shot-scraper, a new datasette-edit-templates plugin and a small but neat improvement to Datasette Lite.
[... 1,562 words]Spevktator: OSINT analysis tool for VK. This is a really cool project that came out of a recent Bellingcat hackathon. Spevktator takes 67,000 posts from five popular Russian news channels on VK (a popular Russian social media platform) and makes them available in Datasette, along with automated translations to English, post sharing metrics and sentiment analysis scores. This README includes some detailed analysis of the data, plus a link to an Observable notebook that implements custom visualizations against queries run directly against the Datasette instance.
Exploring the training data behind Stable Diffusion
Two weeks ago, the Stable Diffusion image generation model was released to the public. I wrote about this last week, in Stable Diffusion is a really big deal—a post which has since become one of the top ten results for “stable diffusion” on Google and shown up in all sorts of different places online.
[... 2,897 words]













