Simon Willison’s Weblog

Subscribe

Items tagged datasette in Jan

Filters: Month: Jan × datasette × Sorted by date


Weeknotes: datasette-test, datasette-build, PSF board retreat

I wrote about Page caching and custom templates in my last weeknotes. This week I wrapped up that work, modifying datasette-edit-templates to be compatible with the jinja2_environment_from_request() plugin hook. This means you can edit templates directly in Datasette itself and have those served either for the full instance or just for the instance when served from a specific domain (the Datasette Cloud case).

[... 757 words]

Weeknotes: Page caching and custom templates for Datasette Cloud

My main development focus this week has been adding public page caching to Datasette Cloud, and exploring what custom template support might look like for that service.

[... 924 words]

datasette-scraper, Big Local News and other weeknotes

In addition to exploring the new MusicCaps training and evaluation data I’ve been working on the big Datasette JSON refactor, and getting excited about a Datasette project that I didn’t work on at all.

[... 1744 words]

datasette-scraper walkthrough on YouTube (via) datasette-scraper is Colin Dellow’s new plugin that turns Datasette into a powerful web scraping tool, with a web UI based on plugin-driven customizations to the Datasette interface. It’s really impressive, and this ten minute demo shows quite how much it is capable of: it can crawl sitemaps and fetch pages, caching them (using zstandard with optional custom dictionaries for extra compression) to speed up subsequent crawls... and you can add your own plugins to extract structured data from crawled pages and save it to a separate SQLite table! # 29th January 2023, 5:23 am

Examples of sites built using Datasette (via) I gave the examples page on the Datasette website a significant upgrade today: it now includes screenshots (taken using shot-scraper) of six projects chosen to illustrate the variety of problems Datasette can be used to tackle. # 29th January 2023, 3:40 am

We’ve built many tools for publishing to the web—but I want to make the claim that we have underdeveloped the tools and platforms for publishing collections, indexes and small databases. It’s too hard to build these kinds of experiences, too hard to maintain them and a lack of collaborative tools.

Tom Critchlow # 28th January 2023, 4:43 pm

Exploring MusicCaps, the evaluation data released to accompany Google’s MusicLM text-to-music model

Google Research just released MusicLM: Generating Music From Text. It’s a new generative AI model that takes a descriptive prompt and produces a “high-fidelity” music track. Here’s the paper (and a more readable version using arXiv Vanity).

[... 1323 words]

datasette-granian (via) Granian is a new Python web server—similar to Gunicorn—written in Rust. I built a small plugin that adds a “datasette granian” command starting a Granian server that serves Datasette’s ASGI application, using the same pattern as my existing datasette-gunicorn plugin. # 20th January 2023, 2:12 am

Datasette is my data hammer (via) Jeremia Kimelman—a data journalist at CalMatters in Sacramento—enthuses about how he uses Datasette as his default hammer for all kinds of data projects—in particular how much he appreciates Datasette’s focus on URLs. So nice to see this! # 17th January 2023, 5:23 pm

Weeknotes: AI hacking and a SpatiaLite tutorial

Short weeknotes this time because the key things I worked on have already been covered here:

[... 477 words]

How to implement Q&A against your documentation with GPT3, embeddings and Datasette

If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.

[... 3491 words]

Datasette 0.64, with a warning about SpatiaLite

I release Datasette 0.64 this morning. This release is mainly a response to the realization that it’s not safe to run Datasette with the SpatiaLite extension loaded if that Datasette instance is configured to enable arbitrary SQL queries from untrusted users.

[... 675 words]

Weeknotes: python_requires, documentation SEO

Fixed Datasette on Python 3.6 for the last time. Worked on documentation infrastructure improvements. Spent some time with Fly Volumes.

[... 1497 words]

Weeknotes: s3-credentials prefix and Datasette 0.60

A new release of s3-credentials with support for restricting access to keys that start with a prefix, Datasette 0.60 and a write-up of my process for shipping a feature.

[... 1134 words]

Datasette 0.60: The annotated release notes

I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.

[... 1119 words]

Weeknotes: datasette-leaflet, datasette-plugin cookiecutter upgrades

This week I shipped Datasette 0.54, sent out the latest Datasette Newsletter and then mostly worked on follow-up projects.

[... 552 words]

Datasette 0.54: The annotated release notes

Datasette 0.54 is out today, with a number of foundational new features. As I’ve done for previous releases, here are the annotated release notes with some additional commentary.

[... 1276 words]

Drawing shapes on a map to query a SpatiaLite database (and other weeknotes)

This week I built a Datasette plugin that lets you query a database by drawing shapes on a map!

[... 950 words]

Weeknotes: Still pretty distracted

Not a lot to report this week. U.S. politics continues to make it extremely difficult to focus on anything substantial.

[... 727 words]

Weeknotes: datasette-export-notebook, PyInstaller packaged Datasette, CBSAs

What a terrible week. I’ve found it hard to concentrate on anything substantial. In a mostly futile attempt to distract myself from doomscrolling I’ve mainly been building some experimental output plugins, fiddling with PyInstaller and messing around with shapefiles.

[... 732 words]

APIs from CSS without JavaScript: the datasette-css-properties plugin

I built a new Datasette plugin called datasette-css-properties. It’s very, very weird—it adds a .css output extension to Datasette which outputs the result of a SQL query using CSS custom property format. This means you can display the results of database queries using pure CSS and HTML, no JavaScript required!

[... 891 words]

datasette-css-properties (via) My new Datasette plugin defines a “.css” output format which returns the data from the query as a valid CSS stylesheet defining custom properties for each returned column. This means you can build a page using just HTML and CSS that consumes API data from Datasette, no JavaScript required! Whether this is a good idea or not is left as an exercise for the reader. # 7th January 2021, 7:42 pm

Weeknotes: A flurry of not-quite-finished features

My Christmas present to myself this year was to allow myself to spend a week working on stuff I found interesting, rather than sticking to the most important things. This may have been a mistake: it’s left me with a flurry of interesting but not-quite-finished features.

[... 2249 words]

Weeknotes: datasette-auth-existing-cookies and datasette-sentry

Work on Datasette Cloud continues—I’m tantalizingly close to having a MVP I can start to invite people to try out.

[... 701 words]

Weeknotes: Datasette Cloud and zero downtime deployments

Yesterday’s piece on Tracking FARA by deploying a data API using GitHub Actions and Cloud Run was originally intended to be my weeknotes, but ended up getting a bit too involved.

[... 1431 words]

Tracking FARA by deploying a data API using GitHub Actions and Cloud Run

I’m using the combination of GitHub Actions and Google Cloud Run to retrieve data from the U.S. Department of Justice FARA website and deploy it as a queryable API using Datasette.

[... 1599 words]

Building a sitemap.xml with a one-off Datasette plugin

One of the fun things about launching a new website is re-learning what it takes to promote a website from scratch on the modern web. I’ve been thoroughly enjoying using Niche Museums as an excuse to explore 2020-era SEO.

[... 1078 words]

SQLite in 2018: A state of the art SQL dialect (via) In 2018 SQLite gained boolean literals, window functions, filter clauses, upserts and the ability to rename a column. If you want to try it out the latest official datasetteproject/datasette Docker image now bundles SQLite 3.26. # 15th January 2019, 4:21 pm

Usable Data (via) A Paul Ford essay from February 2016 in which he advocates for SQLite as the ideal format for sharing interesting data. I don’t know how I missed this one—it predates Datasette, but it perfectly captures the benefits that I’m trying to expose with the project. “In my dream universe, there would be a massive searchable torrent site filled with open, explorable data sets, in SQLite format, some with full text search indexes already in place.” # 11th January 2019, 6:33 pm

Exploring search relevance algorithms with SQLite

SQLite isn’t just a fast, high quality embedded database: it also incorporates a powerful full-text search engine in the form of the FTS4 and FTS5 extensions. You’ve probably used these a bunch of times already: many iOS, Android and desktop applications use SQLite under-the-hood and use it to implement their built-in search.

[... 1390 words]