Simon Willison’s Weblog

Subscribe
Atom feed

Elsewhere

Filters: Sorted by date

Release datasette-llm 0.1a6 — LLM integration plugin for other plugins to depend on
  • The same model ID no longer needs to be repeated in both the default model and allowed models lists - setting it as a default model automatically adds it to the allowed models list. #6
  • Improved documentation for Python API usage.
Release datasette-enrichments-llm 0.2a1 — Enrich data by prompting LLMs
  • The actor who triggers an enrichment is now passed to the llm.mode(... actor=actor) method. #3
Release datasette-extract 0.3a0 — Import unstructured data (text and images) into structured tables
Release datasette-enrichments-llm 0.2a0 — Enrich data by prompting LLMs
  • This plugin now uses datasette-llm to configure and manage models. This means it's possible to specify which models should be made available for enrichments, using the new enrichments purpose.
Release datasette-llm-usage 0.2a0 — Track usage of LLM tokens in a SQLite table
  • Removed features relating to allowances and estimated pricing. These are now the domain of datasette-llm-accountant.
  • Now depends on datasette-llm for model configuration. #3
  • Full prompts and responses and tool calls can now be logged to the llm_usage_prompt_log table in the internal database if you set the new datasette-llm-usage.log_prompts plugin configuration setting.
  • Redesigned the /-/llm-usage-simple-prompt page, which now requires the llm-usage-simple-prompt permission.
Release datasette-llm 0.1a5 — LLM integration plugin for other plugins to depend on
  • The llm_prompt_context() plugin hook wrapper mechanism now tracks prompts executed within a chain as well as one-off prompts, which means it can be used to track tool call loops. #5
Release datasette-llm 0.1a4 — LLM integration plugin for other plugins to depend on

I released llm-echo 0.3 to provide an API key testing utility I needed for the tests for this new feature.

Release llm-all-models-async 0.1 — Register async versions of models from LLM plugins that only provide a sync version

LLM plugins can define new models in both sync and async varieties. The async variants are most common for API-backed models - sync variants tend to be things that run the model directly within the plugin.

My llm-mrchatterbox plugin is sync only. I wanted to try it out with various Datasette LLM features (specifically datasette-enrichments-llm) but Datasette can only use async models.

So... I had Claude spin up this plugin that turns sync models into async models using a thread pool. This ended up needing an extra plugin hook mechanism in LLM itself, which I shipped just now in LLM 0.30.

Release llm 0.30 — Access large language models from the command-line
  • The register_models() plugin hook now takes an optional model_aliases parameter listing all of the models, async models and aliases that have been registered so far by other plugins. A plugin with @hookimpl(trylast=True) can use this to take previously registered models into account. #1389
  • Added docstrings to public classes and methods and included those directly in the documentation.
Release llm-echo 0.4 — Debug plugin for LLM providing an echo model
  • Prompts now have the input_tokens and output_tokens fields populated on the response.
Release llm-echo 0.3 — Debug plugin for LLM providing an echo model
Release datasette-files 0.1a3 — Upload files to Datasette

I'm working on integrating datasette-files into other plugins, such as datasette-extract. This necessitated a new release of the base plugin.

  • owners_can_edit and owners_can_delete configuration options, plus the files-edit and files-delete actions are now scoped to a new FileResource which is a child of FileSourceResource. #18
  • The file picker UI is now available as a <datasette-file-picker> Web Component. Thanks, Alex Garcia. #19
  • New from datasette_files import get_file Python API for other plugins that need to access file data. #20
Release datasette-llm 0.1a3 — LLM integration plugin for other plugins to depend on

Adds the ability to configure which LLMs are available for which purpose, which means you can restrict the list of models that can be used with a specific plugin. #3

Release llm-mrchatterbox 0.1.1 — Chat with Mr Chatterbox, trained on a corpus of over 28,000 Victorian-era British texts published between 1837 and 1899
Release llm-mrchatterbox 0.1 — Chat with Mr Chatterbox, trained on a corpus of over 28,000 Victorian-era British texts published between 1837 and 1899
Tool Python Vulnerability Lookup — Search Python packages for known security vulnerabilities by pasting a `pyproject.toml` or `requirements.txt` file, or by loading dependencies directly from a GitHub repository. The tool queries the OSV.dev vulnerability database and displays detailed information about any identified vulnerabilities, including severity levels, affected version ranges, and links to full disclosure reports.

I learned that the OSV.dev open source vulnerability database has an open CORS JSON API, so I had Claude Code build this HTML tool for pasting in a pyproject.toml or requirements.txt file (or name of a GitHub repo containing those) and seeing a list of all reported vulnerabilities from that API.

Release datasette-showboat 0.1a2 — Datasette plugin for SHOWBOAT_REMOTE_URL

I added an option to export a Markdown file from my app that lets Showboat incrementally publish updates to a remote server.

Release datasette-llm 0.1a2 — LLM integration plugin for other plugins to depend on
  • actor is now available to the llm_prompt_context plugin hook. #2
Release datasette-files-s3 0.1a2 — datasette-files S3 backend
Release datasette-files-s3 0.1a1 — datasette-files S3 backend

A backend for datasette-files that adds the ability to store and retrieve files using an S3 bucket. This release added a mechanism for fetching S3 configuration periodically from a URL, which means we can use time limited IAM credentials that are restricted to a prefix within a bucket.

Release datasette-llm 0.1a1 — LLM integration plugin for other plugins to depend on

New release of the base plugin that makes models from LLM available for use by other Datasette plugins such as datasette-enrichments-llm.

One of the responsibilities of this plugin is to configure which models are used for which purposes, so you can say in one place "data enrichment uses GPT-5.4-nano but SQL query assistance happens using Sonnet 4.6", for example.

Plugins that depend on this can use model = await llm.model(purpose="enrichment") to indicate the purpose of the prompts they wish to execute against the model. Those plugins can now also use the new register_llm_purposes() hook to register those purpose strings, which means future plugins can list those purposes in one place to power things like an admin UI for assigning models to purposes.

Release datasette-files 0.1a2 — Upload files to Datasette

The most interesting alpha of datasette-files yet, a new plugin which adds the ability to upload files directly into a Datasette instance. Here are the release notes in full:

  • Columns are now configured using the new column_types system from Datasette 1.0a26. #8
  • New file_actions plugin hook, plus ability to import an uploaded CSV/TSV file to a table. #10
  • UI for uploading multiple files at once via the new documented JSON upload API. #11
  • Thumbnails are now generated for image files and stored in an internal datasette_files_thumbnails table. #13
Research Starlette 1.0 skill — Starlette 1.0 Skill offers a concise guide for building robust web applications with Starlette, a lightweight ASGI framework. The accompanying demo showcases a task management app featuring projects, tasks, comments, and labels, illustrating Starlette's flexibility in handling routing, templating (Jinja2), async database operations (aiosqlite), and real-time updates.
Research PCGamer Article Performance Audit — A performance audit of the March 2026 PCGamer article on RSS readers reveals severe page bloat, with over 82% of network traffic and transferred bytes traced to ad-tech, tracking, and programmatic advertising scripts. Despite the core content consisting of just 10-15 KB of text and a handful of images (~150 KB total), the page triggers over 431 network requests and 5.5 MB of transfer (18.8 MB decoded) within 60 seconds—ballooning to 200+ MB in Firefox due to autoplay video carousels and…

Stuart Breckenridge pointed out that PC Gamer Recommends RSS Readers in a 37MB Article That Just Keeps Downloading, highlighting a truly horrifying example of web bloat that added up to 100s more MBs thanks to auto-playing video ads. I decided to have Claude Code for web use Rodney to investigate the page - prompt here.

Research JavaScript Sandboxing Research — Analyzing current JavaScript sandboxing options for running untrusted code, this research compares core approaches in Node.js (including worker_threads, node:vm, and the Permission Model), prominent npm packages (isolated-vm, vm2), and alternative engines like quickjs-emscripten.

Aaron Harper wrote about Node.js worker threads, which inspired me to run a research task to see if they might help with running JavaScript in a sandbox. Claude Code went way beyond my initial question and produced a comparison of isolated-vm, vm2, quickjs-emscripten, QuickJS-NG, ShadowRealm, and Deno Workers.

Tool DNS Lookup — # DNS Lookup Documentation

TIL that Cloudflare's 1.1.1.1 DNS service (and 1.1.1.2 and 1.1.1.3, which block malware and malware + adult content respectively) has a CORS-enabled JSON API, so I had Claude Code build me a UI for running DNS queries against all three of those resolvers.

Tool Merge State Visualizer — # CRDT Merge State Visualizer

Bram Cohen wrote about his coherent vision for the future of version control using CRDTs, illustrated by 470 lines of Python.

I fed that Python (minus comments) into Claude and asked for an explanation, then had it use Pyodide to build me an interactive UI for seeing how the algorithms work.

Tool TURBO.COM — 39,731 Bytes Deconstructed — Explore an interactive breakdown of Turbo Pascal 3.02A's 39,731-byte executable, mapping each function from the header and display system through the parser and symbol table. Navigate the binary segments with an overhead visualization and detailed hex dumps with annotated disassembly, revealing how a complete compiler, editor, and runtime fit within the constraints of 1986 DOS memory while routing all file I/O through a single INT 21h dispatcher gateway.
Research SQLite Tags Benchmark: Comparing 5 Tagging Strategies — Benchmarking five tagging strategies in SQLite reveals clear trade-offs between query speed, storage, and implementation complexity for workflows involving tags (100,000 rows, 100 tags, average 6.5 tags/row). Indexed approaches—materialized lookup tables on JSON and classic many-to-many tables—easily outperform others, handling single-tag queries in under 1.5 milliseconds, while raw JSON and LIKE-based solutions are much slower.

I had Claude Code run a micro-benchmark comparing different approaches to implementing tagging in SQLite. Traditional many-to-many tables won, but FTS5 came a close second. Full table scans with LIKE queries performed better than I expected, but full table scans with JSON arrays and json_each() were much slower.