Simon Willison’s Weblog

Subscribe

December 2025

175 posts: 12 entries, 43 links, 19 quotes, 6 notes, 95 beats

Dec. 22, 2025

Release preview-server 0.2a2 — Proxy server for previewing web app changes via uv
Release preview-server 0.2a3 — Proxy server for previewing web app changes via uv

I just had my first success using a browser agent - in this case the Claude in Chrome extension - to solve an actual problem.

A while ago I set things up so anything served from the https://static.simonwillison.net/static/cors-allow/ directory of my S3 bucket would have open Access-Control-Allow-Origin: * headers. This is useful for hosting files online that can be loaded into web applications hosted on other domains.

Problem is I couldn't remember how I did it! I initially thought it was an S3 setting, but it turns out S3 lets you set CORS at the bucket-level but not for individual prefixes.

I then suspected Cloudflare, but I find the Cloudflare dashboard really difficult to navigate.

So I decided to give Claude in Chrome a go. I installed and enabled the extension (you then have to click the little puzzle icon and click "pin" next to Claude for the icon to appear, I had to ask Claude itself for help figuring that out), signed into Cloudflare, opened the Claude panel and prompted:

I'm trying to figure out how come all pages under http://static.simonwillison.net/static/cors/ have an open CORS policy, I think I set that up through Cloudflare but I can't figure out where

Off it went. It took 1m45s to find exactly what I needed.

Screenshot of Chrome - on the left is the Cloudflare dashboard. On the right a Claude chat panel which shows the final result: Summary  Found it! Your open CORS policy for the /static/cors/ directory on static.simonwillison.net is configured through Cloudflare Transform Rules, specifically a Response Header Transform Rule named static.simonwillis.net/static/cors-allow/*.  Here's exactly what's set up:  Rule Configuration:      Type: HTTP Response Header Transform Rule     Applies to: Any URL matching https://static.simonwillison.net/static/cors-allow/* (wildcard match)     Action: Modifies the response header access-control-allow-origin and sets it to *  This is why all pages under that path have an open CORS policy — the * value for Access-Control-Allow-Origin means any origin can access those resources.  Where to find/modify it in the dashboard:      Go to your domain settings (simonwillison.net)     Navigate to Rules > Rules Overview (in the left sidebar)     Scroll down to Response Header Transform Rules (should show 2 active)     Click on the rule named static.simonwillis.net/static/cors-allow/* to view or edit it

Claude's conclusion:

Found it! Your open CORS policy for the /static/cors/ directory on static.simonwillison.net is configured through Cloudflare Transform Rules, specifically a Response Header Transform Rule named static.simonwillis.net/static/cors-allow/*

There's no "share transcript" option but I used copy and paste and two gnarly Claude Code sessions (one, two) to turn it into an HTML transcript which you can take a look at here.

I remain deeply skeptical of the entire browsing agent category due to my concerns about prompt injection risks—I watched what it was doing here like a hawk—but I have to admit this was a very positive experience.

# 4:10 pm / anthropic, claude, browser-agents, cors, ai, llms, generative-ai, chrome, cloudflare, prompt-injection, ai-agents

Dec. 23, 2025

Cooking with Claude

Visit Cooking with Claude

I’ve been having an absurd amount of fun recently using LLMs for cooking. I started out using them for basic recipes, but as I’ve grown more confident in their culinary abilities I’ve leaned into them for more advanced tasks. Today I tried something new: having Claude vibe-code up a custom application to help with the timing for a complicated meal preparation. It worked really well!

[... 1,313 words]

Tool llm-lib demo — This is an interactive demonstration of a unified JavaScript library that provides a consistent interface for working with multiple large language model providers. The tool allows users to seamlessly switch between OpenAI, Anthropic Claude, and Google Gemini while managing API keys and configuring system prompts. Both streaming and non-streaming response modes are supported, with real-time status updates and syntax-highlighted code examples showing how to integrate the library into projects.
Release llm-gemini 0.28.2 — LLM plugin to access Google's Gemini family of models

MicroQuickJS. New project from programming legend Fabrice Bellard, of ffmpeg and QEMU and QuickJS and so much more fame:

MicroQuickJS (aka. MQuickJS) is a Javascript engine targetted at embedded systems. It compiles and runs Javascript programs with as low as 10 kB of RAM. The whole engine requires about 100 kB of ROM (ARM Thumb-2 code) including the C library. The speed is comparable to QuickJS.

It supports a subset of full JavaScript, though it looks like a rich and full-featured subset to me.

One of my ongoing interests is sandboxing: mechanisms for executing untrusted code - from end users or generated by LLMs - in an environment that restricts memory usage and applies a strict time limit and restricts file or network access. Could MicroQuickJS be useful in that context?

I fired up Claude Code for web (on my iPhone) and kicked off an asynchronous research project to see explore that question:

My full prompt is here. It started like this:

Clone https://github.com/bellard/mquickjs to /tmp

Investigate this code as the basis for a safe sandboxing environment for running untrusted code such that it cannot exhaust memory or CPU or access files or the network

First try building python bindings for this using FFI - write a script that builds these by checking out the code to /tmp and building against that, to avoid copying the C code in this repo permanently. Write and execute tests with pytest to exercise it as a sandbox

Then build a "real" Python extension not using FFI and experiment with that

Then try compiling the C to WebAssembly and exercising it via both node.js and Deno, with a similar suite of tests [...]

I later added to the interactive session:

Does it have a regex engine that might allow a resource exhaustion attack from an expensive regex?

(The answer was no - the regex engine calls the interrupt handler even during pathological expression backtracking, meaning that any configured time limit should still hold.)

Here's the full transcript and the final report.

Some key observations:

  • MicroQuickJS is very well suited to the sandbox problem. It has robust near and time limits baked in, it doesn't expose any dangerous primitive like filesystem of network access and even has a regular expression engine that protects against exhaustion attacks (provided you configure a time limit).
  • Claude span up and tested a Python library that calls a MicroQuickJS shared library (involving a little bit of extra C), a compiled a Python binding and a library that uses the original MicroQuickJS CLI tool. All of those approaches work well.
  • Compiling to WebAssembly was a little harder. It got a version working in Node.js and Deno and Pyodide, but the Python libraries wasmer and wasmtime proved harder, apparently because "mquickjs uses setjmp/longjmp for error handling". It managed to get to a working wasmtime version with a gross hack.

I'm really excited about this. MicroQuickJS is tiny, full featured, looks robust and comes from excellent pedigree. I think this makes for a very solid new entrant in the quest for a robust sandbox.

Update: I had Claude Code build tools.simonwillison.net/microquickjs, an interactive web playground for trying out the WebAssembly build of MicroQuickJS, adapted from my previous QuickJS plaground. My QuickJS page loads 2.28 MB (675 KB transferred). The MicroQuickJS one loads 303 KB (120 KB transferred).

Here are the prompts I used for that.

# 8:53 pm / c, javascript, nodejs, python, sandboxing, ai, webassembly, deno, pyodide, generative-ai, llms, claude-code, fabrice-bellard

Tool MicroQuickJS Code Executor — Execute JavaScript code in a lightweight MicroQuickJS sandbox environment running via WebAssembly, with results displayed directly on the page. The sandbox supports ES5-like JavaScript features and automatically saves your code in the URL for easy sharing and recovery. Choose between optimized and original WebAssembly versions, try built-in examples, and use Ctrl+Enter to quickly run your code.

If this [MicroQuickJS] had been available in 2010, Redis scripting would have been JavaScript and not Lua. Lua was chosen based on the implementation requirements, not on the language ones... (small, fast, ANSI-C). I appreciate certain ideas in Lua, and people love it, but I was never able to like Lua, because it departs from a more Algol-like syntax and semantics without good reasons, for my taste. This creates friction for newcomers. I love friction when it opens new useful ideas and abstractions that are worth it, if you learn SmallTalk or FORTH and for some time you are lost, it's part of how the languages are different. But I think for Lua this is not true enough: it feels like it departs from what people know without good reasons.

Salvatore Sanfilippo, Hacker News comment on MicroQuickJS

# 11:03 pm / salvatore-sanfilippo, lua, redis, javascript

Dec. 24, 2025

Research DeepSeek-OCR on NVIDIA GB10 (ARM64 + CUDA 13.0) — Successfully deployed DeepSeek-OCR on an NVIDIA GB10 (ARM64, sm_121) by upgrading to PyTorch 2.9.0+cu130 so CUDA 13.0 wheels could be used instead of building from source. The repo includes automated scripts (setup.sh, run_ocr.py) that load the 6.3GB safetensors model (~34s) and run GPU inference (~58s for a 3503×1668 image), producing annotated images, markdown/text outputs and bounding boxes with validated multi-column accuracy.
Research SQLite-utils Iterator Support Research — Enhancements to the sqlite-utils library now allow its `insert_all` and `upsert_all` methods to efficiently process Python iterators yielding lists, in addition to the original dict-based input. Detection of the iterator type is automatic and maintains full backward compatibility, streamlining bulk inserts from row-based data sources like CSV streams and reducing memory usage by avoiding dict construction.
Research Claude Code for Web Environment — Running Claude Code on the web offers developers a versatile coding sandbox on Ubuntu 24.04, leveraging a broad toolkit that includes Python 3.11, Node.js 22, Go, Rust, and more, alongside developer utilities (Git, Make) and database clients (SQLite, PostgreSQL).
Research Pyodide Simple Demo — A compact demo shows how to run Python scripts inside a WebAssembly sandbox from Node.js using Pyodide: after npm install, launching node server-simple.js executes example-simple.py and writes generated files to the output/ directory. The project demonstrates a minimal server-side integration pattern for Pyodide (https://pyodide.org/) under Node.js (https://nodejs.org/) and is aimed at quick experimentation with sandboxed Python execution.
Research Datasette Plugin Writer Skill — Covering every aspect of Datasette plugin development, this project creates a comprehensive skill set for authors—from bootstrapping with cookiecutter to deploying on GitHub and PyPI. It provides precise guides and working code samples for essential plugin hooks like custom SQL functions, authentication, custom views, and output formats.
Research Absurd-in-SQLite — Durable execution workflows can be implemented using SQLite, as demonstrated by the Absurd-in-SQLite project, which is inspired by Armin Ronacher's Absurd. This project provides a proof-of-concept implementation of durable execution using SQLite, allowing for reliable and long-running workflows that can survive crashes and network failures.
Research Blog Tag Prediction with Scikit-Learn — Automatically assigning meaningful tags to historic, untagged blog posts, this project leverages the Simon Willison blog database and scikit-learn to train and compare multi-label text classification models. Four approaches—TF-IDF + Logistic Regression, Multinomial Naive Bayes, Random Forest, and LinearSVC—were tested on posts’ title and body text using the 158 most frequently used tags.
Research Can BeautifulSoup Use JustHTML as a Parser? — BeautifulSoup 4 can be integrated with JustHTML, a pure Python HTML5 parser, enabling full compliance with the HTML5 parsing algorithm according to the WHATWG specification. By implementing a custom `JustHTMLTreeBuilder`, BeautifulSoup’s parser plugin system can leverage JustHTML for parsing, allowing seamless use of BeautifulSoup’s familiar API and features—like `find_all()` and CSS selectors—while inheriting robust, standards-adherent HTML handling.
Research minijinja vs jinja2 Performance Benchmark — Benchmarking the Python bindings for minijinja (https://github.com/mitsuhiko/minijinja) against Jinja2 (https://palletsprojects.com/p/jinja/) on Python 3.14 and 3.14t measured template render performance using a realistic e-commerce template with inheritance, loops, and ~65KB HTML output. The suite runs 200 iterations per scenario, captures mean/median/std/min/max, and provides reproducible scripts (run_benchmark.sh, benchmark.py) plus matplotlib charts to visualize results.
Research AST-Grep Import Rewriter — Leveraging ast-grep and custom YAML rules, the AST-Grep Import Rewriter offers a structured approach to automatically extract, analyze, and rewrite obfuscated JavaScript import statements across ES6, CommonJS, dynamic imports, and webpack bundles. By parsing source files, it generates mapping templates and applies user-defined mappings, converting unreadable module paths into meaningful names with either regex- or AST-based transformations.
Research SQLite Ripgrep Function — SQLite Ripgrep Function enables fast code and text search inside SQLite queries by integrating the powerful ripgrep search tool as a custom SQL function. It offers both a pure Python implementation and a performant C extension, allowing users to search files within a configurable directory, restrict output with glob patterns (e.g., `*.py`), and enforce time limits to avoid runaway queries.
Research Epsilon Python Wrapper — Epsilon Python Wrapper provides seamless Python bindings to Epsilon, Google's pure Go WebAssembly 2.0 runtime, enabling efficient and dependency-free WASM execution within Python projects. The wrapper exposes a simple API for module instantiation, function calls (with type safety), memory operations, and export inspection, supporting advanced features like SIMD and resource limiting.
Research Apptron Analysis Report — Apptron is a browser-based cloud IDE that hosts a full x86 Linux environment using emulation and WebAssembly, delivering a seamless developer experience directly in the browser. By tightly integrating VS Code, a Linux terminal, and persistent cloud storage via Cloudflare R2, users are able to work on customizable environments without any local setup.
Research Datasette 1.0a20 SQL Permissions System: Architecture Review — A comprehensive architecture review of Datasette's new SQL-based permissions system (introduced in v1.0a20) finds that transitioning from a callback-driven model to SQL query resolution greatly improves scalability for large deployments. The redesigned system efficiently checks access by evaluating compiled permission rules through internal catalog tables, substantially reducing processing overhead compared to the multiplicative N x M callback pattern.
Research GitHub CLI API Proxy Investigation — Proxying GitHub CLI (`gh`) API traffic can be achieved through standard HTTP/HTTPS proxies or via a Unix domain socket, each suited to different use cases and levels of flexibility. The CLI, implemented in Go, natively supports proxy environment variables (`HTTPS_PROXY`, `HTTP_PROXY`, `NO_PROXY`), making integration with existing HTTP proxies seamless and requiring no changes to the CLI configuration.
Research Automatic JavaScript API Tagging for simonw/tools — Efficiently categorizing the 155 HTML tools in simonw/tools by their JavaScript API usage, this project developed an automated pipeline combining Cheerio for HTML parsing and Acorn for JavaScript AST analysis. The solution robustly filters out false positives from comments, strings, and non-code regions, accurately tagging over 60 Web APIs and handling modern ES modules and edge script types.
Research SVG to PNG Conversion Methods in Python — Multiple Python-based approaches for converting SVG files to PNG were benchmarked using the tiger.svg image, evaluating file size, output quality, and ease of installation. Pure Python solutions like CairoSVG and svglib+reportlab offered simple pip-based installs with predictable PNGs, though svglib lacks alpha channel support. Wand (ImageMagick bindings) and ImageMagick CLI yielded the highest quality output (16-bit RGBA) at the cost of larger files and system-level dependencies.
Research OpenAI Codex CLI Sandbox Implementation Analysis — OpenAI Codex CLI's sandbox employs strong, platform-specific isolation to securely constrain the behavior of AI-driven code agents. On macOS, it uses Apple's Seatbelt sandbox with finely tuned dynamic policies, while on Linux, it combines Landlock for strict filesystem controls and seccomp for syscall-based network blocking—ensuring that agents can only write to user-approved directories and have no outgoing network by default.
Research h3o-python — h3o-python delivers efficient Python bindings for the h3o Rust library, enabling fast and convenient access to H3 geospatial indexing from Python. Utilizing PyO3 and packaged with maturin, it allows encoding geographic coordinates into 64-bit H3 cell indexes, decoding indexes, performing neighborhood queries, calculating great-circle distances, and retrieving surface area metrics—all without requiring a separate H3 installation.
Research Wazero Python Bindings — Wazero Python Bindings enable seamless integration of the wazero WebAssembly runtime—written in Go—with Python applications, delivering a zero-dependency solution for running WASM modules natively from Python. The project exposes a clean, Pythonic API for instantiating modules, calling exported WASM functions, and managing resources efficiently with context managers. Performance benchmarks demonstrate rapid execution and minimal overhead between Python and WASM.