<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: Research</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/atom/beats/research/" rel="self"/><id>http://simonwillison.net/</id><updated>2026-05-04T17:52:00+00:00</updated><author><name>Simon Willison</name></author><entry><title>TRE Python binding — ReDoS robustness demo</title><link href="https://github.com/simonw/research/tree/main/tre-python-binding#readme" rel="alternate"/><published>2026-05-04T17:52:00+00:00</published><updated>2026-05-04T17:52:00+00:00</updated><id>https://github.com/simonw/research/tree/main/tre-python-binding#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/tre-python-binding#readme"&gt;TRE Python binding — ReDoS robustness demo&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Demonstrating robust regex performance, this project offers a minimal Python ctypes binding to the TRE regex library, highlighting TRE’s immunity to regular expression denial-of-service (ReDoS) attacks that cripple Python's built-in `re` module. Key benchmarks show that TRE processes even notorious "evil" patterns on gigantic inputs (10 million characters) much faster than `re` on tiny ones, and scales linearly with input size instead of exponentially.&lt;/p&gt;</summary><category term="security"/><category term="python"/><category term="regular-expressions"/><category term="c"/><category term="ctypes"/></entry><entry><title>Claude system prompts as a git timeline</title><link href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme" rel="alternate"/><published>2026-04-18T12:17:00+00:00</published><updated>2026-04-18T12:17:00+00:00</updated><id>https://github.com/simonw/research/tree/main/extract-system-prompts#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme"&gt;Claude system prompts as a git timeline&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Anthropic's published system prompt history for Claude is transformed into a git-based exploration tool, breaking up the monolithic markdown source into granular files and timestamped commits. By structuring extracted prompts per model, family, and revision, researchers can leverage `git log`, `diff`, and `blame` to trace prompt evolution, compare differences, and attribute changes to specific dates—all without manual parsing.&lt;/p&gt;</summary><category term="system-prompts"/><category term="anthropic"/><category term="claude"/><category term="generative-ai"/><category term="ai"/><category term="llms"/></entry><entry><title>Exploring the new `servo` crate</title><link href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme" rel="alternate"/><published>2026-04-13T15:04:00+00:00</published><updated>2026-04-13T15:04:00+00:00</updated><id>https://github.com/simonw/research/tree/main/servo-crate-exploration#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme"&gt;Exploring the new `servo` crate&lt;/a&gt;&lt;/p&gt;&lt;p&gt;After the April 2026 release of the `servo` v0.1.0 crate (blog post), a concise investigation shows that Servo is now an embeddable browser engine for Rust, with a clear API centered on the `ServoBuilder`, `WebView`, and pixel readback methods. A headless CLI (`servo-shot`) successfully renders URLs or HTML files to PNG, building against stable Rust with a robust software-based rendering pipeline.&lt;/p&gt;</summary><category term="research"/><category term="browsers"/><category term="rust"/><category term="webassembly"/><category term="claude-code"/><category term="servo"/></entry><entry><title>QuickJS Python Sandbox — Investigation Report</title><link href="https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme" rel="alternate"/><published>2026-04-12T23:15:00+00:00</published><updated>2026-04-12T23:15:00+00:00</updated><id>https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme"&gt;QuickJS Python Sandbox — Investigation Report&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Exploring the `quickjs` Python package, this project implements an asyncio-compatible JavaScript sandbox with robust resource controls and seamless exposure of both synchronous and asynchronous Python functions (including async httpx fetches) to JavaScript code.&lt;/p&gt;</summary></entry><entry><title>SQLite WAL Mode Across Docker Containers Sharing a Volume</title><link href="https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme" rel="alternate"/><published>2026-04-07T15:41:00+00:00</published><updated>2026-04-07T15:41:00+00:00</updated><id>https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme"&gt;SQLite WAL Mode Across Docker Containers Sharing a Volume&lt;/a&gt;&lt;/p&gt;&lt;p&gt;SQLite’s WAL mode reliably supports concurrent access when two Docker containers share a volume on the same host, due to shared kernel and filesystem semantics. The experiment, using Docker Desktop for macOS and a named volume, demonstrated real-time propagation of database changes and effective memory-mapped file sharing by monitoring `.db-shm`.&lt;/p&gt;</summary><category term="docker"/><category term="sqlite"/></entry><entry><title>Can JavaScript Escape a CSP Meta Tag Inside an Iframe?</title><link href="https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme" rel="alternate"/><published>2026-04-03T16:05:00+00:00</published><updated>2026-04-03T16:05:00+00:00</updated><id>https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme"&gt;Can JavaScript Escape a CSP Meta Tag Inside an Iframe?&lt;/a&gt;&lt;/p&gt;&lt;p&gt;JavaScript running inside a `sandbox="allow-scripts"` iframe cannot escape or disable a `&lt;meta http-equiv="Content-Security-Policy"&gt;` tag, even through removal, modification, or document replacement. Extensive testing across Chromium and Firefox confirmed that CSP policies defined via meta tags are enforced at parse time, and persist even when the iframe is navigated to a data: URI.&lt;/p&gt;</summary><category term="iframes"/><category term="security"/><category term="javascript"/><category term="content-security-policy"/><category term="sandboxing"/></entry><entry><title>Starlette 1.0 skill</title><link href="https://github.com/simonw/research/tree/main/starlette-1-skill#readme" rel="alternate"/><published>2026-03-23T00:05:00+00:00</published><updated>2026-03-23T00:05:00+00:00</updated><id>https://github.com/simonw/research/tree/main/starlette-1-skill#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/starlette-1-skill#readme"&gt;Starlette 1.0 skill&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Starlette 1.0 Skill offers a concise guide for building robust web applications with Starlette, a lightweight ASGI framework. The accompanying demo showcases a task management app featuring projects, tasks, comments, and labels, illustrating Starlette's flexibility in handling routing, templating (Jinja2), async database operations (aiosqlite), and real-time updates.&lt;/p&gt;</summary><category term="starlette"/></entry><entry><title>PCGamer Article Performance Audit</title><link href="https://github.com/simonw/research/tree/main/pcgamer-audit#readme" rel="alternate"/><published>2026-03-22T22:49:00+00:00</published><updated>2026-03-22T22:49:00+00:00</updated><id>https://github.com/simonw/research/tree/main/pcgamer-audit#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/pcgamer-audit#readme"&gt;PCGamer Article Performance Audit&lt;/a&gt;&lt;/p&gt;&lt;p&gt;A performance audit of the March 2026 PCGamer article on RSS readers reveals severe page bloat, with over 82% of network traffic and transferred bytes traced to ad-tech, tracking, and programmatic advertising scripts. Despite the core content consisting of just 10-15 KB of text and a handful of images (~150 KB total), the page triggers over 431 network requests and 5.5 MB of transfer (18.8 MB decoded) within 60 seconds—ballooning to 200+ MB in Firefox due to autoplay video carousels and…&lt;/p&gt;</summary><category term="web-performance"/><category term="rodney"/></entry><entry><title>JavaScript Sandboxing Research</title><link href="https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme" rel="alternate"/><published>2026-03-22T19:53:00+00:00</published><updated>2026-03-22T19:53:00+00:00</updated><id>https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme"&gt;JavaScript Sandboxing Research&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Analyzing current JavaScript sandboxing options for running untrusted code, this research compares core approaches in Node.js (including worker_threads, node:vm, and the Permission Model), prominent npm packages (isolated-vm, vm2), and alternative engines like quickjs-emscripten.&lt;/p&gt;</summary><category term="sandboxing"/><category term="javascript"/><category term="nodejs"/><category term="claude-code"/></entry><entry><title>SQLite Tags Benchmark: Comparing 5 Tagging Strategies</title><link href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme" rel="alternate"/><published>2026-03-20T02:57:00+00:00</published><updated>2026-03-20T02:57:00+00:00</updated><id>https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme"&gt;SQLite Tags Benchmark: Comparing 5 Tagging Strategies&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Benchmarking five tagging strategies in SQLite reveals clear trade-offs between query speed, storage, and implementation complexity for workflows involving tags (100,000 rows, 100 tags, average 6.5 tags/row). Indexed approaches—materialized lookup tables on JSON and classic many-to-many tables—easily outperform others, handling single-tag queries in under 1.5 milliseconds, while raw JSON and LIKE-based solutions are much slower.&lt;/p&gt;</summary><category term="json"/><category term="sqlite"/></entry><entry><title>PDF to Image Converter</title><link href="https://github.com/simonw/research/tree/main/pdf-to-image-converter#readme" rel="alternate"/><published>2026-03-19T21:06:00+00:00</published><updated>2026-03-19T21:06:00+00:00</updated><id>https://github.com/simonw/research/tree/main/pdf-to-image-converter#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/pdf-to-image-converter#readme"&gt;PDF to Image Converter&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Leveraging Rust's `pdfium-render` crate and Python's PyO3 bindings, this project enables fast and reliable conversion of PDF pages to JPEG images, packaged as a self-contained Python wheel. The CLI tool and Python library are both built to require no external dependencies, bundling the necessary PDFium binary for ease of installation and cross-platform compatibility.&lt;/p&gt;</summary></entry><entry><title>REXC (rx) JSON Test Suite</title><link href="https://github.com/simonw/research/tree/main/json-test-suite#readme" rel="alternate"/><published>2026-03-19T06:26:00+00:00</published><updated>2026-03-19T06:26:00+00:00</updated><id>https://github.com/simonw/research/tree/main/json-test-suite#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/json-test-suite#readme"&gt;REXC (rx) JSON Test Suite&lt;/a&gt;&lt;/p&gt;&lt;p&gt;REXC (rx) JSON Test Suite provides a comprehensive, language-agnostic test resource for validating implementations of the REXC encoder/decoder. It includes a single JSON file with 206 tests covering base64 encoding, zigzag integer transformations, value conversions, roundtrip integrity, and special numeric values, ensuring correctness across platforms.&lt;/p&gt;</summary></entry><entry><title>syntaqlite Python Extension in WebAssembly</title><link href="https://github.com/simonw/research/tree/main/syntaqlite-python-extension#readme" rel="alternate"/><published>2026-03-17T16:51:00+00:00</published><updated>2026-03-17T16:51:00+00:00</updated><id>https://github.com/simonw/research/tree/main/syntaqlite-python-extension#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/syntaqlite-python-extension#readme"&gt;syntaqlite Python Extension in WebAssembly&lt;/a&gt;&lt;/p&gt;&lt;p&gt;syntaqlite-python-extension is a Python C extension module that integrates the syntaqlite Rust/C SQL toolkit, making high-fidelity SQL parsing, formatting, validation, and tokenization available to Python and Pyodide environments. It wraps syntaqlite's native FFI for both desktop and web, linking against static libraries produced by Rust and employing Emscripten for WASM builds.&lt;/p&gt;</summary></entry><entry><title>CSRF Protection Demo: Modern Browser-Based Defenses</title><link href="https://github.com/simonw/research/tree/main/csrf-protection-demo#readme" rel="alternate"/><published>2026-03-14T04:34:00+00:00</published><updated>2026-03-14T04:34:00+00:00</updated><id>https://github.com/simonw/research/tree/main/csrf-protection-demo#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/csrf-protection-demo#readme"&gt;CSRF Protection Demo: Modern Browser-Based Defenses&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Modern browser security now enables robust Cross-Site Request Forgery (CSRF) prevention without requiring tokens. This demo project contrasts a vulnerable FastAPI bank app with a protected version, showcasing how browser-sent headers like `Sec-Fetch-Site` and `Origin` empower servers to automatically reject cross-origin POST requests.&lt;/p&gt;</summary></entry><entry><title>v86 exploration</title><link href="https://github.com/simonw/research/tree/main/v86-exploration#readme" rel="alternate"/><published>2026-03-10T15:55:00+00:00</published><updated>2026-03-10T15:55:00+00:00</updated><id>https://github.com/simonw/research/tree/main/v86-exploration#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/v86-exploration#readme"&gt;v86 exploration&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Exploring the v86 Linux Emulator (see v86 Linux Emulator tool), this project evaluates a browser-based Buildroot 2024.05.2 x86 environment with a constrained 39 MB RAM, featuring BusyBox utilities, Lua 5.4.6 scripting, and core text-processing tools. Although it boasts comprehensive shell utilities, file management tools, and basic network utilities (curl, wget, links), actual internet access is unavailable due to the lack of a configured network relay.&lt;/p&gt;</summary></entry></feed>