<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: Research</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/atom/beats/research/" rel="self"/><id>http://simonwillison.net/</id><updated>2026-06-13T23:05:00+00:00</updated><author><name>Simon Willison</name></author><entry><title>Mapping SQLite result columns back to their source `table.column`</title><link href="https://github.com/simonw/research/tree/main/sqlite-column-provenance#readme" rel="alternate"/><published>2026-06-13T23:05:00+00:00</published><updated>2026-06-13T23:05:00+00:00</updated><id>https://github.com/simonw/research/tree/main/sqlite-column-provenance#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/sqlite-column-provenance#readme"&gt;Mapping SQLite result columns back to their source `table.column`&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Determining the source `table.column` for each result column in arbitrary SQLite queries is feasible because SQLite computes this internally and exposes it via its column-metadata API when compiled with `SQLITE_ENABLE_COLUMN_METADATA`. While Python’s standard `sqlite3` module doesn’t surface this information, robust methods exist: using the third-party `apsw` library provides direct access with `cursor.description_full`, or a pure-stdlib ctypes bridge (`column_provenance.py`) can retrieve the…&lt;/p&gt;</summary><category term="python"/><category term="sqlite"/><category term="datasette"/></entry><entry><title>Running untrusted queries: Datasette/SQLite vs psycopg/PostgreSQL</title><link href="https://github.com/simonw/research/tree/main/postgresql-time-limits-readonly#readme" rel="alternate"/><published>2026-06-11T04:17:00+00:00</published><updated>2026-06-11T04:17:00+00:00</updated><id>https://github.com/simonw/research/tree/main/postgresql-time-limits-readonly#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/postgresql-time-limits-readonly#readme"&gt;Running untrusted queries: Datasette/SQLite vs psycopg/PostgreSQL&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Exploring how untrusted SQL queries are safely run in Datasette (using SQLite) and whether similar protections can be applied with psycopg and PostgreSQL, this project shows that both can provide robust safeguards against data corruption and resource exhaustion. Datasette leverages hard read-only file modes and a VM progress handler for query timeouts, while PostgreSQL's privilege system enforces SELECT-only access and its `statement_timeout` cancels resource-intensive or sleeping queries.&lt;/p&gt;</summary></entry><entry><title>Can DuckDB run untrusted SQL as safely as Datasette runs SQLite?</title><link href="https://github.com/simonw/research/tree/main/datasette-duckdb-safety#readme" rel="alternate"/><published>2026-06-10T23:41:00+00:00</published><updated>2026-06-10T23:41:00+00:00</updated><id>https://github.com/simonw/research/tree/main/datasette-duckdb-safety#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/datasette-duckdb-safety#readme"&gt;Can DuckDB run untrusted SQL as safely as Datasette runs SQLite?&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Investigating the security of running untrusted SQL in DuckDB compared to Datasette with SQLite, this project establishes that DuckDB can be sandboxed to match—and sometimes exceed—the safety of SQLite, but requires more than its basic `read_only=True` option. Datasette achieves safe SQL exposure by using engine-level read-only connections and opcode-based time limits in SQLite, which inherently prevents unauthorized file or network access.&lt;/p&gt;</summary></entry><entry><title>Running Python ASGI apps in the browser via Pyodide + a service worker</title><link href="https://github.com/simonw/research/tree/main/pyodide-asgi-browser#readme" rel="alternate"/><published>2026-05-30T15:34:00+00:00</published><updated>2026-05-30T15:34:00+00:00</updated><id>https://github.com/simonw/research/tree/main/pyodide-asgi-browser#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/pyodide-asgi-browser#readme"&gt;Running Python ASGI apps in the browser via Pyodide + a service worker&lt;/a&gt;&lt;/p&gt;&lt;p&gt;By running Python ASGI web applications entirely in the browser using Pyodide and a dedicated service worker, this project intercepts all same-origin requests under `/app/` and executes them against the Python app via the ASGI protocol—removing the need for a backend server except for static files. The mechanism is demonstrated with both a FastAPI demo and the full Datasette app, confirming its generality across ASGI apps.&lt;/p&gt;</summary><category term="javascript"/><category term="python"/><category term="datasette"/><category term="asgi"/><category term="webassembly"/><category term="service-workers"/><category term="pyodide"/><category term="datasette-lite"/><category term="claude-code"/></entry><entry><title>pydantic-monty investigation</title><link href="https://github.com/simonw/research/tree/main/monty-investigation#readme" rel="alternate"/><published>2026-05-22T22:41:00+00:00</published><updated>2026-05-22T22:41:00+00:00</updated><id>https://github.com/simonw/research/tree/main/monty-investigation#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/monty-investigation#readme"&gt;pydantic-monty investigation&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Reviewing `pydantic-monty` reveals it as a fast, minimal Python interpreter designed for controlled sandboxed execution, primarily useful when transforming data, branching, looping, and interacting with a select set of trusted host tools or a virtual filesystem. The interpreter purposefully omits large portions of CPython’s functionality, with clear boundaries: unsupported features and missing resources generally fail cleanly as structured errors rather than escaping into the host runtime.&lt;/p&gt;</summary><category term="python"/><category term="sandboxing"/><category term="pydantic"/></entry><entry><title>TRE Python binding — ReDoS robustness demo</title><link href="https://github.com/simonw/research/tree/main/tre-python-binding#readme" rel="alternate"/><published>2026-05-04T17:52:00+00:00</published><updated>2026-05-04T17:52:00+00:00</updated><id>https://github.com/simonw/research/tree/main/tre-python-binding#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/tre-python-binding#readme"&gt;TRE Python binding — ReDoS robustness demo&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Demonstrating robust regex performance, this project offers a minimal Python ctypes binding to the TRE regex library, highlighting TRE’s immunity to regular expression denial-of-service (ReDoS) attacks that cripple Python's built-in `re` module. Key benchmarks show that TRE processes even notorious "evil" patterns on gigantic inputs (10 million characters) much faster than `re` on tiny ones, and scales linearly with input size instead of exponentially.&lt;/p&gt;</summary><category term="c"/><category term="ctypes"/><category term="python"/><category term="regular-expressions"/><category term="security"/></entry><entry><title>Claude system prompts as a git timeline</title><link href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme" rel="alternate"/><published>2026-04-18T12:17:00+00:00</published><updated>2026-04-18T12:17:00+00:00</updated><id>https://github.com/simonw/research/tree/main/extract-system-prompts#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/extract-system-prompts#readme"&gt;Claude system prompts as a git timeline&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Anthropic's published system prompt history for Claude is transformed into a git-based exploration tool, breaking up the monolithic markdown source into granular files and timestamped commits. By structuring extracted prompts per model, family, and revision, researchers can leverage `git log`, `diff`, and `blame` to trace prompt evolution, compare differences, and attribute changes to specific dates—all without manual parsing.&lt;/p&gt;</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="system-prompts"/></entry><entry><title>Exploring the new `servo` crate</title><link href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme" rel="alternate"/><published>2026-04-13T15:04:00+00:00</published><updated>2026-04-13T15:04:00+00:00</updated><id>https://github.com/simonw/research/tree/main/servo-crate-exploration#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/servo-crate-exploration#readme"&gt;Exploring the new `servo` crate&lt;/a&gt;&lt;/p&gt;&lt;p&gt;After the April 2026 release of the `servo` v0.1.0 crate (blog post), a concise investigation shows that Servo is now an embeddable browser engine for Rust, with a clear API centered on the `ServoBuilder`, `WebView`, and pixel readback methods. A headless CLI (`servo-shot`) successfully renders URLs or HTML files to PNG, building against stable Rust with a robust software-based rendering pipeline.&lt;/p&gt;</summary><category term="browsers"/><category term="research"/><category term="rust"/><category term="webassembly"/><category term="servo"/><category term="claude-code"/></entry><entry><title>QuickJS Python Sandbox — Investigation Report</title><link href="https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme" rel="alternate"/><published>2026-04-12T23:15:00+00:00</published><updated>2026-04-12T23:15:00+00:00</updated><id>https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/quickjs-async-sandbox#readme"&gt;QuickJS Python Sandbox — Investigation Report&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Exploring the `quickjs` Python package, this project implements an asyncio-compatible JavaScript sandbox with robust resource controls and seamless exposure of both synchronous and asynchronous Python functions (including async httpx fetches) to JavaScript code.&lt;/p&gt;</summary></entry><entry><title>SQLite WAL Mode Across Docker Containers Sharing a Volume</title><link href="https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme" rel="alternate"/><published>2026-04-07T15:41:00+00:00</published><updated>2026-04-07T15:41:00+00:00</updated><id>https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/sqlite-wal-docker-containers#readme"&gt;SQLite WAL Mode Across Docker Containers Sharing a Volume&lt;/a&gt;&lt;/p&gt;&lt;p&gt;SQLite’s WAL mode reliably supports concurrent access when two Docker containers share a volume on the same host, due to shared kernel and filesystem semantics. The experiment, using Docker Desktop for macOS and a named volume, demonstrated real-time propagation of database changes and effective memory-mapped file sharing by monitoring `.db-shm`.&lt;/p&gt;</summary><category term="sqlite"/><category term="docker"/></entry><entry><title>Can JavaScript Escape a CSP Meta Tag Inside an Iframe?</title><link href="https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme" rel="alternate"/><published>2026-04-03T16:05:00+00:00</published><updated>2026-04-03T16:05:00+00:00</updated><id>https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/test-csp-iframe-escape#readme"&gt;Can JavaScript Escape a CSP Meta Tag Inside an Iframe?&lt;/a&gt;&lt;/p&gt;&lt;p&gt;JavaScript running inside a `sandbox="allow-scripts"` iframe cannot escape or disable a `&lt;meta http-equiv="Content-Security-Policy"&gt;` tag, even through removal, modification, or document replacement. Extensive testing across Chromium and Firefox confirmed that CSP policies defined via meta tags are enforced at parse time, and persist even when the iframe is navigated to a data: URI.&lt;/p&gt;</summary><category term="iframes"/><category term="javascript"/><category term="sandboxing"/><category term="security"/><category term="content-security-policy"/></entry><entry><title>Starlette 1.0 skill</title><link href="https://github.com/simonw/research/tree/main/starlette-1-skill#readme" rel="alternate"/><published>2026-03-23T00:05:00+00:00</published><updated>2026-03-23T00:05:00+00:00</updated><id>https://github.com/simonw/research/tree/main/starlette-1-skill#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/starlette-1-skill#readme"&gt;Starlette 1.0 skill&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Starlette 1.0 Skill offers a concise guide for building robust web applications with Starlette, a lightweight ASGI framework. The accompanying demo showcases a task management app featuring projects, tasks, comments, and labels, illustrating Starlette's flexibility in handling routing, templating (Jinja2), async database operations (aiosqlite), and real-time updates.&lt;/p&gt;</summary><category term="starlette"/></entry><entry><title>PCGamer Article Performance Audit</title><link href="https://github.com/simonw/research/tree/main/pcgamer-audit#readme" rel="alternate"/><published>2026-03-22T22:49:00+00:00</published><updated>2026-03-22T22:49:00+00:00</updated><id>https://github.com/simonw/research/tree/main/pcgamer-audit#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/pcgamer-audit#readme"&gt;PCGamer Article Performance Audit&lt;/a&gt;&lt;/p&gt;&lt;p&gt;A performance audit of the March 2026 PCGamer article on RSS readers reveals severe page bloat, with over 82% of network traffic and transferred bytes traced to ad-tech, tracking, and programmatic advertising scripts. Despite the core content consisting of just 10-15 KB of text and a handful of images (~150 KB total), the page triggers over 431 network requests and 5.5 MB of transfer (18.8 MB decoded) within 60 seconds—ballooning to 200+ MB in Firefox due to autoplay video carousels and…&lt;/p&gt;</summary><category term="web-performance"/><category term="rodney"/></entry><entry><title>JavaScript Sandboxing Research</title><link href="https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme" rel="alternate"/><published>2026-03-22T19:53:00+00:00</published><updated>2026-03-22T19:53:00+00:00</updated><id>https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/javascript-sandboxing-research#readme"&gt;JavaScript Sandboxing Research&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Analyzing current JavaScript sandboxing options for running untrusted code, this research compares core approaches in Node.js (including worker_threads, node:vm, and the Permission Model), prominent npm packages (isolated-vm, vm2), and alternative engines like quickjs-emscripten.&lt;/p&gt;</summary><category term="javascript"/><category term="nodejs"/><category term="sandboxing"/><category term="claude-code"/></entry><entry><title>SQLite Tags Benchmark: Comparing 5 Tagging Strategies</title><link href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme" rel="alternate"/><published>2026-03-20T02:57:00+00:00</published><updated>2026-03-20T02:57:00+00:00</updated><id>https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme</id><summary type="html">&lt;p&gt;&lt;a href="https://github.com/simonw/research/tree/main/sqlite-tags-benchmark#readme"&gt;SQLite Tags Benchmark: Comparing 5 Tagging Strategies&lt;/a&gt;&lt;/p&gt;&lt;p&gt;Benchmarking five tagging strategies in SQLite reveals clear trade-offs between query speed, storage, and implementation complexity for workflows involving tags (100,000 rows, 100 tags, average 6.5 tags/row). Indexed approaches—materialized lookup tables on JSON and classic many-to-many tables—easily outperform others, handling single-tag queries in under 1.5 milliseconds, while raw JSON and LIKE-based solutions are much slower.&lt;/p&gt;</summary><category term="json"/><category term="sqlite"/></entry></feed>