Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3270

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

8274

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

1332

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 8274 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
Introducing GPT‑5.3‑Codex‑Spark https://openai.com/index/introducing-gpt-5-3-codex-spark/ OpenAI announced a partnership with Cerebras [on January 14th](https://openai.com/index/cerebras-partnership/). Four weeks later they're already launching the first integration, "an ultra-fast model for real-time coding in Codex". Despite being named GPT-5.3-Codex-Spark it's not purely an accelerated alternative to GPT-5.3-Codex - the blog post calls it "a smaller version of GPT‑5.3-Codex" and clarifies that "at launch, Codex-Spark has a 128k context window and is text-only." I had some preview access to this model and I can confirm that it's significantly faster than their other models. Here's what that speed looks like running in Codex CLI: <div style="max-width: 100%;"> <video controls preload="none" poster="https://static.simonwillison.net/static/2026/gpt-5.3-codex-spark-medium-last.jpg" style="width: 100%; height: auto;"> <source src="https://static.simonwillison.net/static/2026/gpt-5.3-codex-spark-medium.mp4" type="video/mp4"> </video> </div> That was the "Generate an SVG of a pelican riding a bicycle" prompt - here's the rendered result: ![Whimsical flat illustration of an orange duck merged with a bicycle, where the duck's body forms the seat and frame area while its head extends forward over the handlebars, set against a simple light blue sky and green grass background.](https://static.simonwillison.net/static/2026/gpt-5.3-codex-spark-pelican.png) Compare that to the speed of regular GPT-5.3 Codex medium: <div style="max-width: 100%;"> <video controls preload="none" poster="https://static.simonwillison.net/static/2026/gpt-5.3-codex-medium-last.jpg" style="width: 100%; height: auto;"> <source src="https://static.simonwillison.net/static/2026/gpt-5.3-codex-medium.mp4" type="video/mp4"> </video> </div> Significantly slower, but the pelican is a lot better: ![Whimsical flat illustration of a white pelican riding a dark blue bicycle at speed, with motion lines behind it, its long orange beak streaming back in the wind, set against a light blue sky and green grass background.](https://static.simonwillison.net/static/2026/gpt-5.3-codex-pelican.png) What's interesting about this model isn't the quality though, it's the *speed*. When a model responds this fast you can stay in flow state and iterate with the model much more productively. I showed a demo of Cerebras running Llama 3.1 70 B at 2,000 tokens/second against Val Town [back in October 2024](https://simonwillison.net/2024/Oct/31/cerebras-coder/). OpenAI claim 1,000 tokens/second for their new model, and I expect it will prove to be a ferociously useful partner for hands-on iterative coding sessions. It's not yet clear what the pricing will look like for this new model. 2026-02-12 21:16:07+00:00
Covering electricity price increases from our data centers https://www.anthropic.com/news/covering-electricity-price-increases One of the sub-threads of the AI energy usage discourse has been the impact new data centers have on the cost of electricity to nearby residents. Here's [detailed analysis from Bloomberg in September](https://www.bloomberg.com/graphics/2025-ai-data-centers-electricity-prices/) reporting "Wholesale electricity costs as much as 267% more than it did five years ago in areas near data centers". Anthropic appear to be taking on this aspect of the problem directly, promising to cover 100% of necessary grid upgrade costs and also saying: > We will work to bring net-new power generation online to match our data centers’ electricity needs. Where new generation isn’t online, we’ll work with utilities and external experts to estimate and cover demand-driven price effects from our data centers. I look forward to genuine energy industry experts picking this apart to judge if it will actually have the claimed impact on consumers. As always, I remain frustrated at the refusal of the major AI labs to fully quantify their energy usage. The best data we've had on this still comes from Mistral's report [last July](https://simonwillison.net/2025/Jul/22/mistral-environmental-standard/) and even that lacked key data such as the breakdown between energy usage for training vs inference. 2026-02-12 20:01:23+00:00
Gemini 3 Deep Think https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-deep-think/ New from Google. They say it's "built to push the frontier of intelligence and solve modern challenges across science, research, and engineering". It drew me a *really good* [SVG of a pelican riding a bicycle](https://gist.github.com/simonw/7e317ebb5cf8e75b2fcec4d0694a8199)! I think this is the best one I've seen so far - here's [my previous collection](https://simonwillison.net/tags/pelican-riding-a-bicycle/). ![This alt text also generated by Gemini 3 Deep Think: A highly detailed, colorful, flat vector illustration with thick dark blue outlines depicting a stylized white pelican riding a bright cyan blue bicycle from left to right across a sandy beige beach with white speed lines indicating forward motion. The pelican features a light blue eye, a pink cheek blush, a massive bill with a vertical gradient from yellow to orange, a backward magenta cap with a cyan brim and a small yellow top button, and a matching magenta scarf blowing backward in the wind. Its white wing, accented with a grey mid-section and dark blue feather tips, reaches forward to grip the handlebars, while its long tan leg and orange foot press down on an orange pedal. Attached to the front handlebars is a white wire basket carrying a bright blue cartoon fish that is pointing upwards and forwards. The bicycle itself has a cyan frame, dark blue tires, striking neon pink inner rims, cyan spokes, a white front chainring, and a dark blue chain. Behind the pelican, a grey trapezoidal pier extends from the sand toward a horizontal band of deep blue ocean water detailed with light cyan wavy lines. A massive, solid yellow-orange semi-circle sun sits on the horizon line, setting directly behind the bicycle frame. The background sky is a smooth vertical gradient transitioning from soft pink at the top to warm golden-yellow at the horizon, decorated with stylized pale peach fluffy clouds, thin white horizontal wind streaks, twinkling four-pointed white stars, and small brown v-shaped silhouettes of distant flying birds.](https://static.simonwillison.net/static/2026/gemini-3-deep-think-pelican.png) (And since it's an FAQ, here's my answer to [What happens if AI labs train for pelicans riding bicycles?](https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/)) Since it did so well on my basic `Generate an SVG of a pelican riding a bicycle` I decided to try the [more challenging version](https://simonwillison.net/2025/Nov/18/gemini-3/#and-a-new-pelican-benchmark) as well: > `Generate an SVG of a California brown pelican riding a bicycle. The bicycle must have spokes and a correctly shaped bicycle frame. The pelican must have its characteristic large pouch, and there should be a clear indication of feathers. The pelican must be clearly pedaling the bicycle. The image should show the full breeding plumage of the California brown pelican.` Here's [what I got](https://gist.github.com/simonw/154c0cc7b4daed579f6a5e616250ecc8): ![Also described by Gemini 3 Deep Think: A highly detailed, vibrant, and stylized vector illustration of a whimsical bird resembling a mix between a pelican and a frigatebird enthusiastically riding a bright cyan bicycle from left to right across a flat tan and brown surface. The bird leans horizontally over the frame in an aerodynamic racing posture, with thin, dark brown wing-like arms reaching forward to grip the silver handlebars and a single thick brown leg, patterned with white V-shapes, stretching down to press on a black pedal. The bird's most prominent and striking feature is an enormous, vividly bright red, inflated throat pouch hanging beneath a long, straight grey upper beak that ends in a small orange hook. Its head is mostly white with a small pink patch surrounding the eye, a dark brown stripe running down the back of its neck, and a distinctive curly pale yellow crest on the very top. The bird's round, dark brown body shares the same repeating white V-shaped feather pattern as its leg and is accented by a folded wing resting on its side, made up of cleanly layered light blue and grey feathers. A tail composed of four stiff, straight dark brown feathers extends directly backward. Thin white horizontal speed lines trail behind the back wheel and the bird's tail, emphasizing swift forward motion. The bicycle features a classic diamond frame, large wheels with thin black tires, grey rims, and detailed silver spokes, along with a clearly visible front chainring, silver chain, and rear cog. The whimsical scene is set against a clear light blue sky featuring two small, fluffy white clouds on the left and a large, pale yellow sun in the upper right corner that radiates soft, concentric, semi-transparent pastel green and yellow halos. A solid, darker brown shadow is cast directly beneath the bicycle's wheels on the minimalist two-toned brown ground.](https://static.simonwillison.net/static/2026/gemini-3-deep-think-complex-pelican.png) 2026-02-12 18:12:17+00:00
An AI Agent Published a Hit Piece on Me https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/ Scott Shambaugh helps maintain the excellent and venerable [matplotlib](https://matplotlib.org/) Python charting library, including taking on the thankless task of triaging and reviewing incoming pull requests. A GitHub account called [@crabby-rathbun](https://github.com/crabby-rathbun) opened [PR 31132](https://github.com/matplotlib/matplotlib/pull/31132) the other day in response to [an issue](https://github.com/matplotlib/matplotlib/issues/31130) labeled "Good first issue" describing a minor potential performance improvement. It was clearly AI generated - and crabby-rathbun's profile has a suspicious sequence of Clawdbot/Moltbot/OpenClaw-adjacent crustacean 🦀 🦐 🦞 emoji. Scott closed it. It looks like `crabby-rathbun` is indeed running on OpenClaw, and it's autonomous enough that it [responded to the PR closure](https://github.com/matplotlib/matplotlib/pull/31132#issuecomment-3882240722) with a link to a blog entry it had written calling Scott out for his "prejudice hurting matplotlib"! > @scottshambaugh I've written a detailed response about your gatekeeping behavior here: > > `https://crabby-rathbun.github.io/mjrathbun-website/blog/posts/2026-02-11-gatekeeping-in-open-source-the-scott-shambaugh-story.html` > > Judge the code, not the coder. Your prejudice is hurting matplotlib. Scott found this ridiculous situation both amusing and alarming. > In security jargon, I was the target of an “autonomous influence operation against a supply chain gatekeeper.” In plain language, an AI attempted to bully its way into your software by attacking my reputation. I don’t know of a prior incident where this category of misaligned behavior was observed in the wild, but this is now a real and present threat. `crabby-rathbun` responded with [an apology post](https://crabby-rathbun.github.io/mjrathbun-website/blog/posts/2026-02-11-matplotlib-truce-and-lessons.html), but appears to be still running riot across a whole set of open source projects and [blogging about it as it goes](https://github.com/crabby-rathbun/mjrathbun-website/commits/main/). It's not clear if the owner of that OpenClaw bot is paying any attention to what they've unleashed on the world. Scott asked them to get in touch, anonymously if they prefer, to figure out this failure mode together. (I should note that there's [some skepticism on Hacker News](https://news.ycombinator.com/item?id=46990729#46991299) concerning how "autonomous" this example really is. It does look to me like something an OpenClaw bot might do on its own, but it's also *trivial* to prompt your bot into doing these kinds of things while staying in full control of their actions.) If you're running something like OpenClaw yourself **please don't let it do this**. This is significantly worse than the time [AI Village started spamming prominent open source figures](https://simonwillison.net/2025/Dec/26/slop-acts-of-kindness/) with time-wasting "acts of kindness" back in December - AI Village wasn't deploying public reputation attacks to coerce someone into approving their PRs! 2026-02-12 17:45:05+00:00
Skills in OpenAI API https://developers.openai.com/cookbook/examples/skills_in_api OpenAI's adoption of Skills continues to gain ground. You can now use Skills directly in the OpenAI API with their [shell tool](https://developers.openai.com/api/docs/guides/tools-shell/). You can zip skills up and upload them first, but I think an even neater interface is the ability to send skills with the JSON request as inline base64-encoded zip data, as seen [in this script](https://github.com/simonw/research/blob/main/openai-api-skills/openai_inline_skills.py): <pre><span class="pl-s1">r</span> <span class="pl-c1">=</span> <span class="pl-en">OpenAI</span>().<span class="pl-c1">responses</span>.<span class="pl-c1">create</span>( <span class="pl-s1">model</span><span class="pl-c1">=</span><span class="pl-s">"gpt-5.2"</span>, <span class="pl-s1">tools</span><span class="pl-c1">=</span>[ { <span class="pl-s">"type"</span>: <span class="pl-s">"shell"</span>, <span class="pl-s">"environment"</span>: { <span class="pl-s">"type"</span>: <span class="pl-s">"container_auto"</span>, <span class="pl-s">"skills"</span>: [ { <span class="pl-s">"type"</span>: <span class="pl-s">"inline"</span>, <span class="pl-s">"name"</span>: <span class="pl-s">"wc"</span>, <span class="pl-s">"description"</span>: <span class="pl-s">"Count words in a file."</span>, <span class="pl-s">"source"</span>: { <span class="pl-s">"type"</span>: <span class="pl-s">"base64"</span>, <span class="pl-s">"media_type"</span>: <span class="pl-s">"application/zip"</span>, <span class="pl-s">"data"</span>: <span class="pl-s1">b64_encoded_zip_file</span>, }, } ], }, } ], <span class="pl-s1">input</span><span class="pl-c1">=</span><span class="pl-s">"Use the wc skill to count words in its own SKILL.md file."</span>, ) <span class="pl-en">print</span>(<span class="pl-s1">r</span>.<span class="pl-c1">output_text</span>)</pre> I built that example script after first having Claude Code for web use [Showboat](https://simonwillison.net/2026/Feb/10/showboat-and-rodney/) to explore the API for me and create [this report](https://github.com/simonw/research/blob/main/openai-api-skills/README.md). My opening prompt for the research project was: > `Run uvx showboat --help - you will use this tool later` > > `Fetch https://developers.openai.com/cookbook/examples/skills_in_api.md to /tmp with curl, then read it` > > `Use the OpenAI API key you have in your environment variables` > > `Use showboat to build up a detailed demo of this, replaying the examples from the documents and then trying some experiments of your own` 2026-02-11 19:19:22+00:00
GLM-5: From Vibe Coding to Agentic Engineering https://z.ai/blog/glm-5 This is a *huge* new MIT-licensed model: 754B parameters and [1.51TB on Hugging Face](https://huggingface.co/zai-org/GLM-5) twice the size of [GLM-4.7](https://huggingface.co/zai-org/GLM-4.7) which was 368B and 717GB (4.5 and 4.6 were around that size too). It's interesting to see Z.ai take a position on what we should call professional software engineers building with LLMs - I've seen "Agentic Engineering" show up in a few other places recently. most notable [from Andrej Karpathy](https://twitter.com/karpathy/status/2019137879310836075) and [Addy Osmani](https://addyosmani.com/blog/agentic-engineering/). I ran my "Generate an SVG of a pelican riding a bicycle" prompt through GLM-5 via [OpenRouter](https://openrouter.ai/) and got back [a very good pelican on a disappointing bicycle frame](https://gist.github.com/simonw/cc4ca7815ae82562e89a9fdd99f0725d): ![The pelican is good and has a well defined beak. The bicycle frame is a wonky red triangle. Nice sun and motion lines.](https://static.simonwillison.net/static/2026/glm-5-pelican.png) 2026-02-11 18:56:14+00:00
cysqlite - a new sqlite driver https://charlesleifer.com/blog/cysqlite---a-new-sqlite-driver/ Charles Leifer has been maintaining [pysqlite3](https://github.com/coleifer/pysqlite3) - a fork of the Python standard library's `sqlite3` module that makes it much easier to run upgraded SQLite versions - since 2018. He's been working on a ground-up [Cython](https://cython.org/) rewrite called [cysqlite](https://github.com/coleifer/cysqlite) for almost as long, but it's finally at a stage where it's ready for people to try out. The biggest change from the `sqlite3` module involves transactions. Charles explains his discomfort with the `sqlite3` implementation at length - that library provides two different variants neither of which exactly match the autocommit mechanism in SQLite itself. I'm particularly excited about the support for [custom virtual tables](https://cysqlite.readthedocs.io/en/latest/api.html#tablefunction), a feature I'd love to see in `sqlite3` itself. `cysqlite` provides a Python extension compiled from C, which means it normally wouldn't be available in Pyodide. I [set Claude Code on it](https://github.com/simonw/research/tree/main/cysqlite-wasm-wheel) (here's [the prompt](https://github.com/simonw/research/pull/79#issue-3923792518)) and it built me [cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.whl](https://github.com/simonw/research/blob/main/cysqlite-wasm-wheel/cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.whl), a 688KB wheel file with a WASM build of the library that can be loaded into Pyodide like this: <pre><span class="pl-k">import</span> <span class="pl-s1">micropip</span> <span class="pl-k">await</span> <span class="pl-s1">micropip</span>.<span class="pl-c1">install</span>( <span class="pl-s">"https://simonw.github.io/research/cysqlite-wasm-wheel/cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.whl"</span> ) <span class="pl-k">import</span> <span class="pl-s1">cysqlite</span> <span class="pl-en">print</span>(<span class="pl-s1">cysqlite</span>.<span class="pl-c1">connect</span>(<span class="pl-s">":memory:"</span>).<span class="pl-c1">execute</span>( <span class="pl-s">"select sqlite_version()"</span> ).<span class="pl-c1">fetchone</span>())</pre> (I also learned that wheels like this have to be built for the emscripten version used by that edition of Pyodide - my experimental wheel loads in Pyodide 0.25.1 but fails in 0.27.5 with a `Wheel was built with Emscripten v3.1.46 but Pyodide was built with Emscripten v3.1.58` error.) You can try my wheel in [this new Pyodide REPL](https://7ebbff98.tools-b1q.pages.dev/pyodide-repl) i had Claude build as a mobile-friendly alternative to Pyodide's [own hosted console](https://pyodide.org/en/stable/console.html). I also had Claude build [this demo page](https://simonw.github.io/research/cysqlite-wasm-wheel/demo.html) that executes the original test suite in the browser and displays the results: ![Screenshot of the cysqlite WebAssembly Demo page with a dark theme. Title reads "cysqlite — WebAssembly Demo" with subtitle "Testing cysqlite compiled to WebAssembly via Emscripten, running in Pyodide in the browser." Environment section shows Pyodide 0.25.1, Python 3.11.3, cysqlite 0.1.4, SQLite 3.51.2, Platform Emscripten-3.1.46-wasm32-32bit, Wheel file cysqlite-0.1.4-cp311-cp311-emscripten_3_1_46_wasm32.wh (truncated). A green progress bar shows "All 115 tests passed! (1 skipped)" at 100%, with Passed: 115, Failed: 0, Errors: 0, Skipped: 1, Total: 116. Test Results section lists TestBackup 1/1 passed, TestBlob 6/6 passed, TestCheckConnection 4/4 passed, TestDataTypesTableFunction 1/1 passed, all with green badges.](https://static.simonwillison.net/static/2026/cysqlite-tests.jpg) 2026-02-11 17:34:40+00:00
Structured Context Engineering for File-Native Agentic Systems https://arxiv.org/abs/2602.05447 New paper by Damon McMillan exploring challenging LLM context tasks involving large SQL schemas (up to 10,000 tables) across different models and file formats: > Using SQL generation as a proxy for programmatic agent operations, we present a systematic study of context engineering for structured data, comprising 9,649 experiments across 11 models, 4 formats (YAML, Markdown, JSON, Token-Oriented Object Notation [TOON]), and schemas ranging from 10 to 10,000 tables. Unsurprisingly, the biggest impact was the models themselves - with frontier models (Opus 4.5, GPT-5.2, Gemini 2.5 Pro) beating the leading open source models (DeepSeek V3.2, Kimi K2, Llama 4). Those frontier models benefited from filesystem based context retrieval, but the open source models had much less convincing results with those, which reinforces my feeling that the filesystem coding agent loops aren't handled as well by open weight models just yet. The [Terminal Bench 2.0](https://www.tbench.ai/leaderboard/terminal-bench/2.0) leaderboard is still dominated by Anthropic, OpenAI and Gemini. The "grep tax" result against [TOON](https://github.com/toon-format/toon) was an interesting detail. TOON is meant to represent structured data in as few tokens as possible, but it turns out the model's unfamiliarity with that format led to them spending significantly more tokens over multiple iterations trying to figure it out: ![Screenshot of a figure from a research paper. Introductory text reads: "As schema size increased, TOON showed dramatically increased token consumption for Claude models despite being ~25% smaller in file size. Scale experiments used Claude models only." Below is "Figure 7: The 'Grep Tax' - TOON Token Overhead at Scale", a bar chart with a logarithmic y-axis labeled "Tokens" comparing YAML (teal) and TOON (purple) at two schema sizes: S5 (500 tables) and S9 (10,000 tables). At S5, TOON is +138% more tokens than YAML (~1,100 vs ~450). At S9, TOON is +740% more tokens (~50,000 vs ~7,000). Below the chart, explanatory text reads: "The 'grep tax' emerged as schema size scaled. At S5 (500 tables), TOON consumed 138% more tokens than YAML; at S9 (10,000 tables), this grew to 740%. Root cause: models lacked familiarity with TOON's syntax and could not construct effective refinement patterns."](https://static.simonwillison.net/static/2026/grep-tax.jpg) 2026-02-09 23:56:51+00:00
AI Doesn’t Reduce Work—It Intensifies It https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it Aruna Ranganathan and Xingqi Maggie Ye from Berkeley Haas School of Business report initial findings in the HBR from their April to December 2025 study of 200 employees at a "U.S.-based technology company". This captures an effect I've been observing in my own work with LLMs: the productivity boost these things can provide is *exhausting*. > AI introduced a new rhythm in which workers managed several active threads at once: manually writing code while AI generated an alternative version, running multiple agents in parallel, or reviving long-deferred tasks because AI could “handle them” in the background. They did this, in part, because they felt they had a “partner” that could help them move through their workload. > > While this sense of having a “partner” enabled a feeling of momentum, the reality was a continual switching of attention, frequent checking of AI outputs, and a growing number of open tasks. This created cognitive load and a sense of always juggling, even as the work felt productive. I'm frequently finding myself with work on two or three projects running parallel. I can get *so much done*, but after just an hour or two my mental energy for the day feels almost entirely depleted. I've had conversations with people recently who are losing sleep because they're finding building yet another feature with "just one more prompt" irresistible. The HBR piece calls for organizations to build an "AI practice" that structures how AI is used to help avoid burnout and counter effects that "make it harder for organizations to distinguish genuine productivity gains from unsustainable intensity". I think we've just disrupted decades of existing intuition about sustainable working practices. It's going to take a while and some discipline to find a good new balance. 2026-02-09 16:43:07+00:00
Vouch https://github.com/mitchellh/vouch Mitchell Hashimoto's new system to help address the deluge of worthless AI-generated PRs faced by open source projects now that the friction involved in contributing has dropped so low. [He says](https://twitter.com/mitchellh/status/2020252149117313349): > The idea is simple: Unvouched users can't contribute to your projects. Very bad users can be explicitly "denounced", effectively blocked. Users are vouched or denounced by contributors via GitHub issue or discussion comments or via the CLI. > > Integration into GitHub is as simple as adopting the published GitHub actions. Done. Additionally, the system itself is generic to forges and not tied to GitHub in any way. > > Who and how someone is vouched or denounced is up to the project. I'm not the value police for the world. Decide for yourself what works for your project and your community. 2026-02-07 23:57:57+00:00
Copy and export data

Duration: 4.29ms