Simon Willison’s Weblog

Subscribe

May 2026

112 posts: 9 entries, 20 links, 19 quotes, 5 notes, 59 beats

May 26, 2026

I cannot believe I'm saying this, but getting the literal Pope to canonize your product's specific technical limitations as a spiritual treatise is the single greatest act of vendor lobbying I have ever seen.

Corey Quinn, on Anthropic co-founder Christopher Olah's influence on Magnifica Humanitas

# 2:28 am / ai, anthropic, ai-ethics, corey-quinn

A lot of the emails I get from founders are now written in a hard-hitting journalistic style. I know they're written by AI, because no founder ever wrote this way before. And once you realize something is written by AI, it's hard not to ignore it.

I have never knowingly finished reading an email signed by a human but written by AI. It feels like being lied to, and who would stand for that?

[...] It makes me think less of the author. It means they can't write well unaided (or feel they can't), and that they're trying to trick me.

It's not impressive to use AI to write stuff for you; any teenager can do that.

Paul Graham

# 3:02 pm / paul-graham, writing, ai, generative-ai, llms, ai-misuse

Microsoft Copilot Cowork Exfiltrates Files (via) The biggest challenge in designing agentic systems continues to be preventing them from enabling attackers to exfiltrate data.

In this case Microsoft Copilot Cowork (yes, that's a real product name) was allowing agents to send emails to the user's own inbox without approval... but those messages were then displayed in a way that could leak data to an attacker via rendered images:

Because these messages can contain external images that trigger network requests to external websites, data can be exfiltrated when a user opens a compromised message sent by the agent.

Since OneDrive can create pre-authenticated download links, a successful prompt injection could cause those links to be leaked, allowing files to be downloaded by the attacker.

# 3:36 pm / microsoft, security, ai, prompt-injection, generative-ai, llms, exfiltration-attacks, lethal-trifecta

The pressure (via) Daniel Stenberg on the unprecedented level of pressure the curl team are facing right now thanks to the deluge of (credible) AI-assisted security issues being reported.

The rate of incoming security reports is 4-5 times higher than it was in 2024 and double the speed of 2025 -- meaning that on average we now get more than one report per day. The quality is way higher than ever before. The reports are typically very detailed and long. [...]

For the first time in my life, my wife voiced concerns about my work hours and my imbalanced work/life situation. I work more than I’ve done before, but the flood keeps coming. [...]

This is a never-before seen or experienced pressure on the curl project and its security team members. An avalanche of high priority work that trumps all other things in the project that is primarily mental because we certainly could ignore them all if we wanted, but we feel a responsibility, we have a conscience and we are proud about our work.

The good news is that curl is a very solid piece of software, so the vulnerabilities people are finding tend not to be of high severity:

What is also a good trend: almost no one finds terrible vulnerabilities. All vulnerabilities found the last few years in curl have all been deemed severity LOW or MEDIUM. I'm not saying there won't be any more HIGH ever, but at least they are rare. The most recent severity high curl CVE was published in October 2023.

# 11:48 pm / curl, security, ai, generative-ai, llms, daniel-stenberg, ai-ethics, ai-security-research

May 27, 2026

PICARD: Data, shields up

DATA: Brilliant! Shields can reduce damage we sustain. Not immunity. Not hubris. Just prudence. It's not precaution—it's strategy.

[camera shakes]

WORF: HULL BREACHES ON NINE DECKS

DATA: Here's what happened: you told me to raise shields, and I didn't

Kyle Ferrana, @KyleTrainEmoji

# 6:41 am / ai, llms, coding-agents, ai-misuse

I think Anthropic and OpenAI have found product-market fit

Anthropic are strongly rumored to be about to have their first profitable quarter. Stories are circulating of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market fit.

[... 1,931 words]

sqlite AGENTS.md (via) SQLite gained an AGENTS.md file five days ago - but it's not intended for their own development, it's presumably aimed at people who are pointing agents at the SQLite codebase. It includes:

SQLite does not accept pull requests without prior agreement and/or accompanying legal paperwork that places the pull request in the public domain. However, the human SQLite developers will review a concise and well-written pull request as a proof-of-concept prior to reimplementing the changes themselves.

SQLite does not accept agentic code. However the project will accept agentic bug reports that include a reproducible test case. Patches or pull requests demonstrating a possible fix, for documentation purposes, are welcomed.

The most recent commit to that file removed "(currently)" from "SQLite does not (currently) accept agentic code", with the commit message "Strengthen the statement about not accepting agentic code".

Meanwhile the SQLite forum was being flooded with so many AI-generated bug reports - of varying quality - that they've now split those off into a new SQLite Bug Forum. D. Richard Hipp is resolving issues on there with a flurry of commits to the codebase.

# 11:44 pm / sqlite, ai, d-richard-hipp, generative-ai, llms, coding-agents, ai-security-research

May 28, 2026

Sighting 11:17 AM — Brewer's Blackbird, in Monterey Bay National Marine Sanctuary, CA, US, CA
Brewer's Blackbird
Brewer's Blackbird

A slightly customized Markdown rendering tool with special treatment for fenced code SVG blocks - it both renders the image and provides a tab for switching to the code view.

You can paste in Markdown or give it a URL to a CORS-enabled Markdown file or Gist. Here's an example where it loads a Markdown file full of LLM pelican logs for Opus 4.8.

  • New model: Claude Opus 4.8 (claude-opus-4.8).
  • New -o fast 1 option for fast mode, for organizations with that feature enabled on their account.
  • Default max_tokens for each model now defaults to that model's maximum output rather than 8,192. #72

See also my notes on Opus 4.8 - I used this new release of llm-anthropic to generate the pelicans.

Claude Opus 4.8: “a modest but tangible improvement”

Visit Claude Opus 4.8: "a modest but tangible improvement"

Anthropic shipped Claude Opus 4.8 today. My favourite thing about it is this note in the release announcement:

[... 983 words]

May 29, 2026

The most interesting thing about Anthropic's $65B Series H announcement is this line (emphasis mine):

Since our Series G in February, adoption has continued to grow across global enterprise customers, and our run-rate revenue crossed $47 billion earlier this month.

Anthropic have made a bit of a habit of sharing their "run-rate revenue" in this kind of announcement, which is an annualized projection of their current revenue - typically calculated by taking the most recent month and multiplying by 12. Update: here's a leaked description of their run-rate formula.

Earlier this year:

I had Claude Opus 4.8 make me this chart using Matplotlib (Claude: "a data line chart is more straightforward matplotlib work—not really a design piece"):

Line chart titled "Run-rate revenue" with y-axis "Run-rate revenue ($bn)" from $0bn to $50bn, showing four data points rising sharply: Dec 31 2025 $9bn, Feb 12 2026 $14bn, Apr 1 2026 $30bn, May 7 2026 $47bn.

Back in April Axios CEO Jim VandeHei wrote that he could not find "any company — in any industry, in any era — that has scaled organic revenue this quickly at this level as Anthropic" - and that was when they were at a paltry $30 billion.

(Also in Axios today is an anonymously sourced note that "An AI consultant tells Axios one of their clients recently spent half a billion dollars in a single month after failing to put usage limits on Claude licenses for employees" - times that by 12 and you get an extra $6 billion in annualized run-rate!)

Ed Zitron was extremely skeptical of that $30 billion number - I wonder if his skepticism will update for the new $47 billion figure.

I've seen a few people dismiss this as untrustworthy, because the numbers come from Anthropic. That doesn't hold up: these numbers were included in announcements of their fundraises, and lying to investors who just put in $65 billion would be securities fraud. They're even less likely to lie given that the real numbers will no doubt come out in their S-1 when they file for their IPO.

# 1:23 am / anthropic, ai

Another significant alpha release, with two new headline features.

Datasette now offers users with the necessary permissions the ability to both execute write queries against their database and to save stored queries (renamed from "canned queries") both privately and for use by other members of their Datasette instance.

There's more detail in SQL write queries and stored queries in Datasette 1.0a31 on the Datasette blog, which now has three posts introducing new features since the blog launched two weeks ago.

Here's an animated demo from the blog post showing how the new execute query interface lets people get started with templated insert/update/delete queries from tables they have permission to edit:

The user starts on the data database page, selects actions and "Execute write SQL", then selects the insert document template on the next page and executes it with a title of "My document!". Also demonstrates that a create table statement cannot be executed because the user does not have create-table permission.

May 30, 2026

Datasette Lite is my version of Datasette that runs entirely in the browser using Pyodide in WebAssembly.

When I first built it four years ago I used Web Workers and code that intercepts navigation operations and fetches the generated HTML by running the Python app.

This worked, but had the disadvantage that any JavaScript in <script> tags would not be executed - breaking some Datasette functionality and a whole lot of Datasette plugins.

This morning I set Claude Opus 4.8 the task (in Claude Code for web) of figuring out how to run Python ASGI apps in Pyodide using Service Workers instead, and it seems to work! Here's a basic ASGI FastCGI demo and here's a demo that runs Datasette 1.0a31.

I'm still getting my head around exactly how it works, but once I've done that I plan to upgrade Datasette Lite itself.

My take on AI is, essentially, everybody who’s against it is too against it and everybody who’s for it is too for it.

Daniel Jalkut, via John Gruber

# 5:29 pm / john-gruber, ai

I Am Retiring from Tech to Live Offline (via) I've seen a lot of posts on forums from people threatening to quit their careers over AI. This is not one of those: Chad Whitacre is taking concrete steps, starting with this typewritten, scanned letter

I'm retiring from tech. Well, "retiring" is euphemistic. I'm stepping away from tech, and that includes Open Source. [...]

AI was the last straw. Have you heard of that island off India where the indigenous population kills any outsiders fool-hardy enough to land? They are doing the rest of us a favor by preserving a way of life we may need again someday, or at the very least should not want to see completely extinguished. A reminder. Never forget your roots. Here in Pennsylvania we have the Amish performing a similar function. Significantly less hostile, though still set apart, they bear witness to what was normal for all of us a couple short centuries ago: horse and buggy, wood stoves and lanterns. My intent is to be AI Amish, which means Internet Amish. Not 1780, but 1980. Neo-Amish. I'm fine driving a car and flipping a lightswitch, by which I mean that they don't make me into something I hate, which AI and [struck through: social media] [handwritten above: doomscrolling] do.

I'll admit that at first I wasn't entirely sure if this was serious. Then I found this earlier post by Chad from Feb 19 2026, Spitting Out the Agentic Kool-Aid:

I figured I’d better taste the Kool-Aid in order to form an opinion, so I dove into Claude Code with Opus 4.5 on a side project. I spent three 12+ hour days with it. I was intoxicated. My family was weirded out. [...]

It weirded me out too, when I unplugged for a long weekend. Something felt off. It was like I had another “person” in my head, sharing my inner monologue—but the “person” was a computer system owned by a budding megacorp.

[...] I am now also committing myself to disembarking from the titantic of technological accelerationism.

All efforts to address the problems of invasive technology are worthwhile, even those that are only partially effective. For my part, I have started trying to return more fully to a pre-screen, analog life.

It's accompanied by a video version of the essay which I found touching and sincere.

Chad has been trying to solve the open source sustainability problem for years - I talked with him about this at PyCon 2025 in Cleveland. That's a very tough nut to crack, and the disruption caused by AI looks to be making it even harder.

I'm glad that the Open Source Endowment will continue without him. I'm very much going to miss his online voice.

# 7:39 pm / open-source, ai, generative-ai, llms, chad-whitacre, ai-ethics, deep-blue

How we contain Claude across products. A complaint I often have about sandboxing products is that they are rarely thoroughly documented, and in the absence of detailed documentation it's hard to know how much I can trust them.

Anthropic just published a fantastic overview of how their various sandbox techniques work across Claude.ai, Claude Code, and Cowork.

We constrain where and how an agent can act with process sandboxes, VMs, filesystem boundaries, and egress controls. The goal is to set a hard boundary on what an agent can reach. For example, if credentials never enter the sandbox, they can't be exfiltrated, regardless of whether the cause is a user, a model finding a “creative” path, or an attacker.

Claude.ai uses gVisor. Claude Code, run locally, uses Seatbelt on macOS and Bubblewrap on Linux. Claude Cowork runs a full VM (Apple's Virtualization framework on macOS, HCS on Windows).

There's a lot in here, including some interesting stories of risks they missed such as the api.anthropic.com/v1/files exfiltration vector covered here previously.

This reminded me it's time I took another look at Anthropic's open source srt (Anthropic Sandbox Runtime) tool - it's mature enough now that I'm ready to give it a proper go.

# 9:36 pm / sandboxing, security, ai, generative-ai, llms, anthropic, claude, claude-code

Sighting 4:00 PM — California Sea Lion, in Pillar Point Harbor, CA, US
California Sea Lion
California Sea Lion

May 31, 2026

Anthropic defines “run-rate revenue” in two parts. Use the last 28 days of sales ⁠from customers charged on a consumption basis and multiply it by 13. Then, multiply the monthly subscription take by 12, ​and add the two together.

Karen Kwok for Reuters Breakingviews, citing "a person familiar with the matter"

# 1:48 am / ai, anthropic

The solution might be cancelling my AI subscription (via) I find this post by David Wilson very relatable. David lists 16+ projects he's spun up with AI tooling, and concludes:

I didn't mean to build most of these things. Usually the Claude session started with something like "write a quick script for X", and one hour later the result is not a quick script for X, nor in the usual case is my problem solved, whatever the original itch happened to be.

On that last point, this technology is horrific for attention. It's a thermonuclear ADHD amplifier and I have seen the same effect in every single one of my adult friends. Folk running 3 screens simultaneously working on totally unrelated "projects" they have little hope of maintaining, and such little commitment to the outcome that the time is obviously wasted.

This is a very real problem. I'm finding that coding agents can take me from a vague idea to a working solution, one with tests and documentation and that looks like a carefully considered project evolved over the course of many weeks... in less than an hour.

Even if the code is rock solid, there's a limit to how many projects like that I can sensibly care for - and if they're instantly abandoned, what value was there from creating them in the first place?

David doesn't think this is sustainable at all:

I have no idea how to manage AI at present except by curtailing use, because a tool producing a cheap reward with minimal input and no friction can only be a liability, and achieving that realisation is probably the only real contribution of AI to date.

I'm hopeful that the critical skill to develop here is discipline. That’s not great news for me: I’ve been trying to figure that one out for decades!

Interestingly, the Hacker News thread has gathered a number of comments from people with ADHD who are finding agents help them achieve the focus they've been missing:

  • "... for me (also ADHD) it's kind of the opposite. I'm finishing side projects for the first time ever because I can actually get them working before I get bored of them"
  • "As someone with ADHD I feel like AI is a salve for my mind. I used to listen to intense EDM while working. Now I sit in silence and talk to my agents. I maintain inbox zero. I absorb and comment across all relevant projects, even outside my team. I literally feel like I have a support team for the first time."
  • "For those of us prone to hyperfocus, working with AI can provide the kinds of stimulation we crave. I can hardly remember a time when I've felt more engaged with my work, more productive, and more badass."

# 4:31 pm / productivity, ai, generative-ai, llms, coding-agents, ai-misuse

Sighting 2:52 PM – 2:55 PM — California Brown Pelican, Pacific Harbor Seal, in Monterey Bay National Marine Sanctuary, CA, US, CA
California Brown Pelican
California Brown Pelican
California Brown Pelican
California Brown Pelican
Pacific Harbor Seal
Pacific Harbor Seal
California Brown Pelican
California Brown Pelican

A minor bugfix release. Fixes a bug with INSERT ... RETURNING queries via the new /db/-/execute-write endpoint and a bunch of base_url issues which showed up when I was experimenting with Service Workers yesterday.