Simon Willison’s Weblog

On ai 2030 ai-ethics 303 agentic-engineering 51 python 1250 anthropic 283 ...

 

Entries Links Quotes Notes Guides Elsewhere

May 21, 2026

Datasette Agent

Visit Datasette Agent

We just announced the first release of Datasette Agent, a new extensible AI assistant for Datasette. I’ve been working on my LLM Python library for just over three years now, and Datasette Agent represents the moment that LLM and Datasette finally come together. I’m really excited about it!

[... 659 words]

A Datasette Agent plugin for running commands in a Fly Sprites sandbox.

  • "View SQL query" buttons below rendered charts.
  • "View SQL query" buttons for both visible tables and collapsed SQL result tool calls.
  • Don't display empty reasoning chunks
  • Improved handling of truncated responses - table still displays to the user even if the SQL results were truncated when showing the agent.

See Datasette Agent, an extensible AI assistant for Datasette.

May 20, 2026

We have the ability to use compute resources to support our proprietary AI applications (such as Grok 5, which is currently being trained at COLOSSUS II), while also providing access to select compute capacity to third-party customers. For example, in May 2026, we entered into Cloud Services Agreements with Anthropic PBC (“Anthropic”), an AI research and development public benefit corporation, with respect to access to compute capacity across COLOSSUS and COLOSSUS II. Pursuant to these agreements, the customer has agreed to pay us $1.25 billion per month through May 2029, with capacity ramping in May and June 2026 at a reduced fee. The agreements may be terminated by either party upon 90 days’ notice.

SpaceX S-1, highlights mine

# 10:26 pm / anthropic, grok, generative-ai, ai, llms

How fast is 10 tokens per second really? (via) Neat little HTML app by Mike Veerman (source code here) which simulates LLM token output speeds from 5/second to 800/second.

Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like.

# 5:57 pm / ai, generative-ai, llms

Sighting 9:20 AM – 10:00 AM — Common Loon, Canada Goose, Striped Shore Crab, California Brown Pelican, California Sea Lion, in Monterey Bay National Marine Sanctuary, CA, US, CA
Canada Goose
Canada Goose
Common Loon
Common Loon
Striped Shore Crab
Striped Shore Crab
California Brown Pelican
California Brown Pelican
California Brown Pelican
California Brown Pelican
California Sea Lion
California Sea Lion
Common Loon
Common Loon

It's hard to find much to write about Google I/O this year because I have a policy of not writing about anything that I can't try out myself, and a lot of the big announcements are "coming soon".

I actually prefer to write about things that are in general availability, because I've had instances in the past where the previews didn't match what was released to the general public later on.

Aside from Gemini 3.5 Flash the most interesting announcement looks to be Google's upcoming OpenClaw competitor Gemini Spark, described as "your personal AI agent" which can "connect natively with your favorite Google apps like Gmail, Calendar, Drive, Docs, Sheets, Slides, YouTube, and Google Maps". The FAQ for that also includes this confusing detail:

What Gemini model does Gemini Spark run on?

Gemini Spark runs on Gemini 3.5 Flash and Antigravity.

The antigravity.google website currently lists Antigravity as a desktop app, a CLI agent tool (written in Go), the Antigravity SDK (an open source Python wrapper around a bundled closed source Go binary), and the original Antigravity IDE (a VS Code fork).

I guess Gemini Spark, the user-facing hosted agent product, might be running on that Go binary, but I'm not sure why that's worth mentioning in the FAQ!

Naturally I went looking for notes on how Gemini Spark intends to handle the risk of prompt injection. The best information I could find on that was in the Everything Google Cloud customers need to know coming out of Google I/O post aimed at enterprise customers, which includes:

Spark operates in a fully managed, secure runtime on Google Cloud, meaning you get enterprise-grade security without ever having to manage the underlying infrastructure. Every task executes in a fresh, strictly isolated, ephemeral VM to help ensure data never overlaps between sessions. To protect your enterprise, all traffic routes through our secure Agent Gateway that enforces Data Loss Prevention (DLP) policies, while user credentials remain fully encrypted and are never exposed directly to the agent.

Given how many people are going to be piping very sensitive data through Gemini Spark in the near future I hope they've made this bullet-proof, or this could be a top candidate for the agent security challenger disaster that we still haven't seen.

Also of note: in Transitioning Gemini CLI to Antigravity CLI Google announce that the open source Gemini CLI tool (Apache 2.0 licensed TypeScript) will stop working with their AI subscription plans on June 18th, replaced by the new closed source Antigravity CLI.

# 3:32 pm / gemini, google, generative-ai, ai, google-io, llms, prompt-injection

  • More color! Bar and waffle charts without a color column are shaded by magnitude with a sequential color scheme; color columns holding text values use the observable10 categorical scheme. #2
  • Now checks execute-sql permission before running the query to find the column names.
  • Charts now display interactive tooltips.
  • Fixed a bug where waffleY charts were not described to the agent.
Sighting 5:28 PM — Surf Scoter, in Monterey Bay National Marine Sanctuary, CA, US, CA
Surf Scoter
Surf Scoter

May 19, 2026

See also my notes on Gemini 3.5 Flash, and the pelican I drew using this upgrade to the plugin.

Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

Visit Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

Today at Google I/O, Google released Gemini 3.5 Flash. This one skipped the -preview modifier and went straight to general availability, and Google appear to be using it for a whole lot of their key products:

[... 610 words]

  • Compatible with llm>=0.32a0 alpha - adds the ability to stream reasoning tokens.
  • Fix for bug where llm_prompt_context() hook did not fully collect chains of responses. #7

The last six months in LLMs in five minutes

Visit The last six months in LLMs in five minutes

I put together these annotated slides from my five minute lightning talk at PyCon US 2026, using the latest iteration of my annotated presentation tool.

[... 2,061 words]

May 18, 2026

Sighting 7:51 AM – 8:13 AM — Glaucous-winged Gull, Brown Pelican, Snowy Egret, Canada Goose, in Los Angeles River, CA, US
Glaucous-winged Gull
Glaucous-winged Gull
Glaucous-winged Gull
Glaucous-winged Gull
Brown Pelican
Brown Pelican
Snowy Egret
Snowy Egret
Snowy Egret
Snowy Egret
Canada Goose
Canada Goose
Canada Goose
Canada Goose

I'm heading home from PyCon US today so I went on a last morning walk to try and spot a pelican. I saw one! Didn't get a great photo of that, but I did see some goslings down by the swan boat lake.

May 17, 2026

GDS weighs in on the NHS’s decision to retreat from Open Source. Terence Eden continues his coverage of the NHS' poorly considered decision to close down access to their open source repositories in response to vulnerabilities reported to them as part of Project Glasswing.

Now the Government Digital Service have joined the conversation with AI, open code and vulnerability risk in the public sector, published May 14th. Their key recommendation:

Keep open by default. Making everything private adds additional delivery and policy costs, and can reduce reuse and scrutiny. Openness should remain the default posture, with closure used sparingly and deliberately.

While they don't mention the NHS by name, Terence speaks the language of the civil service and interprets this as a major escalation:

Within the UK's Civil Service you occasionally hear the expression "being invited to a meeting without biscuits". It implies a rather frosty discussion without any of the polite niceties of a normal meeting. In general though, even when people have severe disagreements, it is rare for tempers to fray. It is even rarer for those internal disagreements to spill over into public.

# 3:59 pm / open-source, security, ai, generative-ai, llms, gov-uk, terence-eden, ai-ethics, ai-security-research

May 16, 2026

In preparation for a lightning talk I'm giving at PyCon US this afternoon I decided to figure out how many names OpenClaw has actually had since that first commit back in November.

Thanks to this first_line_history.py tool (code here) the answer, according to the Git history of the OpenClaw README, is:

Warelay → CLAWDIS → CLAWDBOT → Clawdbot → Moltbot →🦞 OpenClaw

Or in detail (the output from the tool):

2025-11-24T11:23:15+01:00 16dfc1a # Warelay — WhatsApp Relay CLI (Twilio)
2025-11-24T11:41:37+01:00 d4153da # 📡 Warelay — WhatsApp Relay CLI (Twilio)
2025-11-24T17:47:57+01:00 343ef9b # 📡 warelay — WhatsApp Relay CLI (Twilio)
2025-11-25T04:44:10+01:00 14b3c6f # 📡 warelay — WhatsApp Relay CLI
2025-11-25T12:48:40+01:00 4814021 # 📡 warelay — Send, receive, and auto-reply on WhatsApp—Twilio-backed or QR-linked.
2025-11-25T13:50:18+01:00 d51a3e9 # warelay 📡 - Send, receive, and auto-reply on WhatsApp via Twilio or QR-linked WhatsApp Web; webhook setup in one command
2025-11-25T13:51:13+01:00 4d2a8a8 # 📡 warelay — Send, receive, and auto-reply on WhatsApp—Twilio-backed or QR-linked.
2025-11-25T14:52:43+01:00 1ef7f4d # 📡 warelay — Send, receive, and auto-reply on WhatsApp.
2025-12-03T15:45:32+00:00 a27ee23 # 🦞 CLAWDIS — WhatsApp Gateway for AI Agents
2025-12-08T12:43:13+01:00 17fa2f4 # 🦞 CLAWDIS — WhatsApp & Telegram Gateway for AI Agents
2025-12-19T18:41:17+01:00 7710439 # 🦞 CLAWDIS — Personal AI Assistant
2026-01-04T14:32:47+00:00 246adaa # 🦞 CLAWDBOT — Personal AI Assistant
2026-01-10T05:14:09+01:00 cdb915d # 🦞 Clawdbot — Personal AI Assistant
2026-01-27T13:37:47-05:00 3fe4b25 # 🦞 Moltbot — Personal AI Assistant
2026-01-30T03:15:10+01:00 9a71607 # 🦞 OpenClaw — Personal AI Assistant

# 8:23 pm / openclaw, git, tools

[...] in the last 10 years I’ve learned to really love and respect CSS as a technology.

So I decided years ago that I wanted to react to “CSS is hard” by getting better at CSS and taking it seriously as a technology, instead of devaluing it. Doing that changed everything for me: I learned that so many of my frustrations (“centering is impossible”) had been addressed in CSS a long time ago, and that also what “centering” means is not always straightforward and it makes sense that there are many ways to do it. CSS is hard because it’s solving a hard problem!

Julia Evans, Moving away from Tailwind, and learning to structure my CSS

# 4:45 pm / css, julia-evans

May 15, 2026

Part of the infrastructure I use for publishing my iNaturalist sightings on my blog. I've been running this in production for a few weeks now, inspiring some iterations on how it works, so I decided to ship a 0.1 release.

You can see an example of the output in this JSON file.

Sighting 7:42 AM – 7:47 AM — Western Gull, Rock Pigeon, in Los Angeles Area (custom), CA, US
Western Gull
Western Gull
Rock Pigeon
Rock Pigeon

I went for a bird walk in the morning before PyCon, and we spotted a local seagull enjoying a Starbucks.

Claude helped me build this tool for creating QR codes, for both text/URLs and for connecting to WiFi networks.

Screenshot of a QR code generator web form. Heading "QR code generator" with subtitle "Create a scannable code for a URL, text, or WiFi network." A segmented toggle shows "URL / text" and "WiFi" with WiFi selected. Below are fields: "Network name (SSID)" with placeholder "My WiFi"; "Password" with placeholder "Password" and a blue "Show" link; "Security" dropdown set to "WPA / WPA2 / WPA3 (most common)"; an unchecked "Hidden" checkbox; helper text "Not sure? Leave it on WPA / WPA2 / WPA3 — that covers almost every home WiFi network." Below that: "Style" dropdown set to "Square", an unchecked "Border" checkbox, "Size" dropdown set to "Medium", and a "Color" swatch showing black. At the bottom is a blue "Generate QR code" button.

This plugin works in conjunction with datasette-llm and datasette-llm-accountant to let you configure a per-user (or global) spending limit for LLM usage inside of Datasette. Configuration looks something like this:

plugins:
  datasette-llm-limits:
    limits:
      per-user-daily:
        scope: actor
        window: rolling-24h
        amount_usd: 1.00
  • Tool availability can now be attached to a required_permission. The default background agent tools now require the new datasette-agent-background permission. #10

May 14, 2026

This Mitchell Hashimoto quote about Bun migrating from Zig to Rust reminded me of a similar conversation I had at a conference last week.

I was talking to someone who worked for a medium sized technology company with a pair of legacy/legendary iPhone and Android apps.

They told me they had just completed a coding-agent driven rewrite of both apps to React Native.

I asked why they chose that, given that coding agents presumably drive down the cost of maintaining separate iPhone and Android apps.

They said that React Native has improved a lot over the past few years and covered everything their apps needed to do.

And... if it turned out to be the wrong decision, they could just port back to native in the future.

Like Mitchell said:

Programming languages used to be LOCK IN, and they're increasingly not so.

# 10:53 pm / react, coding-agents, ai-assisted-programming, generative-ai, ai, llms

[...] On the interesting side is how fungible programming languages are nowadays. Programming languages used to be LOCK IN, and they're increasingly not so. You think the Bun rewrite in Rust is good for Rust? Bun has shown they can be in probably any language they want in roughly a week or two. Rust is expendable. Its useful until its not then it can be thrown out. That's interesting!

Mitchell Hashimoto, on Bun porting from Zig to Rust

# 10:31 pm / zig, ai, mitchell-hashimoto, llms, rust, generative-ai, agentic-engineering, bun

  • Now uses the execute-sql permission when deciding which tables to list to the user. #8

The datasette.io site was being hammered by poorly-behaved crawlers, so I had Codex (GPT-5.5 xhigh) build a configurable rate limiting plugin to block IPs that were hammering specific areas of the site too quickly.

Here's the production configuration I'm using on that site for the new plugin:

  datasette-ip-rate-limit:
    header: Fly-Client-IP
    max_keys: 10000
    exempt_paths:
    - "/static/*"
    - "/-/turnstile*"
    rules:
    - name: demo-databases
      paths:
      - "/global-power-plants/*"
      - "/legislators/*"
      window_seconds: 60
      max_requests: 60
      block_seconds: 20
Sighting 5:20 PM — American Crow, in Stanford University, CA, US
American Crow
American Crow
American Crow
American Crow

May 13, 2026

Welcome to the Datasette blog. We have a bunch of neat Datasette announcements in the pipeline so we decided it was time the project grew an official blog.

I built this using OpenAI Codex desktop, which turns out to have the Markdown session transcript export feature I've always wanted. Here's the session that built the blog. See also issue 179.

# 11:59 pm / ai, datasette, generative-ai, llms, ai-assisted-programming, codex

Sighting 10:41 AM – 10:44 AM — Surf Scoter, Western Gull, in Monterey Bay National Marine Sanctuary, CA, US, CA
Surf Scoter
Surf Scoter
Western Gull
Western Gull

“11 AI agents” is meaningless as a phrase.

If I said “I have 11 spreadsheets” or “I have 11 browser tabs” to do my work, it means about the same thing.

Boris Mann

# 4:15 pm / ai-agents, ai, agent-definitions

An experiment that shows that you can load an app in a CSP-protected sandboxed iframe (see previous note) and have a custom fetch() that intercepts CSP errors and passes them up to the parent window... which can then prompt the user to add that domain to an allow-list and then refresh the page.

Screenshot of a web tool titled "CSP Allow-list Experiment" with buttons Reset sample, Clear allow-list, Refresh preview. Left panel shows HTML source code starting with <!doctype html>. Right panel shows Preview with CSP header default-src 'none'; script-src 'unsafe-inline'; style-s... and heading "Sandbox fetch test". A modal dialog from tools.simonwillison.net is overlaid reading: "The sandbox tried to connect to: https://api.inaturalist.org   Add this origin to the CSP connect-src allow-list and refresh the page?" with an unchecked checkbox "Don't allow tools.simonwillison.net to prompt you again" and Cancel and OK buttons. Below is "Messages from sandbox" showing fetch-catch blocked https://api.inaturalist.org/v1/observations?per... connect-src · https://api.inaturalist.org. At the bottom left is "Allowed fetch() origins" with an input field containing https://api.github.com, an Add button, and a tag https://api.github.com x.

I built this one with GPT-5.5 xhigh running in the Codex desktop app.

Sighting 5:15 PM – 5:21 PM — Cedar Waxwing, California Scrub-Jay, in San Mateo County, CA, US
Cedar Waxwing
Cedar Waxwing
Cedar Waxwing
Cedar Waxwing
California Scrub-Jay
California Scrub-Jay

May 12, 2026

  • New TokenRestrictions.abbreviated(datasette) utility method for creating "_r" dictionaries. #2695
  • Table headers and column options are now visible even if a table contains zero rows. #2701
  • Fixed bug with display of column actions dialog on Mobile Safari. #2708
  • Fixed bug where tests could crash with a segfault due to a race condition between Datasette.close() and Database.close(). #2709

That segfault bug was gnarly. I added a mechanism to Datasette recently that would automatically close connections at the end of each test, but it turned out that introduced a race condition where an in-flight query could sometimes be executing in a thread against a connection while it was being closed. I ended up solving that by having Codex CLI (with GPT-5.5 xhigh) create a minimal Dockerfile that recreated the bug.

Now, if your CEO has never heard the phrase Ralph Loop, oh man, you are less than 30 days away from your next promotion. I'm not even exaggerating. Walk into his office, close the door, and say, hey chief, been experimenting with something. It's called Ralph Loops. And I think it could change literally everything. And he's gonna say, what's a Ralph loop? And you will say, give me $18,000 worth of API credits and I'll show you. Now you won't actually do anything, because you can't do anything. Because nobody can, because nobody knows what they're doing. But by the time he figures that out, you'll have a new title, and equity bump. [...]

Talk about automation constantly. Nothing arouses the slumbering capitalists than the mention of automation. Drop names too, bro. Like talk about specific team members you can automate out of existence. Be like, yo, I automated Gary, bro. Tag Gary in the message. Tag him in Slack in a very public channel. Be like, yo, I just automated @Gary. His function has been Ralph Looped. And tag your CEO in the same message. You think you're getting laid off after that?

Mo Bitar, The Unethical Guide to Surviving AI Layoffs, TikTok

# 10:59 pm / ai-ethics, tiktok, careers, ai

Highlights

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe