<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/atom/everything/" rel="self"/><id>http://simonwillison.net/</id><updated>2026-05-09T01:03:58+00:00</updated><author><name>Simon Willison</name></author><entry><title>Quoting Luke Curley</title><link href="https://simonwillison.net/2026/May/9/luke-curley/#atom-everything" rel="alternate"/><published>2026-05-09T01:03:58+00:00</published><updated>2026-05-09T01:03:58+00:00</updated><id>https://simonwillison.net/2026/May/9/luke-curley/#atom-everything</id><summary type="html">
    &lt;blockquote cite="https://moq.dev/blog/webrtc-is-the-problem/"&gt;&lt;p&gt;WebRTC is designed to &lt;strong&gt;degrade and drop my prompt&lt;/strong&gt; during poor network conditions.&lt;/p&gt;
&lt;p&gt;wtf my dude&lt;/p&gt;
&lt;p&gt;WebRTC aggressively drops audio packets to keep latency low. If you’ve ever heard distorted audio on a conference call, that’s WebRTC baybee. The idea is that conference calls depend on rapid back-and-forth, so pausing to wait for audio is unacceptable.&lt;/p&gt;
&lt;p&gt;…but as a user, I would much rather wait an extra 200ms for my slow/expensive prompt to be accurate. After all, I’m paying good money to boil the ocean, and a garbage prompt means a garbage response. It’s not like LLMs are particularly responsive anyway.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;But I’m not allowed to wait&lt;/strong&gt;. It’s &lt;em&gt;impossible&lt;/em&gt; to even retransmit a WebRTC audio packet within a browser; we tried at Discord. The &lt;em&gt;implementation&lt;/em&gt; is hard-coded for real-time latency &lt;strong&gt;or else&lt;/strong&gt;.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://moq.dev/blog/webrtc-is-the-problem/"&gt;Luke Curley&lt;/a&gt;, OpenAI’s WebRTC Problem, in response to &lt;a href="https://openai.com/index/delivering-low-latency-voice-ai-at-scale/"&gt;How OpenAI delivers low-latency voice AI at scale&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/webrtc"&gt;webrtc&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;&lt;/p&gt;



</summary><category term="webrtc"/><category term="openai"/></entry><entry><title>Using Claude Code: The Unreasonable Effectiveness of HTML</title><link href="https://simonwillison.net/2026/May/8/unreasonable-effectiveness-of-html/#atom-everything" rel="alternate"/><published>2026-05-08T21:00:11+00:00</published><updated>2026-05-08T21:00:11+00:00</updated><id>https://simonwillison.net/2026/May/8/unreasonable-effectiveness-of-html/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://twitter.com/trq212/status/2052809885763747935"&gt;Using Claude Code: The Unreasonable Effectiveness of HTML&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Thought-provoking piece by Thariq Shihipar (on the Claude Code team at Anthropic) advocating for HTML over Markdown as an output format to request from Claude.&lt;/p&gt;
&lt;p&gt;The article is crammed with interesting examples (collected on &lt;a href="https://thariqs.github.io/html-effectiveness/"&gt;this site&lt;/a&gt;) and prompt suggestions like this one:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Help me review this PR by creating an HTML artifact that describes it. I'm not very familiar with the streaming/backpressure logic so focus on that. Render the actual diff with inline margin annotations, color-code findings by severity and whatever else might be needed to convey the concept well.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I've been defaulting to asking for most things in Markdown since the GPT-4 days, when the 8,192 token limit meant that Markdown's token-efficiency over HTML was extremely worthwhile.&lt;/p&gt;
&lt;p&gt;Thariq's piece here has caused me to reconsider that, especially for output. Asking Claude for an explanation in HTML means it can drop in SVG diagrams, interactive widgets, in-page navigation and all sorts of other neat ways of making the information more pleasant to navigate.&lt;/p&gt;
&lt;p&gt;I wrote about &lt;a href="https://simonwillison.net/2025/Dec/10/html-tools/"&gt;Useful patterns for building HTML tools&lt;/a&gt; last December, but that was focused very much on interactive utilities like the ones on my &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; site. I'm excited to start experimenting more with rich HTML explanations in response to ad-hoc prompts.&lt;/p&gt;
&lt;h4 id="trying-this-out"&gt;Trying this out on copy.fail&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://copy.fail/"&gt;copy.fail&lt;/a&gt; describes a recently discovered Linux security exploit, including a proof of concept distributed as obfuscated Python.&lt;/p&gt;
&lt;p&gt;I tried having GPT-5.5 create an HTML explanation of the exploit like this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;curl https://copy.fail/exp | llm -m gpt-5.5 -s 'Explain this code in detail. Reformat it, expand out any confusing bits and go deep into what it does and how it works. Output HTML, neatly styled and using capabilities of HTML and CSS and JavaScript to make the explanation rich and interactive and as clear as possible'&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://gisthost.github.io/?ae53e3461ffdbfd0826156aacf025c7e"&gt;the resulting HTML page&lt;/a&gt;. It's pretty good, though I should have emphasized explaining the exploit over the Python harness around it.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a dark-themed technical document titled &amp;quot;What this Python script does&amp;quot;. Body text: &amp;quot;This is a compact, deliberately obfuscated Linux-specific local privilege-escalation proof-of-concept. Its apparent goal is to tamper with the in-memory image/page cache of /usr/bin/su, then execute su to obtain elevated privileges.&amp;quot; A yellow-bordered callout reads: &amp;quot;Safety note: This explanation is for code understanding, reverse engineering, and defensive analysis. Do not run this on systems you do not own or administer. On a vulnerable kernel, code like this can alter the behavior of a privileged executable.&amp;quot; Left column heading &amp;quot;High-level summary&amp;quot;: &amp;quot;The script opens /usr/bin/su read-only, decompresses an embedded binary payload, and then processes that payload in 4-byte chunks. For each chunk, it performs a carefully arranged sequence involving Linux's kernel crypto socket interface, AF_ALG, pipes, and splice(). The important point is that this is not ordinary file writing. It never calls write() on /usr/bin/su. Instead, it appears to rely on a kernel bug/primitive involving spliced file pages and the crypto API to get controlled bytes placed into the page-cache representation of a privileged executable.&amp;quot; Numbered steps follow: &amp;quot;1. Open target executable — /usr/bin/su is opened read-only. 2. Decode hidden payload — A zlib-compressed hex blob is decompressed into bytes. 3. Patch in 4-byte chunks — The helper function is called repeatedly with offsets 0, 4, 8, ...&amp;quot;. Right column heading &amp;quot;Why it looks strange&amp;quot; contains a table with Pattern and Purpose columns: &amp;quot;import os as g — Short aliasing to make the script compact and harder to read. socket(38, 5, 0) — Uses raw numeric Linux constants instead of readable names. Compressed hex blob — Hides binary payload bytes and keeps the script small. splice() — Moves file-backed pages through pipes without normal user-space copying. try: recv(...) except: 0 — Triggers the kernel operation and ignores expected errors.&amp;quot;" src="https://static.simonwillison.net/static/2026/python-script-explainer.jpg" /&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/html"&gt;html&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/markdown"&gt;markdown&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="html"/><category term="security"/><category term="markdown"/><category term="ai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="llm"/><category term="claude-code"/></entry><entry><title>llm-gemini 0.31</title><link href="https://simonwillison.net/2026/May/7/llm-gemini/#atom-everything" rel="alternate"/><published>2026-05-07T19:57:06+00:00</published><updated>2026-05-07T19:57:06+00:00</updated><id>https://simonwillison.net/2026/May/7/llm-gemini/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/llm-gemini/releases/tag/0.31"&gt;llm-gemini 0.31&lt;/a&gt;&lt;/p&gt;
        &lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;gemini-3.1-flash-lite&lt;/code&gt; is &lt;a href="https://cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available"&gt;no longer a preview&lt;/a&gt;. &lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's my write-up of the &lt;a href="https://simonwillison.net/2026/Mar/3/gemini-31-flash-lite/"&gt;Gemini 3.1 Flash-Lite Preview model&lt;/a&gt; back in March. I don't believe this new non-preview model has changed since then.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gemini"&gt;gemini&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="llm-release"/><category term="gemini"/><category term="llm"/><category term="google"/><category term="generative-ai"/><category term="ai"/><category term="llms"/></entry><entry><title>Big Words</title><link href="https://simonwillison.net/2026/May/7/big-words/#atom-everything" rel="alternate"/><published>2026-05-07T18:47:09+00:00</published><updated>2026-05-07T18:47:09+00:00</updated><id>https://simonwillison.net/2026/May/7/big-words/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/big-words"&gt;Big Words&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I'm using my &lt;a href="https://simonwillison.net/2026/Feb/25/present/"&gt;vibe coded macOS presentations tool&lt;/a&gt; to put together a talk, and I wanted to add a slide with some text on it. The tool only accepts URLs, so I &lt;a href="https://github.com/simonw/tools/pull/279"&gt;put together&lt;/a&gt; a quick page that accepts query string arguments and turns them into a simple slide.&lt;/p&gt;
&lt;p&gt;Here's an example: &lt;a href="https://tools.simonwillison.net/big-words?text=simonwillison.net&amp;amp;gradient=1&amp;amp;size=9.5"&gt;https://tools.simonwillison.net/big-words?text=simonwillison.net&amp;amp;gradient=1&amp;amp;size=9.5&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Double click or double tap the page to access a form for modifying the different options.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a slide editing tool showing a slide on the left with &amp;quot;simonwillison.net&amp;quot; in heavy white sans-serif text on a black-to-blue gradient background, and a &amp;quot;Slide settings&amp;quot; panel on the right with: TEXT field containing &amp;quot;simonwillison.net&amp;quot;, TEXT COLOR white, BACKGROUND black, &amp;quot;Use gradient background&amp;quot; checked, SECOND COLOR blue, ANGLE 135°, FONT &amp;quot;System sans-seri&amp;quot;, WEIGHT &amp;quot;Heavy&amp;quot;, SIZE 9.5vmin, unchecked Italic / Uppercase / Drop shadow checkboxes, and Reset and Save URL buttons." src="https://static.simonwillison.net/static/2026/big-words.jpg" /&gt;&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="vibe-coding"/><category term="tools"/></entry><entry><title>Behind the Scenes Hardening Firefox with Claude Mythos Preview</title><link href="https://simonwillison.net/2026/May/7/firefox-claude-mythos/#atom-everything" rel="alternate"/><published>2026-05-07T17:56:25+00:00</published><updated>2026-05-07T17:56:25+00:00</updated><id>https://simonwillison.net/2026/May/7/firefox-claude-mythos/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/"&gt;Behind the Scenes Hardening Firefox with Claude Mythos Preview&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fascinating, in-depth details on how Mozilla used their access to the Claude Mythos preview to locate and then fix hundreds of vulnerabilities in Firefox:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Suddenly, the bugs are very good&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Just a few months ago, AI-generated security bug reports to open source projects were mostly known for being unwanted slop. Dealing with reports that look plausibly correct but are wrong imposes an asymmetric cost on project maintainers: it’s cheap and easy to prompt an LLM to find a “problem” in code, but slow and expensive to respond to it.&lt;/p&gt;
&lt;p&gt;It is difficult to overstate how much this dynamic changed for us over a few short months. This was due to a combination of two main factors. First, the models got a lot more capable. Second, we dramatically improved our techniques for &lt;em&gt;harnessing&lt;/em&gt; these models — steering them, scaling them, and stacking them to generate large amounts of signal and filter out the noise.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They include some detailed bug descriptions too, including a 20-year old XSLT bug and a 15-year-old bug in the &lt;code&gt;&amp;lt;legend&amp;gt;&lt;/code&gt; element.&lt;/p&gt;
&lt;p&gt;A lot of the attempts made by the harness were blocked by Firefox's existing defense-in-depth measures, which is reassuring.&lt;/p&gt;
&lt;p&gt;Mozilla were fixing around 20-30 security bugs in Firefox per month through 2025. That jumped to 423 in April.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Bar chart titled &amp;quot;Firefox Security Bug Fixes by Month&amp;quot; with subtitle &amp;quot;All Sources • All Severities&amp;quot; on a dark purple background, showing monthly counts: Jan 2025: 21, Feb 2025: 20, Mar 2025: 26, Apr 2025: 31, May 2025: 17, Jun 2025: 21, Jul 2025: 22, Aug 2025: 17, Sep 2025: 18, Oct 2025: 26, Nov 2025: 19, Dec 2025: 20, Jan 2026: 25, Feb 2026: 61, Mar 2026: 76, Apr 2026: 423 — a dramatic spike in the final month." src="https://static.simonwillison.net/static/2026/firefox-security.webp" /&gt;

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://lobste.rs/s/7zppv1/behind_scenes_hardening_firefox_with"&gt;Lobste.rs&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/firefox"&gt;firefox&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mozilla"&gt;mozilla&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;&lt;/p&gt;



</summary><category term="firefox"/><category term="mozilla"/><category term="security"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="ai-security-research"/></entry><entry><title>Notes on the xAI/Anthropic data center deal</title><link href="https://simonwillison.net/2026/May/7/xai-anthropic/#atom-everything" rel="alternate"/><published>2026-05-07T17:09:28+00:00</published><updated>2026-05-07T17:09:28+00:00</updated><id>https://simonwillison.net/2026/May/7/xai-anthropic/#atom-everything</id><summary type="html">
    &lt;p&gt;There weren't a lot of big new announcements from Anthropic at yesterday's Code w/ Claude event, but the biggest by far was the deal they've struck with SpaceX/xAI to use "all of the capacity of their Colossus data center".&lt;/p&gt;
&lt;p&gt;As I mentioned in my &lt;a href="https://simonwillison.net/2026/May/6/code-w-claude-2026/"&gt;live blog of the keynote&lt;/a&gt;, that's the one with the &lt;a href="https://www.politico.com/news/2025/05/06/elon-musk-xai-memphis-gas-turbines-air-pollution-permits-00317582"&gt;particularly bad environmental record&lt;/a&gt;. The gas turbines installed to power the facility initially ran without Clean Air Act permits or pollution control devices, which they got away with by classifying them as "temporary". Credible reports link it to increases in hospital admissions relating to low air quality.&lt;/p&gt;
&lt;p&gt;Andy Masley, one of the most prolific voices pushing back against misleading rhetoric about data centers (see &lt;a href="https://blog.andymasley.com/p/the-ai-water-issue-is-fake"&gt;The AI water issue is fake&lt;/a&gt; and &lt;a href="https://blog.andymasley.com/p/data-center-land-use-issues-are-fake"&gt;Data center land issues are fake&lt;/a&gt;), had &lt;a href="https://x.com/andymasley/status/2052070252930826384"&gt;this to say&lt;/a&gt; about Colossus:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would simply not run my computing out of this specific data center&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I get that Anthropic are severely compute-constrained, but in a world where the very existence of "AI data centers" is a red-hot political issue (see recent &lt;a href="https://kutv.com/news/local/amid-boos-box-elder-county-commission-unanimously-approves-plan-for-massive-data-center"&gt;news out of Utah&lt;/a&gt; for a fresh example), signing up with this particular data center is a really bad look.&lt;/p&gt;
&lt;p&gt;There was a lot of initial chatter about how this meant xAI were clearly giving up on their own Grok models, since all of their capacity would be sold to Anthropic instead. That was a misconception - Anthropic are getting Colossus 1, but xAI are keeping their larger Colossus 2 data center for their own work.&lt;/p&gt;
&lt;p&gt;As an interesting side note, the night before the Anthropic announcement, xAI sent out a deprecation notice for Grok 4.1 Fast and several other models providing just two weeks' notice before shutdown, reported here &lt;a href="https://twitter.com/xlr8harder/status/2051901091906834439"&gt;by @xlr8harder&lt;/a&gt; from SpeechMap:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/grok-fast-shutdown.png" alt="Effective May 15, 2026 at 12:00pm PT, the following models will be retired from the xAI API: grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning, grok-4-fast-reasoning, grok-4-fast-non-reasoning, grok-4-0709, grok-code-fast-1, grok-3, grok-imagine-image-pro. After May 15, 2026, requests to these models will no longer work." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;This is terrible @xai. I just spent time and money to migrate to grok 4.1 fast, and you're disabling it with less than two weeks notice, after releasing it in November, with no migration path to a fast/cheap alternative.&lt;/p&gt;
&lt;p&gt;I will never depend on one of your products again.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Here's &lt;a href="https://speechmap.substack.com/p/speechmap-update-xai-loses-top-spot"&gt;SpeechMap's detailed explanation&lt;/a&gt; of how they selected Grok 4.1 Fast for their project in March.&lt;/p&gt;
&lt;p&gt;Were xAI serving those models out of Colossus 1?&lt;/p&gt;
&lt;p&gt;xAI owner Elon Musk (who previously delighted in calling Anthropic &lt;a href="https://twitter.com/search?q=from%3Aelonmusk+misanthropic&amp;amp;src=typed_query&amp;amp;f=live"&gt;"Misanthropic"&lt;/a&gt;) &lt;a href="https://twitter.com/elonmusk/status/2052069691372478511"&gt;tweeted&lt;/a&gt; the following:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;By way of background for those who care, I spent a lot of time last week with senior members of the Anthropic team to understand what they do to ensure Claude is good for humanity and was impressed. [...]&lt;/p&gt;
&lt;p&gt;After that, I was ok leasing Colossus 1 to Anthropic, as SpaceXAI had already moved training to Colossus 2.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And then &lt;a href="https://twitter.com/elonmusk/status/2052076315306864756"&gt;shortly afterwards&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Just as SpaceX launches hundreds of satellites for competitors with fair terms and pricing, we will provide compute to AI companies that are taking the right steps to ensure it is good for humanity.&lt;/p&gt;
&lt;p&gt;We reserve the right to reclaim the compute if their AI engages in actions that harm humanity.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Presumably the criteria for "harm humanity" are decided by Elon himself. Sounds like a new form of supply chain risk for Anthropic to me!&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-energy-usage"&gt;ai-energy-usage&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/xai"&gt;xai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/andy-masley"&gt;andy-masley&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="llms"/><category term="anthropic"/><category term="ai-ethics"/><category term="ai-energy-usage"/><category term="xai"/><category term="andy-masley"/></entry><entry><title>GitHub Repo Stats</title><link href="https://simonwillison.net/2026/May/7/github-repo-stats/#atom-everything" rel="alternate"/><published>2026-05-07T07:25:14+00:00</published><updated>2026-05-07T07:25:14+00:00</updated><id>https://simonwillison.net/2026/May/7/github-repo-stats/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/github-repo-stats"&gt;GitHub Repo Stats&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;One of the things I always look for when evaluating a new GitHub repository is the number of commits it has... but that number isn't visible on GitHub's mobile site layout. I built this tool to fix that, using this prompt:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Given a GitHub repo URL or foo/bar repo ID show information about that repo absorbed via wither REST or graphql CORS fetch() including the number of commits in the repo and other useful stats&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Example output for &lt;a href="https://tools.simonwillison.net/github-repo-stats?repo=simonw%2Fdatasette"&gt;simonw/datasette&lt;/a&gt; and &lt;a href="https://tools.simonwillison.net/github-repo-stats?repo=simonw%2Fllm"&gt;simonw/llm&lt;/a&gt;.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/github"&gt;github&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="github"/></entry><entry><title>Live blog: Code w/ Claude 2026</title><link href="https://simonwillison.net/2026/May/6/code-w-claude-2026/#atom-everything" rel="alternate"/><published>2026-05-06T15:58:27+00:00</published><updated>2026-05-06T15:58:27+00:00</updated><id>https://simonwillison.net/2026/May/6/code-w-claude-2026/#atom-everything</id><summary type="html">
    &lt;p&gt;I'm at Anthropic's Code w/ Claude event today. Here's my live blog of the morning keynote sessions.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/live-blog"&gt;live-blog&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="claude-code"/><category term="live-blog"/></entry><entry><title>Vibe coding and agentic engineering are getting closer than I'd like</title><link href="https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/#atom-everything" rel="alternate"/><published>2026-05-06T14:24:08+00:00</published><updated>2026-05-06T14:24:08+00:00</updated><id>https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/#atom-everything</id><summary type="html">
    &lt;p&gt;I recently talked with Joseph Ruscio about AI coding tools for Heavybit's High Leverage podcast: &lt;a href="https://www.heavybit.com/library/podcasts/high-leverage/ep-9-the-ai-coding-paradigm-shift-with-simon-willison"&gt;Ep. #9, The AI Coding Paradigm Shift with Simon Willison&lt;/a&gt;. Here are some of my highlights, including my disturbing realization that vibe coding and agentic engineering have started to converge in my own work.&lt;/p&gt;
&lt;p&gt;One thing I really enjoy about podcasts is that they sometimes push me to think out loud in a way that exposes an idea I've not previously been able to put into words.&lt;/p&gt;
&lt;h4 id="vibe-coding-and-agentic-engineering-are-starting-to-overlap"&gt;Vibe coding and agentic engineering are starting to overlap&lt;/h4&gt;
&lt;p&gt;A few weeks after vibe coding was first coined I published &lt;a href="https://simonwillison.net/2025/Mar/19/vibe-coding/"&gt;Not all AI-assisted programming is vibe coding (but vibe coding rocks)&lt;/a&gt;, where I firmly staked out my belief that "vibe coding" is a very different beast from responsible use of AI to write code, which I've since started to call &lt;a href="https://simonwillison.net/guides/agentic-engineering-patterns/what-is-agentic-engineering/"&gt;agentic engineering&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;When Joseph brought up the distinction between the two I had a sudden realization that they're not nearly as distinct for me as they used to be:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Weirdly though, those things have started to blur for me already, which is quite upsetting.&lt;/p&gt;
&lt;p&gt;I thought we had a very clear delineation where vibe coding is the thing where you're not looking at the code at all. You might not even know how to program. You might be a non-programmer who asks for a thing, and gets a thing, and if the thing works, then great! And if it doesn't, you tell it that it doesn't work and cross your fingers.&lt;/p&gt;
&lt;p&gt;But at no point are you really caring about the code quality or any of those additional constraints. And my take on vibe coding was that it's fantastic, provided you understand when it can be used and when it can't.&lt;/p&gt;
&lt;p&gt;A personal tool for you, where if there's a bug it hurts only you, go ahead!&lt;/p&gt;
&lt;p&gt;If you're building software for other people, vibe coding is grossly irresponsible because it's other people's information. Other people get hurt by your stupid bugs. You need to have a higher level than that.&lt;/p&gt;
&lt;p&gt;This contrasts with agentic engineering where you are a professional software engineer. You understand security and maintainability and operations and performance and so forth. You're using these tools to the highest of your own ability. I'm finding the scope of challenges I can take on has gone up by a significant amount because I've got the support of these tools.&lt;/p&gt;
&lt;p&gt;But I'm still leaning on my 25 years of experience as a software engineer.&lt;/p&gt;
&lt;p&gt;The goal is to build high quality production systems: if you're building lower quality stuff faster, I think that's bad. I want to build &lt;em&gt;higher&lt;/em&gt; quality stuff faster. I want everything I'm building to be better in every way than it was before.&lt;/p&gt;
&lt;p&gt;The problem is that as the coding agents get more reliable, I'm not reviewing every line of code that they write anymore, even for my production level stuff.&lt;/p&gt;
&lt;p&gt;I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it's just going to do it right. It's not going to mess that up. You have it add automated tests, you have it add documentation, you know it's going to be good.&lt;/p&gt;
&lt;p&gt;But I'm not reviewing that code. And now I've got that feeling of guilt: if I haven't reviewed the code, is it really responsible for me to use this in production?&lt;/p&gt;
&lt;p&gt;The thing that really helps me is thinking back to when I've worked at larger organizations where I've been an engineering manager. Other teams are building software that my team depends on.&lt;/p&gt;
&lt;p&gt;If another team hands over something and says, "hey, this is the image resize service, here's how to use it to resize your images"... I'm not going to go and read every line of code that they wrote.&lt;/p&gt;
&lt;p&gt;I'm going to look at their documentation and I'm going to use it to resize some images. And then I'm going to start shipping my own features. And if I start running into problems where the image resizer thing appears to have bugs or the performance isn't good, that's when I might dig into their Git repositories and see what's going on. But for the most part I treat that as a semi-black box that I don't look at until I need to.&lt;/p&gt;
&lt;p&gt;I'm starting to treat the agents in the same way. And it still feels uncomfortable, because human beings are accountable for what they do. A team can build a reputation. I can say "I trust that team over there. They built good software in the past. They're not going to build something rubbish because that affects their professional reputations."&lt;/p&gt;
&lt;p&gt;Claude Code does not have a professional reputation! It can't take accountability for what it's done. But it's been proving itself anyway - time and time again it's churning out straightforward things and doing them right in the style that I like.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There's an element of &lt;a href="https://simonwillison.net/2025/Dec/10/normalization-of-deviance/"&gt;the normalization of deviance&lt;/a&gt; here - every time a model turns out to have written the right code without me monitoring it closely there's a risk that I'll trust it at the wrong moment in the future and get burned.&lt;/p&gt;
&lt;h4 id="the-new-challenge-of-evaluating-software"&gt;The new challenge of evaluating software&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;It used to be if you found a GitHub repository with a hundred commits and a good readme and automated tests and stuff, you could be pretty sure that the person writing that had put a lot of care and attention into that project.&lt;/p&gt;
&lt;p&gt;And now I can knock out a git repository with a hundred commits and a beautiful readme and comprehensive tests of every line of code in half an hour! It looks identical to those projects that have had a great deal of care and attention. Maybe it is as good as them. I don't know. I can't tell from looking at it. Even for my &lt;em&gt;own&lt;/em&gt; projects, I can't tell.&lt;/p&gt;
&lt;p&gt;So I realized what I value more than the quality of the tests and documentation is that I want somebody to have &lt;em&gt;used&lt;/em&gt; the thing. If you've got a vibe coded thing which you have used every day for the past two weeks, that's much more valuable to me than something that you've just spat out and hardly even exercised.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="the-bottlenecks-have-shifted"&gt;The bottlenecks have shifted&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;If you can go from producing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn't.&lt;/p&gt;
&lt;p&gt;It's not just the downstream stuff, it's the upstream stuff as well. I saw &lt;a href="https://simonwillison.net/2026/Jan/24/dont-trust-the-process/"&gt;a great talk by Jenny Wen&lt;/a&gt;, who's the design leader at Anthropic, where she said we have all of these design processes that are based around the idea that you need to get the design &lt;em&gt;right&lt;/em&gt; - because if you hand it off to the engineers and they spend three months building the wrong thing, that's catastrophic.&lt;/p&gt;
&lt;p&gt;There's this whole very extensive design process that you put in place because that design results in expensive work. But if it doesn't take three months to build, maybe the design process can be a whole lot riskier because cost, if you get something wrong, has been reduced so much.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h4 id="why-i-m-still-not-afraid-for-my-career"&gt;Why I'm still not afraid for my career&lt;/h4&gt;
&lt;blockquote&gt;
&lt;p&gt;When I look at my conversations with the agents, it's very clear to me that this is moon language for the vast majority of human beings.&lt;/p&gt;
&lt;p&gt;There are a whole bunch of reasons I'm not scared that my career as a software engineer is over now that computers can write their own code, partly because these things are amplifiers of existing experience. If you know what you're doing, you can run so much faster with them. [...]&lt;/p&gt;
&lt;p&gt;I'm constantly reminded as I work with these tools how hard the thing that we do is. Producing software is a &lt;em&gt;ferociously&lt;/em&gt; difficult thing to do. And you could give me all of the AI tools in the world and what we're trying to achieve here is still really difficult. [...]&lt;/p&gt;
&lt;p&gt;Matthew Yglesias, who's a political commentator, yesterday &lt;a href="https://twitter.com/mattyglesias/status/2049105745132585161"&gt;tweeted&lt;/a&gt;, "Five months in, I think I've decided that I don't want to vibecode — I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money." And that feels about right to me. I can plumb my house if I watch enough YouTube videos on plumbing. I would rather hire a plumber.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;On the threat to SaaS providers of companies rolling their own solutions instead:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I just realized it's the thing I said earlier about how I only want to use your side project if you've used it for a few weeks. The enterprise version of that is I don't want a CRM unless at least two other giant enterprises have successfully used that CRM for six months. [...] You want solutions that are proven to work before you take a risk on them.&lt;/p&gt;
&lt;/blockquote&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/podcast-appearances"&gt;podcast-appearances&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="podcast-appearances"/><category term="vibe-coding"/><category term="coding-agents"/><category term="agentic-engineering"/></entry><entry><title>datasette-referrer-policy 0.1</title><link href="https://simonwillison.net/2026/May/5/datasette-referrer-policy/#atom-everything" rel="alternate"/><published>2026-05-05T23:44:27+00:00</published><updated>2026-05-05T23:44:27+00:00</updated><id>https://simonwillison.net/2026/May/5/datasette-referrer-policy/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/datasette/datasette-referrer-policy/releases/tag/0.1"&gt;datasette-referrer-policy 0.1&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;The OpenStreetMap tiles on the Datasette &lt;a href="https://datasette.io/global-power-plants/global-power-plants"&gt;global-power-plants demo&lt;/a&gt; weren't displaying correctly. This turned out to be caused by two bugs.&lt;/p&gt;
&lt;p&gt;The first is that the CAPTCHA &lt;a href="https://github.com/simonw/datasette-turnstile"&gt;I added&lt;/a&gt; to that site a few weeks ago was triggering for the &lt;code&gt;.json&lt;/code&gt; fetch requests used by the map plugin, and since those weren't HTML the user was not being asked to solve them. Here's &lt;a href="https://github.com/simonw/datasette.io/commit/23a1c8596b75b2094db46035a3b4280109fb3df3"&gt;the fix&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The second was that OpenStreetMap quite reasonably &lt;a href="https://wiki.openstreetmap.org/wiki/Referer"&gt;block tile requests&lt;/a&gt; from sites that use a &lt;code&gt;Referrer-Policy: no-referrer&lt;/code&gt; header.&lt;/p&gt;
&lt;p&gt;Datasette does this by default, and I didn't want to change that default on people without warning - so I had Codex + GPT-5.5 &lt;a href="https://gisthost.github.io/?402f2f23ee3dbfa251bf0d216e0224f7"&gt;build me&lt;/a&gt; a new plugin to help set that header to another value.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/openstreetmap"&gt;openstreetmap&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/http"&gt;http&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="openstreetmap"/><category term="http"/><category term="datasette"/></entry><entry><title>Our AI started a cafe in Stockholm</title><link href="https://simonwillison.net/2026/May/5/our-ai-started-a-cafe-in-stockholm/#atom-everything" rel="alternate"/><published>2026-05-05T22:14:21+00:00</published><updated>2026-05-05T22:14:21+00:00</updated><id>https://simonwillison.net/2026/May/5/our-ai-started-a-cafe-in-stockholm/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://andonlabs.com/blog/ai-cafe-stockholm"&gt;Our AI started a cafe in Stockholm&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Andon Labs previously &lt;a href="https://andonlabs.com/blog/andon-market-launch"&gt;started an AI-run retail store&lt;/a&gt; in San Francisco. Now they're running a similar experiment in Stockholm, Sweden, only this time it's a cafe.&lt;/p&gt;
&lt;p&gt;These experiments are interesting, and often throw out amusing anecdotes:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;During the first week of inventory, Mona ordered 120 eggs even though the café has no stove. When the staff told her they couldn’t cook them, she suggested using the high-speed oven, until they pointed out the eggs would likely explode. She also tried to solve the problem of fresh tomatoes being spoiled too fast by ordering 22.5 kg of canned tomatoes for the fresh sandwiches. The baristas eventually started a “Hall of Shame”, a shelf visible to customers with all the weird things Mona ordered, including 6,000 napkins, 3,000 nitrile gloves, 9L coconut milk, and industrial-sized trash bags.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Where they lose their shine is when these AI managers start wasting the time of human beings who have &lt;em&gt;not&lt;/em&gt; opted into the experiment:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;She also successfully applied for an outdoor seating permit through the Police e-service, which didn’t require BankID. Her first submission included a sketch she had generated herself, despite having never seen the street outside the café. Unsurprisingly, the Police sent it back for revision. [...]&lt;/p&gt;
&lt;p&gt;When she makes a mistake, she often sends multiple emails to suppliers with the subject “EMERGENCY” to cancel or change the order.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don't think it's ethical to run experiments like this that affect real-world systems and steal time from people.&lt;/p&gt;
&lt;p&gt;I'm reminded of the incident last year where the AI Village experiment &lt;a href="https://simonwillison.net/2025/Dec/26/slop-acts-of-kindness/"&gt;infuriated Rob Pike&lt;/a&gt; by sending him unsolicited gratitude emails as an "act of kindness". That was just an unwanted email - asking suppliers to correct mistakes that were made without a human-in-the-loop or wasting police time with slop diagrams feels a whole lot worse to me.&lt;/p&gt;
&lt;p&gt;I think experiments like this need to keep their own human operators in-the-loop for outbound actions that affect other people.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=48028289"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-agents"&gt;ai-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="ai-agents"/><category term="ai-ethics"/></entry><entry><title>datasette-llm 0.1a7</title><link href="https://simonwillison.net/2026/May/5/datasette-llm/#atom-everything" rel="alternate"/><published>2026-05-05T01:56:55+00:00</published><updated>2026-05-05T01:56:55+00:00</updated><id>https://simonwillison.net/2026/May/5/datasette-llm/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/datasette/datasette-llm/releases/tag/0.1a7"&gt;datasette-llm 0.1a7&lt;/a&gt;&lt;/p&gt;
        &lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Mechanism for &lt;a href="https://github.com/datasette/datasette-llm/blob/main/README.md#configuration"&gt;configuring default options&lt;/a&gt; for specific models.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;Part of Datasette's evolving support mechanism for plugins that use LLMs. It's now possible to configure a model with default options, e.g. to say all &lt;a href="https://github.com/datasette/datasette-enrichments-llm"&gt;enrichment&lt;/a&gt; operations should use a specific model with temperature set to 0.5.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/datasette"&gt;datasette&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="llm"/><category term="datasette"/></entry><entry><title>llm-echo 0.5a0</title><link href="https://simonwillison.net/2026/May/5/llm-echo/#atom-everything" rel="alternate"/><published>2026-05-05T01:31:54+00:00</published><updated>2026-05-05T01:31:54+00:00</updated><id>https://simonwillison.net/2026/May/5/llm-echo/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/llm-echo/releases/tag/0.5a0"&gt;llm-echo 0.5a0&lt;/a&gt;&lt;/p&gt;
        &lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;New &lt;code&gt;-o thinking 1&lt;/code&gt; option to help test against &lt;a href="https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28"&gt;LLM 0.32a0&lt;/a&gt; and higher.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;This plugin provides a fake model called "echo" for LLM which doesn't run an LLM at all - it's useful for writing automated tests. You can now do this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;uvx --with llm==0.32a1 --with llm-echo==0.5a0 llm -m echo hi -o thinking 1
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will fake a reasoning block to standard error before returning JSON echoing the prompt.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="llm"/></entry><entry><title>Quoting John Gruber</title><link href="https://simonwillison.net/2026/May/5/john-gruber/#atom-everything" rel="alternate"/><published>2026-05-05T00:46:29+00:00</published><updated>2026-05-05T00:46:29+00:00</updated><id>https://simonwillison.net/2026/May/5/john-gruber/#atom-everything</id><summary type="html">
    &lt;blockquote cite="https://daringfireball.net/2026/05/y_combinators_stake_in_openai"&gt;&lt;p&gt;So it’s well known that Y Combinator owns &lt;em&gt;some&lt;/em&gt; stake in OpenAI. But how big is that stake? This seems like devilishly difficult information to obtain. I asked around and a little birdie who knows several OpenAI investors came back with an answer: Y Combinator owns about 0.6 percent of OpenAI. At OpenAI’s current &lt;a href="https://openai.com/index/accelerating-the-next-phase-ai/"&gt;$852 billion valuation&lt;/a&gt;, that’s worth over $5 billion.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://daringfireball.net/2026/05/y_combinators_stake_in_openai"&gt;John Gruber&lt;/a&gt;, Y Combinator’s Stake in OpenAI&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/y-combinator"&gt;y-combinator&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/john-gruber"&gt;john-gruber&lt;/a&gt;&lt;/p&gt;



</summary><category term="openai"/><category term="y-combinator"/><category term="ai"/><category term="john-gruber"/></entry><entry><title>Granite 4.1 3B SVG Pelican Gallery</title><link href="https://simonwillison.net/2026/May/4/granite-41-3b-svg-pelican-gallery/#atom-everything" rel="alternate"/><published>2026-05-04T23:49:24+00:00</published><updated>2026-05-04T23:49:24+00:00</updated><id>https://simonwillison.net/2026/May/4/granite-41-3b-svg-pelican-gallery/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonw.github.io/granite-4.1-3b-gguf-pelicans/"&gt;Granite 4.1 3B SVG Pelican Gallery&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
IBM released their &lt;a href="https://research.ibm.com/blog/granite-4-1-ai-foundation-models"&gt;Granite 4.1 family&lt;/a&gt; of LLMs a few days ago. They're Apache 2.0 licensed and come in 3B, 8B and 30B sizes.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://huggingface.co/blog/ibm-granite/granite-4-1"&gt;Granite 4.1 LLMs: How They’re Built&lt;/a&gt; by Granite team member Yousaf Shah describes the training process in detail.&lt;/p&gt;
&lt;p&gt;Unsloth released the &lt;a href="https://huggingface.co/unsloth/granite-4.1-3b-GGUF"&gt;unsloth/granite-4.1-3b-GGUF&lt;/a&gt; collection of GGUF encoded quantized variants of the 3B model - 21 different model files ranging in size from 1.2GB to 6.34GB.&lt;/p&gt;
&lt;p&gt;All 21 of those Unsloth files add up to 51.3GB, which inspired me to finally try an experiment I've been wanting to run for ages: prompting "Generate an SVG of a pelican riding a bicycle" against different sized quantized variants of the same model to see what the results would look like.&lt;/p&gt;
&lt;p&gt;Honestly, &lt;a href="https://simonw.github.io/granite-4.1-3b-gguf-pelicans/"&gt;the results&lt;/a&gt; are less interesting than I expected. There's no distinguishable pattern relating quality to size - they're all pretty terrible!&lt;/p&gt;
&lt;p&gt;&lt;img alt="Six different SVG images from models ranging in size from 1.67GB to 1.2GB. They are almost all an abstract collection of shapes - weirdly the smallest model had the best version of a bicycle, while the largest one had something that looked a tiny bit like a pelican." src="https://static.simonwillison.net/static/2026/granite-3B-pelicans.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;I'll likely try this again in the future with a model that's better at drawing pelicans.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ibm"&gt;ibm&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/pelican-riding-a-bicycle"&gt;pelican-riding-a-bicycle&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm-release"&gt;llm-release&lt;/a&gt;&lt;/p&gt;



</summary><category term="ibm"/><category term="ai"/><category term="generative-ai"/><category term="llms"/><category term="pelican-riding-a-bicycle"/><category term="llm-release"/></entry><entry><title>Quoting Andy Masley</title><link href="https://simonwillison.net/2026/May/4/andy-masley/#atom-everything" rel="alternate"/><published>2026-05-04T22:51:09+00:00</published><updated>2026-05-04T22:51:09+00:00</updated><id>https://simonwillison.net/2026/May/4/andy-masley/#atom-everything</id><summary type="html">
    &lt;blockquote cite="https://blog.andymasley.com/p/data-center-land-use-issues-are-fake"&gt;&lt;p&gt;[...] Between 2000 and 2024, farmers sold in total a Colorado-sized chunk of land all on their own, 77 times all land on data center property in 2028, and grew more food than ever on what was left. None of this caused any problems for US food access.&lt;/p&gt;
&lt;p&gt;And then, in the middle of all this, a farmer in Loudoun County sells a few acres of mediocre hay field to a hyperscaler for ten times its agricultural value, and the response is that we’re running out of farmland.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://blog.andymasley.com/p/data-center-land-use-issues-are-fake"&gt;Andy Masley&lt;/a&gt;, pushing back against the "land use" argument against data center construction&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/andy-masley"&gt;andy-masley&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai-ethics"/><category term="ai"/><category term="generative-ai"/><category term="andy-masley"/></entry><entry><title>April 2026 newsletter</title><link href="https://simonwillison.net/2026/May/4/april-newsletter/#atom-everything" rel="alternate"/><published>2026-05-04T22:38:36+00:00</published><updated>2026-05-04T22:38:36+00:00</updated><id>https://simonwillison.net/2026/May/4/april-newsletter/#atom-everything</id><summary type="html">
    &lt;p&gt;I just sent out the April edition of my &lt;a href="https://github.com/sponsors/simonw/"&gt;sponsors-only monthly newsletter&lt;/a&gt;. If you are a sponsor (or if you start a sponsorship now) you can &lt;a href="https://github.com/simonw-private/monthly/blob/main/2026-04-april.md"&gt;access it here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In this month's newsletter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Opus 4.7 and GPT-5.5, both with price increases&lt;/li&gt;
&lt;li&gt;Claude Mythos and LLM security research&lt;/li&gt;
&lt;li&gt;ChatGPT Images 2.0&lt;/li&gt;
&lt;li&gt;More model releases&lt;/li&gt;
&lt;li&gt;Other highlights from my blog&lt;/li&gt;
&lt;li&gt;What I'm using, April 2026 edition&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Here's &lt;a href="https://github.com/simonw/monthly-newsletter-archive/blob/main/2026-03-march.md"&gt;a copy of the March newsletter&lt;/a&gt; as a preview of what you'll get. Pay $10/month to stay a month ahead of the free copy!&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/newsletter"&gt;newsletter&lt;/a&gt;&lt;/p&gt;



</summary><category term="newsletter"/></entry><entry><title>TRE Python binding — ReDoS robustness demo</title><link href="https://simonwillison.net/2026/May/4/tre-python-binding/#atom-everything" rel="alternate"/><published>2026-05-04T17:52:00+00:00</published><updated>2026-05-04T17:52:00+00:00</updated><id>https://simonwillison.net/2026/May/4/tre-python-binding/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Research:&lt;/strong&gt; &lt;a href="https://github.com/simonw/research/tree/main/tre-python-binding#readme"&gt;TRE Python binding — ReDoS robustness demo&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;If it's &lt;a href="https://simonwillison.net/2026/May/4/redis-array/"&gt;good enough for antirez&lt;/a&gt; to add to Redis I figured Ville Laurikari's &lt;a href="https://github.com/laurikari/tre/"&gt;TRE&lt;/a&gt; regular expression engine was worth exploring in a little more detail.&lt;/p&gt;
&lt;p&gt;I had Claude Code build an experimental Python binding (it used &lt;code&gt;ctypes&lt;/code&gt;) and try some malicious regular expression attacks against the library. TRE handles those much better than Python's standard library implementation, thanks mainly to the lack of support for backtracking.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/security"&gt;security&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/regular-expressions"&gt;regular-expressions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ctypes"&gt;ctypes&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="security"/><category term="python"/><category term="regular-expressions"/><category term="c"/><category term="ctypes"/></entry><entry><title>Redis Array Playground</title><link href="https://simonwillison.net/2026/May/4/redis-array/#atom-everything" rel="alternate"/><published>2026-05-04T15:53:57+00:00</published><updated>2026-05-04T15:53:57+00:00</updated><id>https://simonwillison.net/2026/May/4/redis-array/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/redis-array"&gt;Redis Array Playground&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;Salvatore Sanfilippo submitted &lt;a href="https://github.com/redis/redis/pull/15162"&gt;a PR&lt;/a&gt; adding a new data type - arrays - to Redis. &lt;/p&gt;
&lt;p&gt;The new commands are &lt;code&gt;ARCOUNT&lt;/code&gt;, &lt;code&gt;ARDEL&lt;/code&gt;, &lt;code&gt;ARDELRANGE&lt;/code&gt;, &lt;code&gt;ARGET&lt;/code&gt;, &lt;code&gt;ARGETRANGE&lt;/code&gt;, &lt;code&gt;ARGREP&lt;/code&gt;, &lt;code&gt;ARINFO&lt;/code&gt;, &lt;code&gt;ARINSERT&lt;/code&gt;, &lt;code&gt;ARLASTITEMS&lt;/code&gt;, &lt;code&gt;ARLEN&lt;/code&gt;, &lt;code&gt;ARMGET&lt;/code&gt;, &lt;code&gt;ARMSET&lt;/code&gt;, &lt;code&gt;ARNEXT&lt;/code&gt;, &lt;code&gt;AROP&lt;/code&gt;, &lt;code&gt;ARRING&lt;/code&gt;, &lt;code&gt;ARSCAN&lt;/code&gt;, &lt;code&gt;ARSEEK&lt;/code&gt;, &lt;code&gt;ARSET&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The implementation is currently available in a branch, so I &lt;a href="https://github.com/simonw/tools/pull/277"&gt;had Claude Code for web&lt;/a&gt; 
build this interactive playground for trying out the new commands in a WASM-compiled build of a subset of Redis running in the browser.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of a Redis command builder UI. Left sidebar shows commands ARSCAN, ARSEEK, ARSET. Main panel has a &amp;quot;predicate oneof&amp;quot; section with a MATCH dropdown and value CHERRY, plus a &amp;quot;+ add another&amp;quot; button. Below is &amp;quot;options (optional) oneof&amp;quot; with checkboxes: AND (checked), OR (unchecked), LIMIT (checked, value 10), WITHVALUES (checked), NOCASE (checked). COMMAND section shows: ARGREP myarr - + MATCH CHERRY AND LIMIT 10 WITHVALUES NOCASE. A red &amp;quot;Run command&amp;quot; button is below. REPLY section shows &amp;quot;(no reply yet)&amp;quot;." src="https://static.simonwillison.net/static/2026/redis-array-explorer-card.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;The most interesting new command is &lt;code&gt;ARGREP&lt;/code&gt; which can run a server-side grep against a range of values in the array using the newly vendored &lt;a href="https://github.com/laurikari/tre/"&gt;TRE regex library&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Salvatore wrote more about the AI-assisted development process for the array type in &lt;a href="https://antirez.com/news/164"&gt;Redis array type: short story of a long development&lt;/a&gt;.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/salvatore-sanfilippo"&gt;salvatore-sanfilippo&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/webassembly"&gt;webassembly&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/redis"&gt;redis&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/regular-expressions"&gt;regular-expressions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/c"&gt;c&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="salvatore-sanfilippo"/><category term="webassembly"/><category term="generative-ai"/><category term="agentic-engineering"/><category term="ai"/><category term="redis"/><category term="llms"/><category term="regular-expressions"/><category term="c"/></entry><entry><title>Quoting Anthropic</title><link href="https://simonwillison.net/2026/May/3/anthropic/#atom-everything" rel="alternate"/><published>2026-05-03T15:13:23+00:00</published><updated>2026-05-03T15:13:23+00:00</updated><id>https://simonwillison.net/2026/May/3/anthropic/#atom-everything</id><summary type="html">
    &lt;blockquote cite="https://www.anthropic.com/research/claude-personal-guidance"&gt;&lt;p&gt;We used an automatic classifier which judged sycophancy by looking at whether Claude showed a willingness to push back, maintain positions when challenged, give praise proportional to the merit of ideas, and speak frankly regardless of what a person wants to hear. Most of the time in these situations, Claude expressed no sycophancy—only 9% of conversations included sycophantic behavior (Figure 2). But two domains were exceptions: we saw sycophantic behavior in 38% of conversations focused on spirituality, and 25% of conversations on relationships.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://www.anthropic.com/research/claude-personal-guidance"&gt;Anthropic&lt;/a&gt;, How people ask Claude for personal guidance&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-personality"&gt;ai-personality&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sycophancy"&gt;sycophancy&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai-ethics"/><category term="anthropic"/><category term="claude"/><category term="ai-personality"/><category term="generative-ai"/><category term="ai"/><category term="llms"/><category term="sycophancy"/></entry><entry><title>Sightings</title><link href="https://simonwillison.net/2026/May/2/sightings/#atom-everything" rel="alternate"/><published>2026-05-02T17:26:40+00:00</published><updated>2026-05-02T17:26:40+00:00</updated><id>https://simonwillison.net/2026/May/2/sightings/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://simonwillison.net/elsewhere/sighting/"&gt;/elsewhere/sightings/&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
I have a new camera (a Canon R6 Mark II) so I'm taking a lot more photos of birds. I share my best wildlife photos on &lt;a href="https://www.inaturalist.org/"&gt;iNaturalist&lt;/a&gt;, and based on yesterday's &lt;a href="https://simonwillison.net/2026/May/1/inat-sightings/"&gt;successful prototype&lt;/a&gt;  I decided to add those to my blog.&lt;/p&gt;
&lt;p&gt;&lt;img class="blogmark-image" src="https://static.simonwillison.net/static/2026/beats-sightings.jpeg" alt="Screenshot of a &amp;quot;Sightings&amp;quot; webpage with a search bar and RSS icon, showing &amp;quot;Filters: Sorted by date&amp;quot; and &amp;quot;208 results page 1 / 7 next » last »»&amp;quot;. First entry: SIGHTING 7:51 PM — Acorn Woodpecker, with two photos labeled &amp;quot;Acorn Woodpecker&amp;quot; of black and white woodpeckers with red caps on tree branches, dated 2nd May 2026. Second entry: SIGHTING 10:08 AM – 11:17 AM — Acorn Woodpecker, Western Fence Lizard, Osprey, with three photos labeled &amp;quot;Acorn Woodpecker&amp;quot; (bird on bare branches against blue sky), &amp;quot;Wester...&amp;quot; (lizard on tree bark), and &amp;quot;Osprey&amp;quot; (nest on a utility pole), dated 1st May 2026. Third entry: SIGHTING 11:11 AM — White-crowned Sparrow, with a photo labeled &amp;quot;White-crowned Sparrow&amp;quot; of a sparrow with black and white striped head singing with open beak, dated 30th Apr 2026."&gt;&lt;/p&gt;
&lt;p&gt;I built this feature on my phone using Claude Code for web, as an extension of my &lt;a href="https://simonwillison.net/2026/Feb/20/beats/"&gt;beats system&lt;/a&gt; for syndicating external content. Here's &lt;a href="https://github.com/simonw/simonwillisonblog/pull/668"&gt;the PR&lt;/a&gt; and prompt.&lt;/p&gt;
&lt;p&gt;As with my other forms of incoming syndicated content sightings show up on the homepage, the date archive pages, and in site search results.&lt;/p&gt;
&lt;p&gt;I back-populated over a decade of iNaturalist sightings, which means you that if you &lt;a href="https://simonwillison.net/search/?q=lemur"&gt;search for lemur&lt;/a&gt; you'll see my lemur photos from Madagascar in 2019!


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/blogging"&gt;blogging&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/photography"&gt;photography&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/wildlife"&gt;wildlife&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;&lt;/p&gt;



</summary><category term="blogging"/><category term="photography"/><category term="wildlife"/><category term="ai"/><category term="inaturalist"/><category term="generative-ai"/><category term="llms"/><category term="ai-assisted-programming"/><category term="claude-code"/></entry><entry><title>iNaturalist Sightings</title><link href="https://simonwillison.net/2026/May/1/inat-sightings/#atom-everything" rel="alternate"/><published>2026-05-01T19:35:41+00:00</published><updated>2026-05-01T19:35:41+00:00</updated><id>https://simonwillison.net/2026/May/1/inat-sightings/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Tool:&lt;/strong&gt; &lt;a href="https://tools.simonwillison.net/inat-sightings"&gt;iNaturalist Sightings&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;I wanted to see my &lt;a href="https://www.inaturalist.org"&gt;iNaturalist&lt;/a&gt; observations - across two separate accounts - grouped by when they occurred. I'm camping this weekend so I built this entirely on my phone using Claude Code for web.&lt;/p&gt;
&lt;p&gt;I started by building an &lt;a href="https://github.com/simonw/inaturalist-clumper"&gt;inaturalist-clumper&lt;/a&gt; Python CLI for fetching and "clumping" observations - by default clumps use observations within 2 hours and 5km of each other.&lt;/p&gt;
&lt;p&gt;Then I setup &lt;a href="https://github.com/simonw/inaturalist-clumps"&gt;simonw/inaturalist-clumps&lt;/a&gt; as a &lt;a href="https://simonwillison.net/series/git-scraping/"&gt;Git scraping&lt;/a&gt; repository to run that tool and record the result to &lt;a href="https://github.com/simonw/inaturalist-clumps/blob/main/clumps.json"&gt;clumps.json&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;That JSON file is hosted on GitHub, which means it can be fetched by JavaScript using CORS.&lt;/p&gt;
&lt;p&gt;Finally I ran this prompt against my &lt;a href="https://github.com/simonw/tools"&gt;simonw/tools&lt;/a&gt; repo:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Build inat-sightings.html - an app that does a fetch() against https://raw.githubusercontent.com/simonw/inaturalist-clumps/refs/heads/main/clumps.json and then displays all of the observations on one page using the https://static.inaturalist.org/photos/538073008/small.jpg small.jpg URLs for the thumbnails - with loading=lazy - but when a thumbnail is clicked showing the large.jpg in an HTML modal. Both small and large should include the common species names if available&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/tools"&gt;tools&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude-code"&gt;claude-code&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/inaturalist"&gt;inaturalist&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="tools"/><category term="claude-code"/><category term="inaturalist"/><category term="generative-ai"/><category term="ai"/><category term="llms"/></entry><entry><title>Codex CLI 0.128.0 adds /goal</title><link href="https://simonwillison.net/2026/Apr/30/codex-goals/#atom-everything" rel="alternate"/><published>2026-04-30T23:23:17+00:00</published><updated>2026-04-30T23:23:17+00:00</updated><id>https://simonwillison.net/2026/Apr/30/codex-goals/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/openai/codex/releases/tag/rust-v0.128.0"&gt;Codex CLI 0.128.0 adds /goal&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The latest version of OpenAI's Codex CLI coding agent adds their own version of the &lt;a href="https://ghuntley.com/ralph/"&gt;Ralph loop&lt;/a&gt;: you can now set a &lt;code&gt;/goal&lt;/code&gt; and Codex will keep on looping until it evaluates that the goal has been completed... or the configured token budget has been exhausted.&lt;/p&gt;
&lt;p&gt;It looks like the feature is mainly implemented though the &lt;a href="https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/continuation.md"&gt;goals/continuation.md&lt;/a&gt; and &lt;a href="https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/budget_limit.md"&gt;goals/budget_limit.md&lt;/a&gt; prompts, which are automatically injected at the end of a turn.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://twitter.com/fcoury/status/2049917871799636201"&gt;@fcoury&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/prompt-engineering"&gt;prompt-engineering&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/coding-agents"&gt;coding-agents&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/system-prompts"&gt;system-prompts&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/codex-cli"&gt;codex-cli&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/agentic-engineering"&gt;agentic-engineering&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="openai"/><category term="prompt-engineering"/><category term="generative-ai"/><category term="llms"/><category term="coding-agents"/><category term="system-prompts"/><category term="codex-cli"/><category term="agentic-engineering"/></entry><entry><title>Our evaluation of OpenAI's GPT-5.5 cyber capabilities</title><link href="https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities/#atom-everything" rel="alternate"/><published>2026-04-30T23:03:24+00:00</published><updated>2026-04-30T23:03:24+00:00</updated><id>https://simonwillison.net/2026/Apr/30/gpt-55-cyber-capabilities/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities"&gt;Our evaluation of OpenAI&amp;#x27;s GPT-5.5 cyber capabilities&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
The UK's AI Security Institute &lt;a href="https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities"&gt;previously evaluated Claude Mythos&lt;/a&gt;: now they've evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it's generally available right now.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/openai"&gt;openai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/claude"&gt;claude&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-security-research"&gt;ai-security-research&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/gpt"&gt;gpt&lt;/a&gt;&lt;/p&gt;



</summary><category term="ai"/><category term="openai"/><category term="generative-ai"/><category term="llms"/><category term="anthropic"/><category term="claude"/><category term="ai-security-research"/><category term="gpt"/></entry><entry><title>Quoting Andrew Kelley</title><link href="https://simonwillison.net/2026/Apr/30/andrew-kelley/#atom-everything" rel="alternate"/><published>2026-04-30T21:24:55+00:00</published><updated>2026-04-30T21:24:55+00:00</updated><id>https://simonwillison.net/2026/Apr/30/andrew-kelley/#atom-everything</id><summary type="html">
    &lt;blockquote cite="https://lobste.rs/s/ifcyr1/contributor_poker_zig_s_ai_ban#c_cbtxub"&gt;&lt;p&gt;It's a common misconception that we can't tell who is using LLM and who is not. I'm sure we didn't catch 100% of LLM-assisted PRs over the past few months, but the kind of mistakes humans make are fundamentally different than LLM hallucinations, making them easy to spot. Furthermore, people who come from the world of agentic coding have a certain &lt;em&gt;digital smell&lt;/em&gt; that is not obvious to them but is obvious to those who abstain. It's like when a smoker walks into the room, everybody who doesn't smoke instantly knows it.&lt;/p&gt;
&lt;p&gt;I'm not telling you not to smoke, but I am telling you not to smoke in my house.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="https://lobste.rs/s/ifcyr1/contributor_poker_zig_s_ai_ban#c_cbtxub"&gt;Andrew Kelley&lt;/a&gt;, Creator of Zig&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/zig"&gt;zig&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;&lt;/p&gt;



</summary><category term="zig"/><category term="llms"/><category term="ai"/><category term="generative-ai"/></entry><entry><title>We need RSS for sharing abundant vibe-coded apps</title><link href="https://simonwillison.net/2026/Apr/30/rss-vibe-coded-apps/#atom-everything" rel="alternate"/><published>2026-04-30T18:38:48+00:00</published><updated>2026-04-30T18:38:48+00:00</updated><id>https://simonwillison.net/2026/Apr/30/rss-vibe-coded-apps/#atom-everything</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://interconnected.org/home/2026/04/29/syndicating-vibes"&gt;We need RSS for sharing abundant vibe-coded apps&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Matt Webb:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I would love an RSS web feed for all those various tools and apps pages, each item with an “Install” button. (But install to where?)&lt;/p&gt;
&lt;p&gt;The lesson here is that when vibe-coding accelerates app development, apps become more personal, more situated, and more frequent. Shipping a tool or a micro-app is less like launching a website and more like posting on a blog.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This inspired me to &lt;a href="https://github.com/simonw/simonwillisonblog/pull/665"&gt;have Claude&lt;/a&gt; add an Atom feed (and icon) to my &lt;a href="https://simonwillison.net/elsewhere/tool/"&gt;/elsewhere/tools/&lt;/a&gt; page, which itself is populated by content from my &lt;a href="https://tools.simonwillison.net/"&gt;tools.simonwillison.net&lt;/a&gt; site.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/atom"&gt;atom&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/matt-webb"&gt;matt-webb&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rss"&gt;rss&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/vibe-coding"&gt;vibe-coding&lt;/a&gt;&lt;/p&gt;



</summary><category term="atom"/><category term="matt-webb"/><category term="rss"/><category term="ai"/><category term="vibe-coding"/></entry><entry><title>The Zig project's rationale for their firm anti-AI contribution policy</title><link href="https://simonwillison.net/2026/Apr/30/zig-anti-ai/#atom-everything" rel="alternate"/><published>2026-04-30T01:24:23+00:00</published><updated>2026-04-30T01:24:23+00:00</updated><id>https://simonwillison.net/2026/Apr/30/zig-anti-ai/#atom-everything</id><summary type="html">
    &lt;p&gt;&lt;a href="https://ziglang.org/"&gt;Zig&lt;/a&gt; has one of the most stringent &lt;a href="https://ziglang.org/code-of-conduct/"&gt;anti-LLM policies&lt;/a&gt; of any major open source project:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;No LLMs for issues.&lt;/p&gt;
&lt;p&gt;No LLMs for pull requests.&lt;/p&gt;
&lt;p&gt;No LLMs for comments on the bug tracker, including translation. English is encouraged, but not required. You are welcome to post in your native language and rely on others to have their own translation tools of choice to interpret your words.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The most prominent project written in Zig may be the &lt;a href="https://bun.com/"&gt;Bun&lt;/a&gt; JavaScript runtime, which was &lt;a href="https://bun.com/blog/bun-joins-anthropic"&gt;acquired by Anthropic&lt;/a&gt; in December 2025 and, unsurprisingly, makes heavy use of AI assistance.&lt;/p&gt;
&lt;p&gt;Bun operates its own fork of Zig, and recently &lt;a href="https://x.com/bunjavascript/status/2048427636414923250"&gt;achieved a 4x performance improvement&lt;/a&gt; on Bun compile after adding "parallel semantic analysis and multiple codegen units to the llvm backend". Here's &lt;a href="https://github.com/oven-sh/zig/compare/upgrade-0.15.2%E2%80%A6upgrade-0.15.2-fast"&gt;that code&lt;/a&gt;. But &lt;a href="https://twitter.com/bunjavascript/status/2048428104893542781"&gt;@bunjavascript says&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We do not currently plan to upstream this, as Zig has a strict ban on LLM-authored contributions.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Update: here's &lt;a href="https://ziggit.dev/t/bun-s-zig-fork-got-4x-faster-compilation-times/15183/19"&gt;a Zig core contributor&lt;/a&gt; providing details on why they wouldn't accept that particular patch independent of the LLM issue - parallel semantic analysis is a long planned feature but has implications "for the Zig language itself".)&lt;/p&gt;
&lt;p&gt;In &lt;a href="https://kristoff.it/blog/contributor-poker-and-ai/"&gt;Contributor Poker and Zig's AI Ban&lt;/a&gt; (&lt;a href="https://lobste.rs/s/ifcyr1/contributor_poker_zig_s_ai_ban"&gt;via Lobste.rs&lt;/a&gt;) Zig Software Foundation VP of Community Loris Cro explains the rationale for this strict ban. It's the best articulation I've seen yet for a blanket ban on LLM-assisted contributions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;In successful open source projects you eventually reach a point where you start getting more PRs than what you’re capable of processing. Given what I mentioned so far, it would make sense to stop accepting imperfect PRs in order to maximize ROI from your work, but that’s not what we do in the Zig project. Instead, &lt;strong&gt;we try our best to help new contributors to get their work in, even if they need some help getting there&lt;/strong&gt;. We don’t do this just because it’s the “right” thing to do, but also &lt;strong&gt;because it’s the smart thing to do&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Zig values contributors over their contributions. Each contributor represents an investment by the Zig core team - the primary goal of reviewing and accepting PRs isn't to land new code, it's to help grow new contributors who can become trusted and prolific over time.&lt;/p&gt;
&lt;p&gt;LLM assistance breaks that completely. It doesn't matter if the LLM helps you submit a &lt;em&gt;perfect&lt;/em&gt; PR to Zig - the time the Zig team spends reviewing your work does nothing to help them add new, confident, trustworthy contributors to their overall project.&lt;/p&gt;
&lt;p&gt;Loris explains the name here:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The reason I call it “contributor poker” is because, just like people say about the actual card game, “you play the person, not the cards”. In contributor poker, you bet on the contributor, not on the contents of their first PR.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This makes a lot of sense to me. It relates to an idea I've seen circulating elsewhere: if a PR was mostly written by an LLM, why should a project maintainer spend time reviewing and discussing that PR as opposed to firing up their own LLM to solve the same problem?&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/anthropic"&gt;anthropic&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/zig"&gt;zig&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-ethics"&gt;ai-ethics&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/open-source"&gt;open-source&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/javascript"&gt;javascript&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai-assisted-programming"&gt;ai-assisted-programming&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bun"&gt;bun&lt;/a&gt;&lt;/p&gt;



</summary><category term="anthropic"/><category term="zig"/><category term="ai"/><category term="llms"/><category term="ai-ethics"/><category term="open-source"/><category term="javascript"/><category term="ai-assisted-programming"/><category term="generative-ai"/><category term="bun"/></entry><entry><title>llm 0.32a1</title><link href="https://simonwillison.net/2026/Apr/29/llm-3/#atom-everything" rel="alternate"/><published>2026-04-29T23:52:50+00:00</published><updated>2026-04-29T23:52:50+00:00</updated><id>https://simonwillison.net/2026/Apr/29/llm-3/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/llm/releases/tag/0.32a1"&gt;llm 0.32a1&lt;/a&gt;&lt;/p&gt;
        &lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;Fixed a bug in 0.32a0 where tool-calling conversations were not correctly reinflated from SQLite. &lt;a href="https://github.com/simonw/llm/issues/1426"&gt;#1426&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="llm"/></entry><entry><title>LLM 0.32a0  is a major backwards-compatible refactor</title><link href="https://simonwillison.net/2026/Apr/29/llm/#atom-everything" rel="alternate"/><published>2026-04-29T19:01:47+00:00</published><updated>2026-04-29T19:01:47+00:00</updated><id>https://simonwillison.net/2026/Apr/29/llm/#atom-everything</id><summary type="html">
    &lt;p&gt;I just released &lt;a href="https://llm.datasette.io/en/latest/changelog.html#a0-2026-04-28"&gt;LLM 0.32a0&lt;/a&gt;, an alpha release of my &lt;a href="https://llm.datasette.io/"&gt;LLM&lt;/a&gt; Python library and CLI tool for accessing LLMs, with some consequential changes that I've been working towards for quite a while.&lt;/p&gt;
&lt;p&gt;Previous versions of LLM modeled the world in terms of prompts and responses. Send the model a text prompt, get back a text response.&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Capital of France?"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;This made sense when I started working on the library back in April 2023. A lot has changed since then!&lt;/p&gt;
&lt;p&gt;LLM provides an abstraction over thousands of different models via its &lt;a href="https://llm.datasette.io/en/stable/plugins/index.html"&gt;plugin system&lt;/a&gt;. The original abstraction - of text input that returns text output - was no longer able to represent everything I needed it to.&lt;/p&gt;
&lt;p&gt;Over time LLM itself has grown &lt;a href="https://simonwillison.net/2024/Oct/29/llm-multi-modal/"&gt;attachments&lt;/a&gt; to handle image, audio, and video input, then &lt;a href="https://simonwillison.net/2025/Feb/28/llm-schemas/"&gt;schemas&lt;/a&gt; for outputting structured JSON, then &lt;a href="https://simonwillison.net/2025/May/27/llm-tools/"&gt;tools&lt;/a&gt; for executing tool calls. Meanwhile LLMs kept evolving, adding reasoning support and the ability to return images and all kinds of other interesting capabilities.&lt;/p&gt;
&lt;p&gt;LLM needs to evolve to better handle the diversity of input and output types that can be processed by today's frontier models.&lt;/p&gt;
&lt;p&gt;The 0.32a0 alpha has two key changes: model inputs can be represented as a sequence of messages, and model responses can be composed of a stream of differently typed parts.&lt;/p&gt;
&lt;h4 id="prompts-as-a-sequence-of-messages"&gt;Prompts as a sequence of messages&lt;/h4&gt;
&lt;p&gt;LLMs accept input as text, but ever since ChatGPT demonstrated the value of a two-way conversational interface, the most common way to prompt them has been to treat that input as a sequence of conversational turns.&lt;/p&gt;
&lt;p&gt;The first turn might look like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;user: Capital of France?
assistant: 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The model then gets to fill out the reply from the assistant.)&lt;/p&gt;
&lt;p&gt;But each subsequent turn needs to replay the entire conversation up to that point, as a sort of screenplay:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;user: Capital of France?
assistant: Paris
user: Germany?
assistant:
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Most of the JSON APIs from the major vendors follow this pattern. Here's what the above looks like using the OpenAI chat completions API, which has been widely imitated by other providers:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;curl https://api.openai.com/v1/chat/completions \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Authorization: Bearer &lt;span class="pl-smi"&gt;$OPENAI_API_KEY&lt;/span&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -H &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Content-Type: application/json&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt; \
  -d &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;{&lt;/span&gt;
&lt;span class="pl-s"&gt;    "model": "gpt-5.5",&lt;/span&gt;
&lt;span class="pl-s"&gt;    "messages": [&lt;/span&gt;
&lt;span class="pl-s"&gt;      {&lt;/span&gt;
&lt;span class="pl-s"&gt;        "role": "user",&lt;/span&gt;
&lt;span class="pl-s"&gt;        "content": "Capital of France?"&lt;/span&gt;
&lt;span class="pl-s"&gt;      },&lt;/span&gt;
&lt;span class="pl-s"&gt;      {&lt;/span&gt;
&lt;span class="pl-s"&gt;        "role": "assistant",&lt;/span&gt;
&lt;span class="pl-s"&gt;        "content": "Paris"&lt;/span&gt;
&lt;span class="pl-s"&gt;      },&lt;/span&gt;
&lt;span class="pl-s"&gt;      {&lt;/span&gt;
&lt;span class="pl-s"&gt;        "role": "user",&lt;/span&gt;
&lt;span class="pl-s"&gt;        "content": "Germany?"&lt;/span&gt;
&lt;span class="pl-s"&gt;      }&lt;/span&gt;
&lt;span class="pl-s"&gt;    ]&lt;/span&gt;
&lt;span class="pl-s"&gt;  }&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Prior to 0.32, LLM modeled these as conversations:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)

&lt;span class="pl-s1"&gt;conversation&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;conversation&lt;/span&gt;()
&lt;span class="pl-s1"&gt;r1&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;conversation&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Capital of France?"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;r1&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())
&lt;span class="pl-c"&gt;# Outputs "Paris"&lt;/span&gt;

&lt;span class="pl-s1"&gt;r2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;conversation&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Germany?"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;r2&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())
&lt;span class="pl-c"&gt;# Outputs "Berlin"&lt;/span&gt;&lt;/pre&gt;
&lt;p&gt;This worked if you were building a conversation with the model from scratch, but it didn't provide a way to feed in a previous conversation from the start. This made tasks like building an emulation of the OpenAI chat completions API much harder than they should have been.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;llm&lt;/code&gt; CLI tool worked around this through a custom mechanism for persisting and inflating conversations using SQLite, but that never became a stable part of the LLM API - and there are many places you might want to use the Python library without committing to SQLite as the storage layer.&lt;/p&gt;
&lt;p&gt;The new alpha now supports this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;user&lt;/span&gt;, &lt;span class="pl-s1"&gt;assistant&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)

&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s1"&gt;messages&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[
    &lt;span class="pl-en"&gt;user&lt;/span&gt;(&lt;span class="pl-s"&gt;"Capital of France?"&lt;/span&gt;),
    &lt;span class="pl-en"&gt;assistant&lt;/span&gt;(&lt;span class="pl-s"&gt;"Paris"&lt;/span&gt;),
    &lt;span class="pl-en"&gt;user&lt;/span&gt;(&lt;span class="pl-s"&gt;"Germany?"&lt;/span&gt;),
])
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;text&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;llm.user()&lt;/code&gt; and &lt;code&gt;llm.assistant()&lt;/code&gt; functions are new builder functions designed to be used within that &lt;code&gt;messages=[]&lt;/code&gt; array.&lt;/p&gt;
&lt;p&gt;The previous &lt;code&gt;prompt=&lt;/code&gt; option still works, but LLM upgrades it to a single-item messages array behind the scenes.&lt;/p&gt;
&lt;p&gt;You can also now &lt;em&gt;reply&lt;/em&gt; to a response, as an alternative to building a conversation:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;response2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;reply&lt;/span&gt;(&lt;span class="pl-s"&gt;"How about Hungary?"&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response2&lt;/span&gt;) &lt;span class="pl-c"&gt;# Default __str__() calls .text()&lt;/span&gt;&lt;/pre&gt;
&lt;h4 id="streaming-parts"&gt;Streaming parts&lt;/h4&gt;
&lt;p&gt;The other major new interface in the alpha concerns streaming results back from a prompt.&lt;/p&gt;
&lt;p&gt;Previously, LLM supported streaming like this:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Generate an SVG of a pelican riding a bicycle"&lt;/span&gt;)
&lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;chunk&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;:
    &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;Or this async variant:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;asyncio&lt;/span&gt;
&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_async_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(&lt;span class="pl-s"&gt;"Generate an SVG of a pelican riding a bicycle"&lt;/span&gt;)

&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;run&lt;/span&gt;():
    &lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;chunk&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;:
        &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)

&lt;span class="pl-s1"&gt;asyncio&lt;/span&gt;.&lt;span class="pl-c1"&gt;run&lt;/span&gt;(&lt;span class="pl-en"&gt;run&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;Many of today's models return mixed types of content. A prompt run against Claude might return reasoning output, then text, then a JSON request for a tool call, then more text content.&lt;/p&gt;
&lt;p&gt;Some models can even execute tools on the server-side, for example OpenAI's &lt;a href="https://developers.openai.com/api/docs/guides/tools-code-interpreter?lang=curl"&gt;code interpreter tool&lt;/a&gt; or Anthropic's &lt;a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-search-tool"&gt;web search&lt;/a&gt;. This means the results from the model can combine text, tool calls, tool outputs and other formats.&lt;/p&gt;
&lt;p&gt;Multi-modal output models are starting to emerge too, which can return images or even &lt;a href="https://developers.openai.com/api/docs/guides/audio#add-audio-to-your-existing-application"&gt;snippets of audio&lt;/a&gt; intermixed into that streaming response.&lt;/p&gt;
&lt;p&gt;The new LLM alpha models these as a stream of typed message parts. Here's what that looks like as a Python API consumer:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;asyncio&lt;/span&gt;
&lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;

&lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)
&lt;span class="pl-s1"&gt;prompt&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s"&gt;"invent 3 cool dogs, first talk about your motivations"&lt;/span&gt;

&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;describe_dog&lt;/span&gt;(&lt;span class="pl-s1"&gt;name&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;, &lt;span class="pl-s1"&gt;bio&lt;/span&gt;: &lt;span class="pl-smi"&gt;str&lt;/span&gt;) &lt;span class="pl-c1"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="pl-smi"&gt;str&lt;/span&gt;:
    &lt;span class="pl-s"&gt;"""Record the name and biography of a hypothetical dog."""&lt;/span&gt;
    &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-s"&gt;f"&lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;name&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;bio&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;"&lt;/span&gt;

&lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;sync_example&lt;/span&gt;():
    &lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(
        &lt;span class="pl-s1"&gt;prompt&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;tools&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[&lt;span class="pl-s1"&gt;describe_dog&lt;/span&gt;],
    )
    &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;stream_events&lt;/span&gt;():
        &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"text"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
        &lt;span class="pl-k"&gt;elif&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"tool_call_name"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;f"&lt;span class="pl-cce"&gt;\n&lt;/span&gt;Tool call: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;("&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
        &lt;span class="pl-k"&gt;elif&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"tool_call_args"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)

&lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;def&lt;/span&gt; &lt;span class="pl-en"&gt;async_example&lt;/span&gt;():
    &lt;span class="pl-s1"&gt;model&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;llm&lt;/span&gt;.&lt;span class="pl-c1"&gt;get_async_model&lt;/span&gt;(&lt;span class="pl-s"&gt;"gpt-5.5"&lt;/span&gt;)
    &lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;model&lt;/span&gt;.&lt;span class="pl-c1"&gt;prompt&lt;/span&gt;(
        &lt;span class="pl-s1"&gt;prompt&lt;/span&gt;,
        &lt;span class="pl-s1"&gt;tools&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;[&lt;span class="pl-s1"&gt;describe_dog&lt;/span&gt;],
    )
    &lt;span class="pl-k"&gt;async&lt;/span&gt; &lt;span class="pl-k"&gt;for&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt; &lt;span class="pl-c1"&gt;in&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;astream_events&lt;/span&gt;():
        &lt;span class="pl-k"&gt;if&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"text"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
        &lt;span class="pl-k"&gt;elif&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"tool_call_name"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;f"&lt;span class="pl-cce"&gt;\n&lt;/span&gt;Tool call: &lt;span class="pl-s1"&gt;&lt;span class="pl-kos"&gt;{&lt;/span&gt;&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;/span&gt;("&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
        &lt;span class="pl-k"&gt;elif&lt;/span&gt; &lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;type&lt;/span&gt; &lt;span class="pl-c1"&gt;==&lt;/span&gt; &lt;span class="pl-s"&gt;"tool_call_args"&lt;/span&gt;:
            &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;event&lt;/span&gt;.&lt;span class="pl-c1"&gt;chunk&lt;/span&gt;, &lt;span class="pl-s1"&gt;end&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;""&lt;/span&gt;, &lt;span class="pl-s1"&gt;flush&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)

&lt;span class="pl-en"&gt;sync_example&lt;/span&gt;()
&lt;span class="pl-s1"&gt;asyncio&lt;/span&gt;.&lt;span class="pl-c1"&gt;run&lt;/span&gt;(&lt;span class="pl-en"&gt;async_example&lt;/span&gt;())&lt;/pre&gt;
&lt;p&gt;Sample output (from just the first sync example):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;My motivation: create three memorable dogs with distinct “cool” styles—one cinematic, one adventurous, and one charmingly chaotic—so each feels like they could star in their own story.&lt;/code&gt;&lt;br /&gt;
&lt;code&gt;Tool call: describe_dog({"name": "Nova Jetpaw", "bio": "A sleek silver-gray whippet who wears tiny aviator goggles and loves sprinting along moonlit beaches. Nova is fearless, elegant, and rumored to outrun drones just for fun."}&lt;/code&gt;&lt;br /&gt;
&lt;code&gt;Tool call: describe_dog({"name": "Mochi Thunderbark", "bio": "A fluffy corgi with a dramatic black-and-gold bandana and the confidence of a rock star. Mochi is short, loud, loyal, and leads a neighborhood 'security patrol' made entirely of squirrels."}&lt;/code&gt;&lt;br /&gt;
&lt;code&gt;Tool call: describe_dog({"name": "Atlas Snowfang", "bio": "A massive white husky with ice-blue eyes and a backpack full of trail snacks. Atlas is calm, heroic, and always knows the way home—even during blizzards, fog, or confusing camping trips."}&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;At the end of the response you can call &lt;code&gt;response.execute_tool_calls()&lt;/code&gt; to actually run the functions that were requested, or send a &lt;code&gt;response.reply()&lt;/code&gt; to have those tools called and their return values sent back to the model:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;reply&lt;/span&gt;(&lt;span class="pl-s"&gt;"Tell me about the dogs"&lt;/span&gt;))&lt;/pre&gt;
&lt;p&gt;This new mechanism for streaming different token types means the CLI tool can now display "thinking" text in a different color from the text in the final response. The thinking text goes to stderr so it won't affect results that are piped into other tools.&lt;/p&gt;
&lt;p&gt;This example uses Claude Sonnet 4.6 (with an updated streaming event version of the &lt;a href="https://github.com/simonw/llm-anthropic"&gt;llm-anthropic&lt;/a&gt; plugin) as Anthropic's models return their reasoning text as part of the response:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell"&gt;&lt;pre&gt;llm -m claude-sonnet-4.6 &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;Think about 3 cool dogs then describe them&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt; \
  -o thinking_display 1&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;&lt;img src="https://static.simonwillison.net/static/2026/claude-thinking-llm.gif" alt="Animated demo. Starts with ~/dev/scratch/llm-anthropic % uv run llm -m claude-sonnet-4.6 'Think about 3 cool dogs then describe them' -o thinking_display 1 - the text then streams in grey: The user wants me to think about 3 cool dogs and then describe them. Let me come up with 3 interesting, cool dogs and describe them. Then switches to regular color text for the output that describes the dogs." style="max-width: 100%;" /&gt;&lt;/p&gt;
&lt;p&gt;You can suppress the output of reasoning tokens using the new &lt;code&gt;-R/--no-reasoning&lt;/code&gt; flag. Surprisingly that ended up being the only CLI-facing change in this release.&lt;/p&gt;
&lt;h4 id="a-mechanism-for-serializing-and-deserializing-responses"&gt;A mechanism for serializing and deserializing responses&lt;/h4&gt;
&lt;p&gt;As mentioned earlier, LLM has quite inflexible code at the moment for persisting conversations to SQLite. I've added a new mechanism in 0.32a0 that should provide Python API users a way to roll their own alternative:&lt;/p&gt;
&lt;pre&gt;&lt;span class="pl-s1"&gt;serializable&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;response&lt;/span&gt;.&lt;span class="pl-c1"&gt;to_dict&lt;/span&gt;()
&lt;span class="pl-c"&gt;# serializable is a JSON-style dictionary&lt;/span&gt;
&lt;span class="pl-c"&gt;# store it anywhere you like, then inflate it:&lt;/span&gt;
&lt;span class="pl-s1"&gt;response&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-v"&gt;Response&lt;/span&gt;.&lt;span class="pl-c1"&gt;from_dict&lt;/span&gt;(&lt;span class="pl-s1"&gt;serializable&lt;/span&gt;)&lt;/pre&gt;
&lt;p&gt;The dictionary this returns is actually a &lt;code&gt;TypedDict&lt;/code&gt; defined in the new &lt;a href="https://github.com/simonw/llm/blob/main/llm/serialization.py"&gt;llm/serialization.py&lt;/a&gt; module.&lt;/p&gt;
&lt;h4 id="what-s-next-"&gt;What's next?&lt;/h4&gt;
&lt;p&gt;I'm releasing this as an alpha so I can upgrade various plugins and exercise the new design in real world environments for a few days. I expect the stable 0.32 release will be very similar to this alpha, unless alpha testing reveals some design flaw in the way I've put this all together.&lt;/p&gt;
&lt;p&gt;There's one remaining large task: I'd like to redesign the SQLite logging system to better capture the more finely grained details that are returned by this new abstraction.&lt;/p&gt;
&lt;p&gt;Ideally I'd like to model this as a graph, to best support situations like an OpenAI-style chat completions API where the same conversations are constantly extended and then repeated with every prompt. I want to be able to store those without duplicating them in the database.&lt;/p&gt;
&lt;p&gt;I'm undecided as to whether that should be a feature in 0.32 or I should hold it for 0.33.&lt;/p&gt;
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/projects"&gt;projects&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ai"&gt;ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/annotated-release-notes"&gt;annotated-release-notes&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/generative-ai"&gt;generative-ai&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llms"&gt;llms&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="projects"/><category term="python"/><category term="ai"/><category term="annotated-release-notes"/><category term="generative-ai"/><category term="llms"/><category term="llm"/></entry><entry><title>llm 0.32a0</title><link href="https://simonwillison.net/2026/Apr/29/llm-2/#atom-everything" rel="alternate"/><published>2026-04-29T18:57:47+00:00</published><updated>2026-04-29T18:57:47+00:00</updated><id>https://simonwillison.net/2026/Apr/29/llm-2/#atom-everything</id><summary type="html">
    
        &lt;p&gt;&lt;strong&gt;Release:&lt;/strong&gt; &lt;a href="https://github.com/simonw/llm/releases/tag/0.32a0"&gt;llm 0.32a0&lt;/a&gt;&lt;/p&gt;
        &lt;p&gt;See &lt;a href="https://simonwillison.net/2026/Apr/29/llm/"&gt;the annotated release notes&lt;/a&gt;.&lt;/p&gt;
    
    
        &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/llm"&gt;llm&lt;/a&gt;&lt;/p&gt;
    

</summary><category term="llm"/></entry></feed>