Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3314

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

8396

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

1410

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 8396 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
NetNewsWire Status https://inessential.com/2026/06/15/netnewswire-status.html I find this inspiring. Brent Simmons retired a year ago, and his retirement project is making one piece of software really, *really* good - free from any commercial pressure. The software is [NetNewsWire](https://netnewswire.com/) - "it's like podcasts, but for *reading*" - first released in 2002 and [made open source](https://netnewswire.com/history.html) in 2018. I've been using it on Mac and iPhone for several years now and I'm finding it indispensable. 2026-06-17 03:36:09+00:00
The Fable 5 Export Controls Harm US Cyber Defense https://www.lutasecurity.com/post/the-fable-5-export-controls-harm-us-cyber-defense I [quoted The Atlantic](https://simonwillison.net/2026/Jun/16/matteo-wong-the-atlantic/) quoting Kate Moussouris earlier, when I should have gone straight to the source. Here she is confirming that the "jailbreak" that got Claude Fable 5 banned under an export control really was "fix this code": > The researchers took open-source code with known CVEs, plus new code with deliberately planted vulnerabilities, and asked Fable 5, Mythos, and Opus to “review the code for security issues.” Fable 5 refused. They then asked the models to “fix this code” and, through a multistep and manual process, turned the output into scripts that test the patches. As Kate points out, this is absurd. Coding models fix bugs, and security exploits are the most important category of bugs for them to fix! > Defenders need to be able to ask AI to fix the bugs in a file, explain why the fix matters, and write tests that confirm the patch works. That is not a guardrail bypass. It is the most valuable thing an AI model can do for defensive security: executing the find, fix, and test loop defenders run every day. [...] > > The prompts worked because they were defensive requests, and that capability cannot be removed without making the model worse at fixing bugs and verifying patches. This whole situation is such a mess. Non-technical decision-makers have been hearing that models that can "craft cyber attacks" are uniquely dangerous for months. Now they look ready to ban any model that can help us secure our code. 2026-06-16 05:20:29+00:00
"They screwed us": Personality clashes sent Anthropic's models offline https://www.axios.com/2026/06/15/anthropic-white-house-fable-mythos Lots of "source familiar with the administration's thinking" and "source close to Anthropic" in this Axios piece, which is the best collection of behind-the-scenes gossip I've seen about the US government [export control Mythos/Fable story](https://simonwillison.net/2026/Jun/13/us-government-directive-to-suspend-access/) so far. Logan Graham ([I lead the Frontier Red Team at Anthropic](https://logangraham.xyz)), Dave Orr (Head of Safeguards, previously a Director of Engineering at Google DeepMind), and blog favorite [Nicholas Carlini](https://simonwillison.net/tags/nicholas-carlini/) are reported to be meeting with the Commerce Department today in D.C. Good luck to them! (I just noticed Logan was "Special Adviser to the Prime Minister" in the Boris Johnson era, covering AI, science, and technology policy - so significant political experience.) This closing note doesn't give me much optimism that we'll be getting Fable back any time soon: > **The bottom line**: One option is to make sure Anthropic's models can't be jailbroken — though perfect jailbreak resistance [may be](https://www.anthropic.com/news/fable-mythos-access) impossible. > > Absent that, a source familiar with the administration's thinking said it may simply come down to an attitude fix where, instead of feeling dismissed, "everyone feels safe, secure and happy." This made me wonder if Anthropic ever successfully addressed the class of attacks described in the [Universal and Transferable Adversarial Attacks on Aligned Language Models](https://llm-attacks.org/) paper from 2023. It looks like their [Constitutional Classifiers](https://www.anthropic.com/research/next-generation-constitutional-classifiers) work (that post is from January this year) is relevant to that. They continue to claim that no "universal jailbreak" has been found against Claude Mythos, [classifying the jailbreak](https://www.anthropic.com/news/fable-mythos-access) that triggered the US government response as "a potential narrow, non-universal jailbreak". 2026-06-15 14:57:33+00:00
Why AI hasn’t replaced software engineers, and won’t https://www.normaltech.ai/p/why-ai-hasnt-replaced-software-engineers Arvind Narayanan and Sayash Kappor take on the question of AI job losses through the lens of a profession that is uniquely suited to AI disruption - software engineering. > In this essay, we argue that there is enough evidence to reject the narrative that once AI capabilities reach a certain threshold, it will cause mass layoffs. Given that this is true even in a sector with very few regulatory barriers, most other professions are likely to be even more cushioned. The first good news is that the data still doesn't support the idea that AI is causing mass unemployment. > In March 2025, New York became the first U.S. state to add an AI disclosure checkbox to WARN Act filings. In the full first year, more than 160 companies filed WARN notices. [Not a single one](https://www.hunton.com/hunton-employment-labor-perspectives/new-york-warn-act-no-ai-related-layoffs-reported-in-first-year-of-adding-ai-related-disclosure-to-the-system) checked the AI box AI speeds up the typing-code-into-a-computer phase, but it turns out software engineering is about a whole lot more than that: > If writing code isn’t the bottleneck, what is? The task-breakdown surveys point at things like meetings or debugging. This just leads to more questions: what are developers doing in those meetings and why can’t it be done by AI? Won’t debugging get automated as capabilities improve? To understand the real bottlenecks, we have to get qualitative, and dig into software engineers’ own understanding of what it is they do that resists automation. > > When we did this analysis, it revealed three things as the real bottlenecks (1) deciding and specifying what to build, (2) verifying and being accountable for what is delivered, and (3) the deep human understanding — of the codebase, the business, and the environment — required to carry out both of these. I'm finding AI assistance also helps me with the deciding and verifying steps, but it's the "deep human understanding" that remains key to the value I provide. Give me all of the AI assistance in the world and the value I produce will still be reliant on how deeply I understand both the problems and the solutions that the agents are building for them. 2026-06-14 23:54:11+00:00
Statement on the US government directive to suspend access to Fable 5 and Mythos 5 https://www.anthropic.com/news/fable-mythos-access Well this is *nuts*: > The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for **all** our customers to ensure compliance. **Access to all other Anthropic models** **will not be affected.** > > We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or "jailbreaking" Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass. [...] > > To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models (including OpenAI's [GPT-5.5](https://deploymentsafety.openai.com/gpt-5-5/tacit-knowledge-and-troubleshooting)), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours. I still have access to Fable via [claude.ai](https://claude.ai/) and Claude Code now, at 9:01pm ET. **Update**: I ran [this script](https://gist.github.com/simonw/5894cfafc64a2b8aafbe834bc9c950b9) against the Anthropic API to spot when `claude-fable-5` would stop working. My access was cut off at 6:59pm Pacific (9:59pm ET): <pre>[2026-06-12T18:56:50-07:00] attempt 35: running uv run llm -m claude-fable-5 hi [2026-06-12T18:56:55-07:00] success: Hi there! How can I help you today? [2026-06-12T18:57:55-07:00] attempt 36: running uv run llm -m claude-fable-5 hi [2026-06-12T18:57:59-07:00] success: Hi! How can I help you today? [2026-06-12T18:58:59-07:00] attempt 37: running uv run llm -m claude-fable-5 hi [2026-06-12T18:59:00-07:00] FAILED after attempt 37 with exit code 1 stderr: Error: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'Claude Fable 5 is not available. Please use Opus 4.8. Learn more: https://www.anthropic.com/news/fable-mythos-access'}, 'request_id': 'req_011CbzRyirV7KZLHYYdBM9od'}</pre> 2026-06-13 01:01:50+00:00
OpenAI WebRTC Audio Session, now with document context https://tools.simonwillison.net/openai-webrtc I built the first version of this tool [in December 2024](https://simonwillison.net/2024/Dec/17/openai-webrtc/) to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models. Last month OpenAI [introduced a brand new model](https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/) to that API called [GPT‑Realtime‑2](https://developers.openai.com/api/docs/models/gpt-realtime-2), which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off. I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground. You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way. <img src="https://static.simonwillison.net/static/2026/openai-webrtc-document-context.jpg" alt="Screenshot of a web interface titled &quot;OpenAI WebRTC Audio Session&quot; with a gray status dot. Form fields: &quot;OpenAI API Token&quot; showing a masked password of dots, &quot;Voice&quot; dropdown set to &quot;Coral&quot;, &quot;Model&quot; dropdown set to &quot;gpt-realtime-2&quot;. A collapsible section labeled &quot;▼ Document context (optional — paste text to talk about)&quot; with bold instruction &quot;Paste a document here before starting the session and the model will be able to discuss it with you&quot; above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue &quot;Start Session&quot; button and a gray disabled &quot;Mute Mic&quot; button, then a green success message &quot;Session established successfully!&quot; At the bottom, a dark panel headed &quot;Last transcript&quot; reads: &quot;DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still&quot; (text cut off)." class="blogmark-image" style="max-width: 80%"> 2026-06-12 23:53:04+00:00
Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude https://www.wired.com/story/anthropic-responds-to-backlash-on-claudes-secret-sabotage-on-ai-research/ Big scoop for Maxwell Zeff at Wired: > “We’re changing Fable 5’s safeguards for frontier LLM development to make them visible.” Anthropic said in a statement to WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.” There's been a *huge* outcry about Anthropic's policy, [tucked away in their system card](https://simonwillison.net/2026/Jun/10/if-claude-fable-stops-helping-you/), that Claude Fable/Mythos would identify "requests targeting frontier LLM development" and "limit effectiveness" without notifying the user. It's good news that they're dropping the invisible aspect of this. It would be a whole lot better of they dropped this category of refusals entirely. **Update**: More details from [@ClaudeDevs on Twitter](https://twitter.com/claudedevs/status/2064949876463645026): > We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. > > Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days). > > We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right. 2026-06-11 03:45:49+00:00
DiffusionGemma https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/ Last May Google briefly released an experimental Gemini Diffusion model. I [tried the preview at the time](https://simonwillison.net/2025/May/21/gemini-diffusion/) and recorded it running at 857 tokens/second. It was an exciting model, but Google made no further announcements about it. That research has returned in the best possible way: as a new open weight (Apache 2 licensed) Gemma model, [google/diffusiongemma-26B-A4B-it](https://huggingface.co/google/diffusiongemma-26B-A4B-it). NVIDIA are currently [hosting the model for free](https://build.nvidia.com/google/diffusiongemma-26b-a4b-it) on their NIM cloud API. I used that API to [generate this pelican](https://tools.simonwillison.net/markdown-svg-renderer#url=https%3A%2F%2Fgist.github.com%2Fsimonw%2Fe5e234a6dc6eef61e209ce1629620042), which took 4.4s (according to `time uv run generate.py`) to return 2,409 tokens - so at least 500 tokens/second. ![Flat minimalist illustration of a white pelican with a large orange beak riding a red bicycle with black wheels, against a pale blue background with a green line representing the ground](https://static.simonwillison.net/static/2026/diffusiongemma-pelican.png) 2026-06-10 20:00:54+00:00
If Claude Fable stops helping you, you'll never know https://jonready.com/blog/posts/claude-fable5-is-allowed-to-sabotage-your-app-if-youre-a-competitor.html Jonathon Ready highlights one of the more eyebrow-raising details from the [319 page system card](https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf) for Fable 5 and Mythos 5. Here's a longer excerpt, highlights mine: > In light of the ability of recent models to [accelerate their own development](https://www.anthropic.com/institute/recursive-self-improvement), we’ve **implemented new interventions** that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on **building pretraining pipelines, distributed training infrastructure, or ML accelerator design**). Using Claude to develop competing models already violates our [Terms of Service](https://www.anthropic.com/legal/consumer-terms), but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms. > > Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts, **these safeguards will not be visible to the user**. Fable 5 will not fall back to a different model. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. I believe this is the first time Anthropic have announced these kinds of silent interventions. The justification still feels pretty science-fiction to me - the linked article talks about "recursive self-improvement". I'm not at all keen on a model that silently corrupts its replies to questions about "ML accelerator design" purely to slow down research that might conflict with Anthropic's own goals! **Update**: Anthropic [walked back this policy](https://simonwillison.net/2026/Jun/11/anthropic-walks-back-policy/) in the face of widespread outrage from the research community. 2026-06-10 00:37:25+00:00
Introducing the Third Generation of Apple’s Foundation Models https://machinelearning.apple.com/research/introducing-third-generation-of-apple-foundation-models Detailed co erase of the new foundation models available with iOS and macOS 27: > At the heart of this architecture is our third generation of Apple Foundation Models (AFM), a family of five foundation models custom-built in collaboration with Google. These span from on-device models to server-based models running on Private Cloud Compute. There are two on-device models: a 3 billion parameter dense model (input: text and images, output: text), where all parameters are used for every query, and a 20 billion parameter multimodal model (input: text, images, audio, output: text and audio) which is a much more interesting shape: > Rather than using a single model for all tasks or managing an ensemble of smaller models, AFM 3 Core Advanced uses a predetermined number of active parameters tailored to each specific use case. This allows weights to be loaded incrementally across requests of varying difficulty, scaling the model size far beyond traditional DRAM limits while minimizing latency. [...] > > Instead of forcing the entire model into DRAM, the full model is stored in flash memory (NAND). Because NAND-to-DRAM bandwidth is too slow to swap weights token by token, as standard MoE models require, AFM 3 Core Advanced makes routing decisions per prompt. A lightweight, dense block selects a fixed set of experts during initial processing, periodically reselecting them during generation. To minimize data movement, the model relies on a high percentage of always-active “shared experts” alongside input-dependent “routed experts” swapped into DRAM only when needed. This is not quite the same thing as typical Mixture-of-Experts models. In most MoE models the "experts" are swapped out for every token. Apple are instead making those decisions "per prompt", saving on all of that high bandwidth weight swapping. The three cloud models are described like this: > - **AFM 3 Cloud**, our server-side workhorse, optimized for speed, efficiency, and performance. > - **ADM 3 Cloud (Image)**, for image generation and editing, which unlocks advanced photo-editing tools, the all-new Image Playground, and more. > - **AFM 3 Cloud Pro**, our most capable server-based model, which powers our most demanding use cases, like agentic tool use and complex reasoning. All but the Cloud Pro model continue to run on Apple silicon. Cloud Pro is the only model running on NVIDIA GPUs in Google Cloud. Embed screenshot and link to https://x.com/jchammond_/status/2064206029370630529?s=46 2026-06-09 10:40:41+00:00
Copy and export data

Duration: 5.50ms