Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3298

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

8352

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

1378

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 8352 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
GPT-5.5 prompting guide https://developers.openai.com/api/docs/guides/prompt-guidance?model=gpt-5.5 Now that GPT-5.5 is [available in the API](https://developers.openai.com/api/docs/models/gpt-5.5), OpenAI have released a wealth of useful tips on how best to prompt the new model. Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible response: > `Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences.` I've already noticed their Codex app doing this, and it does make longer running tasks feel less like the model has crashed. OpenAI suggest running the following in Codex to upgrade your existing code using advice embedded in their `openai-docs` skill: > `$openai-docs migrate this project to gpt-5.5` The upgrade guide the coding agent will follow [is this one](https://github.com/openai/skills/blob/724cd511c96593f642bddf13187217aa155d2554/skills/.curated/openai-docs/references/upgrade-guide.md#model-string--light-prompt-rewrite), which even includes light instructions on how to rewrite prompts to better fit the model. Also relevant is the [Using GPT-5.5 guide](https://developers.openai.com/api/docs/guides/latest-model), which opens with this warning: > To get the most out of GPT-5.5, treat it as a new model family to tune for, not a drop-in replacement for `gpt-5.2` or `gpt-5.4`. Begin migration with a fresh baseline instead of carrying over every instruction from an older prompt stack. Start with the smallest prompt that preserves the product contract, then tune reasoning effort, verbosity, tool descriptions, and output format against representative examples. Interesting to see OpenAI recommend starting from scratch rather than trusting that existing prompts optimized for previous models will continue to work effectively with GPT-5.5. 2026-04-25 04:13:36+00:00
The people do not yearn for automation https://www.theverge.com/podcast/917029/software-brain-ai-backlash-databases-automation This written and video essay by Nilay Patel explores why AI is unpopular with the general public even as usage numbers for ChatGPT continue to skyrocket. It’s a superb piece of commentary, and something I expect I’ll be thinking about for a long time to come. Nilay’s core idea is that people afflicted with “software brain” - who see the world as something to be automated as much as possible, and attempt to model everything in terms of information flows and data - are becoming detached from everyone else. > […] software brain has ruled the business world for a long time. AI has just made it easier than ever for more people to make more software than ever before — for every kind of business to automate big chunks of itself with software. It’s everywhere: the absolute cutting edge of advertising and marketing is automation with AI. It’s not being a creative. > > But: not everything is a business. Not everything is a loop! The entire human experience cannot be captured in a database. *That’s* the limit of software brain. That’s why people hate AI. It *flattens* them. > > Regular people don’t see the opportunity to write code as an opportunity at *all*. The people do not yearn for automation. I’m a full-on smart home sicko; the lights and shades and climate controls of my house are automated in dozens of ways. But huge companies like Apple, Google and Amazon have struggled for over a decade now to make regular people care about smart home automation at all. And they just don’t. 2026-04-24 22:38:49+00:00
russellromney/honker https://github.com/russellromney/honker <p>"Postgres NOTIFY/LISTEN semantics" for SQLite, implemented as a Rust SQLite extension and various language bindings to help make use of it.</p> <p>The design of this looks very solid. It lets you write Python code for queues that looks like this:</p> <pre><span class="pl-k">import</span> <span class="pl-s1">honker</span> <span class="pl-s1">db</span> <span class="pl-c1">=</span> <span class="pl-s1">honker</span>.<span class="pl-c1">open</span>(<span class="pl-s">"app.db"</span>) <span class="pl-s1">emails</span> <span class="pl-c1">=</span> <span class="pl-s1">db</span>.<span class="pl-c1">queue</span>(<span class="pl-s">"emails"</span>) <span class="pl-c1">emails</span>.<span class="pl-c1">enqueue</span>({<span class="pl-s">"to"</span>: <span class="pl-s">"alice@example.com"</span>}) <span class="pl-c"># Consume (in a worker process)</span> <span class="pl-k">async</span> <span class="pl-k">for</span> <span class="pl-s1">job</span> <span class="pl-c1">in</span> <span class="pl-s1">emails</span>.<span class="pl-c1">claim</span>(<span class="pl-s">"worker-1"</span>): <span class="pl-en">send</span>(<span class="pl-s1">job</span>.<span class="pl-c1">payload</span>) <span class="pl-s1">job</span>.<span class="pl-c1">ack</span>()</pre> <p>And Kafka-style durable streams like this:</p> <pre><span class="pl-s1">stream</span> <span class="pl-c1">=</span> <span class="pl-s1">db</span>.<span class="pl-c1">stream</span>(<span class="pl-s">"user-events"</span>) <span class="pl-k">with</span> <span class="pl-s1">db</span>.<span class="pl-c1">transaction</span>() <span class="pl-k">as</span> <span class="pl-s1">tx</span>: <span class="pl-s1">tx</span>.<span class="pl-c1">execute</span>(<span class="pl-s">"UPDATE users SET name=? WHERE id=?"</span>, [<span class="pl-s1">name</span>, <span class="pl-s1">uid</span>]) <span class="pl-s1">stream</span>.<span class="pl-c1">publish</span>({<span class="pl-s">"user_id"</span>: <span class="pl-s1">uid</span>, <span class="pl-s">"change"</span>: <span class="pl-s">"name"</span>}, <span class="pl-s1">tx</span><span class="pl-c1">=</span><span class="pl-s1">tx</span>) <span class="pl-k">async</span> <span class="pl-k">for</span> <span class="pl-s1">event</span> <span class="pl-c1">in</span> <span class="pl-s1">stream</span>.<span class="pl-c1">subscribe</span>(<span class="pl-s1">consumer</span><span class="pl-c1">=</span><span class="pl-s">"dashboard"</span>): <span class="pl-k">await</span> <span class="pl-en">push_to_browser</span>(<span class="pl-s1">event</span>)</pre> <p>It also adds 20+ custom SQL functions including these two:</p> <div class="highlight highlight-source-sql"><pre><span class="pl-k">SELECT</span> notify(<span class="pl-s"><span class="pl-pds">'</span>orders<span class="pl-pds">'</span></span>, <span class="pl-s"><span class="pl-pds">'</span>{"id":42}<span class="pl-pds">'</span></span>); <span class="pl-k">SELECT</span> honker_stream_read_since(<span class="pl-s"><span class="pl-pds">'</span>orders<span class="pl-pds">'</span></span>, <span class="pl-c1">0</span>, <span class="pl-c1">1000</span>);</pre></div> <p>The extension requires WAL mode, and workers can poll the <code>.db-wal</code> file with a stat call every 1ms to get as close to real-time as possible without the expense of running a full SQL query.</p> <p>honker implements the <strong>transactional outbox pattern</strong>, which ensures items are only queued if a transaction successfully commits. My favorite explanation of that pattern remains <a href="https://brandur.org/job-drain">Transactionally Staged Job Drains in Postgres</a> by Brandur Leach. It's great to see a new implementation of that pattern for SQLite.</p> 2026-04-24 01:50:07+00:00
An update on recent Claude Code quality reports https://www.anthropic.com/engineering/april-23-postmortem It turns out the high volume of complaints that Claude Code was providing worse quality results over the past two months was grounded in real problems. The models themselves were not to blame, but three separate issues in the Claude Code harness caused complex but material problems which directly affected users. Anthropic's postmortem describes these in detail. This one in particular stood out to me: > On March 26, we shipped a change to clear Claude's older thinking from sessions that had been idle for over an hour, to reduce latency when users resumed those sessions. A bug caused this to keep happening every turn for the rest of the session instead of just once, which made Claude seem forgetful and repetitive. I *frequently* have Claude Code sessions which I leave for an hour (or often a day or longer) before returning to them. Right now I have 11 of those (according to `ps aux  | grep 'claude '`) and that's after closing down dozens more the other day. I estimate I spend more time prompting in these "stale" sessions than sessions that I've recently started! If you're building agentic systems it's worth reading this article in detail - the kinds of bugs that affect harnesses are deeply complicated, even if you put aside the inherent non-deterministic nature of the models themselves. 2026-04-24 01:31:25+00:00
Serving the For You feed https://atproto.com/blog/serving-the-for-you-feed One of Bluesky's most interesting features is that anyone can run their own [custom "feed" implementation](bluesky custom feed) and make it available to other users - effectively enabling custom algorithms that can use any mechanism they like to recommend posts. spacecowboy runs the [For You Feed](https://bsky.app/profile/did:plc:3guzzweuqraryl3rdkimjamk/feed/for-you), used by around 72,000 people. This guest post on the AT Protocol blog explains how it works. The architecture is *fascinating*. The feed is served by a single Go process using SQLite on a "gaming" PC in spacecowboy's living room - 16 cores, 96GB of RAM and 4TB of attached NVMe storage. Recommendations are based on likes: what else are the people who like the same things as you liking on the platform? That Go server consumes the Bluesky firehose and stores the relevant details in SQLite, keeping the last 90 days of relevant data, which currently uses around 419GB of SQLite storage. Public internet traffic is handled by a $7/month VPS on OVH, which talks to the living room server via Tailscale. Total cost is now $30/month: $20 in electricity, $7 in VPS and $3 for the two domain names. spacecowboy estimates that the existing system could handle all ~1 million daily active Bluesky users if they were to switch to the cheapest algorithm they have found to work. 2026-04-24 01:08:17+00:00
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model https://qwen.ai/blog?id=qwen3.6-27b Big claims from Qwen about their latest open weight model: > Qwen3.6-27B delivers flagship-level agentic coding performance, surpassing the previous-generation open-source flagship Qwen3.5-397B-A17B (397B total / 17B active MoE) across all major coding benchmarks. On Hugging Face [Qwen3.5-397B-A17B](https://huggingface.co/Qwen/Qwen3.5-397B-A17B/tree/main) is 807GB, this new [Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B/tree/main) is 55.6GB. I tried it out with the 16.8GB Unsloth [Qwen3.6-27B-GGUF:Q4_K_M](https://huggingface.co/unsloth/Qwen3.6-27B-GGUF) quantized version and `llama-server` using this recipe by [benob on Hacker News](https://news.ycombinator.com/item?id=47863217#47865140), after first installing `llama-server` using `brew install llama.cpp`: llama-server \ -hf unsloth/Qwen3.6-27B-GGUF:Q4_K_M \ --no-mmproj \ --fit on \ -np 1 \ -c 65536 \ --cache-ram 4096 -ctxcp 2 \ --jinja \ --temp 0.6 \ --top-p 0.95 \ --top-k 20 \ --min-p 0.0 \ --presence-penalty 0.0 \ --repeat-penalty 1.0 \ --reasoning on \ --chat-template-kwargs '{"preserve_thinking": true}' On first run that saved the ~17GB model to `~/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-GGUF`. Here's [the transcript](https://gist.github.com/simonw/4d99d730c840df594096366db1d27281) for "Generate an SVG of a pelican riding a bicycle". This is an *outstanding* result for a 16.8GB local model: ![Bicycle has spokes, a chain and a correctly shaped frame. Handlebars are a bit detached. Pelican has wing on the handlebars, weirdly bent legs that touch the pedals and a good bill. Background details are pleasant - semi-transparent clouds, birds, grass, sun.](https://static.simonwillison.net/static/2026/Qwen3.6-27B-GGUF-Q4_K_M.png) Performance numbers reported by `llama-server`: - Reading: 20 tokens, 0.4s, 54.32 tokens/s - Generation: 4,444 tokens, 2min 53s, 25.57 tokens/s For good measure, here's [Generate an SVG of a NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER](https://gist.github.com/simonw/95735fe5e76e6fdf1753e6dcce360699) (run previously [with GLM-5.1](https://simonwillison.net/2026/Apr/7/glm-51/)): ![Digital illustration in a neon Tron-inspired style of a grey cat-like creature wearing cyan visor goggles riding a glowing cyan futuristic motorcycle through a dark cityscape at night, with its long tail trailing behind, silhouetted buildings with yellow-lit windows in the background, and a glowing magenta moon on the right.](https://static.simonwillison.net/static/2026/qwen3.6-27b-possum.jpg) That one took 6,575 tokens, 4min 25s, 24.74 t/s. 2026-04-22 16:45:23+00:00
Changes to GitHub Copilot Individual plans https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/ On the same day as Claude Code's temporary will-they-won't-they $100/month kerfuffle (for the moment, [they won't](https://simonwillison.net/2026/Apr/22/claude-code-confusion/#they-reversed-it)), here's the latest on GitHub Copilot pricing. Unlike Anthropic, GitHub put up an official announcement about their changes, which include tightening usage limits, pausing signups for individual plans (!), restricting Claude Opus 4.7 to the more expensive $39/month "Pro+" plan, and dropping the previous Opus models entirely. The key paragraph: > Agentic workflows have fundamentally changed Copilot’s compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support. As Copilot’s agentic capabilities have expanded rapidly, agents are doing more work, and more customers are hitting usage limits designed to maintain service reliability. It's easy to forget that just six months ago heavy LLM users were burning an order of magnitude less tokens. Coding agents consume a *lot* of compute. Copilot was also unique (I believe) among agents in charging per-request, not per-token. (*Correction: Windsurf also operated a credit system like this which they [abandoned last month](https://windsurf.com/blog/windsurf-pricing-plans)*.) This means that single agentic requests which burn more tokens cut directly into their margins. The most recent pricing scheme addresses that with token-based usage limits on a per-session and weekly basis. My one problem with this announcement is that it doesn't clearly clarify *which* product called "GitHub Copilot" is affected by these changes. Last month in [How many products does Microsoft have named 'Copilot'? I mapped every one](https://teybannerman.com/strategy/2026/03/31/how-many-microsoft-copilot-are-there.html) Tey Bannerman identified 75 products that share the Copilot brand, 15 of which have "GitHub Copilot" in the title. Judging by the linked [GitHub Copilot plans page](https://github.com/features/copilot/plans) this covers Copilot CLI, Copilot cloud agent and code review (features on [GitHub.com](https://github.com/) itself), and the Copilot IDE features available in VS Code, Zed, JetBrains and more. 2026-04-22 03:30:02+00:00
scosman/pelicans_riding_bicycles https://github.com/scosman/pelicans_riding_bicycles I firmly approve of Steve Cosman's efforts to pollute the training set of pelicans riding bicycles. ![The heading says "Pelican Riding a Bicycle #1 - the image is a bear on a snowboard](https://static.simonwillison.net/static/2026/pelican-poison-bear.jpg) (To be fair, most of the examples [I've published](https://simonwillison.net/tags/pelican-riding-a-bicycle/) count as poisoning too.) 2026-04-21 15:54:43+00:00
Claude Token Counter, now with model comparisons https://tools.simonwillison.net/claude-token-counter I [upgraded](https://github.com/simonw/tools/pull/269) my Claude Token Counter tool to add the ability to run the same count against different models in order to compare them. As far as I can tell Claude Opus 4.7 is the first model to change the tokenizer, so it's only worth running comparisons between 4.7 and 4.6. The Claude [token counting API](https://platform.claude.com/docs/en/build-with-claude/token-counting) accepts any Claude model ID though so I've included options for all four of the notable current models (Opus 4.7 and 4.6, Sonnet 4.6, and Haiku 4.5). In the Opus 4.7 announcement [Anthropic said](https://www.anthropic.com/news/claude-opus-4-7#migrating-from-opus-46-to-opus-47): > Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is that the same input can map to more tokens—roughly 1.0–1.35× depending on the content type. I pasted the [Opus 4.7 system prompt](https://github.com/simonw/research/blob/2cf912666ba08ef0c00a1b51ee07c9a8e64579ef/extract-system-prompts/claude-opus-4-7.md?plain=1) into the token counting tool and found that the Opus 4.7 tokenizer used 1.46x the number of tokens as Opus 4.6. ![Screenshot of a token comparison tool. Models to compare: claude-opus-4-7 (checked), claude-opus-4-6 (checked), claude-opus-4-5, claude-sonnet-4-6, claude-haiku-4-5. Note: "These models share the same tokenizer". Blue "Count Tokens" button. Results table — Model | Tokens | vs. lowest. claude-opus-4-7: 7,335 tokens, 1.46x (yellow badge). claude-opus-4-6: 5,039 tokens, 1.00x (green badge).](https://static.simonwillison.net/static/2026/claude-token-count.jpg) Opus 4.7 uses the same pricing is Opus 4.6 - $5 per million input tokens and $25 per million output tokens - but this token inflation means we can expect it to be around 40% more expensive. The token counter tool also accepts images. Opus 4.7 has improved image support, described like this: > Opus 4.7 has better vision for high-resolution images: it can accept images up to 2,576 pixels on the long edge (~3.75 megapixels), more than three times as many as prior Claude models. I tried counting tokens for a 3456x2234 pixel 3.7MB PNG and got an even bigger increase in token counts - 3.01x times the number of tokens for 4.7 compared to 4.6: ![Same UI, this time with an uploaded screenshot PNG image. claude-opus-4-7: 4,744 tokens, 3.01x (yellow badge). claude-opus-4-6: 1,578 tokens, 1.00x (green badge).](https://static.simonwillison.net/static/2026/claude-token-count-image.jpg) **Update**: That 3x increase for images is *entirely* due to Opus 4.7 being able to handle higher resolutions. I tried that again with a 682x318 pixel image and it took 314 tokens with Opus 4.7 and 310 with Opus 4.6, so effectively the same cost. **Update 2**: I tried a 15MB, 30 page text-heavy PDF and Opus 4.7 reported 60,934 tokens while 4.6 reported 56,482 - that's a 1.08x multiplier, significantly lower than the multiplier I got for raw text. 2026-04-20 00:50:45+00:00
Headless everything for personal AI https://interconnected.org/home/2026/04/18/headless Matt Webb thinks **headless** services are about to become much more common: > Why? Because using personal AIs is a better experience for users than using services directly (honestly); and headless services are quicker and more dependable for the personal AIs than having them click round a GUI with a bot-controlled mouse. Evidently [Marc Benioff thinks so too](https://twitter.com/benioff/status/2044981547267395620): > Welcome Salesforce Headless 360: No Browser Required! Our API is the UI. Entire Salesforce & Agentforce & Slack platforms are now exposed as APIs, MCP, & CLI. All AI agents can access data, workflows, and tasks directly in Slack, Voice, or anywhere else with Salesforce Headless. If this model does take off it's going to play havoc with existing per-head SaaS pricing schemes. I'm reminded of the early 2010s era when every online service was launching APIs. Brandur Leach reminisces about that time in [The Second Wave of the API-first Economy](https://brandur.org/second-wave-api-first), and predicts that APIs are ready to make a comeback: > Suddenly, an API is no longer liability, but a major saleable vector to give users what they want: a way into the services they use and pay for so that an agent can carry out work on their behalf. Especially given a field of relatively undifferentiated products, in the near future the availability of an API might just be the crucial deciding factor that leads to one choice winning the field. 2026-04-19 21:46:38+00:00
Copy and export data

Duration: 5.06ms