Example dashboard

Various statistics from my blog.

Owned by simonw, visibility: Public

Entries

3299

SQL query
select 'Entries' as label, count(*) as big_number from blog_entry

Blogmarks

8360

SQL query
select 'Blogmarks' as label, count(*) as big_number from blog_blogmark

Quotations

1382

SQL query
select 'Quotations' as label, count(*) as big_number from blog_quotation

Chart of number of entries per month over time

SQL query
select '<h2>Chart of number of entries per month over time</h2>' as html
SQL query
select to_char(date_trunc('month', created), 'YYYY-MM') as bar_label,
count(*) as bar_quantity from blog_entry group by bar_label order by count(*) desc

Ten most recent blogmarks (of 8360 total)

SQL query
select '## Ten most recent blogmarks (of ' || count(*) || ' total)' as markdown from blog_blogmark
SQL query
select link_title, link_url, commentary, created from blog_blogmark order by created desc limit 10

10 rows

link_title link_url commentary created
/elsewhere/sightings/ https://simonwillison.net/elsewhere/sighting/ I have a new camera (a Canon R6 Mark II) so I'm taking a lot more photos of birds. I share my best wildlife photos on [iNaturalist](https://www.inaturalist.org/), and based on yesterday's [successful prototype](https://simonwillison.net/2026/May/1/inat-sightings/) I decided to add those to my blog. <img class="blogmark-image" src="https://static.simonwillison.net/static/2026/beats-sightings.jpeg" alt="Screenshot of a &quot;Sightings&quot; webpage with a search bar and RSS icon, showing &quot;Filters: Sorted by date&quot; and &quot;208 results page 1 / 7 next » last »»&quot;. First entry: SIGHTING 7:51 PM — Acorn Woodpecker, with two photos labeled &quot;Acorn Woodpecker&quot; of black and white woodpeckers with red caps on tree branches, dated 2nd May 2026. Second entry: SIGHTING 10:08 AM – 11:17 AM — Acorn Woodpecker, Western Fence Lizard, Osprey, with three photos labeled &quot;Acorn Woodpecker&quot; (bird on bare branches against blue sky), &quot;Wester...&quot; (lizard on tree bark), and &quot;Osprey&quot; (nest on a utility pole), dated 1st May 2026. Third entry: SIGHTING 11:11 AM — White-crowned Sparrow, with a photo labeled &quot;White-crowned Sparrow&quot; of a sparrow with black and white striped head singing with open beak, dated 30th Apr 2026."> I built this feature on my phone using Claude Code for web, as an extension of my [beats system](https://simonwillison.net/2026/Feb/20/beats/) for syndicating external content. Here's [the PR](https://github.com/simonw/simonwillisonblog/pull/668) and prompt. As with my other forms of incoming syndicated content sightings show up on the homepage, the date archive pages, and in site search results. I back-populated over a decade of iNaturalist sightings, which means you that if you [search for lemur](https://simonwillison.net/search/?q=lemur) you'll see my lemur photos from Madagascar in 2019! 2026-05-02 17:26:40+00:00
Codex CLI 0.128.0 adds /goal https://github.com/openai/codex/releases/tag/rust-v0.128.0 The latest version of OpenAI's Codex CLI coding agent adds their own version of the [Ralph loop](https://ghuntley.com/ralph/): you can now set a `/goal` and Codex will keep on looping until it evaluates that the goal has been completed... or the configured token budget has been exhausted. It looks like the feature is mainly implemented though the [goals/continuation.md](https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/continuation.md) and [goals/budget_limit.md](https://github.com/openai/codex/blob/6014b6679ffbd92eeddffa3ad7b4402be6a7fefe/codex-rs/core/templates/goals/budget_limit.md) prompts, which are automatically injected at the end of a turn. 2026-04-30 23:23:17+00:00
Our evaluation of OpenAI's GPT-5.5 cyber capabilities https://www.aisi.gov.uk/blog/our-evaluation-of-openais-gpt-5-5-cyber-capabilities The UK's AI Security Institute [previously evaluated Claude Mythos](https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities): now they've evaluated GPT-5.5 for finding security vulnerability and found it to be comparable to Mythos, but unlike Mythos it's generally available right now. 2026-04-30 23:03:24+00:00
We need RSS for sharing abundant vibe-coded apps https://interconnected.org/home/2026/04/29/syndicating-vibes Matt Webb: > I would love an RSS web feed for all those various tools and apps pages, each item with an “Install” button. (But install to where?) > > The lesson here is that when vibe-coding accelerates app development, apps become more personal, more situated, and more frequent. Shipping a tool or a micro-app is less like launching a website and more like posting on a blog. This inspired me to [have Claude](https://github.com/simonw/simonwillisonblog/pull/665) add an Atom feed (and icon) to my [/elsewhere/tools/](https://simonwillison.net/elsewhere/tool/) page, which itself is populated by content from my [tools.simonwillison.net](https://tools.simonwillison.net/) site. 2026-04-30 18:38:48+00:00
What's new in pip 26.1 - lockfiles and dependency cooldowns! https://ichard26.github.io/blog/2026/04/whats-new-in-pip-26.1/ Richard Si describes an excellent set of upgrades to Python's default `pip` tool for installing dependencies. This version drops support for Python 3.9 - fair enough, since it's been EOL [since October](https://devguide.python.org/versions/). macOS still ships with `python3` as a default Python 3.9, so I tried out the new Python version against Python 3.14 like this: uv python install 3.14 mkdir /tmp/experiment cd /tmp/experiment python3.14 -m venv venv source venv/bin/activate pip install -U pip pip --version This confirmed I had `pip 26.1` - then I tried out the new lock files: pip lock datasette llm This installs Datasette and LLM and all of their dependencies and writes the whole lot to a 519 line `pylock.toml` file - [here's the result](https://gist.github.com/simonw/ff52c33f4d3a381b8e53c6a3aa0213f8). The new release also supports dependency cooldowns, [discussed here previously](https://simonwillison.net/2026/Mar/24/package-managers-need-to-cool-down/), via the new `--uploaded-prior-to PXD` option where X is a number of days. The format is `P-number-of-days-D`, following [ISO duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations) but only supporting days. I shipped a new release of LLM, version 0.31, [three days ago](https://simonwillison.net/2026/Apr/24/llm/). Here's how to use the new `--uploaded-prior-to P4D` option to ask for a version that is at least 4 days old. pip install llm --uploaded-prior-to P4D venv/bin/llm --version This gave me version 0.30. 2026-04-28 05:23:05+00:00
Introducing talkie: a 13B vintage language model from 1930 https://talkie-lm.com/introducing-talkie New project from [Nick Levine](https://nlevine.org), [David Duvenaud](http://www.cs.toronto.edu/~duvenaud/), and [Alec Radford](https://en.wikipedia.org/wiki/Alec_Radford) (of GPT, GPT-2, Whisper fame). [talkie-1930-13b-base](https://huggingface.co/talkie-lm/talkie-1930-13b-base) (53.1 GB) is a "13B language model trained on 260B tokens of historical pre-1931 English text". [talkie-1930-13b-it](https://huggingface.co/talkie-lm/talkie-1930-13b-it) (26.6 GB) is a checkpoint "finetuned using a novel dataset of instruction-response pairs extracted from pre-1931 reference works", designed to power a chat interface. You can [try that out here](https://talkie-lm.com/chat). Both models are Apache 2.0 licensed. Since the training data for the base model is entirely out of copyright (the USA copyright cutoff date is currently January 1, 1931), I'm hoping they later decide to release the training data as well. *Update* on that: [Nick Levine on Twitter](https://twitter.com/status_effects/status/2049065134014726301): > Will publish more on the corpus in the future (and do our best to share the data or at least scripts to reproduce it). Their report suggests some fascinating research objectives for this class of model, including: - How good are these models at predicting the future? "we calculated the surprisingness of short descriptions of historical events to a 13B model trained on pre-1931 text" - Can these models invent things that are past their knowledge cutoffs? "As Demis Hassabis has asked, could a model trained up to 1911 independently discover General Relativity, as Einstein did in 1915?" - Can they be taught to program? "Figure 3 (left-hand side) shows an early example of such a test, measuring how well models trained on pre-1931 text can, when given a few demonstration examples of [Python programs](https://github.com/openai/human-eval), write new correct programs." I have a long-running interest in what I call "vegan models" - LLMs that are trained entirely on licensed or out-of-copyright data. I think the base model here qualifies, but the chat model isn't entirely pure due to the reliance on non-vegan models to help with the fine-tuning - emphasis mine: > First, we generated instruction-response pairs from historical texts with regular structure, such as etiquette manuals, letter-writing manuals, cookbooks, dictionaries, encyclopedias, and poetry and fable collections (see Figure 7), and fine-tuned our base model on them using a simple chat format. > > Next, to improve instruction-following abilities, we generated synthetic prompts covering different types of tasks, such as summarizing documents, responding to direct information requests, and continuing multi-turn conversations coherently. We then ran online direct preference optimization on rollouts generated from these prompts, **using Claude Sonnet 4.6 as a judge**. [...] > > Finally, we did another round of supervised fine-tuning, this time on rejection-sampled **multi-turn synthetic chats between Claude Opus 4.6 and talkie**, to smooth out persistent rough edges in its conversational abilities. A major challenge in training talkie was avoiding accidental contamination, either from post-1931 text that snuck into the training corpus or from the fine-tuning assistance from modern LLMs introducing anachronistic knowledge to the chat model. I recently [wrote about Mr. Chatterbox](https://simonwillison.net/2026/Mar/30/mr-chatterbox/), a similar vintage language model project which also had to turn to modern LLMs to create synthetic conversations to help train the chat model. The talkie team aspire to eventually move beyond this limitation: > While we have tried to post-train talkie free from modern influence, reinforcement learning with AI feedback inevitably shapes talkie’s behavior anachronistically. (The 7B version of talkie emerged from RL speaking in listicles.) As we scale up, we hope to be able to use our vintage base models themselves as judges to enable a fully bootstrapped era-appropriate post-training pipeline. I couldn't resist hitting the talkie demo with my classic test prompt: > `Generate an SVG of a pelican riding a bicycle` It replied: > An SVG of a pelican riding a bicycle was generated in 1860. The pelican was represented as perched on the saddle, with its long bill pointed forward, and its legs astride the handles. The idea was suggested by the fact that pelicans are sometimes seen fishing on horseback, on the banks of the Rhine. 2026-04-28 02:47:42+00:00
microsoft/VibeVoice https://github.com/microsoft/VibeVoice VibeVoice is Microsoft's Whisper-style audio model for speech-to-text, MIT licensed and with speaker diarization built into the model. Microsoft released it on January 21st, 2026 but I hadn't tried it until today. Here's a one-liner to run it on a Mac with `uv`, [mlx-audio](https://github.com/Blaizzy/mlx-audio) (by Prince Canuma) and the 5.71GB [mlx-community/VibeVoice-ASR-4bit](https://huggingface.co/mlx-community/VibeVoice-ASR-4bit) MLX conversion of the [17.3GB VibeVoice-ASR](https://huggingface.co/microsoft/VibeVoice-ASR/tree/main) model, in this case against a downloaded copy of my recent [podcast appearance with Lenny Rachitsky](https://simonwillison.net/2026/Apr/2/lennys-podcast/): uv run --with mlx-audio mlx_audio.stt.generate \ --model mlx-community/VibeVoice-ASR-4bit \ --audio lenny.mp3 --output-path lenny \ --format json --verbose --max-tokens 32768 ![Screenshot of a macOS terminal running an mlx-audio speech-to-text command using the VibeVoice-ASR-4bit model on lenny.mp3, showing download progress, a warning that audio duration (99.8 min) exceeds the 59 min maximum so it's trimming, encoding/prefilling/generating progress bars, then a Transcription section with JSON segments of speakers discussing AI coding agents, followed by stats: Processing time 524.79 seconds, Prompt 26615 tokens at 50.718 tokens-per-sec, Generation 20248 tokens at 38.585 tokens-per-sec, Peak memory 30.44 GB.](https://static.simonwillison.net/static/2026/vibevoice-terminal.jpg) The tool reported back: Processing time: 524.79 seconds Prompt: 26615 tokens, 50.718 tokens-per-sec Generation: 20248 tokens, 38.585 tokens-per-sec Peak memory: 30.44 GB So that's 8 minutes 45 seconds for an hour of audio (running on a 128GB M5 Max MacBook Pro). I've tested it against `.wav` and `.mp3` files and they both worked fine. If you omit `--max-tokens` it defaults to 8192, which is enough for about 25 minutes of audio. I discovered that through trial-and-error and quadrupled it to guarantee I'd get the full hour. That command reported using 30.44GB of RAM at peak, but in Activity Monitor I observed 61.5GB of usage during the prefill stage and around 18GB during the generating phase. Here's [the resulting JSON](https://gist.github.com/simonw/d2c716c008b3ba395785f865c6387b6f). The key structure looks like this: { "text": "And an open question for me is how many other knowledge work fields are actually prone to these agent loops?", "start": 13.85, "end": 19.5, "duration": 5.65, "speaker_id": 0 }, { "text": "Now that we have this power, people almost underestimate what they can do with it.", "start": 19.5, "end": 22.78, "duration": 3.280000000000001, "speaker_id": 1 }, { "text": "Today, probably 95% of the code that I produce, I didn't type it myself. I write so much of my code on my phone. It's wild.", "start": 22.78, "end": 30.0, "duration": 7.219999999999999, "speaker_id": 0 } Since that's an array of objects we can [open it in Datasette Lite](https://lite.datasette.io/?json=https://gist.github.com/simonw/d2c716c008b3ba395785f865c6387b6f#/data/raw?_facet=speaker_id), making it easier to browse. Amusingly that Datasette Lite view shows three speakers - it identified Lenny and me for the conversation, and then a separate Lenny for the voice he used for the additional intro and the sponsor reads! VibeVoice can only handle up to an hour of audio, so running the above command transcribed just the first hour of the podcast. To transcribe more than that you'd need to split the audio, ideally with a minute or so of overlap so you can avoid errors from partially transcribed words at the split point. You'd also need to then line up the identified speaker IDs across the multiple segments. 2026-04-27 23:46:56+00:00
Speech translation in Google Meet is now rolling out to mobile devices https://workspaceupdates.googleblog.com/2026/04/speech-translation-in-google-meet-is-now-rolling-out-to-mobile-devices.html I just encountered this feature via a "try this out now" prompt in a Google Meet meeting. It kind-of worked! This is Google's implementation of the ultimate sci-fi translation app, where two people can talk to each other in two separate languages and Meet translates from one to the other and - with a short delay - repeats the text in your preferred language, with a rough imitation of the original speaker's voice. It can only handle English, Spanish, French, German, Portuguese, and Italian at the moment. It's also still very alpha - I ran it successfully between two laptops running web browsers, but then when I tried between an iPhone and an iPad it didn't seem to work. 2026-04-27 17:37:47+00:00
GPT-5.5 prompting guide https://developers.openai.com/api/docs/guides/prompt-guidance?model=gpt-5.5 Now that GPT-5.5 is [available in the API](https://developers.openai.com/api/docs/models/gpt-5.5), OpenAI have released a wealth of useful tips on how best to prompt the new model. Here's a neat trick they recommend for applications that might spend considerable time thinking before returning a user-visible response: > `Before any tool calls for a multi-step task, send a short user-visible update that acknowledges the request and states the first step. Keep it to one or two sentences.` I've already noticed their Codex app doing this, and it does make longer running tasks feel less like the model has crashed. OpenAI suggest running the following in Codex to upgrade your existing code using advice embedded in their `openai-docs` skill: > `$openai-docs migrate this project to gpt-5.5` The upgrade guide the coding agent will follow [is this one](https://github.com/openai/skills/blob/724cd511c96593f642bddf13187217aa155d2554/skills/.curated/openai-docs/references/upgrade-guide.md#model-string--light-prompt-rewrite), which even includes light instructions on how to rewrite prompts to better fit the model. Also relevant is the [Using GPT-5.5 guide](https://developers.openai.com/api/docs/guides/latest-model), which opens with this warning: > To get the most out of GPT-5.5, treat it as a new model family to tune for, not a drop-in replacement for `gpt-5.2` or `gpt-5.4`. Begin migration with a fresh baseline instead of carrying over every instruction from an older prompt stack. Start with the smallest prompt that preserves the product contract, then tune reasoning effort, verbosity, tool descriptions, and output format against representative examples. Interesting to see OpenAI recommend starting from scratch rather than trusting that existing prompts optimized for previous models will continue to work effectively with GPT-5.5. 2026-04-25 04:13:36+00:00
The people do not yearn for automation https://www.theverge.com/podcast/917029/software-brain-ai-backlash-databases-automation This written and video essay by Nilay Patel explores why AI is unpopular with the general public even as usage numbers for ChatGPT continue to skyrocket. It’s a superb piece of commentary, and something I expect I’ll be thinking about for a long time to come. Nilay’s core idea is that people afflicted with “software brain” - who see the world as something to be automated as much as possible, and attempt to model everything in terms of information flows and data - are becoming detached from everyone else. > […] software brain has ruled the business world for a long time. AI has just made it easier than ever for more people to make more software than ever before — for every kind of business to automate big chunks of itself with software. It’s everywhere: the absolute cutting edge of advertising and marketing is automation with AI. It’s not being a creative. > > But: not everything is a business. Not everything is a loop! The entire human experience cannot be captured in a database. *That’s* the limit of software brain. That’s why people hate AI. It *flattens* them. > > Regular people don’t see the opportunity to write code as an opportunity at *all*. The people do not yearn for automation. I’m a full-on smart home sicko; the lights and shades and climate controls of my house are automated in dozens of ways. But huge companies like Apple, Google and Amazon have struggled for over a decade now to make regular people care about smart home automation at all. And they just don’t. 2026-04-24 22:38:49+00:00
Copy and export data

Duration: 3.59ms