Simon Willison’s Weblog

Subscribe

GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52

17th March 2026

OpenAI today: Introducing GPT‑5.4 mini and nano. These models join GPT-5.4 which was released two weeks ago.

OpenAI’s self-reported benchmarks show the new 5.4-nano out-performing their previous GPT-5 mini model when run at maximum reasoning effort. The new mini is also 2x faster than the previous mini.

Here’s how the pricing looks—gpt-5.4-nano is notably even cheaper than Google’s Gemini 3.1 Flash-Lite:

Model Input Cached input Output
gpt-5.4 $2.50 $0.25 $15.00
gpt-5.4-mini $0.75 $0.075 $4.50
gpt-5.4-nano $0.20 $0.02 $1.25
Other models for comparison
Claude Opus 4.6 $5.00 - $25.00
Claude Sonnet 4.6 $3.00 - $15.00
Gemini 3.1 Pro $2.00 - $12.00
Claude Haiku 4.5 $1.00 - $5.00
Gemini 3.1 Flash-Lite $0.25 - $1.50

I used GPT-5.4 nano to generate a description of this photo I took at the John M. Mossman Lock Collection:

Description below

llm -m gpt-5.4-nano -a IMG_2324.jpeg 'describe image'

Here’s the output:

The image shows the interior of a museum gallery with a long display wall. White-painted brick walls are covered with many framed portraits arranged in neat rows. Below the portraits, there are multiple glass display cases with dark wooden frames and glass tops/fronts, containing various old historical objects and equipment. The room has a polished wooden floor, hanging ceiling light fixtures/cords, and a few visible pipes near the top of the wall. In the foreground, glass cases run along the length of the room, reflecting items from other sections of the gallery.

That took 2,751 input tokens and 112 output tokens, at a cost of 0.069 cents (less than a tenth of a cent). That means describing every single photo in my 76,000 photo collection would cost around $52.44.

I released llm 0.29 with support for the new models.

Then I had OpenAI Codex loop through all five reasoning effort levels and all three models and produce this combined SVG grid of pelicans riding bicycles (generation transcripts here). I do like the gpt-5.4 xhigh one the best, it has a good bicycle (with nice spokes) and the pelican has a fish in its beak!

Described by Claude Opus 4.6: A 5x3 comparison grid of AI-generated cartoon illustrations of a pelican riding a bicycle. Columns are labeled "gpt-5.4-nano", "gpt-5.4-mini", and "gpt-5.4" across the top, and rows are labeled "none", "low", "medium", "high", and "xhigh" down the left side, representing quality/detail settings. In the "none" row, gpt-5.4-nano shows a chaotic white bird with misplaced arrows and tangled wheels on grass, gpt-5.4-mini shows a duck-like brown bird awkwardly straddling a motorcycle-like bike, and gpt-5.4 shows a stiff gray-and-white pelican sitting atop a blue tandem bicycle with extra legs. In the "low" row, nano shows a chubby round white bird pedaling with small feet on grass, mini shows a cleaner white bird riding a blue bicycle with motion lines, and gpt-5.4 shows a pelican with a blue cap riding confidently but with slightly awkward proportions. In the "medium" row, nano regresses to a strange bird standing over bowling balls on ice, mini shows two plump white birds merged onto one yellow-wheeled bicycle, and gpt-5.4 shows a more recognizable gray-and-white pelican on a red bicycle but with tangled extra legs. In the "high" row, nano shows multiple small pelicans crowded around a broken green bicycle on grass with a sun overhead, mini shows a tandem bicycle with two white pelicans and clear blue sky, and gpt-5.4 shows two pelicans stacked on a red tandem bike with the most realistic proportions yet. In the "xhigh" row, nano shows the most detailed scene with a pelican on a detailed bicycle with grass and a large sun but still somewhat jumbled anatomy, mini produces the cleanest single pelican on a yellow-accented bicycle with a light blue sky, and gpt-5.4 shows a well-rendered gray pelican on a teal bicycle with the best overall coherence. Generally, quality improves moving right across models and down through quality tiers, though "medium" is inconsistently worse than "low" for some models, and all images maintain a lighthearted cartoon style with pastel skies and simple backgrounds.

This is GPT-5.4 mini and GPT-5.4 nano, which can describe 76,000 photos for $52 by Simon Willison, posted on 17th March 2026.

Previous: My fireside chat about agentic engineering at the Pragmatic Summit

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe