Simon Willison’s Weblog

Subscribe

What happens if AI labs train for pelicans riding bicycles?

13th November 2025

Almost every time I share a new example of an SVG of a pelican riding a bicycle a variant of this question pops up: how do you know the labs aren’t training for your benchmark?

The strongest argument is that they would get caught. If a model finally comes out that produces an excellent SVG of a pelican riding a bicycle you can bet I’m going to test it on all manner of creatures riding all sorts of transportation devices. If those are notably worse it’s going to be pretty obvious what happened.

A related note here is that, if they are training for my benchmark, that training clearly is not going well! The very best models still produce pelicans on bicycles that look laughably awful. It’s one of the reasons I’ve continued to find the test useful: drawing pelicans is hard! Even getting a bicycle the right shape is a challenge that few models have achieved yet.

My current favorite is still this one from GPT-5. The bicycle has all of the right pieces and the pelican is clearly pedaling it!

The bicycle is really good, spokes on wheels, correct shape frame, nice pedals. The pelican has a pelican beak and long legs stretching to the pedals.

I should note that OpenAI’s Aidan McLaughlin has specifically denied training for this particular benchmark:

we do not hill climb on svg art

People also ask if they’re training on my published collection. If they are that would be a big mistake, because a model trained on these examples will produce some very weird looking pelicans.

Truth be told, I’m playing the long game here. All I’ve ever wanted from life is a genuinely great SVG vector illustration of a pelican riding a bicycle. My dastardly multi-year plan is to trick multiple AI labs into investing vast resources to cheat at my benchmark until I get one.

This is What happens if AI labs train for pelicans riding bicycles? by Simon Willison, posted on 13th November 2025.

Previous: Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe