Simon Willison’s Weblog


My notes on gpt2-chatbot. There's a new, unlabeled and undocumented model on the LMSYS Chatbot Arena today called gpt2-chatbot. It's been giving some impressive responses - you can prompt it directly in the Direct Chat tab by selecting it from the big model dropdown menu.

It looks like a stealth new model preview. It's giving answers that are comparable to GPT-4 Turbo and in some cases better - my own experiments lead me to think it may have more "knowledge" baked into it, as ego prompts ("Who is Simon Willison?") and questions about things like lists of speakers at DjangoCon over the years seem to hallucinate less and return more specific details than before.

The lack of transparency here is both entertaining and infuriating. Lots of people are performing a parallel distributed "vibe check" and sharing results with each other, but it's annoying that even the most basic questions (What even IS this thing? Can it do RAG? What's its context length?) remain unanswered so far.

The system prompt appears to be the following - but system prompts just influence how the model behaves, they aren't guaranteed to contain truthful information:

You are ChatGPT, a large language model trained
by OpenAI, based on the GPT-4 architecture.

Knowledge cutoff: 2023-11
Current date: 2024-04-29

Image input capabilities: Enabled
Personality: v2

My best guess is that this is a preview of some kind of OpenAI "GPT 4.5" release. I don't think it's a big enough jump in quality to be a GPT-5.

Update: LMSYS do document their policy on using anonymized model names for tests of unreleased models.

Update May 7th: The model has been confirmed as belonging to OpenAI thanks to an error message that leaked details of the underlying API platform.