Simon Willison’s Weblog

Subscribe

Items tagged openai, claude

Filters: openai × claude × Sorted by date


“The king is dead”—Claude 3 surpasses GPT-4 on Chatbot Arena for the first time. I’m quoted in this piece by Benj Edwards for Ars Technica:

“For the first time, the best available models—Opus for advanced tasks, Haiku for cost and efficiency—are from a vendor that isn’t OpenAI. That’s reassuring—we all benefit from a diversity of top vendors in this space. But GPT-4 is over a year old at this point, and it took that year for anyone else to catch up.” # 27th March 2024, 4:58 pm

Claude and ChatGPT for ad-hoc sidequests

Here is a short, illustrative example of one of the ways in which I use Claude and ChatGPT on a daily basis.

[... 1754 words]

The GPT-4 barrier has finally been broken

Four weeks ago, GPT-4 remained the undisputed champion: consistently at the top of every key benchmark, but more importantly the clear winner in terms of “vibes”. Almost everyone investing serious time exploring LLMs agreed that it was the most capable default model for the majority of tasks—and had been for more than a year.

[... 697 words]

The new Claude 3 model family from Anthropic. Claude 3 is out, and comes in three sizes: Opus (the largest), Sonnet and Haiku.

Claude 3 Opus has self-reported benchmark scores that consistently beat GPT-4. This is a really big deal: in the 12+ months since the GPT-4 release no other model has consistently beat it in this way. It’s exciting to finally see that milestone reached by another research group.

The pricing model here is also really interesting. Prices here are per-million-input-tokens / per-million-output-tokens:

Claude 3 Opus: $15 / $75
Claude 3 Sonnet: $3 / $15
Claude 3 Haiku: $0.25 / $1.25

All three models have a 200,000 length context window and support image input in addition to text.

Compare with today’s OpenAI prices:

GPT-4 Turbo (128K): $10 / $30
GPT-4 8K: $30 / $60
GPT-4 32K: $60 / $120
GPT-3.5 Turbo: $0.50 / $1.50

So Opus pricing is comparable with GPT-4, more than GPT-4 Turbo and significantly cheaper than GPT-4 32K... Sonnet is cheaper than all of the GPT-4 models (including GPT-4 Turbo), and Haiku (which has not yet been released to the Claude API) will be cheaper even than GPT-3.5 Turbo.

It will be interesting to see if OpenAI respond with their own price reductions. # 4th March 2024, 6:34 pm

Catching up on the weird world of LLMs

I gave a talk on Sunday at North Bay Python where I attempted to summarize the last few years of development in the space of LLMs—Large Language Models, the technology behind tools like ChatGPT, Google Bard and Llama 2.

[... 10489 words]

It’s infuriatingly hard to understand how closed models train on their input

One of the most common concerns I see about large language models regards their training data. People are worried that anything they say to ChatGPT could be memorized by it and spat out to other users. People are concerned that anything they store in a private repository on GitHub might be used as training data for future versions of Copilot.

[... 1465 words]

ChatGPT should include inline tips

In OpenAI isn’t doing enough to make ChatGPT’s limitations clear James Vincent argues that OpenAI’s existing warnings about ChatGPT’s confounding ability to convincingly make stuff up are not effective.

[... 1488 words]