Simon Willison’s Weblog

Subscribe

Items tagged anthropic, claude in 2023

Filters: Year: 2023 × anthropic × claude × Sorted by date


Long context prompting for Claude 2.1. Claude 2.1 has a 200,000 token context, enough for around 500 pages of text. Convincing it to answer a question based on a single sentence buried deep within that content can be difficult, but Anthropic found that adding “Assistant: Here is the most relevant sentence in the context:” to the end of the prompt was enough to raise Claude 2.1’s score from 27% to 98% on their evaluation. # 6th December 2023, 11:44 pm

Claude: How to use system prompts. Documentation for the new system prompt support added in Claude 2.1. The design surprises me a little: the system prompt is just the text that comes before the first instance of the text “Human: ...”—but Anthropic promise that instructions in that section of the prompt will be treated differently and followed more closely than any instructions that follow.

This whole page of documentation is giving me some pretty serious prompt injection red flags to be honest. Anthropic’s recommended way of using their models is entirely based around concatenating together strings of text using special delimiter phrases.

I’ll give it points for honesty though. OpenAI use JSON to field different parts of the prompt, but under the hood they’re all concatenated together with special tokens into a single token stream. # 22nd November 2023, 4:31 am

Introducing Claude 2.1. Anthropic’s Claude used to have the longest token context of any of the major models: 100,000 tokens, which is about 300 pages. Then GPT-4 Turbo came out with 128,000 tokens and Claude lost one of its key differentiators.

Claude is back! Version 2.1, announced today, bumps the token limit up to 200,000—and also adds support for OpenAI-style system prompts, a feature I’ve been really missing.

They also announced tool use, but that’s only available for a very limited set of partners to preview at the moment. # 22nd November 2023, 4:28 am

Claude was trained on data up until December 2022, but may know some events into early 2023.

How up-to-date is Claude's training data? # 9th October 2023, 1:25 am

How I make annotated presentations

Giving a talk is a lot of work. I go by a rule of thumb I learned from Damian Conway: a minimum of ten hours of preparation for every one hour spent on stage.

[... 2122 words]

Catching up on the weird world of LLMs

I gave a talk on Sunday at North Bay Python where I attempted to summarize the last few years of development in the space of LLMs—Large Language Models, the technology behind tools like ChatGPT, Google Bard and Llama 2.

[... 10489 words]

claude.ai. Anthropic’s new Claude 2 model is available to use online, and it has a 100k token context window and the ability to upload files to it—I tried uploading a text file with 34,000 tokens in it (according to my ttok CLI tool, counting using the GPT-3.5 tokenizer) and it gave me a workable summary. # 12th July 2023, 4:39 pm

It’s infuriatingly hard to understand how closed models train on their input

One of the most common concerns I see about large language models regards their training data. People are worried that anything they say to ChatGPT could be memorized by it and spat out to other users. People are concerned that anything they store in a private repository on GitHub might be used as training data for future versions of Copilot.

[... 1465 words]