10th July 2024 - Link Blog
Anthropic cookbook: multimodal. I'm currently on the lookout for high quality sources of information about vision LLMs, including prompting tricks for getting the most out of them.
This set of Jupyter notebooks from Anthropic (published four months ago to accompany the original Claude 3 models) is the best I've found so far. Best practices for using vision with Claude includes advice on multi-shot prompting with example, plus this interesting think step-by-step style prompt for improving Claude's ability to count the dogs in an image:
You have perfect vision and pay great attention to detail which makes you an expert at counting objects in images. How many dogs are in this picture? Before providing the answer in
<answer>tags, think step by step in<thinking>tags and analyze every part of the image.
Recent articles
- Datasette Agent - 21st May 2026
- Gemini 3.5 Flash: more expensive, but Google plan to use it for everything - 19th May 2026
- The last six months in LLMs in five minutes - 19th May 2026