Yeah, unfortunately vision prompting has been a tough nut to crack. We've found it's very challenging to improve Claude's actual "vision" through just text prompts, but we can of course improve its reasoning and thought process once it extracts info from an image.
In general, I think vision is still in its early days, although 3.5 Sonnet is noticeably better than older models.
— Alex Albert, Anthropic
Recent articles
- Deep Blue - 15th February 2026
- The evolution of OpenAI's mission statement - 13th February 2026
- Introducing Showboat and Rodney, so agents can demo what they’ve built - 10th February 2026