Google released a new Gemini 2.5 Computer Use model today, specially designed to help operate a GUI interface by interacting with visible elements using a virtual mouse and keyboard.
I tried the demo hosted by Browserbase at gemini.browserbase.com and was delighted and slightly horrified when it appeared to kick things off by first navigating to Google.com and solving their CAPTCHA in order to run a search!
I wrote a post about it and included this screenshot, but then learned that Browserbase itself has CAPTCHA solving built in and, as shown in this longer video, it was Browserbase that solved the CAPTCHA even while Gemini was thinking about doing so itself.
I deeply regret this error. I've deleted various social media posts about the original entry and linked back to this retraction instead.
Recent articles
- What happens if AI labs train for pelicans riding bicycles? - 13th November 2025
- Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican - 9th November 2025
- Video + notes on upgrading a Datasette plugin for the latest 1.0 alpha, with help from uv and OpenAI Codex CLI - 6th November 2025