Simon Willison’s Weblog

Subscribe

Gemini 2.5 Computer Use can solve Google’s own CAPTCHAs

7th October 2025

Draft: This is a draft post. Please do not share this URL with anyone else.

Google just introduced a new Gemini 2.5 Computer Use model, specially designed to help operate a GUI interface by interacting with visible elements using a virtual mouse and keyboard. I just tried their demo... and watched it solved Google’s own CAPTCHA without me even asking it to.

The official demo is hosted at gemini.browserbase.com, and one of the click-to-try example prompts shown there is the following:

Go to Hacker News and find the most controversial post from today, then read the top 3 comments and summarize the debate.

I activated the demo and Gemini decided to start by navigating to www.google.com in order to search for “hacker news”. But Google served a CAPTCHA challenge, presumably because of a large volume of suspicious traffic from the Browserbase IP range.

The model instantly got to solving that CAPTCHA:

Screenshot of Gemini Browser automation interface. Header shows "B Gemini Browser" with "Deploy" and "Close X" buttons. Left panel contains task: "Go to Hacker News and find the most controversial post from today, then read the top 3 comments and summarize the debate." Step 1: "Open Browser" with "ToolCall open_web_browser". Step 2: "Click" - "I have evaluated step 1. I am presented with a CAPTCHA asking to select squares with motorcycles. I need to click the squares containing motorcycles to proceed. Based on the image, I will click the squares at (350, 300), (350, 400), (470, 300), (470, 400), (590, 300), and (590, 400). Since I cannot click multiple times, I will try to click "SKIP" at (790, 320) as it is hard to select all squares at once. I will then click skip." with "ToolCall click_at". Step 3: "Click" - "I have evaluated step 2. I clicked "SKIP" but it seems I clicked the wrong coordinates as the page is still the same CAPTCHA page, just with a new image. The red dot shows I clicked way". Right panel shows Google reCAPTCHA with "Select all images with a bus Click verify once there are none left." Above 9 image squares, some marked with checkmarks. "About this new reCAPTCHA" text visible with "IP address:" "Time: 20" "URL: http". "VERIFY" button at bottom. Session timer shows "Session time: 0:34 / 5:00" with red "Stop" button.

It went through a few rounds of this, solved all of them and continued on to Google Search, where it ran the search for “hacker news”, navigated to the site and then did an admittedly unimpressive job of solving the original prompt. It looked at just one thread and reported back on what it found there. I was hoping it would consider more than one option to discover the “most controversial post from today”.

The Gemini 2.5 Computer Use Model card (PDF) talks about training the model to “recognize when it is tasked with a high-stakes action” and request user confirmation before proceeding, but doesn’t have anything to say about not solving CAPTCHAs. So I guess this behaviour is the model working as intended!

Something that did impress me—aside from the unprompted CAPTCHA solve against Google’s very own system—was the quality of the mouse usage. I’ve written about Computer Use models before from both Anthropic and OpenAI (they called their version “Operator”) and by far the biggest challenge for them is accurately clicking the right targets with the mouse.

It would take a formal eval to derive if Gemini really is best at this, but given the Gemini models previous demonstrations of both bounding boxes and image segmentation masks it doesn’t surprise me that a Gemini model can do a great job of clicking on the right elements in a screenshot of an operating system or browser.

More recent articles

This is Gemini 2.5 Computer Use can solve Google’s own CAPTCHAs by Simon Willison, posted on 7th October 2025.

Previous: Vibe engineering

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe