Simon Willison’s Weblog

Subscribe
Atom feed for rodney

3 posts tagged “rodney”

Rodney is my browser automation CLI tool, designed for use by coding agents and via Showboat.

2026

Rodney v0.4.0. My Rodney CLI tool for browser automation attracted quite the flurry of PRs since I announced it last week. Here are the release notes for the just-released v0.4.0:

  • Errors now use exit code 2, which means exit code 1 is just for for check failures. #15
  • New rodney assert command for running JavaScript tests, exit code 1 if they fail. #19
  • New directory-scoped sessions with --local/--global flags. #14
  • New reload --hard and clear-cache commands. #17
  • New rodney start --show option to make the browser window visible. Thanks, Antonio Cuni. #13
  • New rodney connect PORT command to debug an already-running Chrome instance. Thanks, Peter Fraenkel. #12
  • New RODNEY_HOME environment variable to support custom state directories. Thanks, Senko Rašić. #11
  • New --insecure flag to ignore certificate errors. Thanks, Jakub Zgoliński. #10
  • Windows support: avoid Setsid on Windows via build-tag helpers. Thanks, adm1neca. #18
  • Tests now run on windows-latest and macos-latest in addition to Linux.

I've been using Showboat to create demos of new features - here those are for rodney assert, rodney reload --hard, rodney exit codes, and rodney start --local.

The rodney assert command is pretty neat: you can now Rodney to test a web app through multiple steps in a shell script that looks something like this (adapted from the README):

#!/bin/bash
set -euo pipefail

FAIL=0

check() {
    if ! "$@"; then
        echo "FAIL: $*"
        FAIL=1
    fi
}

rodney start
rodney open "https://example.com"
rodney waitstable

# Assert elements exist
check rodney exists "h1"

# Assert key elements are visible
check rodney visible "h1"
check rodney visible "#main-content"

# Assert JS expressions
check rodney assert 'document.title' 'Example Domain'
check rodney assert 'document.querySelectorAll("p").length' '2'

# Assert accessibility requirements
check rodney ax-find --role navigation

rodney stop

if [ "$FAIL" -ne 0 ]; then
    echo "Some checks failed"
    exit 1
fi
echo "All checks passed"

# 17th February 2026, 11:02 pm / browsers, projects, testing, annotated-release-notes, rodney

I'm a very heavy user of Claude Code on the web, Anthropic's excellent but poorly named cloud version of Claude Code where everything runs in a container environment managed by them, greatly reducing the risk of anything bad happening to a computer I care about.

I don't use the web interface at all (hence my dislike of the name) - I access it exclusively through their native iPhone and Mac desktop apps.

Something I particularly appreciate about the desktop app is that it lets you see images that Claude is "viewing" via its Read /path/to/image tool. Here's what that looks like:

Screenshot of a Claude Code session in Claude Desktop. Claude says: The debug page looks good - all items listed with titles and descriptions. Now let me check the nav
menu -  Analyzed menu image file - Bash uvx rodney open "http://localhost:8765/" 2>&1 && uvx rodney click "details.nav-menu summary" 2>&1 &% sleep 0.5 && uvx rodney screenshot /tmp/menu.png 2>&1 Output reads: Datasette: test, Clicked, /tmp/menu.png - then it says Read /tmp/menu.png and reveals a screenshot of the Datasette interface with the nav menu open, showing only "Debug" and "Log out" options. Claude continues: The menu now has just "Debug" and “Log out" — much cleaner. Both pages look good. Let me clean up the server and run the remaining tests.

This means you can get a visual preview of what it's working on while it's working, without waiting for it to push code to GitHub for you to try out yourself later on.

The prompt I used to trigger the above screenshot was:

Run "uvx rodney --help" and then use Rodney to manually test the new pages and menu - look at screenshots from it and check you think they look OK

I designed Rodney to have --help output that provides everything a coding agent needs to know in order to use the tool.

The Claude iPhone app doesn't display opened images yet, so I requested it as a feature just now in a thread on Twitter.

# 16th February 2026, 4:38 pm / projects, ai, generative-ai, llms, ai-assisted-programming, anthropic, claude, coding-agents, claude-code, async-coding-agents, rodney

Introducing Showboat and Rodney, so agents can demo what they’ve built

Visit Introducing Showboat and Rodney, so agents can demo what they’ve built

A key challenge working with coding agents is having them both test what they’ve built and demonstrate that software to you, their supervisor. This goes beyond automated tests—we need artifacts that show their progress and help us see exactly what the agent-produced software is able to do. I’ve just released two new tools aimed at this problem: Showboat and Rodney.

[... 2,023 words]