Simon Willison’s Weblog

Subscribe

aavetis/PRarena. Albert Avetisian runs this repository on GitHub which uses the Github Search API to track the number of PRs that can be credited to a collection of different coding agents. The repo runs this collect_data.py script every three hours using GitHub Actions to collect the data, then updates the PR Arena site with a visual leaderboard.

The result is this neat chart showing adoption of different agents over time, along with their PR success rate:

Line and bar chart showing PR metrics over time from 05/26 to 10/01. The left y-axis shows "Number of PRs" from 0 to 1,800,000, the right y-axis shows "Success Rate (%)" from 0% to 100%, and the x-axis shows "Time" with dates. Five line plots track success percentages: "Copilot Success % (Ready)" and "Copilot Success % (All)" (both blue, top lines around 90-95%), "Codex Success % (Ready)" and "Codex Success % (All)" (both brown/orange, middle lines declining from 80% to 60%), and "Cursor Success % (Ready)" and "Cursor Success % (All)" (both purple, middle lines around 75-85%), "Devin Success % (Ready)" and "Devin Success % (All)" (both teal/green, lower lines around 65%), and "Codegen Success % (Ready)" and "Codegen Success % (All)" (both brown, declining lines). Stacked bar charts show total and merged PRs for each tool: light blue and dark blue for Copilot, light red and dark red for Codex, light purple and dark purple for Cursor, light green and dark green for Devin, and light orange for Codegen. The bars show increasing volumes over time, with the largest bars appearing at 10/01 reaching approximately 1,700,000 total PRs.

I found this today while trying to pull off the exact same trick myself! I got as far as creating the following table before finding Albert's work and abandoning my own project.

Tool Search term Total PRs Merged PRs % merged Earliest
Claude Code is:pr in:body "Generated with Claude Code" 146,000 123,000 84.2% Feb 21st
GitHub Copilot is:pr author:copilot-swe-agent[bot] 247,000 152,000 61.5% March 7th
Codex Cloud is:pr in:body "chatgpt.com" label:codex 1,900,000 1,600,000 84.2% April 23rd

(Those "earliest" links are a little questionable, I tried to filter out false positives and find the oldest one that appeared to really be from the agent in question.)

It looks like OpenAI's Codex Cloud is massively ahead of the competition right now in terms of numbers of PRs both opened and merged on GitHub.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe