Simon Willison’s Weblog

Subscribe

TIL: Rate limiting by IP using Cloudflare's rate limiting rules. My blog started timing out on some requests a few days ago, and it turned out there were misbehaving crawlers that were spidering my /search/ page even though it's restricted by robots.txt.

I run this site behind Cloudflare and it turns out Cloudflare's WAF (Web Application Firewall) has a rate limiting tool that I could use to restrict requests to /search/* by a specific IP to a maximum of 5 every 10 seconds.

Monthly briefing

Sponsor me for $10/month and get a curated email digest of the month's most important LLM developments.

Pay me to send you less!

Sponsor & subscribe