If you're a startup running your own crawlers to gather data for whatever purpose, you should try really hard not to make the world a worse place by driving up costs for the sites you are scraping.
There's really no excuse for crawling Wikipedia ("65% of our most expensive traffic comes from bots") when they offer a comprehensive collection of bulk download options.
Do better!
Recent articles
- Wilson Lin on FastRender: a browser built by thousands of parallel agents - 23rd January 2026
- First impressions of Claude Cowork, Anthropic's general agent - 12th January 2026
- My answers to the questions I posed about porting open source code with LLMs - 11th January 2026