robots.txt Adventure. Interesting notes from crawling 4.6 million robots.txt, including 69 different ways in which the word “disallow” can be mis-spelled.
Recent articles
- Storing times for human events - 27th November 2024
- Ask questions of SQLite databases and CSV/JSON files in your terminal - 25th November 2024
- Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast - 22nd November 2024