Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

robots.txt Adventure. Interesting notes from crawling 4.6 million robots.txt, including 69 different ways in which the word “disallow” can be mis-spelled.

Tagged , ,

2 comments

  1. Funny,
    I just yesterday pieced together a little Django app which handles robots.txt requests, manageable with the admin interface (currently only with the oldforms-admin/trunk). Thanks for the great link!

    http://code.google.com/p/django-robots/

    Jannis Leidel - 22nd September 2007 09:04 - #

  2. Just to let the people coming from search engines know: I updated django-robots (0.2) to reflect the problems Andrew Wooster found in his robots.txt Adventure, e.g., correct mimetype and http status codes for resulting robots.txt, crawl-delay support, Allow and Dissallow rules and automatic Sitemap contrib app support.

    Jannis Leidel - 7th October 2007 13:56 - #

Comments are closed.
A django site