Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

2 items tagged “robotstxt”

The X-Robots-Tag HTTP header. News to me, but both Google and Yahoo! have supported it since last year. You can add per-page robots exclusion rules in HTTP headers instead of using meta tags, and Google’s version supports unavailable_after which is handy for content with a known limited shelf-life. 0 9th June 2008, 9:21 am

robots.txt Adventure. Interesting notes from crawling 4.6 million robots.txt, including 69 different ways in which the word “disallow” can be mis-spelled. 2 22nd September 2007, 12:36 am

A django site