Simon Willison’s Weblog


More on screen scraping

4th February 2003

In response to yesterday’s screen scraping post, Richard Jones describes a screen scraping technique that uses PyWebPwerf, a Python performance measuring tool.

I forgot to mention it in the article, but Snoopy is a PHP web client library which can retrieve content and emulate a browser interacting with forms. I’ve used it for simple screen scraping before, but it still lacks some of the more impressive functionality that WWW::Mechanize demonstrates.

