Simon Willison’s Weblog

More on screen scraping

In response to yesterday’s screen scraping post, Richard Jones describes a screen scraping technique that uses PyWebPwerf, a Python performance measuring tool.

I forgot to mention it in the article, but Snoopy is a PHP web client library which can retrieve content and emulate a browser interacting with forms. I’ve used it for simple screen scraping before, but it still lacks some of the more impressive functionality that WWW::Mechanize demonstrates.

This is More on screen scraping by Simon Willison, posted on 4th February 2003.

Next: Zeldman and definition lists

Previous: Vellum on Windows

Previously hosted at http://simon.incutio.com/archive/2003/02/04/moreOnScreenScraping