Simon Willison’s Weblog

Entries in Feb

Filters: Type: entry × Month: Feb ×


More on screen scraping

In response to yesterday’s screen scraping post, Richard Jones describes a screen scraping technique that uses PyWebPwerf, a Python performance measuring tool.

[... 80 words]

Vellum on Windows

Via Paul Freeman, detailed instructions for installing Stuart’s Vellum Python blogging system on Windows using either IIS or Apache.

[... 32 words]

Mechanize the web

Via Keith Devens, Screen-scraping with WWW::Mechanize describes how Perl’s WWW::Mechanize module can be used to grab information from sites that require a user login. I’ve always dismissed screen scraping as something of a wasted effort, given the fact that a major rewrite of the scraper is required whenever the target site tweaks its HTML. This article has encouraged me to reconsider—some of the functionality in WWW::Mechanise is fantastic:

[... 262 words]