Simon Willison’s Weblog

Grabbing web pages with Perl and PHP

Web Basics with LWP (via Scott) is an excellent tutorial on Perl’s LWP, a powerful set of modules which make it easy to retrieve content from the web. I’ve been using the excellent Snoopy class for PHP for the same purpose, but I have to admit it isn’t half as comprehensive as LWP. I’ve also written my own simple function safeGet for more light weight tasks—it grabs and returns the contents of a web page but limits both the size of the page and the maximum time it can take to download it.

This is Grabbing web pages with Perl and PHP by Simon Willison, posted on 1st September 2002.

Next: Yay for <links>

Previous: PHP XML-RPC

Previously hosted at http://simon.incutio.com/archive/2002/09/01/grabbingWebPagesWithPerlAndPHP