Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Stylesheet parsing gets complicated

Craig Saila points to the SearchEngineWatch Webpage Size Checker. It’s a nice tool, but it doesn’t appear to take the size of linked style sheets in to account. I was playing around with the idea of a web page cache written in Python over Christmas and I hit the same kind of problem—while finding linked stylesheets using Python’s HTML parser wasn’t too difficult (and could be achieved equally well using a regular expression) things get a lot hairier when you start to take @import statements and CSS defined background images / custom bullet images in to account. Again I imagine a solution could be hacked out with regular expressions but a nicer method would be some kind of CSS parser (the Python standard library has yet to include one). Maybe another project for a rainy day...

This is Stylesheet parsing gets complicated by Simon Willison, posted on 20th January 2003.

View blog reactions

Next: More Vellum

Previous: Scaling the two way web

1 comment

  1. Have you tried Q42's? http://www.q42.nl/research/ It purports to weigh all linked and imported resources, and it certainly seems to, although I haven't tested it too thoroughly. In some cases it'll return out-of-domain errors (e.g. on Wired).

    francois - 21st January 2003 11:51 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/01/20/stylesheetParsingGetsComplicat

A django site