Stylesheet parsing gets complicated
20th January 2003
Craig Saila points to the SearchEngineWatch Webpage Size Checker. It’s a nice tool, but it doesn’t appear to take the size of linked style sheets in to account. I was playing around with the idea of a web page cache written in Python over Christmas and I hit the same kind of problem—while finding linked stylesheets using Python’s HTML parser wasn’t too difficult (and could be achieved equally well using a regular expression) things get a lot hairier when you start to take @import
statements and CSS defined background images / custom bullet images in to account. Again I imagine a solution could be hacked out with regular expressions but a nicer method would be some kind of CSS parser (the Python standard library has yet to include one). Maybe another project for a rainy day...
More recent articles
- First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin) - 4th December 2024
- Storing times for human events - 27th November 2024
- Ask questions of SQLite databases and CSV/JSON files in your terminal - 25th November 2024