Simon Willison’s Weblog

3 items tagged “beautifulsoup”

Recovering missing content from the Internet Archive

When I restored my blog last weekend I used the most recent SQL backup of my blog’s database from back in 2010. I thought it had all of my content from before I started my 7 year hiatus, but in watching the 404 logs I started seeing the occasional hit to something that really should have been there but wasn’t. Turns out the SQL backup I was working from was missing some content.

[... 636 words]

How to Make a US County Thematic Map Using Free Tools. This is the trick I’ve been using to generate choropleths at the Guardian for the past year: figure out the preferred colours for a set of data in a Python script and then rewrite an SVG file to colour in the areas. I use ElementTree rather than BeautifulSoup but the technique is exactly the same. The best thing about SVG is that our graphics department can export them directly out of Illustrator, with named layers and paths automatically becoming SVG ID attributes. Bonus tip: sometimes you don’t have to rewrite the SVG XML at all, instead you can generate CSS to colour areas by ID selector and inject it in to the top of the file. # 12th November 2009, 10:49 am

soupselect. My simple extension to BeautifulSoup that allows you to grab elements using CSS selectors; should be useful for parsing microformats. # 28th February 2007, 1:47 pm