Simon Willison’s Weblog

Subscribe

Friday, 3rd April 2009

UK Guardian Data + ManyEyes = ISAF Troops Contribution Story. Including a heat map showing countries that are contributing the most troops to Afghanistan.

# 2:44 pm / afghanistan, datastore, guardian, heatmap, manyeyes, military, visualisation

Automating PowerPoint with Python. Useful tutorial on using ActivePython’s win32com module to automate PowerPoint. The example code pulls in the top 50 banks by assets from the Guardian Data Store and generates a treemap using PowerPoint’s shape drawing primitives.

# 3:13 pm / activepython, datastore, guardian, powerpoint, python, treemap, visualisation

Introducing Digg’s IDDB Infrastructure. IDDB is Digg’s new infrastructure component for sharding data across multiple databases, with support for both MySQL and memcachedb. “The DiggBar and URL minifying service is powered by a 16 machine IDDB cluster, which includes 8 write masters in the index and 8 MySQL storage nodes.”

# 8:42 pm / databases, digg, iddb, joe-stump, memcachedb, mysql, scaling, sharding

TinyURL—Archiveteam. Excellent: the Internet Archive are crawling TinyURL (and hopefully other URL shortening services as well). The wiki page was created back in January. UPDATE from comments: Archiveteam are a separate organisation from the Internet Archive.

# 11:11 pm / archive, archiveteam, internet-archive, tinyurl