Simon Willison’s Weblog


September 2009

Sept. 1, 2009

Mac OS X 10.6 Snow Leopard: the Ars Technica review. The essential review: 23 pages of information-dense but readable goodness. Pretty much everything I know about Mac OS X internals I learnt from reading John Siracusa’s reviews—this one is particularly juice when it gets to Grand Central Dispatch and blocks (aka closures) in C and Objective-C.

# 7:05 pm / objectivec, c, closures, blocks, osx, apple, john-siracusa, grandcentraldispatch, snowleopard

Sept. 3, 2009 incident report for 8/28/2009. Various sites were down for a while last week—here the Apache Infrastructure Team provide a detailed description of what happened (a security breach on a minor server, which provided non-priveleged SSH access to mirror servers via an SSH key used for backups) and how they are responding. Useful for neophyte sysadmins like myself.

# 8:56 am / sysadmin, apache, security

On Influenza A (H1N1). “It’s humbling that I could be killed by 3.2kbytes of genetic data. Then again, with 850 Mbytes of data in my genome, there’s bound to be an exploit or two.”

# 9:25 am / h1n1, influenza, bunniehuang

And so it goes, around again. Charles Miller on Java, pointing out that if you don’t have closures and first-class functions you end up having to add band-aid solutions and special case syntactic sugar. Python’s lack of multi-line lambdas leads to a similar (though less pronounced) effect.

# 9:46 am / java, python, charles-miller, programming-languages

Chris Heathcote: loca london. Chris’s new guide to exhibitions in London is presented as an enormous (5100px wide) page with horizontal and vertical scrollbars—as Chris points out, this interface may be a bit clumsy with a mouse but it works wonderfully well on touchpads and touchscreens.

# 6:28 pm / design, chris-heathcote, london, crawlbar, horizontal

Ravelry. Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet.

# 6:50 pm / scaling, ravelry, tim-bray, caseyforbes, tokyocabinet, tokyotyrant, rails, mysql, sphinx-search, memcached, nginx, haproxy, passenger

Sept. 4, 2009

So’s your facet: Faceted global search for Mozilla Thunderbird. Yes! This is the kind of innovation I’ve been hoping would show up in e-mail clients for years. Faceting is a really natural fit for e-mail.

# 10:29 am / faceting, search, email, thunderbird

Sept. 6, 2009

Automating web site deployment at Barcamp Brighton. I’m determined to start using Fabric and proper deployment scripts for my personal projects.

# 2:16 pm / fabric, deployment, gareth-rushgrove, barcampbrighton, barcamp

Petabytes on a budget: How to build cheap cloud storage. Explains how Backblaze can operate an unlimited backup service for five dollars a month—their custom storage hardware stores 67 terabytes for $7,867.

# 9:27 pm / backblaze, backup, storage (via) A Twisted/Python powered comet API for pushing out Subversion commits, built for Apache Foundation projects.

# 9:50 pm / twisted, python, subversion, svnpubsub, comet

Sept. 7, 2009

Debugging Django in Production Revisited. Eric Holscher expands his show-technical-errors-to-superusers middleware to only show them to users in the group named “Technical Errors”.

# 5:21 am / django, debugging, python, middleware, ericholscher

Sept. 9, 2009

Looking to the future with Cassandra. Digg are now using Cassandra for their “green badge” (one of your friends have dugg this story) feature—the resulting denormalised dataset weighs in at 3 TB and 76 billion columns.

# 9:26 pm / cassandra, denormalisation, nosql, digg

Why Python Pickle is Insecure. Because pickle is essentially a stack-based interpreter, so you can put os.system on the stack and use it to execute arbitrary commands.

# 11:04 pm / python, pickle, security

Londiste Tutorial. Master/slave replication for PostgreSQL, developed and used by Skype.

# 11:06 pm / replication, skype, postgresql, masterslave, londiste

Sept. 10, 2009

RSSCloud Vs. PubSubHubbub: Why The Fat Pings Win. A PubSubHubbub advocate explains the differences between the two proposals: most importantly, PubSubHubbub includes the actual new content with the “fat ping” whereas RSSCloud just notifies you that you should poll the RSS feed, leading to a potential thundering herd. I’m still hoping one of those specs will detail a way in which they can be used for scalable regular WebHook-style notifications without any feed infrastructure at all.

# 3:49 pm / pubsubhubbub, rsscloud, webhooks, dogpile

OpenStreetMap: QuadTiles. Fascinating explanation of a proposal for replacing lat, lon pairs in the OpenStreetMap database with a QuadTile-based addressing system.

# 3:54 pm / quadtiles, openstreetmap, algorithms, gis

Tornado Web Server (via) An extremely exciting addition to the Python web landscape, Tornado is the open sourced version of FriendFeed’s custom web stack. It’s a non-blocking (epoll) Python web server designed for handling thousands of simultaneous connections, perfect for building Comet applications. The web framework is cosmetically similar to or App Engine’s webapp but has decorators for writing asynchronous request handlers. The template language uses Django-style syntax but allows you to use full Python expressions. FriendFeed have benchmarked it handling 8,000 requests a second running as four load-balanced processes on a 4 core server.

# 9:32 pm / friendfeed, tornado, python, comet, webpy, webapp, appengine, django, epoll, brettaylor

Thousands of people have come together to demand justice for Alan Turing and recognition of the appalling way he was treated. While Turing was dealt with under the law of the time and we can’t put the clock back, his treatment was of course utterly unfair and I am pleased to have the chance to say how deeply sorry I and we all are for what happened to him.

Gordon Brown

# 11:39 pm / alan-turing, gordonbrown, homophobia

Sept. 11, 2009

Announcing Heechee. “Heechee is a transparent mercurial-as-subversion gateway”—you can use it to allow subversion clients to check out a mercurial repository, meaning svn:externals can work against projects hosted by mercurial. It’s very young code but I’ve already seen it out-perform regular subversion for checkout speed.

# 2:16 am / subversion, heechee, andrew-godwin, mercurial

We experimented with different async DB approaches, but settled on synchronous at FriendFeed because generally if our DB queries were backlogging our requests, our backends couldn't scale to the load anyway. Things that were slow enough were abstracted to separate backend services which we fetched asynchronously via the async HTTP module.

Bret Taylor

# 5:31 pm / async, brettaylor, friendfeed, http, tornado

Sept. 13, 2009

The Guardian 1000 Novels Everyone Must Read in FluidDB. Nicholas J. Radcliffe loaded the Guardian’s list of 1000 novels in to FluidDB, where the ability for users to add their own ratings style metadata makes it an ideal dataset for exploring the capabilities of the platform.

# 11:48 pm / guardian, fluiddb, nicholasjradcliffe

Effective A/B Testing. Impressively comprehensive presentation on A/B testing, from theory to practice to statistical analysis of the results.

# 11:49 pm / ab-testing, buckettesting, statistics, ben-tilly

Sept. 21, 2009

Developing for the iPhone at the moment is like picking up dimes in front of a bulldozer.

Tim Bray

# 5:30 pm / apple, iphone, sharecropping, tim-bray

Welcome to Django Dose. Launched at DjangoCon, a new Django community site designed to be a successor to TWiD, still with (shorter) podcasts but also featuring more news, articles and screencasts.

# 6:21 pm / djangocon, djangodose, django, community, twid, podcasts, screencasts

There was this clamour in the past to get companies to open source their products. This has stopped, because all the software that got open source sucked. It's just not very interesting to have a closed source program get open sourced. It doesn't help anyone, because the way closed source software is created in a very different way than open source software. The result is a software base that just does not engage people in a way to make it a valid piece of software for further development.

Ian Bicking

# 6:22 pm / closedsource, ian-bicking, open-source

Fabric factory. Promising looking continuous integration server written in Django, which uses Fabric scripts to define actions.

# 6:35 pm / django, fabric, python, continuous-integration, testing, fabricfactory

Years ago, Alex Russell told me that Django ought to be collecting CLAs. I said "yeah, whatever" and ignored him. And thus have spent more than a year gathering CLAs to get DSF's paperwork in order. Sigh.

Jacob Kaplan-Moss

# 6:35 pm / alex-russell, clas, django, jacob-kaplan-moss, legal

django-debug-toolbar. The new panel styling for the Django debug toolbar is really slick—here’s a neatly produced screencast demonstrating it (with Gypsy Jazz accompaniment).

# 6:36 pm / django, debugging, django-debug-toolbar, screencasts, gypsyjazz

2009 » September