Simon Willison’s Weblog

Subscribe

Items in Jun, 2009

Filters: Year: 2009 × Month: Jun × Sorted by date


Using Mongo for Real-Time Analytics. MongoDB supports an “upsert” query, which when combined with the $inc operator can cause counter fields to be incremented if they exist and created otherwise. This makes it a great fit for real-time analytics applications (one increment per page view), something that regular relational databases aren’t particularly good at. # 30th June 2009, 7:28 pm

MongoDB. Lots of discussions about this at EuroPython today—it’s a document database, very similar to CouchDB but significantly faster and suggested for production use. Best of all, trying it out on OS X is as easy as extracting the tarball and running “bin/mongod --dbpath /tmp/test-mongo-db run”. # 30th June 2009, 7:13 pm

Firefox 3.5 for developers. It’s out today, and the feature list is huge. Highlights include HTML 5 drag ’n’ drop, audio and video elements, offline resources, downloadable fonts, text-shadow, CSS transforms with -moz-transform, localStorage, geolocation, web workers, trackpad swipe events, native JSON, cross-site HTTP requests, text API for canvas, defer attribute for the script element and TraceMonkey for better JS performance! # 30th June 2009, 6:08 pm

cache-money. A “write-through caching library for ActiveRecord”, maintained by Nick Kallen from Twitter. Queries hit memcached first, and caches are automatically kept up-to-date when objects are created, updated and deleted. Only some queries are supported—joins and comparisons won’t hit the cache, for example. # 28th June 2009, 3:17 pm

Twitter, an Evolving Architecture. The most detailed write-up of Twitter’s current architecture I’ve seen, explaining the four layers of cache (all memcached) used by the Twitter API. # 28th June 2009, 3:09 pm

BashReduce. Map/Reduce in Bash is no longer a joke project (if it ever was)—Richard Crowley is extending it and using it for analysis at OpenDNS. # 28th June 2009, 3:03 pm

What’s New In Python 3.1. Lots of stuff, but the best bits are an ordered dictionary type (congrats, Armin), a Counter class for counting unique items in an iterable (I do this on an almost daily basis) and a bunch of performance improvements including a rewrite of the Python 3.0 IO system in C. # 28th June 2009, 3:02 pm

The Resource Expert Droid. Like the HTML Validator but for your server’s HTTP headers—extremely useful. # 25th June 2009, 10:06 am

Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment. Michael Andersen from the Nieman Journalism Lab interviewed me about the MP expenses crowdsourcing site. # 24th June 2009, 3:31 pm

Test-Driven Heresy. Tim Bray advocates TDD for maintenance development, but argues that it may not be as useful during the exploratory, greenfield development phase of a project. # 24th June 2009, 11:03 am

Software engineers today are about 200-400% more productive than software engineers were 10 years ago because of open source software, better programming tools, common libraries, easier access to information, better education, and other factors. This means that one engineer today can do what 3-5 people did in 1999!

Auren Hoffman # 24th June 2009, 11 am

To Sprite Or Not To Sprite. CSS sprite images are decompressed to full bitmaps by browsers before they are rendered, so sprite files with large numbers of pixels will dramatically increase the memory footprint of your site. # 24th June 2009, 10:33 am

You can buy an iPod nano on Apple, Best Buy, etc. for about $149. Amazon sells it for $134. That’s probably cost price. It turns out that Amazon can sell almost everything at cost price and still make a product because of volume. It’s all down to the Negative Operating Cycle. Amazon turns over its inventory every 20 days whereas Best Buy takes 74 days. Standard retail term payments take 45 days. So Best Buy is in debt between day 45 and day 74. Amazon, on the other hand, are sitting on cash between day 20 and day 45. In that time, they can invest that money. That’s where their profit comes from.

Jared Spool, via Jeremy Keith # 22nd June 2009, 5:13 pm

Google asked people in Times Square:“What is a browser?”. Stuff like this makes me despair for creating a secure web—what chance do people have of surfing safely if they don’t understand browsers, web sites, operating systems, DNS, URLs, SSL, certificates... # 20th June 2009, 1:25 am

The breakneck race to build an application to crowdsource MPs’ expenses. Charles Arthur wrote up a very nice piece on the development effort behind the Guardian’s crowdsourcing expenses app. # 19th June 2009, 10:16 pm

Towards a Standard for Django Session Messages. I completely agree that Django’s user.message_set (which I helped design) is unfit for purpose, but I don’t think sessions are the right solution for messages sent to users. A signed cookie containing either the full message or a key referencing the message body on the server is a much more generally useful solution as it avoids the need for a round trip to a persistent store entirely. # 19th June 2009, 9:57 pm

Unimpressed by NodeIterator. John Resig, one of the most talented API designers I’ve ever come across, posts some well earned criticism of the document.createNodeIterator DOM traversal API. # 19th June 2009, 9:53 pm

Investigate your MP’s expenses. Launched today, this is the project that has been keeping me ultra-busy for the past week—we’re crowdsourcing the analysis of the 700,000+ scanned MP expenses documents released this morning. It’s the Guardian’s first live Django-powered application, and also the first time we’ve hosted something on EC2. # 18th June 2009, 11:16 pm

C64 Twitter client. Awesome. # 17th June 2009, 9:14 am

Jython 2.5.0 Final is out! It’s been a long time coming—congratulations to the team. # 16th June 2009, 11:21 pm

SWFUpload jQuery Plugin. Nice looking plugin around an invisible Flash shim that provides multiple file uploads and client-side progress indicators. # 16th June 2009, 11:46 am

Opera Unite. Opera’s big announcement: a developer preview (“labs release”) of their new web-server-in-your-browser feature, Unite. Includes an Opera-hosted proxy to help break through your firewall. The web server can be customised using server-side JavaScript running in an Opera Widget. # 16th June 2009, 11 am

Facebook Usernames and OpenID

Today’s launch of Facebook Usernames provides an obvious and exciting opportunity for Facebook to become an OpenID provider. Facebook have clearly demonstrated their interest in becoming the key online identity for their users, and the new usernames feature is their acknowledgement that URL-based identities are an important component of that, no doubt driven in part by Twitter making usernames trendy again.

[... 760 words]

And that is why, in 2009, when developing in Microsoft .NET 3.5 for ASP.NET MVC 1.0 on a Windows 7 system, you cannot include /com\d(\..*)?, /lpt\d(\..*)?, /con(\..*)?, /aux(\..*)?, /prn(\..*)?, or /nul(\..*)? in any of your routes.

Benjamin Pollack # 12th June 2009, 11:48 pm

Mr. Penumbra’s Twenty-Four-Hour Book Store. Enormously entertaining short story about data visualisation and creepy San Francisco bookshops by Robin Sloan. # 12th June 2009, 6:07 pm

Dealing with election results data. Alf Eaton loaded the Guardian’s European election results spreadsheet in to Google’s new Fusion Tables tool. # 12th June 2009, 6:06 pm

The GIF Pronunciation Page. It’s jiff. Here’s evidence. # 11th June 2009, 10:50 pm

Cryptographic Right Answers. Best practise recommendations for cryptography: “While some people argue that you should never use cryptographic primitives directly and that trying to teach people cryptography just makes them more likely to shoot themselves in their proverbial feet, I come from a proud academic background and am sufficiently optimistic about humankind that I think it’s a good idea to spread some knowledge around.” # 11th June 2009, 10:16 pm

Exactly how well did the BNP do where you live? Guardian journalists spent a day and a half calling round different local authorities to get a proper breakdown of the European election results (which are only officially published in aggregate) and published the results as a spreadsheet on the Datablog. # 11th June 2009, 11:37 am

Exclusive: The Future of Facebook Usernames. I have to admit I was planning to just let Facebook get on with it, assuming that the OpenID provider part would show up of its own accord—but maybe I should write a thoughtful and persuasive essay about it after all. # 11th June 2009, 9:46 am