Simon Willison’s Weblog


Items in May, 2008

Filters: Year: 2008 × Month: May × Sorted by date

Python + Hadoop = Flying Circus Elephant. have released Dumbo, a Python module that lets you easily write Hadoop map/reduce tasks using Python and generators. # 31st May 2008, 2:14 pm

Obscure bugs revisited: IE, HTTPS and plugins. Filed for future reference: IE breaks mysteriously if you serve it up plugin content (e.g. Flash) over HTTPS with a no-cache header—it deletes the file from cache before the plugin software gets a chance to open it. # 30th May 2008, 9:54 am

Twitter, or Architecture Will Not Save You. Kellan is not an armchair architect. He also doesn’t mention Rails once. Well worth reading. # 29th May 2008, 1:16 am

Google Gears renamed “Gears”. “We want to make it clear that Gears isn’t just a Google thing. We see Gears as a way for everyone to get involved with upgrading the web platform.” Support for Firefox 3 and Safari is being added and Opera are integrating Gears with both their desktop and mobile browsers. # 29th May 2008, 12:38 am

Google Earth in a browser (sort of), Scriptable, a quick peek and poke. Dan Catt on Google’s new browser plugin version of Google Earth... which conveniently exposes a JavaScript API to the browser in the form of the “ge” object, which can then be poked at interactively using Firebug. # 28th May 2008, 11:13 pm

Using Memcache with Google App Engine. Brad Fitzpatrick’s 20% time project. # 28th May 2008, 11:11 pm

OpenID phishing demo (via) A demonstration of the OpenID man-in-the-middle phishing attack. OpenIDs are immune to this particular variant due to the landing page not asking for your password (the phishing site could still provide their own redesigned landing page and hope users don’t notice though). # 28th May 2008, 8:09 am

If we see good usage, we can work with browser vendors to automatically ship these libraries. Then, if they see the URLs that we use, they could auto load the libraries, even special JIT’d ones, from their local system. Thus, no network hit at all!

Dion Almaer # 27th May 2008, 5:58 pm

Google AJAX Libraries API (via) Google are hosting copies of jQuery, Prototype, mooTools and Dojo on their CDN, with a promise to permanently host different versions and an optional JavaScript API to dynamically load the most recent version of a library. I wish they’d stop capitalising Ajax though. # 27th May 2008, 5:56 pm

Tracking Christmas Cheer with Google Charts. Brian Suda’s Google Charts tutorial on 24 ways has proved invaluable for figuring out how to handle grid lines and axis labels, both of which are pretty unintuitive (and not hugely helped by the official documentation). # 26th May 2008, 9:43 pm

QUnit. The jQuery unit testing framework is now documented and supported as a separate project. # 26th May 2008, 5:31 pm

Craigslist is fighting back. Its latest gimmick is phone verification. Posting in some categories now requires a callback phone call, with a password sent to the user either by voice or as an SMS message. [...] Spammers tried using their own free ringtone sites to get many users to accept the Craigslist verification call, then type in the password from the voice message. Craigslist hasn’t countered that trick yet.

John Nagle # 26th May 2008, 8:40 am

Twitter / MarsPhoenix. NASA’s Mars Phoenix lander, due to land on the planet today, has a Twitter account. Bio: “I dig Mars!”. # 25th May 2008, 7:41 pm

Debugging Django, a slidecast. I used SlideShare’s slidecast tool for the first time to synchronize audio of my Django London User Group talk with the slides. The talk included several live demos which aren’t represented in the slides so it’s a bit piecemeal in places. # 25th May 2008, 2:47 pm

Easy way to reset your sleep cycle: Stop eating (via) New research shows that you can quickly reset your sleep cycle by not eating for 12-16 hours and then using breakfast to flip in to another time zone. I get clobbered by jet lag when I fly from the US to Europe; this could be really useful. # 25th May 2008, 2:11 pm

LastGraph 3. Andrew Godwin’s profile visualisation tool, now in its third incarnation. # 25th May 2008, 2:05 pm

Walk, Don’t Run (via) A retrospective look at Grim Fandango (possibly my favourite game of all time) and the fan community that are keeping it alive, nearly a decade after it was first released. # 25th May 2008, 2:04 pm

Richard Feynman and The Connection Machine. Too much great stuff in here to attempt to summarise. # 25th May 2008, 2:01 pm

modswgi: Debugging Techniques. mod_wsgi is excellent software, and the documentation is equally superb. I used these instructions recently to run the Python debugger inside a running instance of Apache, which helped my track down some import errors that weren’t occurring with Django’s development server. # 25th May 2008, 1:34 pm

On the spot. Did you know Jupiter just grew a third spot? Apparently the spots are storms, and the largest has been raging for several centuries. # 24th May 2008, 6:25 pm

Scoble writes something—6,800 writes are kicked off, 1 for each follower. Michael Arrington replies—another 6,600 writes. Jason Calacanis jumps in—another 6,500 writes. Beyond the 19,900 writes, there’s a lot of additional overhead too. You have to hit a DB to figure out who the 19,900 followers are. [...] And here’s the kicker: that giant processing and delivery effort—possibly a combined 100K disk IOs—was caused by 3 users, each just sending one, tiny, 140 char message. How innocent it all seemed.

Isreal L'Heureux # 23rd May 2008, 7:28 pm

Search Engine Optimization Through Hoax News. Devious new black-hat SEO technique: invent a news story that’s pure link-bait. The recent “13 year old steals dad’s credit card to buy hookers” story was a hoax: it was a pure play for PageRank. # 22nd May 2008, 6:09 pm

On-board vs. Off-board Comet. Useful distinction. On-board comet runs on the same server as the rest of your application; Off-board comet is served from a separate server (generally a subdomain) and a separate stack. If you want to stick with PHP, Rails or Django for the rest of your site off-board comet looks like the way to go. # 22nd May 2008, 5:02 pm

Debugging Django

I gave a talk on Debugging Django applications at Monday’s inaugural meeting of DJUGL, the London Django Users Group. I wanted to talk about something that wasn’t particularly well documented elsewhere, so I pitched the talk as “Bug Driven Development”—what happens when Test Driven Development goes the way of this unfortunate pony.

The slides [... 1759 words]

AOP aspect of JavaScript with Dojo. Fantastic post—concisely explains Aspect Oriented Programming, then shows how Dojo’s dojox.lang.aspect brings AOP to JavaScript, including some really useful built-in aspects for logging, profiling and more. Aspects are like Python decorators on steroids. # 18th May 2008, 10:45 am

GeoNames Commercial Webservices. Wikinear has been loading slowly recently, so I’ve signed up for GeoNames very reasonably priced commercial plan which provides access to better servers at their end. This might speed things up to the point that I can reliably run the site on Google AppEngine, which times out aggressively if an external HTTP request takes too long. # 18th May 2008, 10:32 am

Dopplr place googlemaps, with and without Yahoo Geo API bounding box adjustment. Dopplr uses Geonames for most geo information, but is now mixing in bounding box data from the Yahoo! Geo web service to improve the default zoom level for their maps. The JSON callback API means no server-side code is required on Dopplr’s end. # 17th May 2008, 11:35 pm

A McAfee spokeswoman said the company rates XSS vulnerabilities less severe than SQL injections and other types of security bugs. “Currently, the presence of an XSS vulnerability does not cause a web site to fail HackerSafe certification,” she said. “When McAfee identifies XSS, it notifies its customers and educates them about XSS vulnerabilities.”

Dan Goodin # 17th May 2008, 11:31 pm

Firebug Command Line API. Another thing I didn’t know about Firebug: you can set a breakpoint at the start of a function with “debug(fn)” and log all calls to it with “monitor(fn)”. # 16th May 2008, 12:14 pm

Using Git as a versioned data store in Python. gitshelve supports the same interface as Python’s built-in shelve module but stores things to a versioned Git repository instead of just a pickled dictionary. I’ve been casually wondering what a Git-powered CMS would look like. # 15th May 2008, 3:25 pm