Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

On python, django, security, javascript, openid, ...

 

Recent entries

jQuery style chaining with the Django ORM eight days ago

Django’s ORM is, in my opinion, the unsung gem of the framework. For the subset of SQL that’s used in most web applications it’s very hard to beat. It’s a beautiful piece of API design, and I tip my hat to the people who designed and built it.

Lazy evaluation

If you haven’t spent much time with the ORM, two key features are lazy evaluation and chaining. Consider the following statement:

entries = Entry.objects.all()

Assuming you have created an Entry model of some sort, the above statement will create a Django QuerySet object representing all of the entries in the database. It will not result in the execution of any SQL—QuerySets are lazily evaluated, and are only executed at the last possible moment. The most common situation in which SQL will be executed is when the object is used for iteration:

for entry in entries:
    print entry.title

This usually happens in a template:

<ul>
{% for entry in entries %}
  <li>{{ entry.title }}</li>
{% endfor %}
</ul>

Lazy evaluation works nicely with template fragment caching—even if you pass a QuerySet to a template it won’t be executed if the fragment it is used in can be served from the cache.

You can modify QuerySets as many times as you like before they are executed:

entries = Entry.objects.all()
today = datetime.date.today()
entries_this_year = entries.filter(
    posted__year = today.year
)
entries_last_year = entries.filter(
    posted__year = today.year - 1
)

Again, no SQL has been executed, but we now have two QuerySets which, when iterated, will produce the desired result.

Chaining

Chaining comes in when you want to apply multiple modifications to a QuerySet. Here are blog entries from 2006 that weren’t posted in January:

Entry.objects.filter(
    posted__year = 2006
).exclude(posted__month = 1)

And here’s entries from that year posted to the category named “Personal”, ordered by title:

Entry.objects.filter(
    posted__year = 2006
).filter(
    category__name = "Personal"
).order_by('title')

The above can also be expressed like this:

Entry.objects.filter(
    posted__year = 2006,
    category__name = "Personal"
).order_by('title')

Chaining in jQuery

The parallels to jQuery are pretty clear. The jQuery API is built around chaining, and the jQuery animation library even uses a form of lazy evaluation to automatically queue up effects to run in sequence:

jQuery('div#message').addClass(
	'borderfade'
).animate({
   'borderWidth': '+10px'
}, 1000).fadeOut();

One of the neatest things about jQuery is the plugin model, which takes advantage of JavaScript’s prototype inheritance and makes it trivially easy to add new chainable methods. If we wanted to package the above dumb effect up as a plugin, we could do so like this:

jQuery.fn.dumbBorderFade = function() {
    return this.addClass(
        'borderfade'
    ).animate({
       'borderWidth': '+10px'
    }, 1000).fadeOut();
};

Now we can apply it to an element like so:

jQuery('div#message').dumbBorderFade();

Custom QuerySet methods in Django

Django supports adding custom methods for accessing the ORM through the ability to implement a custom Manager. In the above examples, Entry.objects is the Manager. The downside of this approach is that methods added to a manager can only be used at the beginning of the chain.

Luckily, Managers also provide a hook for returning a custom QuerySet. This means we can create our own QuerySet subclass and add new methods to it, in a way that’s reminiscent of jQuery:

from django.db import models
from django.db.models.query import QuerySet
import datetime

class EntryQuerySet(QuerySet):
    def on_date(self, date):
        next = date + datetime.timedelta(days = 1)
        return self.filter(
            posted__gt = date,
            posted__lt = next
        )

class EntryManager(models.Manager):
    def get_query_set(self):
        return EntryQuerySet(self.model)

class Entry(models.Model):
    ...
    objects = EntryManager()

The above gives us a new method on the QuerySets returned by Entry.objects called on_date(), which lets us filter entries down to those posted on a specific date. Now we can run queries like the following:

Entry.objects.filter(
    category__name = 'Personal'
).on_date(datetime.date(2008, 5, 1))

Reducing the boilerplate

This method works fine, but it requires quite a bit of boilerplate code—a QuerySet subclass and a Manager subclass plus the wiring to pull them all together. Wouldn’t it be neat if you could declare the extra QuerySet methods inside the model definition itself?

It turns out you can, and it’s surprisingly easy. Here’s the syntax I came up with:

from django.db.models.query import QuerySet

class Entry(models.Model):
   ...
   objects = QuerySetManager()
   ...
   class QuerySet(QuerySet):
       def on_date(self, date):
           return self.filter(
               ...
           )

Here I’ve made the custom QuerySet class an inner class of the model definition. I’ve also replaced the default manager with a QuerySetManager. All this class does is return the QuerySet inner class for the current model from get_query_set. The implementation looks like this:

class QuerySetManager(models.Manager):
    def get_query_set(self):
        return self.model.QuerySet(self.model)

I’m pretty happy with this; it makes it trivial to add custom QuerySet methods and does so without any monkeypatching or deep reliance on Django ORM internals. I think the ease with which this can be achieved is a testament to the quality of the ORM API.

wikinear.com, OAuth and Fire Eagle one month ago

I’m pleased to announce wikinear.com. It’s a simple site that does just one thing: show you a list of the five Wikipedia pages that are geographically closest to your current location. It’s designed (or not-designed) to be used mainly from mobile phones.

You’ll need a Fire Eagle invitation code to use the site. I’ve got four spare; the first four comments to ask for one can have them my invites are all accounted for. If you don’t have a Fire Eagle account you’ll have to make do with this screenshot instead.

The idea for the site came from living in Oxford for a year. The city is full of beautiful old historic buildings (many of them colleges), but very few of them are labelled or signposted. With wikinear.com and a GPS hooked up to Fire Eagle, I can pull out my phone and see a list of the closest points of interest, plotted on a handy map.

Under the hood the site combines a number of interesting technologies: OAuth, Fire Eagle, GeoNames and the new Google Static Maps API.

OAuth

OAuth was originally designed to solve a problem with OpenID: in an authentication protocol based on browser redirects, how do you authenticate a desktop or command-line application? As it turned out, the solution to that problem solved a bunch of other problems that are unrelated to OpenID, so OAuth now exists as very much its own thing. In essence, it lets users delegate permission to perform actions on their behalf, without having to hand their regular authentication credentials (e.g. username and password) over to a third-party piece of software.

If you’ve ever used a Flickr application that sends you back to Flickr to ask permission to view your private photos you’ll understand what OAuth does straight away. Before OAuth, sites had to invent their own solutions to this problem—complete with smart security measures, their own UI flow and libraries for developers wishing to access their protected APIs. OAuth provides a ready-made solution, complete with tested libraries in a bunch of languages.

If you want to securely expose your user’s private data via an API, OAuth is a no-brainer. I expect to see a lot more of it over the next year.

Fire Eagle

Launched at ETech a few weeks ago, Fire Eagle is a service with enormous potential. You can watch Tom Coates explain it in ten minutes in this video from the conference, but the short version is that Fire Eagle acts as a location broker. It consists of two key OAuth-protected APIs: one for setting the geographical location of a user, and another for retrieving that location.

This leads to a neat separation of concerns. On the one hand are the applications that attempt to figure out your location—GPS receivers, WiFi maps, mobile phones that triangulate nearby cell towers, or even sites that know where you are because you told them (Dopplr and Upcoming, for example, or the Fire Eagle site itself). On the other hand are the applications that do something useful with your location—from restaurant review sites, traffic alert services, friend finders and ARGs down to trivial applications like wikinear.com.

As a developer, this is really exciting. I can build location-based services without having to solve the much bigger problem of figuring out where my users are. Even better, wikinear.com becomes incrementally more useful every time someone builds a new tool for passing location information to Fire Eagle, without me having to do anything at all.

Obviously privacy is a huge concern when dealing with this kind of data. That’s where the Fire Eagle application itself comes in: it provides a simple suite of tools for users to manage the applications that can access their location. Applications can be permitted to access different levels of accuracy or disabled entirely, and there’s a “Hide” button for disabling all applications at once.

Disclaimer: I worked on an early prototype of Fire Eagle as my last project at Yahoo! before leaving in January 2007, but the product that has launched has changed enormously and is entirely the work of the current Fire Eagle team. wikinear.com is inspired by part of that early prototype.

Wikipedia and GeoNames

Wikipedia has a thriving community of geo-hackers, mainly focused around the Maps, Geographical coordinates and Wikipedia-World wiki projects. Many Wikipedia pages (Brighton, for example) have their co-ordinates in the top-right, added using a bewildering array of macros and markup extensions. You can browse through the huge collection of geotagged pages using this KML-powered Google Maps tool—zoom in and wait a few seconds to load in more markers.

The wonderful GeoNames (also used on djangopeople.net) includes an API for querying Wikipedia by location, based on 610,000 articles extracted from a Wikipedia data dump. This was a huge relief when I found it, as “order by distance from X” is actually pretty tricky to do efficiently; I’ve used expanding bounding box searches in the past but I’d love to hear about more effective solutions.

Google Static Maps

A long-term criticism of the Google Maps API is that it requires JavaScript to display anything at all—once you’ve committed to using it, you’re going to have trouble implementing unobtrusive scripting (although you can work around the problem to some extent). Yahoo! Maps has long been better in this regard, but their map image API is a bit of a pain to use—you have to do an initial call to get back the URL to an image embedded in an XML file, then extract that URL and send it to the browser.

Launched last month, Google’s Static Maps API is a big improvement. As with Google Charts, you need only construct a URL to the image to have it dynamically generated on the fly. You can also specify markers, and optionally omit the initial latitude/longitude/zoom to indicate that you want a best fit for the markers you are displaying. There’s even a flag for a “mobile optimised” image which I’m using for wikinear.com.

Mixing it all together

Excluding templates, the entire application comes in at less than 200 lines of code and took around two hours to build. The only persistence is a couple of cookies for storing Fire Eagle tokens; Django’s database layer isn’t even configured (and user locations aren’t logged anywhere, which is great from a privacy point of view). I suppose it’s a classic mashup—Fire Eagle + OAuth + Wikipedia + GeoNames + Google Static Maps = wikinear.com. Despite its simplicity (or maybe because if it), I think it’s a neat demonstration of the kind of applications Fire Eagle enables.

Django People: OpenID and microformats three months ago

In hindsight, it was a mistake to launch Django People without support for OpenID. It was on the original feature list, but in the end I decided to cut any feature that wasn’t completely essential in order to get the site launched before it drowned in an ocean of “wouldn’t-it-be-cool-ifs”.

I thought that, once launched, the site would see a small amount of activity from a few interested parties and I’d have plenty of time to catch up on the feature backlog. What I didn’t expect was that over 750 people would create profiles within the first 24 hours!

So, I spent a few hours this evening integrating my current development version of django-openid, which thankfully had about 80% of the code needed to integrate with Django’s built-in authentication mechanism already written. Sadly the other 20% is either incomplete or a bit of a mess, but I’ve checked it in to a branch on Google Code for anyone who’s interested.

Anyway, there are a few new features on the site of interest to OpenID users:

  1. When signing up for a new account, you now have the option to start by signing in with an OpenID. If you do this, you’ll be able to complete the signup form without having to pick a password. If your OpenID provider supports simple registration the name, e-mail address and username fields will be filled in for you.
  2. If you already have an existing account, you can associate one or more OpenIDs with that account. You’ll then be able to use any of them to sign in to the account. Why multiple OpenIDs instead of just one? Two reasons: firstly, it opens the potential for doing interesting things with multiple OpenIDs from different providers in the future; secondly, it gives you a fallback for if one of your OpenID providers becomes unavailable.
  3. You can freely add and remove OpenIDs from your associations, with one exception: the site won’t let you delete your last OpenID if your account doesn’t also have a password associated with it, to prevent you from locking yourself out.
  4. While I decided that I didn’t want Django People to become yet another OpenID provider, I do want to give people the ability to use their profile page on the site as an OpenID—so that they can prove that they own it (see my recent post on identity projection). To that end, the new account settings page lets advanced OpenID users set up an openid.server and openid.delegate for their profile page, as described in my blog entry from just over a year ago.

One caveat: the site only supports OpenID 1.1, at least for the moment. I had originally planned to go for OpenID 2.0, but demand was such that I decided to get what I had up and running rather than digging in to the OpenID 2.0 libraries.

Microformats

While I was messing around with OpenID, Natalie was updating the site’s templates to clean up the crufty code I’d introduced and add some microformatted goodness. The site now uses hCard where you would expect it (country listing pages, skill listing pages and the new search interface) and the profile pages have been updated with a healthy dose of XFN (just rel=“me”, since there isn’t a relevant microformat for “people who live nearby”) and Rel-Tag. On Jeremy Keith’s suggestion, the profile pages also use hResume—all the more reason to add the Django projects you’ve worked on to your profile’s portfolio.

As usual, post feedback and bug reports as comments on this entry.

Elsewhere

Today

  • Processing.js. John Resig’s outstanding port of the Processing visualisation language to JavaScript and Canvas. Runs amazingly well in Firefox 3. One hell of a hack. 0

6th May 2008

  • Opera Dragonfly. Opera’s new Firebug-style developer console. Out in alpha and it shows (slow to load and the interactive console leaves a lot to be desired) but still looks incredibly promising, especially the remote debugging tools for working with Opera on phones and games consoles. 0
  • Unobtrusive JavaScript with jQuery. The online handout for the tutorial I gave this morning at XTech. 5

5th May 2008

  • What amazes me is how close Ruby 1.9 bytecode and Python 2.5 bytecode are. Some things translate almost directly. [...] And, really, if that’s true (and I vouch that it is truly, truly true,) then how are Python and Ruby still on separate runtimes?

    Why the lucky stiff 4

  • Sneaking Ruby Through Google App Engine (and Other Strictly Python Places). In a characteristic stroke of genius, _why makes a solid initial attempt at compiling Ruby 1.9 source to Python 2.5 bytecode. 0

4th May 2008

  • Making Time Machine work with the ReadyNAS. Finally, a decent set of instructions on using a ReadyNAS with Time Machine. The trick is to create a local sparse disk image with a magic name (based on hostname and eth0 MAC address), then move it to the NAS. 1
  • twistori. Lovely implementation of a neat idea for a Twitter app from Amy Hoy and Thomas Fuchs. 0

2nd May 2008

  • James B. on Pownce (via) James Bennett has started using Pownce for sort of medium-format blog entries, longer than a tweet but shorter than a blog essay and delivered with a healthy dose of snark. 0
  • How one site dealt with SQL injection attack (via) Horrifying story of developer incompetence from Autoweb: “The contractor had no idea how to find and fix the Web page vulnerability that allowed the SQL injection attack code to execute successfully.” 2
  • Django Users Group London meetup, 19th of May. The inaugural meeting of DJUGL will be on the 19th of May at the Capital Radio building in Leicester Square, sponsored by GCap Media. Three presentations starting at 7pm (I’ll be giving one of them), then on to the pub. Sign up on EventWax; there are only 70 places. 0
A django site