Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

jQuery style chaining with the Django ORM

Django’s ORM is, in my opinion, the unsung gem of the framework. For the subset of SQL that’s used in most web applications it’s very hard to beat. It’s a beautiful piece of API design, and I tip my hat to the people who designed and built it.

Lazy evaluation

If you haven’t spent much time with the ORM, two key features are lazy evaluation and chaining. Consider the following statement:

entries = Entry.objects.all()

Assuming you have created an Entry model of some sort, the above statement will create a Django QuerySet object representing all of the entries in the database. It will not result in the execution of any SQL—QuerySets are lazily evaluated, and are only executed at the last possible moment. The most common situation in which SQL will be executed is when the object is used for iteration:

for entry in entries:
    print entry.title

This usually happens in a template:

<ul>
{% for entry in entries %}
  <li>{{ entry.title }}</li>
{% endfor %}
</ul>

Lazy evaluation works nicely with template fragment caching—even if you pass a QuerySet to a template it won’t be executed if the fragment it is used in can be served from the cache.

You can modify QuerySets as many times as you like before they are executed:

entries = Entry.objects.all()
today = datetime.date.today()
entries_this_year = entries.filter(
    posted__year = today.year
)
entries_last_year = entries.filter(
    posted__year = today.year - 1
)

Again, no SQL has been executed, but we now have two QuerySets which, when iterated, will produce the desired result.

Chaining

Chaining comes in when you want to apply multiple modifications to a QuerySet. Here are blog entries from 2006 that weren’t posted in January:

Entry.objects.filter(
    posted__year = 2006
).exclude(posted__month = 1)

And here’s entries from that year posted to the category named “Personal”, ordered by title:

Entry.objects.filter(
    posted__year = 2006
).filter(
    category__name = "Personal"
).order_by('title')

The above can also be expressed like this:

Entry.objects.filter(
    posted__year = 2006,
    category__name = "Personal"
).order_by('title')

Chaining in jQuery

The parallels to jQuery are pretty clear. The jQuery API is built around chaining, and the jQuery animation library even uses a form of lazy evaluation to automatically queue up effects to run in sequence:

jQuery('div#message').addClass(
	'borderfade'
).animate({
   'borderWidth': '+10px'
}, 1000).fadeOut();

One of the neatest things about jQuery is the plugin model, which takes advantage of JavaScript’s prototype inheritance and makes it trivially easy to add new chainable methods. If we wanted to package the above dumb effect up as a plugin, we could do so like this:

jQuery.fn.dumbBorderFade = function() {
    return this.addClass(
        'borderfade'
    ).animate({
       'borderWidth': '+10px'
    }, 1000).fadeOut();
};

Now we can apply it to an element like so:

jQuery('div#message').dumbBorderFade();

Custom QuerySet methods in Django

Django supports adding custom methods for accessing the ORM through the ability to implement a custom Manager. In the above examples, Entry.objects is the Manager. The downside of this approach is that methods added to a manager can only be used at the beginning of the chain.

Luckily, Managers also provide a hook for returning a custom QuerySet. This means we can create our own QuerySet subclass and add new methods to it, in a way that’s reminiscent of jQuery:

from django.db import models
from django.db.models.query import QuerySet
import datetime

class EntryQuerySet(QuerySet):
    def on_date(self, date):
        next = date + datetime.timedelta(days = 1)
        return self.filter(
            posted__gt = date,
            posted__lt = next
        )

class EntryManager(models.Manager):
    def get_query_set(self):
        return EntryQuerySet(self.model)

class Entry(models.Model):
    ...
    objects = EntryManager()

The above gives us a new method on the QuerySets returned by Entry.objects called on_date(), which lets us filter entries down to those posted on a specific date. Now we can run queries like the following:

Entry.objects.filter(
    category__name = 'Personal'
).on_date(datetime.date(2008, 5, 1))

Reducing the boilerplate

This method works fine, but it requires quite a bit of boilerplate code—a QuerySet subclass and a Manager subclass plus the wiring to pull them all together. Wouldn’t it be neat if you could declare the extra QuerySet methods inside the model definition itself?

It turns out you can, and it’s surprisingly easy. Here’s the syntax I came up with:

from django.db.models.query import QuerySet

class Entry(models.Model):
   ...
   objects = QuerySetManager()
   ...
   class QuerySet(QuerySet):
       def on_date(self, date):
           return self.filter(
               ...
           )

Here I’ve made the custom QuerySet class an inner class of the model definition. I’ve also replaced the default manager with a QuerySetManager. All this class does is return the QuerySet inner class for the current model from get_query_set. The implementation looks like this:

class QuerySetManager(models.Manager):
    def get_query_set(self):
        return self.model.QuerySet(self.model)

I’m pretty happy with this; it makes it trivial to add custom QuerySet methods and does so without any monkeypatching or deep reliance on Django ORM internals. I think the ease with which this can be achieved is a testament to the quality of the ORM API.

This is jQuery style chaining with the Django ORM by Simon Willison, posted on 1st May 2008.

Tagged , , , , , ,

View blog reactions

Next: Debugging Django

Previous: wikinear.com, OAuth and Fire Eagle

17 comments

  1. Is it intentional to keep the same name (QuerySet) for your own class?

    Does it require the qs-rf merge?

    David, biologeek - 1st May 2008 13:28 - #

  2. Does this approach have some advantages compared to the following snippet: http://www.djangosnippets.org/snippets/562/ (the first comment in the snippet provides also a more general way).

    Thanks

    Francesco - 1st May 2008 15:42 - #

  3. The continued implication by django devs who should know better that SQLAlchemy is Too Hard reminds me of the similar FUD spread by MySQL fanboys against PostgreSQL.

    In both cases, the more powerful product is also surprisingly easy to use on the low end.

    Jonathan Ellis - 1st May 2008 15:48 - #

  4. I would only ask that the "SQLAlchemy is too hard" crowd spend some time with a version of SQLA thats from the past year or so before continuing with this meme...in my personal experience, this opinion is universally based on experiences with extremely old versions, like 0.1 and 0.2, which I took great pains to illustrate in my Pycon talk are ancient history.

    In particular, the SQLA version of the examples above would look extremely similar - for the most part, it's query.join('name').filter_by(subname='foo') instead of query.filter(name__subname='foo').

    Anyway, the Django ORM is orders of magnitude more popular than that of SQLA, and the Queryset merge looks great so theres no need to advertise it at SQLA's expense; it will do quite well on its own.

    Mike Bayer - 1st May 2008 16:37 - #

  5. As for the chaining:
    after qs-refactor you can't chain filters anymore, when they apply to the same table. I had to rewrite code to get it working quick again, because qs-rf joined too many tables and made some junk out of it.
    Before that, I had a manager that pre-filters a queryset and a filter in the view. That seems not possible now. Any ideas?

    Julian - 1st May 2008 16:54 - #

  6. That would be even more convenient if the default behavior of Manager was to take the QuerySet class of Model if defined.

    Grégoire Cachet - 1st May 2008 17:32 - #

  7. I agree, being able to work with an object, add criteria to it and delay execution of the SQL is a great feature to have. I find myself wanting this will ActiveRecord in Rails all the time. It's one of the things I like best about the DataMapper (http://datamapper.org) Ruby ORM framework that is being developed.

    Paul Barry - 1st May 2008 17:57 - #

  8. Paul, if DataMapper can do that, it's news to me!

    JD - 1st May 2008 20:49 - #

  9. If you don't hit the database until you're in a template, how do you handle exceptions and communicate them to both the user (in an vague but friendly way) and the developer (in full form, with backtraces etc., in server logs and so on)?

    John - 1st May 2008 21:00 - #

  10. John: You hit the DB whenever the queryset needs to be evaluated, if that means just iterating over something in a template it won't get evaluated until then, however if you need to do more complex business logic, your DB will be hit in the view

    Alex - 1st May 2008 21:16 - #

  11. Regardless of the complexity of the business logic, my question remains: what do you do about errors when you do run the query from within a template? (For example, say the database is unreachable or throws a lock timeout error or something.) If you never get all the way into the template with an "unexecuted" QuerySet, then there's no problem. But if you ever do that, then I imagine you need some way to handle errors. And in your post you say that "this usually happens in a template."

    John - 1st May 2008 22:15 - #

  12. @John: In Django your view function usually ends up with a call like:

    return render_to_response('some_template', data)

    So you can easily put a try/except around that for basic error handling if you want to override the default behavior of mapping exceptions to a 500 error.

    Also, you can force evaluation of the query earlier by simply using it in a list context.

    I guess in most cases I think if there is an exception evaluating the SQL, it's most likely a bug (filtering by a non-existent column) such that there wouldn't be a good way to handle it in the template OR the view so that I'd just let it pass up the chain and track these down during testing.

    If it is some sort of DB error that you can handle and can recover from in a more graceful manner, then my previous suggestions provide ways to deal with these errors.

    David Avraamides - 1st May 2008 23:06 - #

  13. I was about to post a justification of my opinion that Django's ORM is easier to use, but on further thought I realised that mentioning SQLAlchemy in this post was a mistake. I've now removed that sentence (for the record, it was "While not as full-featured as SQLAlchemy, what it lacks in power it more than makes up for in easy of use."); it added nothing useful to this article, and in fact acted as a distraction from the point I was trying to make.

    I apologise for the poorly supported remark. I would be happy to expand on my reasons for finding Django's ORM easier to use, but I don't think this post is the right place for doing so.

    I'd like to emphasize that I have an enormous amount of respect for the SQLAlchemy project. I mentioned it here only because I often feel that Django's ORM is considered a toy in comparison, which I believe to be unfair. Again, this is a discussion for a different venue.

    Simon Willison - 2nd May 2008 01:06 - #

  14. Can I weigh in with a remark about the "SQL is too hard" crowd? ;-)

    Paul Boddie - 2nd May 2008 13:40 - #

  15. Hi Simon,

    Can the inner QuerySet contain multiple custom methods that can be chained lazily?

    In other words, is it possible to do the following without evaluation until iteration?

    class Entry(models.Model):
    ___...
    ___objects = QuerySetManager()
    ___...
    ___class QuerySet(QuerySet):
    ______def on date(self, date):
    _________return self.filter(
    ____________...
    _________)
    ______def only active(self):
    _________return self.filter(
    ____________...
    _________)

    Entry.objects.on date(date=datetime.date.today()).only active().filter(title starts with='W')

    Thanks, Leif

    Leif - 8th May 2008 20:01 - #

  16. I've run into a problem using this approach.

    I created a model with a QuerySetManager and QuerySet inner class, as described in the post. Then tried to access the new model's manager from a different model using a RelatedManager.

    In other words, I did something like:

    relatedmodel.mynewmodel_set.custom_filter()

    However, I got the following error:

    AttributeError: 'RelatedManager' object has no attribute 'custom_filter'

    Any idea what this is? Am I doing something wrong, or is this a problem with the way models create RelatedManagers?

    Thanks,
    Leif

    Leif - 12th May 2008 22:12 - #

  17. @Leif : I am facing a similar issue. I have defined a custom method that order my QuerySet called : "*sort_by_order*". This work fine when I call it directly like in : Question.objects.filter(survey=survey).sort_by_ord er() but raise an Exception when I call it using the RelatedManager : survey.questions.all().sort_by_order()

    Thank you

    yml - 26th May 2008 11:06 - #

Comments are closed.
A django site