Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Cache Machine: Automatic caching for your Django models. This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis.

Tagged , , , , , , , ,

6 comments

  1. I mentioned cache-machine briefly in my blogpost about johnny-cache. It works very differently from Johnny, but is a good kind of different that would work well for many projects.

    The code is short and sweet, too. I had a chat with Jeff about it to try to understand some things, since it actually came out right as the finishing touches were being made on Johnny.

    I hope to read up a bit on cachebot's code to see what it's doing in there, and then write some kind of comprehensive article comparing them. There seems to be a lot more code there, and I've been waiting for the list of caveats/gotchas to be resolved.

    Johnny is a very non-intrusive caching layer; "perfect coherency" is a design goal and over-invalidation is thus an unfortunate but inescapable side-effect. The approach cachemachine takes doesn't really preclude them from being used together effectively. Everyone thinking: "Invalidates the whole table on every write?" when Johnny was released should look at cachemachine; it's coherency guarantees aren't as strict but you have a lot more control over where it goes.

    I also think it's interesting that all 3 projects have found the need to write custom backends re-defining the meaning of 0; perhaps this (or custom singleton value like cachemachine's locmem Infinity) makes sense as a patch on Django's cache models itself? Or is it just a sign that pluggable cache backends are working just fine :)

    Jason Moiron - 11th March 2010 22:24 - #

  2. I think there's definitely a strong argument for modifying Django to make it easier to implement QuerySet caching of some sort. If there's a low-level hook that would make it easier to add caching (just like multi-db adds database routers which make it easy to hook up a multi-db scheme without controlling the fine details) I'd be very much in favour of it going in to Django.

    Simon Willison - 11th March 2010 22:33 - #

  3. I'm the author of django-cachebot, and I've looked through cachemachine and a little bit at johnny-cache. Just having looked at cachemachine's source and not actually used it, I didn't see how adding a new row would invalidate the cache, or how it handles reverse relations.

    As for the caveats in cachebot, most of them should be fixed when 1.2 comes out. We're using cachebot at work and on 1.1, so until then that's what I have to develop against.

    All of the projects have good ideas on how to solve this problem (I especially like that johnny-cache is transaction aware, I'll probably have to steal this feature) and it'd be nice if we could merge the best of breed into a single project and propose a merge into django core.

    David Ziegler - 12th March 2010 16:22 - #

  4. @Jason Once addons.mozilla.org isn't running PHP and Python in parallel, I'd like to drop in johnny-cache underneath cache-machine. It'll let us keep lower timeouts without feeling too bad about it.

    @David Cache Machine doesn't do anything when new rows are added. Any queries that would include that object will stay in cache until timeout (or until an object already in the query forces an invalidation). So far, this works fine for me. I'd like to add more fine-grained timeout control so that we can bump up the default timeout and still keep lists refreshing reasonably.

    For reverse relations, cache machine looks in _meta.fields for any ForeignKeys and adds the relation to the object's flush list. This lets invalidation follow foreign keys.

    Since the first release, I've added template fragment caching using the same flush lists as objects (so we retain invalidation). This is a big boost since we don't bother with unpickling objects and running through the template. I've been meaning to cut a new release and write about the performance numbers.

    Jeff Balogh - 12th March 2010 22:29 - #

  5. @Jeff
    As far as I know _meta.fields is only for forward relations. For instance User._meta.fields won't bring up any related fields because it doesn't have any forward relations. You'd have to do User._meta.get_all_related_many_to_many_objects() and User._meta.get_all_related_objects() to get its reverse relations.

    David Ziegler - 13th March 2010 03:34 - #

  6. I think the existence of all these managers* is a good argument that Django's caching framework should include such functionality. It would be nice if all contributors could come together and create a nice hybridized approach that allows you to tune the coherency needs in a fine grained way.

    * - I (thought I) came up with this idea myself and was considering writing one until I realized so many people had already made one.

    Robert - 13th March 2010 15:47 - #

Sign in with OpenID

Auto-HTML: Line breaks are preserved; URLs will be converted in to links.

Manual XHTML: Enter your own, valid XHTML. Allowed tags are a, p, blockquote, ul, ol, li, dl, dt, dd, em, strong, dfn, code, q, samp, kbd, var, cite, abbr, acronym, sub, sup, br, pre

A django site