Simon Willison’s Weblog

41 items tagged “caching”

nginx proxy-cache-lock (via) Crucially important feature hidden away in the nginx documentation: proxy_cache_lock enables request coalescing, or dog-pile protection: it means that if a hundred simultaneous requests all suffer the same cache miss, only one request is made to the backend and the answer is then sent back to all hundred requests at once. I’ve leaned heavily on this feature in Varnish for years—useful to know that nginx has the same capability. # 14th November 2017, 9:53 pm

Display your events on your own website with Lanyrd Badges. We’ve launched badges for Lanyrd—JavaScript that lets you embed a top bar or a content “splat” showing events you plan to attend, talks you’ve given in the past and other various combinations. I’m quite pleased with the implementation—the badges are configured using classes on a link to your Lanyrd profile, and the badges themselves are served through a combination of Amazon CloudFront for the initial script and a Varnish cache for the badge data itself to keep things nice and snappy. # 13th January 2011, 8:38 pm

Reddit is now running on Cassandra. Migrating their persistent cache over from memcacheDB to Cassandra took one developer just ten days. # 13th March 2010, 12:14 am

Cache Machine: Automatic caching for your Django models. This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis. # 11th March 2010, 7:35 pm

Announcing django-cachebot. The ORM caching space around Django is heating up. django-cachebot is used in production at mingle.com and takes a more low level approach to cache invalidation than Johnny Cache, enabling you to specifically mark the querysets you wish to cache and providing some advanced options for cache invalidation. Unfortunately it currently relies on a patch to Django core to enable its own manager. # 6th March 2010, 12:48 pm

Is johnny-cache for you? “Using Johnny is really adopting a particular caching strategy. This strategy isn’t always a win; it can impact performance negatively”—but for a high percentage of Django sites there’s a very good chance it will be a net bonus. # 2nd March 2010, 11:44 am

jmoiron.net: Johnny Cache. The blog entry announcing Johnny Cache (“a drop-in caching library/framework for Django that will cache all of your querysets forever in a consistent and safe manner”) to the world. # 1st March 2010, 11:48 am

Johnny Cache. Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully. # 28th February 2010, 10:55 pm

Internet Explorer Cookie Internals (FAQ). Grr... IE 6, 7 and 8 don’t support the max-age cookie argument, forcing you to use an explicit expiry date instead. This appears to affect the cache busting cookie pattern, where you set a cookie to expire in 30 seconds for any user who posts content and use the presence of that cookie to skip caches and/or send their queries to a master instead of slave database. If you have to use expires, users with incorrect system clocks may get inconsistent results. Anyone know of a workaround? # 26th February 2010, 12:25 pm

High-end Varnish-tuning. Tuning the Varnish HTTP cache to serve 27K requests/second on a single core 2.2GHz Opteron. # 20th October 2009, 9:25 am

Caching in ASP.NET with the SqlCacheDependency Class. Interesting cache invalidation concept: set up dependencies between cache entries and tables or rows in the database, then use triggers (which I presume are automatically created for you) to clear your cache. # 18th August 2009, 12:15 pm

Memcached 1.4.0 released. The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration. # 17th July 2009, 10:26 pm

Yahoo! proposal to open source “Traffic Server” via the ASF. Traffic Server is a “fast, scalable and extensible HTTP/1.1 compliant caching proxy server” (presumably equivalent to things like Squid and Varnish) originally acquired from Inktomi and developed internally at Yahoo! for the past three years, which has been benchmarked handling 35,000 req/s on a single box. No source code yet but it looks like the release will arrive pretty soon. # 7th July 2009, 12:37 pm

cache-money. A “write-through caching library for ActiveRecord”, maintained by Nick Kallen from Twitter. Queries hit memcached first, and caches are automatically kept up-to-date when objects are created, updated and deleted. Only some queries are supported—joins and comparisons won’t hit the cache, for example. # 28th June 2009, 3:17 pm

Twitter, an Evolving Architecture. The most detailed write-up of Twitter’s current architecture I’ve seen, explaining the four layers of cache (all memcached) used by the Twitter API. # 28th June 2009, 3:09 pm

Django tip: Caching and two-phased template rendering. Neat trick for expensive pages which can be mostly cached with the exception of the “logged in as” bit—run them through the template system twice, caching the intermediary generated template. # 19th May 2009, 1:34 am

mmalone’s django-caching. Mike Malone shares code used by Pownce to add QuerySet level caching to Django. It’s a smart implementation—a CachingQuerySet class inspects the arguments passed to get(), and if they’re just a straight forward exact PK lookup hits memcache for the object before hitting the database. Signals are used to invalidate the cache. # 7th May 2009, 7:36 am

hash_ring 1.2. A Python library for consistent hashing with memcached, using MD5 and the same algorithm as libketama. Exposes an interface that is identical to regular memcache making this a drop-in replacement. # 5th May 2009, 1:45 pm

How search.twitter.com uses Varnish. Includes examples of the configuration options they use. # 2nd March 2009, 5:08 pm

Sharding Counters on Google App Engine. “While the datastore for App Engine scales to support a huge number of entities it is important to note that you can only expect to update any single entity, or entity-group, about five times a second”. This article explains a technique for sharding writes across multiple counters in detail, including a way to keep a memcache counter updated at the same time for faster reads. # 27th January 2009, 8:27 pm

ETags And Modification Times In Django. Part of Malcolm’s series of tutorials on implementing advanced HTTP concepts in Django. # 13th December 2008, 9:49 am

REST, I just don’t get it. Read the comments for some excellent practical reasons to care about REST, including cache management (PUT and DELETE can expire the cache entries for the corresponding GET), the ability to add or move parts of the server API without redeploying client libraries and the idempotency of GET / PUT / DELETE and HEAD (repeated POST operations may have side-effects). # 15th August 2008, 8:20 am

Facelift Image Replacement. Like sIFR but with JavaScript and a PHP text rendering component. I question the need for the JavaScript if you’re already generating the images on the server, but the actual generation script is nicely done—it makes smart use of ImageMagick and caches the generated images. # 5th August 2008, 6:36 pm

quipt (via) Extremely clever idea: Cache JavaScript in window.name (which persists between page views and can hold several MB of data), but use document.referrer to check that an external domain hasn’t loaded the cache with malicious code for an XSS attack. UPDATE: Jesse Ruderman points out a fatal flaw in the comments. # 4th July 2008, 3:49 pm

Velocity: A Distributed In-Memory Cache from Microsoft. I’d been wondering what Microsoft ecosystem developers were using in the absence of memcached. Is Velocity the first Windows platform implementation of this idea? # 6th June 2008, 9:52 pm

Obscure bugs revisited: IE, HTTPS and plugins. Filed for future reference: IE breaks mysteriously if you serve it up plugin content (e.g. Flash) over HTTPS with a no-cache header—it deletes the file from cache before the plugin software gets a chance to open it. # 30th May 2008, 9:54 am

so-you-wanna-see-an-image (via) WordPress.com use Amazon S3 to store images (presumably to save having to create a massive scalable redundant filesystem themselves) but the images are served via a load balanced memcached / varnishd caching system that they control. # 1st May 2008, 10:13 am

hash. Douglas Crockford: “Any HTML tag that accepts a src= or href= attribute should also be allowed to take a hash= attribute”—to protect against file tampering and (more importantly) provide a truly robust caching mechanism. # 30th March 2008, 6:34 pm

Consistent Hashing. Beautifully clear explanation of consistent hashing, a simple technique that allows you to add new caching servers to a cluster without re-hashing your keys and hence invalidating all of your caches. # 18th March 2008, 1 am

Two data streams for a happy website. Useful architectural concept for scaling: keep user-specific and generic data separate from the start, in recognition of their different caching and partitioning constraints. # 4th March 2008, 4:40 am