Simon Willison’s Weblog

32 items tagged “memcached”

What are the best books/tutorials to begin learning about memcached?

There isn’t really enough of memcached to justify a whole book—it’s a pretty straight-forward API.

[... 100 words]

What are people’s experiences using Memcached?

That it’s so obviously a good idea (and works so well) that you’d be crazy not to use it. As far as I’m concerned, it’s part of the default stack for any web application.

[... 46 words]

What is the best way to list every key stored in memcached?

Redis might be a better bet for this—it has a “KEYS *” command which can return every key in the dataset, and its GET and SET performance are comparable to memcached.

[... 113 words]

ElasticSearch memcached module. Fascinating idea: the ElasticSearch search server provides an optional memcached protocol plugin for added performance which maps simple HTTP to memcached. GET is mapped to memcached get commands, POST is mapped to set commands. This means you can use any memcached client to communicate with the search server. # 15th May 2010, 10:17 am

Introduction to nginx.conf scripting. Slideshow—hit left arrow to navigate through the slides. The nginx community is officially nuts. Starts out with a simple “Hello world” using the echo module, then rapidly descends down the rabbit hole in to array operations, sub-requests, memcached connection pooling and eventually non-blocking Drizzle SQL execution against a sharded cluster—all implemented in the nginx.conf configuration file. # 21st April 2010, 11:40 pm

Cache Machine: Automatic caching for your Django models. This is the third new ORM caching layer for Django I’ve seen in the past month! Cache Machine was developed for zamboni, the port of addons.mozilla.org to Django. Caching is enabled using a model mixin class (to hook up some post_delete hooks) and a custom caching manager. Invalidation works by maintaining a “flush list” of dependent cache entries for each object—this is currently stored in memcached and hence has potential race conditions, but a comment in the source code suggests that this could be solved by moving to redis. # 11th March 2010, 7:35 pm

Johnny Cache. Clever twist on ORM-level caching for Django. Johnny Cache (great name) monkey-patches Django’s QuerySet classes and caches the result of every single SELECT query in memcached with an infinite expiry time. The cache key includes a “generation” ID for each dependent database table, and the generation is changed every single time a table is updated. For apps with infrequent writes, this strategy should work really well—but if a popular table is being updated constantly the cache will be all but useless. Impressively, the system is transaction-aware—cache entries created during a transaction are held in local memory and only pushed to memcached should the transaction complete successfully. # 28th February 2010, 10:55 pm

Distributed lock on top of memcached. A simple Python context manager (taking advantage of the with statement) that implements a distributed lock using memcached to store lock state: “memcached_lock can be used to ensure that some global data is only updated by one server”. Redis would work well for this kind of thing as well. # 1st February 2010, 10:15 am

Crowdsourced document analysis and MP expenses

As you may have heard, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here’s what we built:

[... 2051 words]

dustin’s gomemcached (via) A memcached server written in Go, an experiment by memcached maintainer Dustin Sallings. # 13th November 2009, 3:13 pm

memcache-top. Useful self-contained perl script for interactively monitoring a group of memcached servers. # 29th October 2009, 8:32 am

How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine. # 21st October 2009, 9:14 pm

Ravelry. Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet. # 3rd September 2009, 6:50 pm

Memcached 1.4.0 released. The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration. # 17th July 2009, 10:26 pm

cache-money. A “write-through caching library for ActiveRecord”, maintained by Nick Kallen from Twitter. Queries hit memcached first, and caches are automatically kept up-to-date when objects are created, updated and deleted. Only some queries are supported—joins and comparisons won’t hit the cache, for example. # 28th June 2009, 3:17 pm

Twitter, an Evolving Architecture. The most detailed write-up of Twitter’s current architecture I’ve seen, explaining the four layers of cache (all memcached) used by the Twitter API. # 28th June 2009, 3:09 pm

hash_ring 1.2. A Python library for consistent hashing with memcached, using MD5 and the same algorithm as libketama. Exposes an interface that is identical to regular memcache making this a drop-in replacement. # 5th May 2009, 1:45 pm

peeping into memcached. “Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state”—you can then load the resulting data in to MySQL using LOAD LOCAL INFILE and analyse it using standard SQL queries. # 20th April 2009, 6:35 pm

Tokyo Cabinet: Beyond Key-Value Store. Useful overview of Yet Another Scalable Key Value Store. Interesting points: multiple backends (hash table, B-Tree, in memory, on disk), a “table” engine which enables more advanced queries, a network server that supports HTTP, memcached or its own binary protocol and the ability to extend the engine with Lua scripts. # 14th February 2009, 11:17 am

Rate limiting with memcached

On Monday, several high profile “celebrity” Twitter accounts started spouting nonsense, the victims of stolen passwords. Wired has the full story—someone ran a dictionary attack against a Twitter staff member, discovered their password and used Twitter’s admin tools to reset the passwords on the accounts they wanted to steal.

[... 910 words]

Scaling memcached at Facebook. Fascinating techie details on how Facebook forked memcache to use UDP and increase performance from 50,000 requests a second to 200,000. Now running on 800 servers with 28 TB of memory, and their code is on GitHub. (They may scale like crazy, but they can’t put their blog entry title in the title element?) # 13th December 2008, 10:08 am

Facebook engineering notes on Scaling Out. Jason Sobel explains a couple of tricks Facebook use to deal with consistency between their California and Virginia data centres. The first is to hijack the MySQL replication stream to include information about memcached records to invalidate; the second is to use Layer 7 load balancers which inspect a “last modification time” cookie and send users to the masters in California if they have updated their profile in the past 20 seconds. # 20th August 2008, 11:51 pm

Velocity: A Distributed In-Memory Cache from Microsoft. I’d been wondering what Microsoft ecosystem developers were using in the absence of memcached. Is Velocity the first Windows platform implementation of this idea? # 6th June 2008, 9:52 pm

App Engine Fan: Efficient Global Counters. Implementing efficient counters in Google App Engine, using shards and/or memcached. # 3rd June 2008, 12:56 am

so-you-wanna-see-an-image (via) WordPress.com use Amazon S3 to store images (presumably to save having to create a massive scalable redundant filesystem themselves) but the images are served via a load balanced memcached / varnishd caching system that they control. # 1st May 2008, 10:13 am

Nginx and Memcached, a 400% boost! Ilya Grigorik wrote up my current favourite nginx trick—you set nginx to check memcached for a cache entry matching the current URL on every hit, then invalidate your cache by pushing a new cache record straight in to memcached from your application server. # 11th February 2008, 10:05 pm

RubyForge: Starling. “Starling is a light-weight persistent queue server that speaks the MemCache protocol. It was built to drive Twitter’s backend, and is in production across Twitter’s cluster.” # 11th January 2008, 9:47 pm

NginxMemcachedModule. nginx can be set up to directly serve a URL from memcache if the corresponding cache key is set, and fall back to a backend application server otherwise. Application servers can then write directly to memcache when content needs to be cached or goes stale. # 15th December 2007, 1:59 am

A Django Cache Status. Django view to display stats pulled from your memcached server. # 25th August 2007, 2:08 pm

... Facebook has roughly 200 dedicated memcached servers in its production environment, plus a small number of others for development and so on. A few of those 200 are hot spares. They are all 16GB 4-core AMD64 boxes, just because that’s where the price/performance sweet spot is for us right now.

Steve Grimm # 3rd May 2007, 10:36 pm