Simon Willison’s Weblog

Items tagged scaling in 2008

Filters: Year: 2008 × scaling ×

Scaling memcached at Facebook. Fascinating techie details on how Facebook forked memcache to use UDP and increase performance from 50,000 requests a second to 200,000. Now running on 800 servers with 28 TB of memory, and their code is on GitHub. (They may scale like crazy, but they can’t put their blog entry title in the title element?) # 13th December 2008, 10:08 am

Spock Proxy. A MySQL Proxy fork (no Lua) that concentrates solely on sharding, by parsing incoming SQL statements and redirecting them across multiple databases. There are some limitations on the SQL that can be handled (no nested queries, joins across a maximum of two tables) but generally it looks pretty impressive. # 11th December 2008, 9:49 am

Facebook engineering notes on Scaling Out. Jason Sobel explains a couple of tricks Facebook use to deal with consistency between their California and Virginia data centres. The first is to hijack the MySQL replication stream to include information about memcached records to invalidate; the second is to use Layer 7 load balancers which inspect a “last modification time” cookie and send users to the masters in California if they have updated their profile in the past 20 seconds. # 20th August 2008, 11:51 pm

Dark Launches, Gradual Ramps and Isolation: Testing the Scalability of New Features on your Web Site. Smart advice from Dare Obasanjo that extend the “dark launch” idea illustrated by Facebook chat a few weeks ago. # 29th June 2008, 2:22 pm

Dissecting today’s Internet traffic spikes (via) Theo Schlossnagle on how the increasing popularity of interest aggregation services such as Digg and Reddit result in traffic spikes that dwarf the old Slashdot effect, making a the old rules of thumb for capacity planning irrelevant. # 29th June 2008, 2:12 pm

Scoble writes something—6,800 writes are kicked off, 1 for each follower. Michael Arrington replies—another 6,600 writes. Jason Calacanis jumps in—another 6,500 writes. Beyond the 19,900 writes, there’s a lot of additional overhead too. You have to hit a DB to figure out who the 19,900 followers are. [...] And here’s the kicker: that giant processing and delivery effort—possibly a combined 100K disk IOs—was caused by 3 users, each just sending one, tiny, 140 char message. How innocent it all seemed.

Isreal L'Heureux # 23rd May 2008, 7:28 pm

Engineering @ Facebook: Facebook Chat. The new Facebook Chat uses Comet (long polling with a hidden iframe) against a custom web / chat server written in Erlang, designed to handle a launch to all 70 million users at once. It was tested using a “dark launch” period where live pages simulated chat request traffic without showing any visible UI. # 15th May 2008, 7:55 am

Internet Asshattery, Armchair Scaling Experts Edition (via) Leonard says what needs to be said about the most recent case of Twitter scaling flame-bait. # 25th April 2008, 11:19 pm

Google App Engine. Write applications in Python using a WSGI compatible application framework, then host them on Google’s highly scalable infrastructure. The most exciting part is probably the Datastore API, which provides external developers with access to Bigtable for the first time. # 8th April 2008, 7:25 am

Consistent Hashing. Beautifully clear explanation of consistent hashing, a simple technique that allows you to add new caching servers to a cluster without re-hashing your keys and hence invalidating all of your caches. # 18th March 2008, 1 am

The GigaOM Interview: Mark Zuckerberg. Some interesting titbits on Facebook’s architecture. # 11th March 2008, 5:41 am

Two data streams for a happy website. Useful architectural concept for scaling: keep user-specific and generic data separate from the start, in recognition of their different caching and partitioning constraints. # 4th March 2008, 4:40 am