Simon Willison’s Weblog

Subscribe

129 items tagged “scaling”

2008

Internet Asshattery, Armchair Scaling Experts Edition (via) Leonard says what needs to be said about the most recent case of Twitter scaling flame-bait. # 25th April 2008, 11:19 pm

Google App Engine. Write applications in Python using a WSGI compatible application framework, then host them on Google’s highly scalable infrastructure. The most exciting part is probably the Datastore API, which provides external developers with access to Bigtable for the first time. # 8th April 2008, 7:25 am

Consistent Hashing. Beautifully clear explanation of consistent hashing, a simple technique that allows you to add new caching servers to a cluster without re-hashing your keys and hence invalidating all of your caches. # 18th March 2008, 1 am

The GigaOM Interview: Mark Zuckerberg. Some interesting titbits on Facebook’s architecture. # 11th March 2008, 5:41 am

Two data streams for a happy website. Useful architectural concept for scaling: keep user-specific and generic data separate from the start, in recognition of their different caching and partitioning constraints. # 4th March 2008, 4:40 am

2007

Eventually Consistent. Werner Vogels explains the trade-offs involved in building scalable, highly-available data stores such as Amazon’s SimpleDB. # 20th December 2007, 5:59 pm

Techniques for safely consuming external HTTP on demand? I asked this question on programming.reddit.com yesterday and got some really insightful answers, including Joe Stump from Digg describing how Digg Images uses Danga’s Gearman worker queue. # 15th December 2007, 12:29 pm

NginxMemcachedModule. nginx can be set up to directly serve a URL from memcache if the corresponding cache key is set, and fall back to a backend application server otherwise. Application servers can then write directly to memcache when content needs to be cached or goes stale. # 15th December 2007, 1:59 am

What You Need To Know About Amazon SimpleDB. Amazon have finally launched the database component of their web service suite. It fits a bunch of current trends: key/value pairs, schemaless, built on top of Erlang. “Eventual consistency” is an interesting characteristic. # 14th December 2007, 11:21 am

Client Side Load Balancing for Web 2.0 Applications (via) I recall that early versions of Netscape picked a random server from a hard-coded list each time a user clicked the “What’s New” button, back before server-side scaling techniques were well understood. # 5th October 2007, 11:29 pm

Scale rails from one box to three, four and five. Excellent, concise run-down of what it takes to scale a web application. Most of the advice is easily portable to other frameworks. # 30th July 2007, 1:40 pm

High Scalability (via) New blog about building scalable, reliable sites. # 26th July 2007, 8:15 pm

YouTube Scalability Talk. Kyle Cordes’ notes on a Google Tech Talk on scaling YouTube by Cuong Do. # 14th July 2007, 10:26 pm

SlideShare: Webapps scalability. Lots of great presentations on scaling, from Twitter, Digg, Vox, LiveJournal, Last.fm and more. # 4th July 2007, 12:53 am

SELECT * FROM everything, or why databases are awesome. I’m beginning to think that for scalable applications the thinner your ORM is the better—if you even use one at all. # 22nd June 2007, 12:40 am

iLike: Holy cow... 6mm users and growing 300k/day! (via) Facebook platform offers a viral distribution mechanism for free. Downside: you have to double your capacity every few days. # 13th June 2007, 9:02 am

Rapid development serving 500,000 pages/hour (via) Curse Gaming are getting impressive performance out of Django. # 24th May 2007, 4:11 pm

Wikipedia internals (PDF) (via) A gold mine of scaling tips. # 11th May 2007, 11:35 am

... Facebook has roughly 200 dedicated memcached servers in its production environment, plus a small number of others for development and so on. A few of those 200 are hot spares. They are all 16GB 4-core AMD64 boxes, just because that’s where the price/performance sweet spot is for us right now.

Steve Grimm # 3rd May 2007, 10:36 pm

MintCache for Django. Caching scheme for Django that solves the dog-pile effect, where high traffic causes many processes to regenerate stale cached data at the same time. # 2nd May 2007, 8:49 am

The top 10 presentations on scaling websites: twitter, Flickr, Bloglines, Vox and more. I normally avoid linking to “top 10” lists on principle, but this one pulls together some great resources and adds extra context to each one. # 1st May 2007, 1:51 pm

Capacity Planning for LAMP (via) John Allspaw’s MySQL Conf 2007 talk on capacity planning (John is Operations Engineering Manager at Flickr). # 27th April 2007, 8:41 pm

Scaling Twitter (via) Slides from Blaine’s recent talk. # 23rd April 2007, 11:02 am

In the big picture, Twitter did exactly the right thing. They had a good idea and they buckled down and focused on delivering something as cool as possible as fast as possible, and it’s really hard, in early 2007, to beat Rails for that. When all of a sudden there were a few tens of thousands of people using it, then they went to work on the scaling.

Tim Bray # 14th April 2007, 9:13 am

The promise [of J2EE] was that of infinite scalability based on tooling, which assumes that designing scalable systems is a general case problem. I now firmly believe that this is flawed reasoning. Frameworks don’t solve scalability problems, design solves scalability problems.

Ryan Tomayko # 14th April 2007, 2:35 am

Rails and Scaling with Multiple Databases. Ryan Tomayko explains how his team spreads a high traffic Rails application across five separate PostgreSQL databases by giving each client their own schema—similar to how WordPress MU scales. # 14th April 2007, 2:32 am

None of these scaling approaches are as fun and easy as developing for Rails. All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise.

Alex Payne, Twitter # 12th April 2007, 2:51 pm

Scaling Python for High-Load Web Sites. Slides from a talk at PyCon. Be sure to switch to the notes view (Ø in the bottom right)—a really nice overview of scaling up from a CGIs to load balanced, memcached Python application servers. # 4th March 2007, 9:14 pm

Data::ObjectDriver. Benjamin Trott’s Perl ORM, with built in support for both caching and data partitioning. I think this is what Six Apart uses for Vox. # 25th February 2007, 12:43 am

A brief update with some numbers for hardware load-balanced mongrels. 4000 requests/second on 48 mongrels behind a hardware load balancer. # 5th February 2007, 12:38 am