77 items tagged “scaling”
Notes from a production MongoDB deployment. Notes from running MongoDB for 8 months in production, with 664 million documents spread across 72 GB master and slave servers located in two different data centers.
28th February 2010, 11:05 pm
Django Advent: Scaling Django. Mike Malone’s advice on scaling Django applications, including taking advantage of new features in 1.2.
26th February 2010, 7:22 pm
Search Engine Time Machine. Detailed explanation of how ElasticSearch provides high availability, through clever sharding and replication strategies and configurable gateways for long-term persistent storage.
17th February 2010, 10:32 pm
Elastic Search (via) Solr has competition! Like Solr, Elastic Search provides a RESTful JSON HTTP interface to Lucene. The focus here is on distribution, auto-sharding and high availability. It’s even easier to get started with than Solr, partly due to the focus on providing a schema-less document store, but it’s currently missing out on a bunch of useful Solr features (a web interface and faceting are the two that stand out). The high availability features look particularly interesting. UPDATE: I was incorrect, basic faceted queries are already supported.
11th February 2010, 6:33 pm
dogproxy. Another of my experiments with Node.js—this is a very simple HTTP proxy which addresses the dog pile effect (also known as the thundering herd) by watching out for multiple requests for a URL that is currently “in flight” and bundling them together.
3rd February 2010, 1:05 pm
PostgreSQL 8.5alpha3 now available. “Hot Standby, allowing read-only connections during recovery, provides a built-in master-slave replication solution.” Woohoo!
23rd December 2009, 9:57 am
Django | Multiple Databases. Russell just checked in the final patch developed from Alex Gaynor’s Summer of Code project to add multiple database support to Django. I’d link to the 21,000 line changeset but it crashed our Trac, so here’s the documentation instead.
22nd December 2009, 5:22 pm
PostgreSQL 8.5 alpha 2 is out. “P.S. If you’re wondering about Hot Standby and Synchronous Replication, they’re still under heavy development and still (at this point) expected to be in 8.5.”—Hot Standby is PostgreSQL-speak for MySQL-style master/slave replication for scaling your reads.
28th October 2009, 9:02 am
How We Made GitHub Fast. Detailed overview of the new GitHub architecture. It’s a lot more complicated than I would have expected—lots of moving parts are involved in ensuring they can scale horizontally when they need to. Interesting components include nginx, Unicorn, Rails, DRBD, HAProxy, Redis, Erlang, memcached, SSH, git and a bunch of interesting new open source projects produced by the GitHub team such as BERT/Ernie and ProxyMachine.
21st October 2009, 9:14 pm
MichaelMoore.com in Django. A seriously impressive case study—a complete rebuild from the ground up completed in just five weeks using Django, Solr and Haystack for a high traffic site with a top 10,000 US Alexa ranking.
9th October 2009, 12:38 am
When I worked at Amazon.com we had a deeply-ingrained hatred for all of the SQL databases in our systems. Now, we knew perfectly well how to scale them through partitioning and other means. But making them highly available was another matter. Replication and failover give you basic reliability, but it’s very limited and inflexible compared to a real distributed datastore with master-master replication, partition tolerance, consensus and/or eventual consistency, or other availability-oriented features.
— Matt Brubeck
4th October 2009, 9:50 am
Ravelry. Tim Bray interviews Casey Forbes, the single engineer behind Ravelry, the knitting community that serves 10 million Rails requests a day using just seven physical servers, MySQL, Sphinx, memcached, nginx, haproxy, passenger and Tokyo Cabinet.
3rd September 2009, 6:50 pm
Memcached 1.4.0 released. The big new feature is the (optional) binary protocol, which enables other features such as CAS-everywhere and efficient client-side replication. Maintainer Dustin Sallings has also released some useful sounding EC2 instances which automatically assign nearly all of their RAM to memcached on launch and shouldn’t need any further configuration.
17th July 2009, 10:26 pm
Keyspace. Yet Another Key-Value Store—this one focuses on high availability, with one server in the cluster serving as master (and handling all writes), and the paxos algorithm handling replication and ensuring a new master can be elected should the existing master become unavailable. Clients can chose to make dirty reads against replicated servers or clean reads by talking directly to the master. Underlying storage is BerkeleyDB, and the authors claim 100,000 writes/second. Released under the AGPL.
16th July 2009, 10:30 am
Up and running with Cassandra. Twitter are beginning to use Cassandra, the open source branch of Facebook’s BigTable-like non-relational database. Evan Weaver explains how to get started with it, but warns that it’s not yet a good idea to trust data to it without having a full backup in an unrelated storage engine.
7th July 2009, 11:18 am
uuidd.py. Neat implementation of an ID server from Mike Malone—it serves up incrementing integers over a socket (using Python’s asyncore for fast IO) and records state to a file only after every 10,000 IDs served, so most of the time it’s not reading or writing to disk at all. If the server crashes it doesn’t matter because it can start up again at an integer it’s sure hasn’t been used before.
25th May 2009, 9:34 pm
TwitterAlikeExample—redis. Excellent example of how you design a moderately complex system against a scalable key-value store (in this case redis). Most “how to build Twitter” code examples fail to address the hard problem of scaling user inboxes, but this one tackles it head on.
21st May 2009, 11:14 pm
New Features for EC2: Elastic Load Balancing, Auto Scaling, and Amazon CloudWatch. EC2 now fulfils the promise of “magic scaling in the cloud” out of the box—CloudWatch monitors performance of your EC2 instances without needing to install any monitoring software, Auto Scaling allows you to configure “scaling triggers” which start up new instances based on information from CloudWatch, and Elastic Load Balancing balances requests across all available instances.
18th May 2009, 10:07 am
peeping into memcached. “Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state”—you can then load the resulting data in to MySQL using LOAD LOCAL INFILE and analyse it using standard SQL queries.
20th April 2009, 6:35 pm
Experiences deploying a large-scale infrastructure in Amazon EC2. “At OpenX we recently completed a large-scale deployment of one of our server farms to Amazon EC2. Here are some lessons learned from that experience.”
10th April 2009, 9:43 am
Introducing Digg’s IDDB Infrastructure. IDDB is Digg’s new infrastructure component for sharding data across multiple databases, with support for both MySQL and memcachedb. “The DiggBar and URL minifying service is powered by a 16 machine IDDB cluster, which includes 8 write masters in the index and 8 MySQL storage nodes.”
3rd April 2009, 8:42 pm
Streams, affordances, Facebook, and rounding errors. I asked Kellan about scaling activity streams the other day. Here he suggests the best technique is not to promise a perfect stream (like Twitter does)—Facebook used to get away with 80% loss of update messages, but their new redesign has changed the contract with their users.
19th March 2009, 2:02 pm
redis (via) An in-memory scalable key/value store but with an important difference: this one lets you perform list and set operations against keys, opening up a whole new set of possibilities for application development. It’s very young but already supports persistence to disk and master-slave replication.
15th March 2009, 1:32 pm
What happened to Hot Standby? Hot Standby (the ability to have read-only replication slaves) has been dropped from PostgreSQL 8.4 and is now scheduled for 8.5. “Making hard decisions to postpone features which aren’t quite ready is how PostgreSQL makes sure that our DBMS is ”bulletproof“ and that we release close to on-time every year”.
8th March 2009, 9:28 am
Database Sharding at Netlog, with MySQL and PHP. Detailed MySQL sharding case study from Netlog, who serve five billion page requests a month using thousands of shards across more than 80 database servers.
2nd March 2009, 10:22 am
How FriendFeed uses MySQL to store schema-less data. The pain of altering/ adding indexes to tables with 250 million rows was killing their ability to try out new features, so they’ve moved to storing pickled Python objects and manually creating the indexes they need as denormalised two column tables. These can be created and dropped much more easily, and are continually populated by an off-line index building process.
27th February 2009, 2:33 pm
Building and Scaling a Startup on Rails: 12 Things We Learned the Hard Way. Lessons learned from Posterous. Some good advice in here, in particular “Memcache later: If you memcache first, you will never feel the pain and never learn how bad your database indexes and Rails queries are”. Also recommends using job queues for offline processing of anything that takes more than 200ms.
23rd February 2009, 8:28 am
Sharding Counters on Google App Engine. “While the datastore for App Engine scales to support a huge number of entities it is important to note that you can only expect to update any single entity, or entity-group, about five times a second”. This article explains a technique for sharding writes across multiple counters in detail, including a way to keep a memcache counter updated at the same time for faster reads.
27th January 2009, 8:27 pm
Project Voldemort. Yet Another “big, distributed, persistent, fault-tolerant hash table”—this time from LinkedIn, released under the Apache 2.0 license. The approach to consistency is interesting—instead of using distributed transactions, they use versioning and “resolve inconsistencies at read time”. It also uses consistent hashing (as seen in libketama) to select servers. The design document has lots more information.
17th January 2009, 7:45 pm
New Gearman Server & Library in C, MySQL UDFs. Gearman, the job queue written for LiveJournal and now used by Digg and Yahoo!, has been rewritten in C. Looks like a good candidate for an easily configured lightweight message queue. Also includes hooks for writing MySQL functions that can interact with queues.
13th January 2009, 4:41 pm
MemcacheDB. A server that speaks the memcache protocol but uses Berkeley DB for reliable persistent storage. Speedy: 20,000 writes/second and 60,000+ reads/second. Includes a full replication mechanism (with custom memcache protocol commands) based on Berkeley DB’s.
5th January 2009, 12:37 pm
Scaling memcached at Facebook. Fascinating techie details on how Facebook forked memcache to use UDP and increase performance from 50,000 requests a second to 200,000. Now running on 800 servers with 28 TB of memory, and their code is on GitHub. (They may scale like crazy, but they can’t put their blog entry title in the title element?)
13th December 2008, 10:08 am
Spock Proxy. A MySQL Proxy fork (no Lua) that concentrates solely on sharding, by parsing incoming SQL statements and redirecting them across multiple databases. There are some limitations on the SQL that can be handled (no nested queries, joins across a maximum of two tables) but generally it looks pretty impressive.
11th December 2008, 9:49 am
Facebook engineering notes on Scaling Out. Jason Sobel explains a couple of tricks Facebook use to deal with consistency between their California and Virginia data centres. The first is to hijack the MySQL replication stream to include information about memcached records to invalidate; the second is to use Layer 7 load balancers which inspect a “last modification time” cookie and send users to the masters in California if they have updated their profile in the past 20 seconds.
20th August 2008, 11:51 pm
Dark Launches, Gradual Ramps and Isolation: Testing the Scalability of New Features on your Web Site. Smart advice from Dare Obasanjo that extend the “dark launch” idea illustrated by Facebook chat a few weeks ago.
29th June 2008, 2:22 pm
Dissecting today’s Internet traffic spikes (via) Theo Schlossnagle on how the increasing popularity of interest aggregation services such as Digg and Reddit result in traffic spikes that dwarf the old Slashdot effect, making a the old rules of thumb for capacity planning irrelevant.
29th June 2008, 2:12 pm
Scoble writes something—6,800 writes are kicked off, 1 for each follower. Michael Arrington replies—another 6,600 writes. Jason Calacanis jumps in—another 6,500 writes. Beyond the 19,900 writes, there’s a lot of additional overhead too. You have to hit a DB to figure out who the 19,900 followers are. [...] And here’s the kicker: that giant processing and delivery effort—possibly a combined 100K disk IOs—was caused by 3 users, each just sending one, tiny, 140 char message. How innocent it all seemed.
— Isreal L'Heureux
23rd May 2008, 7:28 pm
Engineering @ Facebook: Facebook Chat. The new Facebook Chat uses Comet (long polling with a hidden iframe) against a custom web / chat server written in Erlang, designed to handle a launch to all 70 million users at once. It was tested using a “dark launch” period where live pages simulated chat request traffic without showing any visible UI.
15th May 2008, 7:55 am
Internet Asshattery, Armchair Scaling Experts Edition (via) Leonard says what needs to be said about the most recent case of Twitter scaling flame-bait.
25th April 2008, 11:19 pm
Google App Engine. Write applications in Python using a WSGI compatible application framework, then host them on Google’s highly scalable infrastructure. The most exciting part is probably the Datastore API, which provides external developers with access to Bigtable for the first time.
8th April 2008, 7:25 am
Consistent Hashing. Beautifully clear explanation of consistent hashing, a simple technique that allows you to add new caching servers to a cluster without re-hashing your keys and hence invalidating all of your caches.
18th March 2008, 1 am
The GigaOM Interview: Mark Zuckerberg. Some interesting titbits on Facebook’s architecture.
11th March 2008, 5:41 am
Two data streams for a happy website. Useful architectural concept for scaling: keep user-specific and generic data separate from the start, in recognition of their different caching and partitioning constraints.
4th March 2008, 4:40 am
Eventually Consistent. Werner Vogels explains the trade-offs involved in building scalable, highly-available data stores such as Amazon’s SimpleDB.
20th December 2007, 5:59 pm
Techniques for safely consuming external HTTP on demand? I asked this question on programming.reddit.com yesterday and got some really insightful answers, including Joe Stump from Digg describing how Digg Images uses Danga’s Gearman worker queue.
15th December 2007, 12:29 pm
NginxMemcachedModule. nginx can be set up to directly serve a URL from memcache if the corresponding cache key is set, and fall back to a backend application server otherwise. Application servers can then write directly to memcache when content needs to be cached or goes stale.
15th December 2007, 1:59 am
What You Need To Know About Amazon SimpleDB. Amazon have finally launched the database component of their web service suite. It fits a bunch of current trends: key/value pairs, schemaless, built on top of Erlang. “Eventual consistency” is an interesting characteristic.
14th December 2007, 11:21 am
Client Side Load Balancing for Web 2.0 Applications (via) I recall that early versions of Netscape picked a random server from a hard-coded list each time a user clicked the “What’s New” button, back before server-side scaling techniques were well understood.
5th October 2007, 11:29 pm
Scale rails from one box to three, four and five. Excellent, concise run-down of what it takes to scale a web application. Most of the advice is easily portable to other frameworks.
30th July 2007, 1:40 pm
High Scalability (via) New blog about building scalable, reliable sites.
26th July 2007, 8:15 pm
YouTube Scalability Talk. Kyle Cordes’ notes on a Google Tech Talk on scaling YouTube by Cuong Do.
14th July 2007, 10:26 pm
SlideShare: Webapps scalability. Lots of great presentations on scaling, from Twitter, Digg, Vox, LiveJournal, Last.fm and more.
4th July 2007, 12:53 am
SELECT * FROM everything, or why databases are awesome. I’m beginning to think that for scalable applications the thinner your ORM is the better—if you even use one at all.
22nd June 2007, 12:40 am
iLike: Holy cow... 6mm users and growing 300k/day! (via) Facebook platform offers a viral distribution mechanism for free. Downside: you have to double your capacity every few days.
13th June 2007, 9:02 am
Rapid development serving 500,000 pages/hour (via) Curse Gaming are getting impressive performance out of Django.
24th May 2007, 4:11 pm
Wikipedia internals (PDF) (via) A gold mine of scaling tips.
11th May 2007, 11:35 am
... Facebook has roughly 200 dedicated memcached servers in its production environment, plus a small number of others for development and so on. A few of those 200 are hot spares. They are all 16GB 4-core AMD64 boxes, just because that’s where the price/performance sweet spot is for us right now.
— Steve Grimm
3rd May 2007, 10:36 pm
MintCache for Django. Caching scheme for Django that solves the dog-pile effect, where high traffic causes many processes to regenerate stale cached data at the same time.
2nd May 2007, 8:49 am
The top 10 presentations on scaling websites: twitter, Flickr, Bloglines, Vox and more. I normally avoid linking to “top 10” lists on principle, but this one pulls together some great resources and adds extra context to each one.
1st May 2007, 1:51 pm
Capacity Planning for LAMP (via) John Allspaw’s MySQL Conf 2007 talk on capacity planning (John is Operations Engineering Manager at Flickr).
27th April 2007, 8:41 pm
Scaling Twitter (via) Slides from Blaine’s recent talk.
23rd April 2007, 11:02 am
In the big picture, Twitter did exactly the right thing. They had a good idea and they buckled down and focused on delivering something as cool as possible as fast as possible, and it’s really hard, in early 2007, to beat Rails for that. When all of a sudden there were a few tens of thousands of people using it, then they went to work on the scaling.
— Tim Bray
14th April 2007, 9:13 am
The promise [of J2EE] was that of infinite scalability based on tooling, which assumes that designing scalable systems is a general case problem. I now firmly believe that this is flawed reasoning. Frameworks don’t solve scalability problems, design solves scalability problems.
— Ryan Tomayko
14th April 2007, 2:35 am
Rails and Scaling with Multiple Databases. Ryan Tomayko explains how his team spreads a high traffic Rails application across five separate PostgreSQL databases by giving each client their own schema—similar to how WordPress MU scales.
14th April 2007, 2:32 am
None of these scaling approaches are as fun and easy as developing for Rails. All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise.
— Alex Payne, Twitter
12th April 2007, 2:51 pm
Scaling Python for High-Load Web Sites. Slides from a talk at PyCon. Be sure to switch to the notes view (Ø in the bottom right)—a really nice overview of scaling up from a CGIs to load balanced, memcached Python application servers.
4th March 2007, 9:14 pm
Data::ObjectDriver. Benjamin Trott’s Perl ORM, with built in support for both caching and data partitioning. I think this is what Six Apart uses for Vox.
25th February 2007, 12:43 am
A brief update with some numbers for hardware load-balanced mongrels. 4000 requests/second on 48 mongrels behind a hardware load balancer.
5th February 2007, 12:38 am
At some point in the past rolling out an application to 300,000 people was the pinnacle of engineering excellence. Today it means you passed your second round of funding and can move out of your parents garage.
— Joe Gregorio
1st February 2007, 11 am
Inside MySpace.com. Case study of scaling against a network effect. Includes pretty honest coverage of the mistakes made along the way, although the article was put together second hand from conference presentations rather than from interviews.
17th January 2007, 9:18 am
Curse launches with Django platform. Handles 500k visits/hour!
14th December 2006, 3:02 am
The Architecture of Mailinator. 3 million e-mails a day on a 2GHz server with 1GB of RAM.
7th December 2006, 3:11 pm
punupgeek.com on Active Resource. Looks like 37 signals might be looking in to scaling across multiple servers using web services.
26th June 2006, 11:12 am
Ruby on Rails and FastCGI: Scaling using processes instead of threads. Relates to the shared-nothing architecture.
12th April 2005, 2:06 pm
Photo Matt: RSS Bandwidth Usage. Matt makes the case for RSS scaling just fine if you’re smart about it.
10th September 2004, 2:48 am
Transcript of Bruce Sterling at Microsoft Corporation (via) Bruce Sterling on scaling up his annual SxSW party. I can’t believe I missed it htis year.
22nd May 2004, 8:35 pm
New Technorati Infrastructure beta test! (via) It certainly feels faster
20th January 2004, 10:36 pm