Entries tagged scaling
What are some strategies for scaling sites & infrastructure so global response times are relatively close to US response times?
You need to run your application in multiple data centers around the world, partitioned such that an incoming HTTP request can be completely serviced by a single data center. Then you use global DNS load balancing to direct users to the data center that is closest to them.[... 185 words]
Cal Henderson’s book Building Scalable Websites offers a good grounding.[... 32 words]
There are a bunch of options for communicating between different languages, but these days the simplest is definitely JSON—it maps directly to common data structures in PHP, Python, Ruby and so on. Treat it as your common interchange format and you can’t go far wrong. It’s very easy to build simple internal web services on top of JSON.[... 109 words]
Did Mark Zuckerberg have any knowledge on building scalable social networks prior to starting work on Facebook?
I’m going to bet he didn’t have this knowledge, simply because back when he launched Facebook in 2004 almost NO ONE had this knowledge—there simply weren’t enough “web scale” products around for the patterns needed to run them to be widely discussed.[... 143 words]
Scalability: What is the best way to store and serve hundreds of GB of images for a heavy traffic website?
If you’re not going to use a service like S3, your best bet is to run something like MogileFS (which was designed by LiveJournal for handling images) and stick Varnish (a screamingly fast HTTP caching server) in front of it.[... 66 words]
They use stream processing algorithms—they mention trending topics calculation in their technical blog entry about Storm, their open source stream processing software: http://engineering.twitter.com/2...[... 38 words]
Sorting large amounts of data is one of the first exercises you’ll see described in any Hadoop or map/reduce tutorial—so I’d suggest taking a look at Hadoop.[... 44 words]
Read “Building Scalable Websites” by Cal Henderson. It’s a few years old now but still very relevant—it basically covers everything he learnt the hard way scaling Flickr. It’s a really fun read, too.[... 98 words]
No, because Scala is harder to master than Java.[... 54 words]
I’ve seen tools that do this, but to be honest it’s very simple to write your own script for this (especially if you’re using an ORM). The other benefit to writing your own script for this is that you’ll have a much better chance of accurately representing your expected data, sizes etc.[... 221 words]
We’re building up a pretty sizable collection of video (and slides) from talks about Django on http://lanyrd.com/—including plenty that talk about scaling issues. Try this: http://lanyrd.com/search/?q=djan...—we have 16 videos and 16 slide decks from talks at events all over the world.[... 102 words]
I don’t fully understand the question, but if you’re talking about doing a single join across multiple tables the Django ORM handles that just fine. Let’s say you want to get every BlogEntry written by a User who belongs to the Group with the name “admins”:[... 67 words]
I’d guess Twitter or Craigslist.[... 19 words]
When your own benchmarks prove that your application’s particular load characteristics will perform better on another database—and the difference is large enough that it’s worth the cost involved in retargeting your code. If that cost is high (and it probably will be) it may be worth paying for some expert consultants to ensure that your implementations against the different databases are properly optimised.[... 102 words]
They wrote about their reasons in detail in a post to the Django sub-reddit a while ago: http://www.reddit.com/r/django/c...[... 165 words]
The BBC have a pretty big CouchDB cluster, which they use mostly as a replicated key-value store. It’s used by their new identity platform which includes customisation features for iPlayer.[... 47 words]
My best guess would be Disqus. Instagram are pretty enormous these days as well.[... 31 words]