Simon Willison’s Weblog

6 items tagged “cassandra”

2021

How Discord Stores Billions of Messages (via) Fascinating article from 2017 describing how Discord migrated their primary message store to Cassandra (from MongoDB, but I could easily see them making the same decision if they had started with PostgreSQL or MySQL). The trick with scalable NoSQL databases like Cassandra is that you need to have a very deep understanding of the kinds of queries you will need to answer—and Discord had exactly that. In the article they talk about their desire to eventually migrate to Scylla (a compatible Cassandra alternative written in C++)—in the Hacker News comments they confirm that in 2021 they are using Scylla for a few things but they still have their core messages in Cassandra. # 24th August 2021, 9:31 pm

2010

What are the advantages and disadvantages of using MongoDB vs CouchDB vs Cassandra vs Redis?

I see Redis as a different category from the other three—kind of like you wouldn’t say “what are the advantages of MySQL v.s. Memcached”. Redis makes an excellent complement to pretty much any other persistent storage mechanism. I expanded on this here: http://simonwillison.net/2009/Oc...

[... 67 words]

reddit’s May 2010 “State of the Servers” report. An interesting Cassandra war story: Cassandra scales up, but it doesn’t scale down very well: running with just three nodes can make recovery from problems a lot more tricky. # 18th May 2010, 6:37 pm

Reddit is now running on Cassandra. Migrating their persistent cache over from memcacheDB to Cassandra took one developer just ten days. # 13th March 2010, 12:14 am

2009

Looking to the future with Cassandra. Digg are now using Cassandra for their “green badge” (one of your friends have dugg this story) feature—the resulting denormalised dataset weighs in at 3 TB and 76 billion columns. # 9th September 2009, 9:26 pm

Up and running with Cassandra. Twitter are beginning to use Cassandra, the open source branch of Facebook’s BigTable-like non-relational database. Evan Weaver explains how to get started with it, but warns that it’s not yet a good idea to trust data to it without having a full backup in an unrelated storage engine. # 7th July 2009, 11:18 am