Why Cassandra Kept Winning
Looking across all the system designs, Cassandra appeared repeatedly. The pattern behind every choice:
| System | Access pattern | Why Cassandra |
|---|---|---|
| URL Shortener | slug → long_url | Simple key lookup, 115k reads/sec |
| Notification System | All notifications for user X by time | Time series, 2B writes/day |
| Feed System | Posts by user X by time | Time series, write heavy |
| Ride Sharing | Location pings for ride X | Extreme write throughput, 333k/sec |
Every time Cassandra won, three things were true:
- Access pattern was known upfront — “give me records for this key”
- Scale was massive — billions of records, thousands of writes/sec
- Data was time series or append-only — mostly inserts, rarely updates
When MongoDB Wins Instead
Use MongoDB when:
- Data structure is unpredictable — product catalog where iPhone has storage/colour options and a T-shirt has size/colour options. Document model fits naturally.
- You need rich ad-hoc queries — “find products under ₹500 with rating > 4 in electronics with free delivery.” MongoDB’s query engine handles this; Cassandra can only query by partition key.
- Moderate scale with complex structure — millions not billions, frequent schema changes during development.
The Decision Rule
1. Do I know exactly how I'll query this data?
Yes → Cassandra (optimise for that query)
No → MongoDB (flexible queries)
2. What's my write volume?
Millions+ per day → Cassandra
Thousands per day → either works
Replication — Surviving Failures
Replication = keeping multiple copies of data on different machines.
Master-Slave
Primary (Master) → handles all writes
↓ replicates to
Replica 1 (Slave) → reads only
Replica 2 (Slave) → reads only
Good for read-heavy systems. Primary dies → replica promoted (30-60 sec downtime).
Multi-Master (Cassandra’s model)
Primary 1 ←sync→ Primary 2 ←sync→ Primary 3
All three accept reads and writes
No single point of failure. Any node dying doesn’t affect writes. Trade-off: conflict resolution needed (last write wins) — this is the AP choice from CAP theorem.
Replication Factor
Factor 1 → no redundancy (never do this)
Factor 2 → one failure tolerated
Factor 3 → industry standard — two simultaneous failures tolerated
In Cassandra with factor 3: write confirmed when 2 of 3 nodes acknowledge.
Sharding — When Data Is Too Big for One Machine
Even with replication, one primary has limits. Sharding = splitting data across multiple database instances. Each instance (shard) holds a subset.
Without sharding: one DB holds all 500M users
With sharding:
Shard 1 → users 1 to 125M
Shard 2 → users 125M to 250M
Shard 3 → users 250M to 375M
Shard 4 → users 375M to 500M
Choosing a Shard Key
Range-based — by user ID range. Problem: new users always get high IDs → all new writes hit last shard → hot shard.
Hash-based ✅
shard_number = hash(userId) % number_of_shards
userId 1001 → hash → Shard 3
userId 1002 → hash → Shard 1
Distribution is uniform. No hot shards.
Geographic — India users → India shard, US users → US shard. Good for latency and GDPR. Problem: uneven load if one region is much larger.
The Cross-Shard Query Problem
Simple query (works great):
"Give me user 1001" → hash(1001) → Shard 3 → done ✅
Complex query (breaks):
"All users who signed up last month with 10+ orders"
→ Must query every shard → merge results
→ Called scatter-gather — expensive ❌
Sharding forces you to design queries around your shard key.
How the Systems Actually Used These
| System | Shard key | Replication |
|---|---|---|
| URL Shortener | slug (hash) | Factor 3 |
| Notification System | recipient_id | Factor 3 |
| Feed System | user_id | Factor 3 |
| Ride Sharing (PostgreSQL) | city (geographic) | Master-slave |
Replication + Sharding Together
3 shards × replication factor 3 = 9 total nodes
Shard 1 → Primary 1 + Replica 1a + Replica 1b
Shard 2 → Primary 2 + Replica 2a + Replica 2b
Shard 3 → Primary 3 + Replica 3a + Replica 3b
Sharding handles scale. Replication handles availability. Start with replication first — it’s simpler. Add sharding only when a single server genuinely can’t handle the load.
The Interview Line
“I’d use Cassandra with replication factor 3 and partition key as userId. This gives fault tolerance through replication and horizontal scale through consistent hash partitioning.”