System Design Capacity Cheat Sheet (Interview Ballparks)

·
system-design interview scalability performance capacity-planning

If you give only vague scale answers in system design interviews, your design sounds generic. This reference gives practical ballpark numbers you can use to make concrete decisions.

Important: these are estimation ranges, not vendor guarantees. Real numbers depend on hardware, query shape, payload size, indexing, replication, and tuning.


Quick Reference Table

ComponentBallpark ThroughputLatency (typical)Notes
App server (4vCPU)1k-5k RPS (with DB calls), 10k-50k pure API1-10 msCPU vs IO bound changes this a lot
PostgreSQL10k-50k simple reads/s, 5k-10k writes/s1-10 msComplex joins/aggregations can drop to 100-1k QPS
Cassandra (cluster)60k-150k reads/s and writes/s (3-node)1-5 ms p99Partition-key lookups only; scales near linearly
Redis (single node)100k-1M ops/s<1 msRAM-bound, not disk-bound
Kafka (cluster)1M+ msgs/s (depends on partitioning/msg size)5-15 ms e2eUsually consumer side is bottleneck
RabbitMQ20k-50k msgs/s (less with durability)1-10 msGreat for task queues at moderate scale
Elasticsearch1k-10k search QPS, 1k-10k index writes/s5-50 msHeavy aggregations slower (100-500 ms)
CDNMillions RPS globally5-50 ms edgeOrigin limits before CDN limits
S3/Object storage5.5k GET/s and 3.5k PUT/s per prefix100-200 msUse multiple prefixes + CDN fronting

Per-Component Notes

App Server

PostgreSQL

Cassandra

Redis

Kafka

RabbitMQ

Elasticsearch

CDN and Object Storage


Interview Decision Thresholds (Rule of Thumb)

Use these as rough triggers, not hard laws:

The key is not “use distributed tech early.” The key is “introduce complexity only when numbers justify it.”


How to Use These Numbers in an Interview

Step 1 - Estimate load

Example:

10M users
10 requests/user/day
= 100M requests/day
= ~1,200 RPS average
Peak factor 3x -> ~3,600 RPS peak

Step 2 - Map load to components

3,600 RPS app tier:
single instance maybe insufficient for resiliency
-> use 3-5 app instances behind LB

If each request does one DB read:
~3,600 QPS to DB
-> fits Postgres read capacity (with indexing)

Step 3 - Find bottlenecks

If writes grow to 15k/s sustained:
likely beyond comfortable single Postgres write profile
-> shard, queue, or move specific workloads to Cassandra

Step 4 - State migration path

Interviewers like this language:

“At current scale I’d keep Postgres for simplicity and consistency. If sustained writes cross 10k+/s, I would introduce write partitioning and event buffering, then move high-throughput append workloads to Cassandra while retaining Postgres for transactional correctness.”


Caveats You Should Always Say Out Loud

If you say these caveats plus concrete ranges, your answer sounds practical and senior.