System Design: Notification System

Open Table of contents

Introduction
Step 1: Requirements
Step 2: Scale Estimation
Step 3: The Core Challenge — Fan-Out
Step 4: Architecture
Step 5: Data Model
Step 6: Hard Problems
Full Architecture
Feature Extension: Notification Grouping
Feature Extension: Real-Time Unread Counter
- Option A: SSE + Redis Pub/Sub
- Option B: Polling (Often the Right Answer)
Everything Connects
Key Takeaways

Introduction

A notification system looks simple on the surface. In practice it touches fan-out scaling, channel routing, idempotency, real-time infrastructure, and hard decisions about complexity vs cost. Every lesson from Phase 1 and 2 converges here.

Step 1: Requirements

Functional:

Send notifications across channels: push (mobile), email, SMS, in-app
Trigger types: user-triggered (like, follow), system-triggered (order shipped), scheduled (weekly summary)
User preferences: disable notification types, choose channels, set Do Not Disturb hours
Notification history: user can view past notifications

Non-functional:

High throughput — millions/second during peak events
Low latency — notifications arrive within seconds
Reliability — transactional notifications (OTP, order confirmed) must never be lost
Idempotency — same notification never sent twice

Step 2: Scale Estimation

100M daily active users
20 notifications/user/day
= 2 billion notifications/day
= ~23,000 notifications/second normally

Peak (IPL, flash sale): could spike to millions/second

This immediately tells you:

Synchronous processing will collapse — async is mandatory
Message queues are the backbone
Must handle massive bursts (fan-out)

Step 3: The Core Challenge — Fan-Out

Fan-out means one event triggers notifications to many users.

Virat Kohli posts a photo
→ 250 million followers
→ 250 million notifications in seconds

Wrong approach:

Post created
→ Fetch all 250M followers from DB (single query, kills DB)
→ Loop through each, send one by one
→ Takes hours

Right approach:

Post created
→ Publish ONE event to Kafka
→ Notification workers consume from Kafka
→ Workers fetch followers in batches of 1,000
→ Thousands of workers process in parallel
→ 250M notifications sent in seconds

Step 4: Architecture

Layer 1: Event Sources

All application services publish notification events to Kafka:

Order Service → "order.shipped" event
Social Service → "post.liked" event
Auth Service → "new.login" event
Scheduler → "weekly.summary.due" event
       ↓
    [Kafka]

Layer 2: Notification Service (The Brain)

Consumes Kafka events and makes decisions:

Event: { type: "POST_LIKED", senderId: "u123", recipientId: "u456" }

Notification Service:
→ Fetch user u456 preferences (Redis cache)
→ Disabled like notifications? → skip
→ DND hours active? → delay
→ Preferred channels? → push + email
→ Build content
→ Route to channel queues

Layer 3: Channel Queues (Separate Per Channel)

Notification Service
    ↓         ↓         ↓         ↓
Push Queue  Email Queue  SMS Queue  In-App Queue
    ↓         ↓         ↓         ↓
Push Worker Email Worker SMS Worker In-App Worker
    ↓         ↓         ↓         ↓
 APNs/FCM  SendGrid   Twilio    Cassandra

Why separate queues per channel:

SMS is slow — don’t let it block push delivery
Email provider down → push and SMS still work
Each channel scales independently
A push spike doesn’t affect email throughput

Kafka vs RabbitMQ Decision

Both work for the channel queues. Many companies use Kafka end-to-end:

Kafka: "notification.push"
Kafka: "notification.email"
Kafka: "notification.sms"

Push consumer group → each notification processed by exactly one worker

Use RabbitMQ if you need native priority queues or per-message TTL (both complex in Kafka). Use Kafka if team already uses it and you want one consistent system.

There is no single correct answer. The key is justifying your choice:

“Kafka end-to-end — consumer groups give exactly-once processing per channel, consistent technology, handles our scale”

“Kafka at ingestion for fan-out, RabbitMQ at channel layer for native priority queues and simpler retry config”

Both are valid.

Step 5: Data Model

Notifications table — Cassandra:

id            → unique notification ID
recipient_id  → partition key
created_at    → sort key (time series access pattern)
type          → "POST_LIKED", "ORDER_SHIPPED"
channel       → "push", "email", "sms", "in_app"
status        → "pending", "sent", "failed", "read"
metadata      → JSON

Cassandra fits: 2B writes/day, time series, access pattern is “all notifications for user u456 sorted by time.”

User preferences — PostgreSQL:

user_id           → FK to users
push_enabled      → boolean
email_enabled     → boolean
dnd_start_hour    → 22
dnd_end_hour      → 8
disabled_types    → ["MARKETING"]
timezone          → "Asia/Kolkata"

PostgreSQL fits: relational (tied to user accounts), mutable, low volume, ACID for preference updates.

Cache preferences in Redis:

Key: "prefs:u456"
TTL: 1 hour

2B notifications/day hitting PostgreSQL for preferences = disaster. Read from Redis first.

Step 6: Hard Problems

Idempotency — No Duplicate Notifications

In distributed systems, failures cause retries. Retry = potential duplicate.

Worker sends push → APNs delivers ✅
Network timeout before confirmation → worker retries
User gets notification twice ❌

Fix — idempotency key:

notificationId = hash(recipientId + eventType + eventId)

Before sending:
→ Check Redis: "sent:{notificationId}" exists?
→ Yes → already sent, skip
→ No → send, set "sent:{notificationId}" in Redis (24hr TTL)

Same event processed multiple times → delivered exactly once.

Do Not Disturb

Notification arrives at 11 PM, user has DND 10 PM–8 AM:
→ Don't discard — delay it
→ Store in "scheduled_notifications"
→ Scheduler job runs at 8 AM, sends delayed notifications

Transactional notifications override DND:

OTP → send immediately regardless
Order confirmed → send immediately
Marketing email → respect DND
Like notification → respect DND

Third Party Failures

Worker attempts send
→ APNs/SendGrid/Twilio returns error
→ Exponential backoff retry:
   Retry 1 → wait 1 min
   Retry 2 → wait 5 min
   Retry 3 → wait 30 min
   Retry 4 → Dead Letter Queue → alert engineers

Priority Queues

OTP for login → Critical — seconds matter
Order confirmed → High — send quickly
Post like → Normal — minutes acceptable
Weekly summary → Low — any time today

Separate queues per priority. Most workers assigned to Critical.

Full Architecture

[Event Sources]
Order / Social / Auth Services
         ↓
      [Kafka]
   "notification.events"
         ↓
[Notification Service]
  Checks Redis prefs cache
  Checks DND
  Builds content
  Routes to channel queues
         ↓
[Channel Queues]
Push | Email | SMS | In-App
         ↓
[Channel Workers]
         ↓
APNs/FCM | SendGrid | Twilio | DB
         ↓
[Storage]
Cassandra → notification history
PostgreSQL → user preferences
Redis → preference cache + idempotency keys

Feature Extension: Notification Grouping

The ask: show “John, Mary and 2 others liked your post” instead of 4 separate notifications.

Redis aggregation window:

Like arrives
→ Don't send immediately
→ Redis key: "notif:group:{postId}:{recipientId}"
  Value: ["John", "Mary", "Alex"]
  TTL: 30 seconds

New like:
→ Key exists? → append to list, reset TTL
→ Key gone (TTL expired)? → send grouped notification

Worker triggered by TTL expiry → sends "John, Mary and 1 other liked your post"

Never group:

OTP, order updates, direct messages, security alerts — each is distinct and time-sensitive

Feature Extension: Real-Time Unread Counter

The ask: bell icon updates in real time without page refresh.

Option A: SSE + Redis Pub/Sub

Notification arrives for u456
→ Increment Redis counter: "unread:u456"
→ Publish to Redis Pub/Sub: channel "user:u456"
→ SSE Server holding u456's connection receives signal
→ Pushes to browser → bell icon updates instantly

Redis Pub/Sub vs Kafka:

	Redis Pub/Sub	Kafka
Message persistence	No — fire and forget	Yes — stored on disk
Offline consumers	Message lost	Message waits
Replay	No	Yes
Latency	Sub-millisecond	Low but higher
Use case	Real-time signaling	Event streaming

Redis Pub/Sub is right here because:

Message loss is acceptable — the Redis counter is the source of truth
User offline = no SSE connection = nobody to signal anyway
Sub-millisecond latency required

The scaling problem: 100M users = 100M SSE connections = 2,000 servers at $300K/month just for a bell icon counter.

Option B: Polling (Often the Right Answer)

Client polls every 30 seconds:
GET /api/notifications/count
→ App server reads Redis "unread:u456"
→ Returns count in <1ms
→ Bell icon updates

Why polling often wins for this specific feature:

Redis counter already exists — polling just reads it
No extra infrastructure, no extra cost
30 second delay on a notification counter is acceptable
100M users × poll every 30s = ~3.3M req/sec — Redis handles this easily

When to use SSE instead of polling:

Feature genuinely requires sub-second updates (live auction, live score)
You already have WebSocket infrastructure (piggyback for free)
Poll requests themselves become the bottleneck

When to use push notifications (mobile):

APNs (iOS) and FCM (Android) maintain the connections
Apple and Google handle delivery to device
Zero SSE servers needed for mobile users

Start with the simplest solution that meets the requirement. Add complexity only when the simple solution genuinely breaks.

For a notification counter: polling is simpler, cheaper, and sufficient. Build SSE only if polling proves inadequate.

Everything Connects

Component	Lesson
Kafka fan-out	Lesson 8 — Message Queues
Channel workers with RabbitMQ/Kafka	Lesson 8
Redis preference cache	Lesson 6 — Caching
Redis idempotency keys	Lesson 6
Redis Pub/Sub for SSE routing	Lesson 6
Cassandra for notification history	Lesson 7 — Databases
PostgreSQL for user preferences	Lesson 7
Load balancer across notification servers	Lesson 5
AP for notification history, CP for idempotency	Lesson 3 — CAP

Key Takeaways

Fan-out is the hardest scaling problem — one event, millions of recipients. Kafka + batched workers solves it.
Separate queues per channel — failures and slowdowns in one channel don’t affect others.
Idempotency keys in Redis — the simplest way to prevent duplicate delivery.
DND means delay, not discard. Transactional notifications always bypass DND.
Redis Pub/Sub = fire and forget signaling. Kafka = guaranteed delivery with replay. Don’t confuse them.
Real-time unread counter: polling + Redis counter is often the right answer. SSE is only worth the infrastructure cost when sub-second updates are a genuine requirement.

Part of the system design series. Next: designing a feed system — how 500 million users get a personalised feed in under 100ms.