System Design: Notification System

Published: at 11:00 AM
(8 min read)

Table of contents

Open Table of contents

Introduction

A notification system looks simple on the surface. In practice it touches fan-out scaling, channel routing, idempotency, real-time infrastructure, and hard decisions about complexity vs cost. Every lesson from Phase 1 and 2 converges here.


Step 1: Requirements

Functional:

Non-functional:


Step 2: Scale Estimation

100M daily active users
20 notifications/user/day
= 2 billion notifications/day
= ~23,000 notifications/second normally

Peak (IPL, flash sale): could spike to millions/second

This immediately tells you:


Step 3: The Core Challenge — Fan-Out

Fan-out means one event triggers notifications to many users.

Virat Kohli posts a photo
→ 250 million followers
→ 250 million notifications in seconds

Wrong approach:

Post created
→ Fetch all 250M followers from DB (single query, kills DB)
→ Loop through each, send one by one
→ Takes hours

Right approach:

Post created
→ Publish ONE event to Kafka
→ Notification workers consume from Kafka
→ Workers fetch followers in batches of 1,000
→ Thousands of workers process in parallel
→ 250M notifications sent in seconds

Step 4: Architecture

Layer 1: Event Sources

All application services publish notification events to Kafka:

Order Service → "order.shipped" event
Social Service → "post.liked" event
Auth Service → "new.login" event
Scheduler → "weekly.summary.due" event

    [Kafka]

Layer 2: Notification Service (The Brain)

Consumes Kafka events and makes decisions:

Event: { type: "POST_LIKED", senderId: "u123", recipientId: "u456" }

Notification Service:
→ Fetch user u456 preferences (Redis cache)
→ Disabled like notifications? → skip
→ DND hours active? → delay
→ Preferred channels? → push + email
→ Build content
→ Route to channel queues

Layer 3: Channel Queues (Separate Per Channel)

Notification Service
    ↓         ↓         ↓         ↓
Push Queue  Email Queue  SMS Queue  In-App Queue
    ↓         ↓         ↓         ↓
Push Worker Email Worker SMS Worker In-App Worker
    ↓         ↓         ↓         ↓
 APNs/FCM  SendGrid   Twilio    Cassandra

Why separate queues per channel:

Kafka vs RabbitMQ Decision

Both work for the channel queues. Many companies use Kafka end-to-end:

Kafka: "notification.push"
Kafka: "notification.email"
Kafka: "notification.sms"

Push consumer group → each notification processed by exactly one worker

Use RabbitMQ if you need native priority queues or per-message TTL (both complex in Kafka). Use Kafka if team already uses it and you want one consistent system.

There is no single correct answer. The key is justifying your choice:

“Kafka end-to-end — consumer groups give exactly-once processing per channel, consistent technology, handles our scale”

“Kafka at ingestion for fan-out, RabbitMQ at channel layer for native priority queues and simpler retry config”

Both are valid.


Step 5: Data Model

Notifications table — Cassandra:

id            → unique notification ID
recipient_id  → partition key
created_at    → sort key (time series access pattern)
type          → "POST_LIKED", "ORDER_SHIPPED"
channel       → "push", "email", "sms", "in_app"
status        → "pending", "sent", "failed", "read"
metadata      → JSON

Cassandra fits: 2B writes/day, time series, access pattern is “all notifications for user u456 sorted by time.”

User preferences — PostgreSQL:

user_id           → FK to users
push_enabled      → boolean
email_enabled     → boolean
dnd_start_hour    → 22
dnd_end_hour      → 8
disabled_types    → ["MARKETING"]
timezone          → "Asia/Kolkata"

PostgreSQL fits: relational (tied to user accounts), mutable, low volume, ACID for preference updates.

Cache preferences in Redis:

Key: "prefs:u456"
TTL: 1 hour

2B notifications/day hitting PostgreSQL for preferences = disaster. Read from Redis first.


Step 6: Hard Problems

Idempotency — No Duplicate Notifications

In distributed systems, failures cause retries. Retry = potential duplicate.

Worker sends push → APNs delivers ✅
Network timeout before confirmation → worker retries
User gets notification twice ❌

Fix — idempotency key:

notificationId = hash(recipientId + eventType + eventId)

Before sending:
→ Check Redis: "sent:{notificationId}" exists?
→ Yes → already sent, skip
→ No → send, set "sent:{notificationId}" in Redis (24hr TTL)

Same event processed multiple times → delivered exactly once.

Do Not Disturb

Notification arrives at 11 PM, user has DND 10 PM–8 AM:
→ Don't discard — delay it
→ Store in "scheduled_notifications"
→ Scheduler job runs at 8 AM, sends delayed notifications

Transactional notifications override DND:

Third Party Failures

Worker attempts send
→ APNs/SendGrid/Twilio returns error
→ Exponential backoff retry:
   Retry 1 → wait 1 min
   Retry 2 → wait 5 min
   Retry 3 → wait 30 min
   Retry 4 → Dead Letter Queue → alert engineers

Priority Queues

OTP for login → Critical — seconds matter
Order confirmed → High — send quickly
Post like → Normal — minutes acceptable
Weekly summary → Low — any time today

Separate queues per priority. Most workers assigned to Critical.


Full Architecture

[Event Sources]
Order / Social / Auth Services

      [Kafka]
   "notification.events"

[Notification Service]
  Checks Redis prefs cache
  Checks DND
  Builds content
  Routes to channel queues

[Channel Queues]
Push | Email | SMS | In-App

[Channel Workers]

APNs/FCM | SendGrid | Twilio | DB

[Storage]
Cassandra → notification history
PostgreSQL → user preferences
Redis → preference cache + idempotency keys

Feature Extension: Notification Grouping

The ask: show “John, Mary and 2 others liked your post” instead of 4 separate notifications.

Redis aggregation window:

Like arrives
→ Don't send immediately
→ Redis key: "notif:group:{postId}:{recipientId}"
  Value: ["John", "Mary", "Alex"]
  TTL: 30 seconds

New like:
→ Key exists? → append to list, reset TTL
→ Key gone (TTL expired)? → send grouped notification

Worker triggered by TTL expiry → sends "John, Mary and 1 other liked your post"

Never group:


Feature Extension: Real-Time Unread Counter

The ask: bell icon updates in real time without page refresh.

Option A: SSE + Redis Pub/Sub

Notification arrives for u456
→ Increment Redis counter: "unread:u456"
→ Publish to Redis Pub/Sub: channel "user:u456"
→ SSE Server holding u456's connection receives signal
→ Pushes to browser → bell icon updates instantly

Redis Pub/Sub vs Kafka:

Redis Pub/SubKafka
Message persistenceNo — fire and forgetYes — stored on disk
Offline consumersMessage lostMessage waits
ReplayNoYes
LatencySub-millisecondLow but higher
Use caseReal-time signalingEvent streaming

Redis Pub/Sub is right here because:

The scaling problem: 100M users = 100M SSE connections = 2,000 servers at $300K/month just for a bell icon counter.

Option B: Polling (Often the Right Answer)

Client polls every 30 seconds:
GET /api/notifications/count
→ App server reads Redis "unread:u456"
→ Returns count in <1ms
→ Bell icon updates

Why polling often wins for this specific feature:

When to use SSE instead of polling:

When to use push notifications (mobile):

Start with the simplest solution that meets the requirement. Add complexity only when the simple solution genuinely breaks.

For a notification counter: polling is simpler, cheaper, and sufficient. Build SSE only if polling proves inadequate.


Everything Connects

ComponentLesson
Kafka fan-outLesson 8 — Message Queues
Channel workers with RabbitMQ/KafkaLesson 8
Redis preference cacheLesson 6 — Caching
Redis idempotency keysLesson 6
Redis Pub/Sub for SSE routingLesson 6
Cassandra for notification historyLesson 7 — Databases
PostgreSQL for user preferencesLesson 7
Load balancer across notification serversLesson 5
AP for notification history, CP for idempotencyLesson 3 — CAP

Key Takeaways

  1. Fan-out is the hardest scaling problem — one event, millions of recipients. Kafka + batched workers solves it.
  2. Separate queues per channel — failures and slowdowns in one channel don’t affect others.
  3. Idempotency keys in Redis — the simplest way to prevent duplicate delivery.
  4. DND means delay, not discard. Transactional notifications always bypass DND.
  5. Redis Pub/Sub = fire and forget signaling. Kafka = guaranteed delivery with replay. Don’t confuse them.
  6. Real-time unread counter: polling + Redis counter is often the right answer. SSE is only worth the infrastructure cost when sub-second updates are a genuine requirement.

Part of the system design series. Next: designing a feed system — how 500 million users get a personalised feed in under 100ms.