Architecture

System design and component overview for personalized video feeds

Interactive Architecture Diagram

The system is designed around a request-driven architecture with intelligent caching and asynchronous event processing to meet stringent latency and scale requirements.

Mobile SDK (Host App)

API Gateway

Non-Personalized Feed Service

Personalized Feed Service

Cache Layer (Redis) Non-Personalized

Cache Layer (Redis) Personalized

Ranking Engine Scoring Algorithm

Videos DB Content Metadata

User Signals DB Watch History

Tenant Config DB Weights & Flags

Event Pipeline (Kafka/SQS) Async Processing

API Gateway

Entry point for feed requests, responsible for routing traffic to the appropriate feed path and enforcing basic traffic controls.

Key Details

• Rate limiting: 3,000 RPS peak capacity

• Feature flag evaluation (routes to personalized or non-personalized service)

• Request validation (user_id, tenant_id format)

Legend

Components

Services & Storage

Core Processing

Data Flows

Request Path (real-time)

Synchronous Calls (on-demand)

Async Processing (background)

Fallback Path (conditional)

Caching Strategy

Core Principle

Cache derived, decision-ready state rather than raw inputs or final outputs. Raw user_signals are never cached or read on the request path.

Non-Goals

Never cache raw user_signals
Do not default to per-user feed caching

Cache Item	Key Pattern	TTL	Cardinality	Purpose
Tenant Config	`config:{tenant_id}`	15 min	Low	Store tenant personalization settings and weights
Video Metadata	`video:{video_id}`	60 s	Medium	Cache individual video attributes for ranking
Candidate Pool	`candidates:{tenant_id}`	60 s	Low	Broad unranked video set per tenant for ranking
User Profile (derived)	`profile:{user_id_hash}`	5 min	High	Precomputed preferences: category affinity, engagement patterns
Hot-user Feed (optional)	`feed:hot:{tenant_id}:{user_id_hash}`	60 s	Medium	Optional full feed cache for very high-traffic users only

Request Paths

Personalized Feed Request Path

1API Gateway receives request, validates params, routes to Personalized Feed Service
2Service checks Redis for tenant config (cache miss → DB fallback)
3Service checks Redis for user profile—derived state (category affinity, engagement patterns)
4Service checks Redis for candidate pool (broad unranked videos for tenant)
5Ranking engine performs in-process scoring using cached profile and video metadata
6Service returns ranked feed with metadata. No database reads occur if cache hits.

Non-Personalized Feed Request Path

1API Gateway receives request, routes to Non-Personalized Feed Service
2Service checks Redis for tenant config (cache miss → DB fallback)
3Service checks Redis for candidate pool (tenant-level, shared across users)
4Service sorts by editorial boost and recency (no user profile lookup needed)
5Service returns feed. Response may be HTTP/CDN cacheable due to tenant-level sharing.

Latency Budget

Personalized Feed Latency Budget

Component	p95 (ms)	p99 (ms)	Notes
API Gateway	5	10	Request validation and routing
Redis reads (3 keys)	3	8	Parallel MGET; bounded by network RTT
DB fallback (cache miss)	20	50	Occurs <5% of requests (no joins)
In-process ranking	8	15	Simple scoring, no external calls
Response serialization	2	5	JSON encoding
Total (cache hit)	18	38	Hot path: Redis-only
Total (cache miss)	38	88	Cold path: DB + Redis

Non-Personalized Feed Latency Budget

Component	p95 (ms)	p99 (ms)	Notes
API Gateway	5	10	Request validation and routing
Redis reads (2 keys)	2	6	Tenant-level keys; 99%+ hit rate
DB fallback (cache miss)	15	40	Occurs <1% of requests
Editorial sort	3	8	Simple sort by boost + recency
Response serialization	2	5	JSON encoding
Total (cache hit)	12	29	Hot path: Redis-only
Total (cache miss)	27	69	Cold path: DB + Redis

Why p99 stays bounded

Bounded cache operations: Redis MGET latency is predictable (<8ms p99)
No raw event aggregation: User profiles are precomputed; no DB scans on request path
No DB joins: Candidate pools and profiles are denormalized in cache
Graceful degradation: If Redis is unavailable, DB fallback adds ~50ms p99 overhead

Freshness Budget

Personalized Feed Freshness Budget

Freshness Dimension	Mechanism	Upper Bound	Meets Requirement
New content visible	Candidate pool TTL (60s)	≤ 60 s	Yes
User-signal updates reflected	Event pipeline processes with ≤5 min lag; profiles refreshed after processing	≤ 5 min	Yes
Tenant config changes	Config TTL (15 min) + manual invalidation	≤ 15 min	Yes

Non-Personalized Feed Freshness Budget

Freshness Dimension	Mechanism	Upper Bound	Meets Requirement
New content visible	Candidate pool TTL (60s)	≤ 60 s	Yes
Tenant config changes	Config TTL (15 min) + manual invalidation	≤ 15 min	Yes
HTTP/CDN cache	Optional edge caching (30s TTL)	≤ 30 s (if enabled)	Yes

Scalability Considerations

Horizontal Scaling: Feed service is stateless and can scale horizontally behind a load balancer. Each instance connects to shared Redis and PostgreSQL.

Database Sharding: User_signals table can be sharded by user_id_hash for write scalability. Videos table sharded by tenant_id for isolation.

Redis Cluster: Redis can run in cluster mode for high availability and additional capacity. Separate read replicas for hot data.

CDN Edge Caching: For popular content, CDN can cache feed responses at the edge (30-second TTL), further reducing backend load.