DS Types

Introduction

Distributed systems can be classified into three main categories based on how data is processed and the timing of that processing.

The classification depends on:

Latency requirements
Input patterns - Is data event-driven or scheduled?
Processing model - Synchronous interaction vs. asynchronous batch processing

Type 1: Online Systems

Systems designed to respond to user queries in real-time with synchronous request-response patterns.

Characteristics

Latency: Low latency required (milliseconds to seconds)
Throughput: Variable, depends on user load
Concurrency: High - many simultaneous users
Input: User-initiated requests (synchronous)
Coupling: Tightly coupled request-response

Examples

Web servers and APIs (REST, GraphQL)
E-commerce platforms
Real-time collaboration tools
Chat applications

Key Considerations

Responsiveness: Must handle user expectations for fast responses
Scalability: Need to handle concurrent users (see partitioning for horizontal scaling)
Reliability: User-facing, so failures are immediately visible
Network resilience: See ds-problems-pad-5nov for handling network delays and failures

Common Patterns

Request-response RPC (Remote Procedure Call)
REST API design
Load balancing across multiple instances

Type 2: Offline Systems

Systems that process large batches of data on a schedule or on-demand, without direct user interaction.

Characteristics

Latency: High latency acceptable (minutes to hours)
Throughput: High - process large volumes efficiently
Concurrency: N/A - single or few batch jobs at a time
Input: Scheduled data (does not depend on online user activity)
Coupling: Loosely coupled, fire-and-forget

Examples

Data warehouse ETL (Extract, Transform, Load)
Analytics pipelines and reporting
Machine learning model training
Data aggregation and cleaning jobs
Scheduled backups and maintenance

Key Considerations

Throughput optimization: Maximize processing efficiency
Resource efficiency: Use resources effectively since jobs run infrequently
Data consistency: Can enforce strong guarantees since no concurrent access
Data partitioning: See partitioning for distributing large datasets across nodes

Common Patterns

MapReduce and batch processing frameworks
Scheduled cron jobs
Data lakes and warehouses
Eventual consistency models

Type 3: Near Real-time Systems

Systems that process continuous streams of data with low-to-moderate latency requirements, often event-driven.

Characteristics

Latency: Moderate (seconds to minutes)
Throughput: Continuous, unbounded stream
Concurrency: Multiple events processing in parallel
Input: Continuous stream of events
Coupling: Loosely coupled via message brokers

Examples

IoT sensor data processing
Video stream processing and analytics
Real-time event aggregation (clickstreams, logs)
Fraud detection systems
Real-time dashboards and monitoring

Key Considerations

Ordering guarantees: May need to preserve event causality
Backpressure handling: What happens when events arrive faster than processing?
Fault tolerance: Missed events can cause data loss
Exactly-once semantics: Balance between at-least-once and at-most-once delivery

Message Brokers: The Core of Near Real-time Systems

Purpose

intermediaries that manage asynchronous communication between producers (senders) and consumers (receivers). They decouple components so they don’t need direct knowledge of each other.

Architecture Pattern: Pub/Sub

Publisher-Subscriber: Publishers emit messages; brokers route them to interested subscribers.

Benefits:

Decoupling: Publishers don’t know about subscribers
Scalability: Add new subscribers without changing publishers
Flexibility: Subscribers can process messages at their own pace

Message Broker Types

1. Traditional Message Brokers (RabbitMQ)

Architecture: Queue-based

Messages are stored in queues
Consumer pulls (or broker pushes) messages
Message is deleted after consumer acknowledges

How it works:

Producer sends message to queue
Consumer fetches message from queue
Consumer processes and acknowledges
Message is removed from queue

Characteristics:

Built-in acknowledgment mechanisms
Support for explicit message routing
Generally lower storage overhead
Good for short-lived, transient messages

Pros:

Simple, well-understood model
Good acknowledgment guarantees
Efficient for low-latency requirements

Cons:

Cannot replay past messages (not persisted long-term)
Message loss if broker crashes before acknowledgment
Less suitable for long-term event streaming

2. Log-based Message Brokers (Kafka)

Architecture: Append-only log

Messages are written to an immutable log
Consumers maintain offset pointers to track their position
Messages persist indefinitely (or until retention period expires)

How it works:

Producer appends message to log
Consumer reads from their last offset
Consumer processes message
Consumer commits their offset (persists progress)
Message remains in log for other consumers or replay

Characteristics:

All messages permanently recorded (until deletion)
Consumers track their own progress via offsets
Multiple consumers can read same message independently
Supports message replay

Pros:

Replay messages from any point in time
Multiple independent consumers
Durable, fault-tolerant persistence
Better for long-term event sourcing and analytics

Cons:

Higher storage overhead (keeping all messages)
More complex offset management
Slightly higher latency due to persistent writes

Comparison: Traditional vs. Log-based

Aspect	Traditional (RabbitMQ)	Log-based (Kafka)
Persistence	Transient (deleted after consumption)	Permanent (append-only log)
Consumption	Message deleted after ack	Multiple independent consumers
Replay	Not possible	Supported via offset
Use Case	Task queues, command delivery	Event streaming, analytics
Storage	Lower	Higher
Semantics	At-most-once to exactly-once	At-least-once to exactly-once

Message Broker Enhancements

1. Backpressure

Problem: Producer generates messages faster than consumer can process them. Queue grows unboundedly → memory exhaustion → system crash.

Definition: Mechanism to slow down producers when consumers fall behind.

Implementation patterns:

Reactive backpressure: Consumer signals to broker when overwhelmed
Rate limiting: Producer throttles message production
Queue depth monitoring: Stop accepting messages when queue exceeds threshold

Best practice: Design systems to handle backpressure gracefully rather than just dropping messages.

2. Load Balancing

Purpose: Distribute work across multiple consumer instances.

Strategies:

Round-robin: Send messages to consumers in sequence
Least-loaded: Send to consumer with fewest pending messages
Sticky routing: Keep related messages going to same consumer (preserves ordering)

Example: Multiple service instances consuming from same Kafka topic with different consumer group IDs.

3. Re-delivery

Problem: Consumer crashes after processing but before acknowledging message. Message must be reprocessed.

Implementation:

Broker doesn’t remove message until consumer acknowledges
Consumer crashes: message automatically re-delivered to another consumer
Idempotent operations (see ds-problems-pad-5nov) ensure safe retries

Risk: Duplicate processing if consumer fails after processing but before acknowledgment. Use idempotent operations to make this safe.

4. Dead-letter Channel

Problem: Message causes consumer to crash repeatedly (poison message). Can’t be processed, blocks other messages.

Solution: Route problematic messages to separate “dead-letter” queue.

Process:

Message fails to process N times
Moved to dead-letter queue
Manual inspection and remediation later
Prevents infinite retry loops

Message Broker Patterns

1. Supervisor Pattern

Purpose: Monitor worker processes and handle failures intelligently.

How it works:

Supervisor watches multiple workers
If worker crashes, supervisor can:
- Restart it (Erlang “let it crash” philosophy)
- Route work to healthy workers
- Escalate if repeated failures

Related: See PAD-intro for process and thread models.

2. Worker Pool Pattern

Purpose: Distribute work across multiple identical workers.

Architecture:

Queue of tasks
Fixed number of worker processes/threads
Each worker pulls task, processes, returns result

Benefit: Bounded concurrency - never overwhelm system with unlimited threads.

3. Aggregator Pattern

Purpose: Combine results from multiple workers into single output.

Example:

Scatter request to 10 data sources
Each returns partial result
Aggregator combines results
Returns unified response

4. Batcher Pattern

Purpose: Accumulate stream of individual events into batches, then process batch.

Why useful:

Database writes: Inserting 100,000 rows one at a time is slow. Batch insert is much faster.
Network efficiency: Send 100 small messages vs. 1 large batch
Processing efficiency: Process items together rather than individually

Implementation:

Collect events until batch size reached OR timeout expires
Whichever comes first
Process batch as atomic unit

5. Event Log Pattern

Purpose: Keep immutable record of all significant events that occur in system.

Key characteristics:

Append-only: Events never modified or deleted
Ordered: Events recorded in sequence
Comprehensive: Captures all state changes

Uses:

Audit trails: Compliance and forensics
Replay: Rebuild system state from events
Analytics: Analyze event stream

Related: Log-based message brokers like Kafka are natural fit for event logging.

6. Derived Data Pattern

Purpose: Transform raw event stream into different, optimized views.

Example:

Raw event: User clicked “buy” button
Derived data 1: Per-user purchase count (for recommendations)
Derived data 2: Revenue by category (for analytics)
Derived data 3: Fraud risk score (for security)

Advantage: Single source of truth (event log) can feed multiple derived systems without duplication.

Concurrency & Safety: CSP (Communicating Sequential Processes)

1. Fan-in

Definition: Multiple input sources → Single processor.

pad

Pattern:

Producer A ─┐
Producer B ├─→ Aggregator → Output
Producer C ─┘

Use case: Merge multiple data streams into single stream.

2. Fan-out

Definition: Single input → Multiple processors.

Pattern:

                ┌─→ Processor A
Input ──────────┼─→ Processor B
                └─→ Processor C

Use case: Broadcast message to multiple independent consumers.

3. Actor Model

Key principle: Each actor has isolated memory - no shared state.

Each actor is independent unit
Actors communicate via message passing (not shared variables)
By design, eliminates race conditions

Advantage over threads: See PAD-intro for race conditions, mutexes, semaphores.

Traditional threads: Shared memory → race conditions → need locks → deadlock risk
Actors: Message passing → no shared state → no races → no need for locks

Languages/frameworks:

Erlang/Elixir: Built on actor model
Akka (JVM): Actor library
Go: Goroutines with channels

Why this matters for distributed systems:

Scales naturally to distributed network (message passing works remotely too)
Fault isolation: One actor crash doesn’t bring down others
Concurrency without complexity of locks

Summary Table

Factor	Online	Offline	Near Real-time
Latency	Milliseconds	Minutes/hours	Seconds
Input	User queries	Batch data	Event stream
Throughput	Variable	High	Continuous
Concurrency	High	Low	Medium
Example Tech	REST APIs, gRPC	Hadoop, Spark	Kafka, Storm
Best for	Web, mobile apps	Analytics, reports	Monitoring, alerts
Scaling approach	Horizontal (more servers)	Horizontal (data partitioning)	Horizontal (consumer groups)

Key Takeaways

Online systems prioritize responsiveness and concurrency
Offline systems prioritize throughput and cost efficiency
Near real-time systems balance both via message brokers
Message brokers enable loose coupling and fault tolerance
Traditional brokers suit task queues; log-based brokers suit event streaming
Design patterns (supervisor, worker pool, batcher) address common distributed problems
Actor model and CSP provide alternatives to shared-memory concurrency
No single approach - choose based on latency, throughput, and consistency requirements

DV

Recent Notes

Gallery

3-2-1 backup strategy

Language, personal writing, LLMs

Explorer

Introduction

Type 1: Online Systems

Characteristics

Examples

Key Considerations

Common Patterns

Type 2: Offline Systems

Characteristics

Examples

Key Considerations

Common Patterns

Type 3: Near Real-time Systems

Characteristics

Examples

Key Considerations

Message Brokers: The Core of Near Real-time Systems

Purpose

Architecture Pattern: Pub/Sub

Message Broker Types

1. Traditional Message Brokers (RabbitMQ)

2. Log-based Message Brokers (Kafka)

Comparison: Traditional vs. Log-based

Message Broker Enhancements

1. Backpressure

2. Load Balancing

3. Re-delivery

4. Dead-letter Channel

Message Broker Patterns

1. Supervisor Pattern

2. Worker Pool Pattern

3. Aggregator Pattern

4. Batcher Pattern

5. Event Log Pattern

6. Derived Data Pattern

Concurrency & Safety: CSP (Communicating Sequential Processes)

1. Fan-in

2. Fan-out

3. Actor Model

Summary Table

Key Takeaways

Graph View

Table of Contents