Enabling Reliable and Scalable Event Streams in Distributed Systems

Introduction

Event streams provide a framework for processing and transmitting continuously generated data in distributed systems. Rather than relying on static batch files, streams enable real-time or near-real-time communication between producers and consumers. The challenge lies in efficiently transmitting these streams, ensuring scalability, reliability, and fault tolerance. This subchapter focuses on two key mechanisms: messaging systems and partitioned logs.

Messaging Systems

A messaging system provides the foundation for transmitting events between producers and consumers. Unlike direct communication methods (e.g., HTTP or TCP connections), messaging systems decouple producers from consumers by introducing an intermediary called a message broker.

How Messaging Systems Work

Producers send events (messages) to a broker, often organized by topics.
The broker stores these messages temporarily or persistently.
Consumers subscribe to topics, receiving messages either in real-time or when they are ready to process them.

Key Features of Messaging Systems

Durability with Message Brokers
Message brokers, such as RabbitMQ and ActiveMQ, provide durability by writing messages to disk. This ensures resilience to failures, enabling consumers to retrieve messages that were stored before the crash of a producer or a consumer node.
Load Balancing
When multiple consumers subscribe to a topic, the broker can distribute messages in two ways:
- Load Balancing: Distribute messages among consumers to parallelize processing.
- Fan-Out: Broadcast all messages to every consumer subscribed to the topic.

Limitations of Messaging Systems

Short-Term Storage: Traditional brokers are optimized for transient workloads and delete messages after they are acknowledged. Therefore, they are unsuitable for long-term message storage.
Message Ordering: Depending on the configuration, messages may arrive out-of-order if brokers redistribute them following consumer failures.

Partitioned Logs

A log-based message broker builds on the durable and append-only log structure seen in replication and storage engines. Well-known examples include Apache Kafka and Amazon Kinesis. These systems address some of the challenges of traditional brokers by retaining messages long-term and providing better reliability when dealing with unbounded streams.

How Partitioned Logs Work

Producers send events to a topic, which is divided into partitions for scalability. Each event in a partition is assigned a monotonically increasing offset, ensuring per-partition message order.
Consumers independently read from assigned partitions at their own pace, tracking offsets to avoid reprocessing.

Advantages of Log-Based Brokers

Persistence for Long-Term Availability: Unlike traditional brokers, logs retain messages until they are explicitly deleted. This allows new consumers to replay past events and catch up on historical data.
Fan-Out Without Performance Loss: Multiple consumers can read the same data from partitions without affecting each other, enabling efficient stream processing and replication tasks.
Efficient Sequential Reads: Consumers read sequentially from partitions, enabling high throughput.

Message Broker vs. Partitioned Log

Aspect	Traditional Message Broker	Log-Based Broker
Message Retention	Messages are deleted after acknowledgment.	Messages are retained until explicitly deleted.
Delivery Mechanism	Push-based (broker pushes messages to consumer).	Pull-based (consumer reads messages from log).
Message Ordering	Limited guarantees, may vary during redelivery.	Strong per-partition ordering guarantees.
Scalability	Limited by broker processing capacity.	Horizontal scalability via partitions.

Challenges in Event Transmission

Regardless of the mechanism, transmitting event streams presents inherent challenges:

Backpressure and Overload
- In traditional brokers, unbounded queues caused by slow consumers risk degrading overall performance.
- Partitioned logs mitigate this by allowing streams to accumulate independently per partition.
Crash Recovery
- Brokers use acknowledgments to confirm message delivery, relying on replays to recover unprocessed messages.
Distributed Order Guarantees
- Partition-level ordering ensures integrity within each topic. However, maintaining order across partitions adds complexity and is generally avoided unless explicitly needed.

Conclusion

Messaging systems and partitioned logs represent two complementary approaches to transmitting event streams in distributed systems. Traditional brokers excel at transient workloads, while log-based brokers offer persistent storage, replayability, and scalability. By understanding the trade-offs between these mechanisms, engineers can design stream-based architectures that balance performance, reliability, and durability—key pillars for modern data-intensive applications.

Enabling Reliable and Scalable Event Streams in Distributed Systems

Introduction

Messaging Systems

How Messaging Systems Work

Key Features of Messaging Systems

Limitations of Messaging Systems

Partitioned Logs

How Partitioned Logs Work

Advantages of Log-Based Brokers

Message Broker vs. Partitioned Log

Challenges in Event Transmission

Conclusion

More to read

Ethical Data Practices for Building Better Systems (Mar 12, 2023)

Building Correct Systems in Distributed Environments (Mar 3, 2023)

Unbundling Monolithic Databases for Flexibility (Feb 26, 2023)

Integrating Distributed Systems for Unified Data Pipelines (Feb 18, 2023)

Enabling Reliable and Scalable Event Streams in Distributed Systems

Introduction

Messaging Systems

How Messaging Systems Work

Key Features of Messaging Systems

Limitations of Messaging Systems

Partitioned Logs

How Partitioned Logs Work

Advantages of Log-Based Brokers

Message Broker vs. Partitioned Log

Challenges in Event Transmission

Conclusion

Want to get blog posts over email?

More to read

Ethical Data Practices for Building Better Systems (Mar 12, 2023)

Building Correct Systems in Distributed Environments (Mar 3, 2023)

Unbundling Monolithic Databases for Flexibility (Feb 26, 2023)

Integrating Distributed Systems for Unified Data Pipelines (Feb 18, 2023)