Kafka is a scalable distributed message queue. This means, it allows producers to push high volume of messages to a queue and consumers to pull messages from the queue. In this post I will describe Kafka’s key abstractions and architecture principles.
A Kafka cluster is composed from nodes which are called brokers. Producer client connects to a broker and push a message. Internally, the broker create a sequence number per message, append the message to a log file located in the node’s local hard disk. For effcient fetch of a message – the brokers manage an offset table with a sequence number as the key and value of the seek location of the message in the log file and its size. In Kafka, the consumer is responsible to track which messages he already consumed and not the server. Other message queues approach this tracking differently – their server track this data for each of their consumers. After a message is consumed by a client it is not removed by the server so other consumers can consume it later including the consumer that just consumed it. This can be helpful in case data need to processed again (replay) by the consumer after some fault in the first processing. The messages will be kept for some period – this is also handled differently in other message queue systems: usually they delete the message from the queue once its consumer consumed the message. To emphasis – the queue is actually implemented by continuous log file on node’s local disk.
You might wonder at this point how Kafka manages efficient reads and writes to local disk. Kafka interaction with the local disk has two characteristic: 1) it appends new writes to an open file(i.e., the message-log-file) and 2) it usually reads continues area from disk. This characteristic lets Kafka rely on OS disk cache for optimizing its reads from and writes to disk. The OS cache is highly efficient, always enabled and it’s optimized for appends calls and for multi reads from a continues area on disk. More advantages of using OS level cache is that the Kafka’s processes uses little JVM’s heap space so avoiding all GC related issues, and in case the broker restart its does not need to run any cache-warm-up as the OS cache is not cleared when process terminates. Another optimization strategy used by Kafka is to reduce unnecessary copy of data – for example copy of data from network buffers from the OS into the application buffers. This is achieved by using Linux’s sendfile API that read data directly from disk (hopefully from OS disk cache) and routing it directly to a network socket without passing through the application and so avoid the unnecessary copy of buffers.
Now, consider the use case where several producers and consumers are interested in different type of messages. In order to support this use case – Kafka is using the abstraction of ‘topic‘. Producer is sending messages to a topic and consumers are reading messages from a topic.
In order to support large volume of messages and parallel consumption – Kafka introduced the concept of topic’s partitioning which means that when producing to a topic – the producer also specify the method for partitioning the messages e.g., via round robin or by specifying a partition function and message’s field that together define a partition per message. for example if there are 5 partitions with a single topic on a Kafka cluster with 5 brokers – then each broker will manage (lead) one of the partitions. Order between messages is kept between messages in the same partition only. High availability is achieved by replicating partition to other brokers so in case of a failure in one node – another node starts to manage the partitions. Usually, there are more topics and partitions than the number of cluster nodes so a typical broker will have several managed partitions and several replication for other partitions which are managed in different brokers. Replication to brokers achieved by a broker act like a regular client that consumes a specific partition. All producers of a partition write to the same broker that manage this partition and the same goes for consumers of a partition- they all read from the same broker.
Kafka introduced the concept of consumer-group which allow consumers to share the consumption of a topic so each message is return to only one consumer. a topic can be consumed by multiple consumer groups. in case a set of consumers would like to consume the same messages of a topic – they all need to belong to different consumer-group.