Kafka
Kafka is an open source event streaming and messaging platform, designedfor distributed and fault tolerant operation.
Architecture
flowchart TB
Producer1
Producer2
subgraph Cluster
subgraph Broker1
subgraph TopicA
direction LR
TopicAPartition1
TopicAPartition2
end
subgraph TopicB
direction LR
TopicBPartition1
TopicBPartition2
end
subgraph TopicC
direction LR
TopicCPartition1
TopicCPartition2
end
end
subgraph Broker2
subgraph TopicX
direction LR
TopicXPartition1
TopicXPartition2
end
subgraph TopicY
direction LR
TopicYPartition1
TopicYPartition2
end
subgraph TopicZ
direction LR
TopicZPartition1
TopicZPartition2
end
end
end
subgraph ConsumerGroupA
ConsumerGroupAConsumer1
ConsumerGroupAConsumer2
end
subgraph ConsumerGroupB
ConsumerGroupBConsumer1
ConsumerGroupBConsumer2
end
Producer1-->Cluster
Producer2-->Cluster
ConsumerGroupA-->Cluster
ConsumerGroupB-->Cluster
Concepts
- Messages are the atomic units of data moving through the cluster. They're byte arrays, and are agnostic of format.
- Topics categorise messages into segregated streams.
- Partitions allow horizontal scalability by sharding topics. Partitioning can be:
- randomly, if no key, partition, or custom partitioning logic is specified;
- based on a key specified on each message;
- by a key specified on each message; or
- based on custom partitioning logic.
- Brokers are the indiviudual nodes which make up the Kafka cluster.
- Producers produce messages which are published to the Kafka cluster.
- Consumers read from partitions, in-order. The last-read offset can be stored to allow the consumer to resume where it left off when interrupted. It's possibly to replay events by altering this offset value.
- Consumer Groups are sets of consumers which each receive a proportion of the topic's messages. As consumers join and leave the group, the elected coordinator for the group manages assignment of partitions to consumers (rebalancing) to create a new generation of the group.
Children
Backlinks