Apache Kafka® concepts
A comprehensive glossary of essential Apache Kafka® terms and their meaning.
Broker
A server that operates Apache Kafka, responsible for message storage, processing, and delivery. Typically part of a cluster for enhanced scalability and reliability, each broker functions independently but is integral to Kafka's overall operations, separate from tools like Apache Kafka Connect.
Consumer
An application that reads data from Apache Kafka, often processing or acting upon it. Various tools used with Apache Kafka ultimately function as either a producer or a consumer when communicating with Apache Kafka.
Consumer groups
Groups of consumers in Apache Kafka are used to scale beyond a single application instance. Multiple instances of an application coordinate to handle messages, with each group allocated to different partitions for even workload distribution.
Event-driven architecture
Application architecture centered around responding to and processing events.
Event
A single discrete data unit in Apache Kafka, consisting of a value
(the message body) and often a key
(for quick identification) and
headers
(metadata about the message).
Kafka node
See Broker
Kafka server
See Broker
Message
See Event
Partitioning
A method used by Apache Kafka to distribute a topic across multiple
servers. Each server acts as the leader
for a partition, ensuring data
sharding and message order within each partition.
Producer
An application that writes data into Apache Kafka without concern for the data consumers. The data can range from well-structured to simple text, often accompanied by metadata.
Pub/sub
A publish-subscribe messaging architecture where messages are broadcast by publishers and received by any listening subscribers, unlike point-to-point systems.
Queueing
A messaging system where messages are sent and received in the order they are produced. Apache Kafka maintains a watermark for each consumer to track the most recent message read.
Record
See Event
Replication
Apache Kafka's feature for data replication across multiple servers, ensuring data preservation even if a server fails. This is configurable per topic.
Topic
Logical channels in Apache Kafka through which messages are organized.
Topics are named in a human-readable manner, like sensor-readings
or
kubernetes-logs
.