Interview prep · Junior → Senior

Kafka Interview Questions

The questions a Kafka interviewer actually asks — the core concepts and the append-only log, producers, consumers and consumer groups, partitions and the load-bearing ordering guarantee, delivery semantics (at-least-once, at-most-once, and the much-misunderstood exactly-once), replication and durability (the ISR and the acks=all / min.insync.replicas pairing), storage, retention and performance, the Connect / Streams / Schema ecosystem, and the KRaft / ZooKeeper architecture — each answered with a config snippet, a CLI command, or a short worked example and a link to the source. This page covers Apache Kafka the open-source platform; for what changed in each release — when exactly-once landed, when ZooKeeper went away — see the Kafka version reference.

Difficulty

Junior — the core vocabulary: topics, partitions, offsets, producers, consumers, brokers.

Mid — Kafka in practice: consumer groups, acks, retention vs compaction, Connect and Streams.

Senior — the hard guarantees: exactly-once, the ISR and min.insync.replicas, rebalancing, KRaft, and the design tradeoffs.

Difficulty

Topic

Showing 0 of 0 questions

1 · Core concepts & the log

What is Apache Kafka? Junior

Kafka is a distributed event-streaming platform built around one core abstraction: the append-only commit log. Producers append records to the end of a log; consumers read forward from a position they control. Unlike a traditional queue, a read is not destructive — the record stays on the log until it ages out by a retention policy, so many independent consumers can read the same stream at their own pace and replay it. That single design choice is why Kafka became infrastructure for event streaming, event sourcing, and stream processing rather than just point-to-point messaging. Around the log it grew a full platform: Kafka Connect (moving data in and out), Kafka Streams (stream processing), and exactly-once semantics.

# A topic is a named, partitioned, replayable log.
$ kafka-console-producer.sh --bootstrap-server localhost:9092 --topic orders
# ... consumers read the SAME records, independently, from offset 0 or "latest":
$ kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic orders --from-beginning

Why it's asked / follow-up: it sets the frame for everything else — if you think of Kafka as “a message queue,” half the later answers come out wrong. Follow-up: “is it a queue or a database?” — neither exactly; it's a durable, replayable log, and the queue comparison and database comparison are their own questions.

Source: Apache Kafka — Introduction. History and release timeline: the Kafka version reference.

What are topics, partitions, and offsets? Junior

A topic is a named stream of records — the logical channel you publish to and subscribe from. Each topic is split into one or more partitions, and a partition is the physical log: an ordered, immutable, append-only sequence of records. Every record in a partition has an offset, a monotonically increasing integer that is its position in that partition. Offsets are per-partition, not per-topic. A consumer tracks “where I am” as an offset per partition. The partition is the unit of both parallelism and ordering, so it's the most important object on the page.

// topic "orders", 3 partitions — each an independent, ordered log:
P0: [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ]→          // offsets are per-partition
P1: [ 0 ][ 1 ][ 2 ]→
P2: [ 0 ][ 1 ][ 2 ][ 3 ]→              // there is NO global "topic offset"

Why it's asked / follow-up: conflating a topic with a partition is the single most common Kafka misconception, and it poisons the ordering and consumer-group answers. Follow-up: “does an offset mean the same thing across partitions?” — no; offset 5 in P0 is unrelated to offset 5 in P1.

Source: Apache Kafka — Main concepts and terminology.

What is in a Kafka record (message)? Junior

A record is more than a payload. It carries a key (optional; drives partitioning and compaction), a value (the payload), a timestamp (creation-time or log-append-time), optional headers (free-form key/value metadata, useful for tracing or a schema id), and — once written — a partition and an offset. Keys and values are opaque bytes to the broker; turning your objects into bytes is the serializer's job on the producer and the deserializer's on the consumer. The key is the field that most affects behavior: same key → same partition → ordering and compaction both hinge on it.

new ProducerRecord<>(
    "orders",          // topic
    "customer-42",     // key   → decides the partition (and compaction identity)
    "{...json...}");   // value → the payload (opaque bytes to the broker)
// headers, timestamp, partition, offset round out the record.

Why it's asked / follow-up: it checks that you know the key is a first-class field with consequences, not just “a label.” Follow-up: “what happens if the key is null?” — the record is spread across partitions (sticky/round-robin), so you lose the same-key ordering guarantee.

Source: Apache Kafka — Message format.

What is a broker, and what is a Kafka cluster? Junior

A broker is a single Kafka server; it stores partition data on disk and serves produce and fetch requests. A cluster is a set of brokers that together hold the topics. A topic's partitions are spread across brokers, and each partition's replicas live on different brokers so the data survives a node loss. A client only needs one or two bootstrap brokers to start; from there it fetches cluster metadata (which broker leads which partition) and connects directly to the right brokers.

# the client bootstraps from any broker, then learns the full topology:
bootstrap.servers=broker1:9092,broker2:9092,broker3:9092
# partitions are distributed across brokers; replicas sit on different brokers.

Why it's asked / follow-up: it separates the logical model (topics/partitions) from the physical one (brokers/replicas). Follow-up: “why list more than one bootstrap server?” — so the client can still discover the cluster if the first broker it tries is down; it's a discovery list, not the full membership.

Source: Apache Kafka — Concepts.

How does Kafka differ from a traditional message queue? Mid

A classic broker (RabbitMQ, ActiveMQ, SQS) treats a message as a unit of work that is deleted once acknowledged — the queue is a to-do list that drains. Kafka treats the stream as a retained, replayable log: consuming does not remove anything; the consumer just advances its offset. Consequences: multiple unrelated consumer groups can read the same data independently; a new consumer can start from offset 0 and replay history; and you can rewind after a bug and reprocess. The tradeoff is that Kafka doesn't do per-message routing, priorities, or selective ack/redelivery the way a queue does — historically it assigned whole partitions to consumers rather than handing out individual messages (Share Groups later added a queue-like mode).

// traditional queue:  deliver → ack → message GONE (destructive read)
// kafka:              read at offset N → advance to N+1; record STAYS
//                     retention (not consumption) is what removes data.

Why it's asked / follow-up: it's the frame that makes replay, multi-consumer fan-out, and event sourcing possible — and the reason “just use a queue” is sometimes the right call (see when NOT to use Kafka). Follow-up: “can Kafka do queue semantics?” — yes, since Share Groups / Queues for Kafka (production-ready in 4.2), but that's a recent addition on top of the log, not its native model.

Source: Apache Kafka — Design (persistence & the log).

2 · Producers

How does a producer decide which partition a record goes to? Junior

The producer, not the broker, chooses the partition. Three cases: (1) if the record names an explicit partition, that wins; (2) if it has a key, the default partitioner hashes the key (murmur2) modulo the partition count — so the same key always lands on the same partition; (3) if the key is null, modern clients use the sticky partitioner, filling one partition's batch before moving on (better batching than the old round-robin). You can also supply a custom partitioner. The keyed case is the one that matters: it's how you get ordering and locality for a logical entity.

// keyed → deterministic placement:
partition = murmur2(keyBytes) % numPartitions;   // same key → same partition
// null key → sticky partitioner batches to one partition at a time

Why it's asked / follow-up: it connects producing to the ordering and skew answers. Follow-up: “what breaks the key→partition mapping?” — changing the partition count, because the modulo changes (see adding partitions).

Source: Apache Kafka — Producer configuration.

What does acks do, and what are the tradeoffs? Mid

acks controls how many replicas must acknowledge a write before the producer considers it durable. acks=0: fire-and-forget, no acknowledgement — fastest, can silently lose data. acks=1: the leader writes and acks — you lose the record if the leader dies before a follower replicates it. acks=all (aka -1): the leader waits for all in-sync replicas — the durable setting. Crucially, acks=all is only as strong as the ISR is large, which is why it must be paired with min.insync.replicas (its own question). Newer clients default acks=all alongside idempotence.

acks=all                     # wait for all in-sync replicas (durable)
# acks=1  → leader only  (fast, small loss window)
# acks=0  → no ack       (fastest, can drop data)

Why it's asked / follow-up: it's the headline durability knob and the setup for the ISR answer. Follow-up: “does acks=all alone prevent data loss?” — no; with only one in-sync replica, all means “one,” so you also need min.insync.replicas.

Source: Apache Kafka — Producer config: acks.

What is the idempotent producer? Mid

Without it, a producer retry after a network hiccup can write the same record twice (the ack was lost, not the write). The idempotent producer (enable.idempotence=true) makes retries safe: the broker tags each producer with a producer id and stamps each record batch with a monotonic sequence number per partition, so it detects and drops a duplicate or an out-of-order batch. It gives you exactly-once per partition, per producer session for produce — the foundation the transactional/EOS story builds on. It's essentially free and, on modern brokers, on by default (enabled by default since Kafka 3.0).

enable.idempotence=true      # dedupe retried writes (default on modern clients)
# implies: acks=all, retries>0, max.in.flight.requests.per.connection≤5

Why it's asked / follow-up: it's the difference between “retries might duplicate” and “retries are safe,” and it's the base layer under transactions. Follow-up: “does it survive a producer restart?” — no; the guarantee is per producer session. Cross-session exactly-once needs transactions.

Source: Apache Kafka — enable.idempotence.

How do linger.ms, batch.size, and compression affect throughput? Mid

The producer batches records per partition before sending. batch.size is the byte cap on a batch; linger.ms is how long it waits to let a batch fill before sending even if it isn't full. Raising linger.ms from the default 0 trades a little latency for much larger batches, which means fewer requests and far better throughput and compression ratios. Compression (compression.type=lz4/zstd/snappy/gzip) is applied to the whole batch, so it compresses better the bigger the batch — and the compressed batch is stored and transmitted compressed all the way to the consumer.

linger.ms=10               # wait up to 10ms to fill a batch
batch.size=65536          # 64 KB per-partition batch cap
compression.type=zstd     # compress the whole batch (better when batches are bigger)

Why it's asked / follow-up: it's the everyday throughput-tuning lever, and it surfaces the latency↔throughput tradeoff. Follow-up: “what does linger.ms=0 do?” — send as soon as possible; low latency but small batches and more overhead under load.

Source: Apache Kafka — Producer configuration.

How do retries and max.in.flight.requests.per.connection interact with ordering? Senior

The classic trap: with retries enabled and more than one request in flight to a partition, a failed-then-retried batch can land after a later batch that already succeeded — reordering records within a partition. Historically the only safe fix was max.in.flight.requests.per.connection=1, which throttles throughput. The idempotent producer solves this: because each batch carries a sequence number, the broker rejects out-of-order batches and the client retries them in order, so you keep ordering and pipelining up to 5 in-flight requests. So on a modern client the answer is “enable idempotence and leave in-flight at ≤5,” not “set in-flight to 1.”

// non-idempotent + retries + inflight>1  → possible reorder within a partition
enable.idempotence=true                          // preserves order AND allows pipelining
max.in.flight.requests.per.connection=5       // safe with idempotence; must be ≤5

Why it's asked / follow-up: it's a real production footgun and it tests whether you know the modern fix rather than the folklore one. Follow-up: “why the cap of 5?” — the broker only tracks the last 5 batches per producer to detect duplicates/gaps, so idempotence requires in-flight ≤5.

Source: Apache Kafka — max.in.flight.requests.per.connection.

3 · Consumers & consumer groups

What is a consumer group, and how are partitions assigned? Junior

Consumers that share a group.id form a consumer group, and Kafka divides the topic's partitions among the group's members so that each partition is consumed by exactly one member of the group. That's how you scale out: add consumers to spread the partitions. The consequence people miss is the ceiling — partition count caps useful parallelism. With 6 partitions, a 7th consumer in the group sits idle. Different groups are independent: two groups each get their own full copy of the stream (fan-out). Consumers pull (poll) rather than being pushed to, so a slow consumer just falls behind rather than being overwhelmed.

group.id=order-processors
// topic with 3 partitions, group with 2 consumers:
//   C1 ← P0, P1     C2 ← P2      (each partition → exactly one consumer)
// a 4th consumer would be idle; a second group gets its OWN copy of all 3.

Why it's asked / follow-up: it's the core scaling model and the source of the “why won't adding consumers help?” puzzle. Follow-up: “how do you raise the parallelism ceiling?” — add partitions (with the key-stability caveat), not just consumers.

Source: Apache Kafka — Consumers & consumer groups.

What is a consumer-group rebalance, and what was the “stop-the-world” problem? Senior

A rebalance re-assigns partitions across the group when membership changes — a consumer joins, leaves, or is declared dead (missed heartbeats / max.poll.interval.ms). In the original eager protocol every consumer revoked all its partitions and stopped processing until a new assignment was computed — the “stop-the-world” pause, painful for large groups that rebalanced often. Cooperative / incremental rebalancing (KIP-429) let consumers keep the partitions they retain and only hand off the ones that move. The next-generation protocol (KIP-848) goes further, moving assignment logic from a client-side group leader to the broker and making it fully incremental — it reached GA in Kafka 4.0 (opt in with group.protocol=consumer).

// eager (old):     ALL consumers revoke ALL partitions → recompute → resume
// cooperative:     keep what you hold; only reassigned partitions move (KIP-429)
group.protocol=consumer    // opt into the KIP-848 broker-side protocol (GA in 4.0)

Why it's asked / follow-up: rebalancing storms are a top production pain point, and this is a fast-moving area. Follow-up: “what triggers needless rebalances?” — long processing that blows past max.poll.interval.ms, frequent scaling, or short session timeouts; static membership (group.instance.id) avoids rebalancing on a rolling restart.

Source: KIP-848 — next-gen consumer rebalance protocol; arrival on the version reference.

How does offset management work — auto vs manual commit? Mid

A committed offset records “my group has processed up to here” so a restarting or reassigned consumer resumes at the right place. With enable.auto.commit=true the client commits the last polled offset periodically in the background — convenient, but it can commit records you haven't finished processing (crash → loss) or re-deliver records you did process (crash before commit → duplicates). Manual commit (commitSync/commitAsync after processing) gives you control, and the order of “process” vs “commit” is exactly what chooses your delivery semantics: commit after processing → at-least-once; commit before → at-most-once.

enable.auto.commit=false
// at-least-once: process first, THEN commit
records = consumer.poll(...);
process(records);
consumer.commitSync();          // crash before this → records re-delivered (dupes), not lost

Why it's asked / follow-up: commit timing is your delivery guarantee — a lot of “Kafka lost my data” stories are really auto-commit misconfigurations. Follow-up: “where are offsets stored?” — in the internal __consumer_offsets topic (not in ZooKeeper, for modern clients).

Source: Apache Kafka — Consumer config: offset commits.

What is the __consumer_offsets topic? Mid

Committed offsets have to live somewhere durable, and since roughly 0.9 that place is an internal, compacted Kafka topic named __consumer_offsets — Kafka storing its own bookkeeping on Kafka. A commit is just a produce to this topic keyed by (group, topic, partition); because it's log-compacted, only the latest offset per key is retained. This replaced the old design of storing offsets in ZooKeeper, which didn't scale to high commit rates. It's also where group metadata lives, coordinated by the group coordinator broker for each group.

# it's a real (internal, compacted) topic — 50 partitions by default:
$ kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic __consumer_offsets
# key = (group, topic, partition); value = committed offset + metadata

Why it's asked / follow-up: it demystifies “where do offsets go” and reinforces that compaction is a first-class Kafka feature. Follow-up: “why compacted?” — you only ever need the latest committed offset per group/partition, which is exactly what compaction keeps.

Source: Apache Kafka — Offset tracking.

What is consumer lag, and why does it matter? Mid

Lag is, per partition, the difference between the partition's log-end offset (the newest record) and the group's committed offset (how far it's processed) — i.e. how many records the consumer is behind. It's the single most important health metric for a streaming pipeline: steady, low lag means consumers keep up; growing lag means they're falling behind and latency is climbing. You read it with kafka-consumer-groups.sh --describe or from the consumer's metrics (Burrow / a monitoring stack in production).

$ kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
      --describe --group order-processors
TOPIC   PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG
orders  0          10432           10440             8   ← 8 records behind

Why it's asked / follow-up: it's the operational vocabulary for “is my pipeline healthy?” Follow-up: “lag is growing — now what?” — that's the diagnose-a-lagging-group scenario: add partitions + consumers, speed up processing, or check for a hot partition.

Source: Apache Kafka — Checking consumer position (lag).

4 · Partitions, keys & ordering

Why does Kafka partition topics at all? Junior

Partitions are how Kafka scales a single topic beyond one machine and one consumer. A partition is the unit of parallelism (different partitions live on different brokers and are consumed by different group members) and the unit of ordering (records are ordered within a partition). Those two roles are inseparable and slightly in tension: more partitions means more throughput but only per-partition ordering. If a topic had a single partition it would be perfectly ordered but capped at one consumer's throughput.

# parallelism: partitions spread across brokers and consumers
# ordering:    guaranteed WITHIN a partition, not across the topic
# the same object (partition) gives you both — that's the key idea.

Why it's asked / follow-up: it frames the central tradeoff of the whole system. Follow-up: “so how many partitions should a topic have?” — enough to hit your throughput and consumer-parallelism target with headroom, but not so many that metadata, open files, and rebalance cost balloon (see changing the count later).

Source: Apache Kafka — Partitions.

Does Kafka guarantee message ordering? Mid

Only within a partition — never across a topic. Records in one partition are strictly ordered by offset; records in different partitions have no ordering relationship at all. So “does Kafka guarantee ordering?” has a precise answer: yes per-partition, no globally. The practical consequence: if a set of records must be processed in order (all events for one account, say), they must share a key so they land in the same partition. Global total ordering across a topic requires a single partition — which sacrifices all parallelism, so it's rarely the right design.

// same key → same partition → ordered relative to each other
send("account-7", "debit 50");   // P1 offset 100
send("account-7", "credit 20");  // P1 offset 101  (ordered ✓)
send("account-9", "debit 10");   // P2 — NO ordering vs account-7

Why it's asked / follow-up: “Kafka guarantees ordering” stated flatly is one of the most common wrong answers on the open web; the correct scoping is the whole point. Follow-up: “how do you get total ordering?” — one partition (and accept the throughput cost), or order downstream by an event timestamp / sequence in the payload.

Source: Apache Kafka — Message delivery & ordering.

How do you choose a partition key, and what is key skew? Mid

Pick the key that is the unit you need ordered together — account id, user id, device id — because same-key records land on the same partition and stay ordered. The failure mode is key skew / hot partitions: if one key (or a few) carries a disproportionate share of traffic, its partition becomes a bottleneck while others idle, and the consumer owning it lags. High-cardinality, evenly-distributed keys avoid this; a low-cardinality key (like “country” when 80% of traffic is one country) invites it. When you need ordering per entity and even load, a well-distributed entity id is the sweet spot.

// good:  key = userId        → high cardinality, even spread, per-user order
// risky: key = countryCode   → few values → one huge "hot" partition
// null key → no ordering guarantee, but load spreads evenly

Why it's asked / follow-up: it links the ordering guarantee to a real design decision with a real failure mode. Follow-up: “you have a hot key you can't change — options?” — a composite key, salting (spread one key across N sub-partitions and re-order downstream), or accepting the skew if ordering for that key matters more than balance.

Source: Apache Kafka — Producer configuration (partitioning).

Can you change a topic's partition count? What breaks? Senior

You can add partitions but not remove them, and adding has a sharp edge: it breaks the key→partition mapping. Placement is hash(key) % numPartitions, so changing the partition count changes where existing keys route — “account-7” that always went to P1 may now go to P3, which means new records for that key can be processed out of order relative to its history and any per-key state downstream is now split. Removing partitions isn't supported at all because it would orphan committed data and offsets. So partition count is a decision you want to get roughly right up front, with headroom.

$ kafka-topics.sh --bootstrap-server localhost:9092 \
      --alter --topic orders --partitions 12   # increase only; can't decrease
# after this, hash(key) % 12 ≠ hash(key) % 6  → same key may move partitions

Why it's asked / follow-up: it's a real operational trap and it tests whether you understand why the modulo makes repartitioning dangerous. Follow-up: “how do you repartition safely?” — usually create a new topic with the target count and migrate, rather than altering in place, if key-order continuity matters.

Source: Apache Kafka — Modifying topics.

5 · Delivery semantics

At-most-once, at-least-once, exactly-once — what's the difference? Mid

Three delivery guarantees, defined by what happens on failure. At-most-once: each record is delivered zero or one time — you may lose some, never duplicate (commit the offset before processing). At-least-once: never lose, may duplicate (process then commit; a crash before commit re-delivers) — the common, sensible default. Exactly-once: each record affects the result once, no loss and no duplicate — the strong guarantee, achievable within Kafka via the idempotent producer plus transactions. The lever for the first two is simply commit ordering; exactly-once needs real machinery.

// at-most-once:  commit offset → process   (crash after commit → record lost)
// at-least-once: process → commit offset    (crash before commit → duplicate)
// exactly-once:  idempotent producer + transactions (EOS)

Why it's asked / follow-up: it's the vocabulary the harder delivery questions build on, and it exposes whether you understand these as failure-mode definitions, not marketing tiers. Follow-up: “which should I default to?” — at-least-once plus idempotent processing downstream is simpler and often preferable to full EOS.

Source: Apache Kafka — Message delivery semantics.

How does Kafka actually achieve exactly-once (EOS)? Senior

Two mechanisms, introduced together in Kafka 0.11. The idempotent producer removes duplicates from producer retries (per partition, per session). Transactions let a producer write to multiple partitions atomically and — the key part — commit its consumer offsets in the same transaction. That makes the read-process-write loop atomic: consume from A, produce to B, and commit the input offset, all-or-nothing. Consumers reading B with isolation.level=read_committed never see records from an aborted transaction. This is exactly-once for Kafka-to-Kafka pipelines (and it's what Kafka Streams turns on with one config).

producer.initTransactions();
producer.beginTransaction();
producer.send(outRecord);                              // write to topic B
producer.sendOffsetsToTransaction(inOffsets, groupMD); // commit input offset...
producer.commitTransaction();                          // ...atomically with the output

Why it's asked / follow-up: it's a top senior cluster, and most people can say “exactly-once” but not how. Follow-up: “in Kafka Streams?” — set processing.guarantee=exactly_once_v2; Streams wires the transactional producer and offset commit for you.

Source: KIP-98 — Exactly-once & transactional messaging; arrival on the version reference (0.11).

What does isolation.level=read_committed do? Senior

It's the consumer side of transactions. With the default read_uncommitted, a consumer sees all records, including those from transactions that later abort. With read_committed, the consumer only delivers records from committed transactions and never reads past an open transaction's boundary (the “last stable offset”) until it resolves. That's what makes the downstream half of exactly-once real: an aborted transaction's writes are invisible. The cost is a little added latency, because a consumer may have to wait for a pending transaction to commit or abort before advancing.

isolation.level=read_committed   # hide records from aborted/open transactions
# default is read_uncommitted → sees everything, including aborted writes

Why it's asked / follow-up: people describe transactions from the producer side and forget the consumer must opt in to actually benefit. Follow-up: “what is the last stable offset (LSO)?” — the offset up to which all transactions are decided; a read_committed consumer won't read past it.

Source: Apache Kafka — isolation.level.

Why is exactly-once subtler than it sounds — where does it not apply? Senior

Kafka's exactly-once is scoped to Kafka-to-Kafka: reading from Kafka, processing, and writing back to Kafka, with offsets committed transactionally. The moment your processing has a side effect outside Kafka — charge a credit card, send an email, write to an external database that isn't in the transaction — Kafka can't make that side effect exactly-once, because it can't roll back the outside world. There you're back to at-least-once, and the honest fix is to make the side effect idempotent (dedupe on a business key, use conditional writes / upserts). Selling “exactly-once” as an unqualified checkbox is the misconception; the correct framing is “exactly-once within Kafka, idempotency at the edges.”

// inside Kafka:  read-process-write + txn  → exactly-once ✓
// external DB / API side effect:            → at-least-once; make it idempotent
upsert(orderId, ...);      // dedupe on a business key instead of relying on EOS

Why it's asked / follow-up: it separates people who parrot “exactly-once” from those who know its boundary. Follow-up: “so is exactly-once useless with external systems?” — no; you push the guarantee to the sink by making writes idempotent, and Kafka's part still removes in-pipeline duplicates.

Source: Apache Kafka — Message delivery semantics.

6 · Replication, durability & availability

What are replication factor, leader, and follower replicas? Mid

Each partition is replicated onto several brokers; the replication factor is how many copies exist (3 is the common production choice). Among the replicas, one is the leader and the rest are followers. All reads and writes for a partition go through its leader; followers do nothing but continuously fetch from the leader to stay caught up. If the leader's broker dies, a follower is promoted to leader, so the partition survives. Replicas for one partition are placed on different brokers (and ideally different racks), so a single broker loss never takes all copies.

$ kafka-topics.sh --create --topic orders \
      --partitions 6 --replication-factor 3    # 3 copies of every partition
# per partition: 1 leader (serves reads/writes) + 2 followers (replicate)

Why it's asked / follow-up: replication is the whole durability/availability story, and the leader/follower split is often muddled. Follow-up: “do consumers read from followers?” — traditionally no (leader only); follower/rack-aware fetching exists (KIP-392) mainly to cut cross-datacenter bandwidth, not for scale.

Source: Apache Kafka — Replication.

What is the in-sync replica set (ISR)? Senior

The ISR is the subset of a partition's replicas that are currently caught up with the leader (within replica.lag.time.max.ms). A follower that falls behind or goes offline is dropped from the ISR and re-added when it catches up. The ISR is the linchpin of durability because a record is considered committed once all ISR members have it, and acks=all waits for exactly that set — not all replicas, just the in-sync ones. So the ISR is what “all” in acks=all actually means, and it can shrink to just the leader, which is why the next question matters.

$ kafka-topics.sh --describe --topic orders
Topic: orders  Partition: 0  Leader: 1  Replicas: 1,2,3  Isr: 1,2,3
              (if broker 3 lags/dies → Isr: 1,2 — "all" now means 2)

Why it's asked / follow-up: it's the concept that makes acks=all and min.insync.replicas make sense together. Follow-up: “what if the ISR shrinks to just the leader?” — then acks=all is as weak as acks=1 unless min.insync.replicas blocks the write.

Source: Apache Kafka — Replication & the ISR.

How do acks=all and min.insync.replicas work together? Senior

This pairing is what actually protects against data loss, and interviewers love it because each half is useless alone. acks=all makes the producer wait for the whole ISR — but if the ISR has shrunk to one replica, “all” is “one.” min.insync.replicas (a topic/broker config) sets the floor: if fewer than that many replicas are in-sync, the broker rejects the write (NotEnoughReplicas) rather than accept a record that might not survive. The canonical durable config is RF=3, min.insync.replicas=2, acks=all: you can lose one broker and still write, but you'll never acknowledge a record that lives on only one node.

# durable trio (survives one broker loss, never acks a single-copy write):
replication.factor=3
min.insync.replicas=2       # <2 in-sync → producer write is REJECTED
acks=all                    # wait for the whole ISR

Why it's asked / follow-up: it's the durability question, and it tests whether you see that acks=all without a floor is a false sense of safety. Follow-up: “why not min.insync.replicas=3 with RF=3?” — then losing any one broker stops all writes; 2-of-3 balances durability against availability.

Source: Apache Kafka — min.insync.replicas.

What is unclean leader election? Senior

It's the availability-vs-durability knob. If every in-sync replica for a partition is down, Kafka has a choice: wait for an ISR member to come back (stay unavailable but lose nothing), or promote an out-of-sync replica that's missing the newest records (available again, but those records are lost). unclean.leader.election.enable=false (the safe default) chooses durability — the partition stays offline until a caught-up replica returns. Setting it true chooses availability at the risk of silent data loss. It's a pure CAP-style tradeoff, and the interview wants you to name both sides.

unclean.leader.election.enable=false   # default: prefer durability (may stall)
# =true → an out-of-sync replica can become leader → newest records lost

Why it's asked / follow-up: it's a clean way to probe whether you understand that Kafka forces an explicit durability/availability choice. Follow-up: “when would you ever enable it?” — for a topic where staying available matters more than a few lost records (some metrics/telemetry streams), never for financial or source-of-truth data.

Source: Apache Kafka — unclean.leader.election.enable.

What happens when a broker fails? Mid

For every partition that broker led, the controller elects a new leader from the ISR on a surviving broker, and clients transparently re-route to it after refreshing metadata (a brief blip, not an outage). For every partition it merely followed, that replica just drops out of the ISR until the broker returns and catches up. Producers with acks=all and a healthy min.insync.replicas keep writing as long as the floor is met; consumers keep reading from the new leaders. When the broker comes back, it re-fetches what it missed and rejoins the ISR. This is why replication factor ≥ 3 is standard: it tolerates a broker loss with writes still flowing.

// broker 2 dies:
//   partitions it LED     → controller promotes an ISR follower to leader
//   partitions it FOLLOWED→ dropped from ISR; rejoins after catching up
// clients refresh metadata and continue; RF=3 keeps writes flowing.

Why it's asked / follow-up: it ties replication, ISR, and the controller into one failure story. Follow-up: “who runs the election?” — the controller (a ZooKeeper role historically, a KRaft role now).

Source: Apache Kafka — Replication (failover).

7 · Storage, retention & performance

How is a partition stored on disk (log segments)? Mid

A partition's log isn't one giant file; it's a sequence of segments. The newest segment is active and being appended to; older ones are sealed and immutable. Each segment has a .log (the records), plus .index and .timeindex files that map offsets and timestamps to byte positions for fast lookup. Retention and compaction operate at segment granularity — Kafka deletes or compacts whole old segments, which is cheap because they're immutable. This segmented, append-only layout is also why Kafka's writes are sequential and its reads are fast.

# /var/lib/kafka/orders-0/
00000000000000000000.log   00000000000000000000.index   00000000000000000000.timeindex
00000000000000369125.log   ...                          # ← active segment (being appended)

Why it's asked / follow-up: it grounds retention, compaction, and tiered storage in the actual on-disk model. Follow-up: “why segment at all?” — so retention/compaction can drop or rewrite an immutable old file without touching the active one, and so index lookups stay small.

Source: Apache Kafka — The log (segments).

Retention vs log compaction — when do you use each? Mid

Two different cleanup policies. Retention (cleanup.policy=delete) drops records older than a time or size threshold — it's the “keep the last 7 days” policy, right for event streams where old events stop mattering. Compaction (cleanup.policy=compact) keeps the latest value per key forever and garbage-collects superseded values — it's for changelog / state topics where you want the current state of every key, not the full history. Compaction is why __consumer_offsets and Kafka Streams state topics work. You can even combine both (compact,delete). The mental model: retention = a time-boxed event stream; compaction = a keyed snapshot / table.

cleanup.policy=delete   retention.ms=604800000   # keep 7 days (event stream)
cleanup.policy=compact                          # keep latest value per key (changelog/state)

Why it's asked / follow-up: people conflate the two or think compaction is “compression” — it isn't; it's key-based deduplication. Follow-up: “how do you delete a key under compaction?” — write a tombstone (a record with that key and a null value), which compaction eventually removes along with prior values.

Source: Apache Kafka — Log compaction.

Why is Kafka so fast despite writing everything to disk? Senior

Three ideas. (1) Sequential I/O: because the log is append-only, writes and reads are sequential, which is dramatically faster than random access on both spinning and solid-state disks — Kafka leans on the disk's best case. (2) The OS page cache: Kafka doesn't maintain its own in-process cache; it writes to the filesystem and lets the kernel cache hot data, so recent records are usually served from RAM without Kafka managing it. (3) Zero-copy: on the read path Kafka uses sendfile to move bytes straight from the page cache to the network socket, skipping copies into user space. Batching and compression amplify all three.

// append-only log      → SEQUENTIAL disk I/O (the fast case)
// OS page cache        → recent data served from RAM, managed by the kernel
// sendfile (zero-copy) → page cache → socket, no user-space copy

Why it's asked / follow-up: it tests systems intuition — the surprising claim that a disk-backed log can outrun an in-memory queue, and why. Follow-up: “does zero-copy still apply with TLS or compression re-encoding?” — TLS and broker-side re-compression can defeat sendfile, which is a real throughput consideration.

Source: Apache Kafka — Design: persistence & efficiency.

What is tiered storage? Mid

Tiered storage (KIP-405) lets brokers offload older, sealed log segments to cheap remote object storage (S3, GCS, HDFS) while keeping recent segments on local disk. It decouples how much history you retain from how much local disk you buy, so you can keep months or years of data affordably and still add brokers for throughput without dragging huge local logs around. Consumers reading old data fetch it transparently from the remote tier. It reached production-ready (GA) in Kafka 3.9 after early access in 3.6.

remote.storage.enable=true                 # per-topic: offload old segments to object storage
local.retention.ms=3600000                # keep ~1h locally; older segments live remotely

Why it's asked / follow-up: it's a recent, high-interest capability that changes cost/scaling math, so it flags whether you follow current Kafka. Follow-up: “what does it not solve?” — latency for cold reads (object storage is slower than local disk), so it's for retention economics, not for making old data hot.

Source: KIP-405 — Tiered storage; arrival on the version reference (3.9 GA).

Can you use Kafka as a database? Senior

Partly, and it's a useful thing to reason about — but mostly no. Kafka has database-like traits: durable storage, infinite retention (with tiered storage), and compacted topics that hold the latest value per key (a table-like view over a log). Event sourcing and Kafka Streams' state stores lean on exactly this. But Kafka lacks what makes a database a database: no random access by primary key (you scan a partition; you don't index-seek a row), no ad-hoc queries or secondary indexes, no updates in place, and no cross-key transactions/joins like a relational engine. The idiomatic pattern is Kafka as the source-of-truth log and a real database (or a Streams state store / search index) as the materialized, queryable view.

// Kafka = durable, ordered log (source of truth, event history)
// query needs (get-by-id, joins, search) → materialize into Postgres / a KTable
// compacted topic ≈ "latest value per key", but still not a queryable DB.

Why it's asked / follow-up: it's a great “do you know the boundaries of your tool” question, and it connects to event sourcing and CQRS. Follow-up: “what's a compacted topic closest to?” — a durable key→latest-value changelog you can rebuild state from, not a general-purpose store.

Source: Apache Kafka — Design; see also Kafka vs a queue.

8 · The ecosystem: Connect, Streams & Schema

What is Kafka Connect, and when do you use it instead of writing a consumer? Mid

Kafka Connect is a framework for streaming data between Kafka and external systems using reusable, configuration-driven connectors — source connectors pull data into Kafka (a database via CDC, a file, an API), sink connectors push it out (to a warehouse, Elasticsearch, S3). It runs as its own cluster of workers and handles the tedious parts — offset tracking, retries, scaling, restarts — so you configure JSON instead of writing and operating bespoke producer/consumer apps. Reach for Connect when the job is plumbing between Kafka and a common system; write a consumer when you have real per-record business logic a generic connector can't express.

// a sink connector is just config, not code:
{ "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
  "topics": "orders", "connection.url": "jdbc:postgresql://..." }

Why it's asked / follow-up: it separates “I'd hand-roll a consumer for everything” from “I'd use the right tool.” Follow-up: “what are single-message transforms (SMTs)?” — lightweight per-record tweaks (rename a field, mask a value) you configure in the connector without a full stream-processing job.

Source: Apache Kafka — Kafka Connect.

Kafka Streams vs a plain consumer — when and why? Senior

A plain consumer gives you records; you write all the processing, state, and fault-tolerance yourself. Kafka Streams is a client library for stateful stream processing on top of the consumer/producer, giving you a high-level DSL: KStream (an unbounded record stream), KTable (a changelog interpreted as the latest value per key — a table), joins, aggregations, and windowing for time-bucketed computation. Its state stores are backed by compacted changelog topics, so state survives restarts, and it supports exactly-once with one config. Use Streams when you need aggregations, joins, or windowed state; a raw consumer when the work is simple, stateless per-record handling.

builder.stream("orders")
       .groupByKey()
       .windowedBy(TimeWindows.ofSizeWithNoGrace(Duration.ofMinutes(5)))
       .count();          // windowed, stateful — Streams manages the state store + changelog

Why it's asked / follow-up: it checks whether you know the stream-vs-table (KStream/KTable) duality and when the library earns its complexity. Follow-up: “KStream vs KTable?” — a KStream is every event (inserts); a KTable is the current value per key (upserts) — the log/table duality at the API level.

Source: Apache Kafka — Kafka Streams.

What is a schema registry, and what is schema compatibility? Mid

Kafka records are opaque bytes, so producers and consumers must agree on a format. A schema registry (a Confluent component, not part of Apache Kafka itself) stores versioned schemas — typically Avro, Protobuf, or JSON Schema — and the producer stamps each record with a small schema id instead of the full schema. Its real value is enforcing compatibility as schemas evolve: backward (new consumers read old data), forward (old consumers read new data), or full. It rejects an incompatible schema change at registration time, so a producer can't silently break every downstream consumer — the schema equivalent of a migration check.

// backward-compatible change: ADD a field WITH a default
{ "name": "discount", "type": "double", "default": 0.0 }
// removing a required field or renaming → rejected under BACKWARD compat

Why it's asked / follow-up: schema evolution is where real pipelines break, and this shows you've operated one. Follow-up: “backward vs forward?” — backward = upgrade consumers first (new code reads old data); forward = upgrade producers first (old code reads new data). This is a Confluent component — described here, cite Apache for Kafka core.

Source: Confluent developer docs — Schema compatibility (schema registry is a Confluent component; link-and-paraphrase).

Where does ksqlDB fit? Mid

ksqlDB (also a Confluent product, not Apache Kafka core) puts a SQL interface over stream processing: you declare streams and tables and write continuous SELECTs that run forever, and under the hood it compiles to Kafka Streams. It lowers the barrier for filtering, transforming, joining, and aggregating streams without writing a JVM app — good for straightforward streaming ETL and quick materialized views. The tradeoff is less control than hand-written Streams or a full processing framework, so complex topologies often outgrow it. Conceptually: ksqlDB is to Kafka Streams roughly what SQL is to a hand-written data-processing program.

CREATE TABLE orders_per_user AS
  SELECT userId, COUNT(*) FROM orders GROUP BY userId
  EMIT CHANGES;        // a continuous, always-updating query

Why it's asked / follow-up: it rounds out the “how do I process streams” spectrum — raw consumer → Kafka Streams → ksqlDB → Flink/Spark. Follow-up: “ksqlDB or Kafka Streams?” — ksqlDB for SQL-shaped work and speed of delivery; Streams when you need full programmatic control.

Source: Confluent developer docs — ksqlDB (a Confluent product; link-and-paraphrase). Core stream processing: Apache Kafka Streams.

9 · Architecture: controller, ZooKeeper & KRaft

What is the controller in a Kafka cluster? Mid

The controller is the brain that manages cluster metadata: it tracks which brokers are alive, elects partition leaders, and propagates leadership and ISR changes to the rest of the cluster. It's a role, not a separate process — historically one elected broker held it (via ZooKeeper); under KRaft it's a dedicated quorum of controller nodes. Ordinary produce/consume traffic does not go through the controller; it's the coordination plane that decides leadership and reassignments, while the data plane (leaders/followers) moves the records.

// control plane:  controller → leader elections, ISR/metadata changes
// data plane:     producers/consumers ↔ partition leaders (never the controller)

Why it's asked / follow-up: it's the setup for the ZooKeeper→KRaft story and clarifies that metadata and data are separate planes. Follow-up: “what happens if the controller fails?” — a new one is elected (a ZooKeeper election historically; a Raft election among controller nodes under KRaft); data keeps flowing meanwhile.

Source: Apache Kafka — Design; KRaft timeline.

What did ZooKeeper do for Kafka, and why was it removed? Mid

For most of Kafka's history, a separate Apache ZooKeeper ensemble stored cluster metadata — broker membership, topic/partition configuration, ACLs, and controller election. It worked, but it meant running and tuning a second distributed system, a metadata scaling ceiling (very large clusters strained ZooKeeper), and slow failovers where the new controller had to reload state from ZooKeeper. Removing it (KIP-500) simplifies operations to a single system, scales to far more partitions, and makes controller failover much faster.

// pre-KRaft: [ Kafka brokers ]  +  [ ZooKeeper ensemble ]  ← two systems to run
// KRaft:     [ Kafka brokers + controllers ]              ← one system

Why it's asked / follow-up: it's a defining architectural change every current Kafka engineer should be able to explain. Follow-up: “when did it actually go away?” — ZooKeeper was removed entirely in Kafka 4.0; 3.9 was the last release to support it.

Source: KIP-500; the version reference's KRaft callout.

What is KRaft — and do I still need ZooKeeper? Senior

KRaft (Kafka Raft) is Kafka's built-in metadata quorum: a set of controller nodes that run the Raft consensus protocol and store all cluster metadata as an internal Kafka log (a compacted __cluster_metadata topic). It replaces ZooKeeper entirely — metadata now lives in Kafka. The direct answer to the interview's favorite version of this: no, you no longer need ZooKeeper. KRaft was production-ready for new clusters from 3.3, a ZooKeeper→KRaft migration path landed in 3.6, and 4.0 dropped ZooKeeper altogether, so a current cluster is KRaft-only.

# KRaft cluster: dedicated controller quorum, no ZooKeeper
process.roles=broker,controller
controller.quorum.voters=1@c1:9093,2@c2:9093,3@c3:9093
# metadata lives in an internal, compacted __cluster_metadata log

Why it's asked / follow-up: “do I still need ZooKeeper?” is one of the highest-intent Kafka questions today, and the version boundaries are the crisp answer. Follow-up: “why Raft over the old model?” — a single system to operate, metadata that scales to millions of partitions, and much faster controller failover (no reload from an external store).

Source: KIP-500 — KRaft; the version reference's KRaft timeline.

What is rack awareness? Mid

Rack awareness tells Kafka the failure domain each broker sits in — a rack, or more usefully an availability zone — via broker.rack. When placing a partition's replicas, Kafka then spreads them across different racks/zones, so a whole-rack or whole-AZ outage can't take every replica of a partition at once. It's the config that turns “replication factor 3” into real fault tolerance in a cloud deployment, and it also feeds rack-aware consumer fetching (reading from a same-zone replica) to cut cross-AZ network cost.

broker.rack=us-east-1a   # label the broker's zone; Kafka spreads replicas across zones

Why it's asked / follow-up: it shows you think about real deployment topology, not just logical replication. Follow-up: “RF=3 but all three brokers in one AZ — safe?” — no; an AZ outage takes all three; rack awareness across three AZs is what makes RF=3 meaningful.

Source: Apache Kafka — Rack awareness.

10 · Design & scenario questions

How would you guarantee ordering and no data loss for a critical stream? Senior

This is where the earlier answers combine into one config recipe. Ordering: give records that must stay ordered a shared key so they share a partition (ordering is per-partition), and enable the idempotent producer so retries don't reorder within it. No loss: replication factor 3, min.insync.replicas=2, and acks=all, so a record is never acknowledged unless it's on at least two in-sync replicas; keep unclean.leader.election.enable=false so a stale replica can't be promoted over committed data. On the consumer, commit offsets after processing (at-least-once) and make processing idempotent, or use transactions for end-to-end EOS inside Kafka.

# producer
enable.idempotence=true   acks=all   # order-safe retries + full-ISR durability
# topic / broker
replication.factor=3   min.insync.replicas=2   unclean.leader.election.enable=false
# consumer: process → commit (at-least-once) + idempotent processing (or EOS txns)

Why it's asked / follow-up: it's the synthesis question — it proves you can assemble the individual knobs into a coherent guarantee. Follow-up: “what's the cost?” — latency and a throughput ceiling from per-key partitioning and full-ISR acks; you buy correctness with throughput.

Source: composed from min.insync.replicas, ordering, and idempotence; base reference Apache Kafka — delivery semantics.

A consumer group is lagging — how do you diagnose and fix it? Senior

Start by localizing the lag with kafka-consumer-groups.sh --describe. Is every partition lagging (the group is globally under-provisioned) or just one or two (a hot partition from key skew, or a slow/stuck consumer)? Globally: add partitions and consumers up to the partition ceiling, and speed up per-record processing (batch external calls, tune max.poll.records). One partition: look for key skew or a poison message blocking progress. Also rule out rebalancing storms (processing exceeding max.poll.interval.ms keeps kicking consumers out) and downstream backpressure (a slow database is the real bottleneck). The discipline is measure → localize → fix the actual cause, not “add consumers” reflexively (which does nothing past the partition count).

$ kafka-consumer-groups.sh --describe --group order-processors  # which partitions lag?
# all partitions   → add partitions + consumers; speed up processing
# one partition    → hot key / poison message / one stuck consumer
# frequent rebalances → processing > max.poll.interval.ms, or scaling churn

Why it's asked / follow-up: it's the most common Kafka operations scenario, and it rewards a diagnostic method over a canned answer. Follow-up: “adding consumers didn't help — why?” — you're at the partition ceiling (one partition per consumer), so extra consumers sit idle; you need more partitions or faster processing.

Source: Apache Kafka — Consumer lag; see key skew and rebalancing.

When would you not use Kafka? Senior

An honest answer names Kafka's costs. It's a heavyweight distributed system — if you need simple task queuing with per-message priorities, complex routing, or per-message ack/redelivery, a traditional broker like RabbitMQ (or SQS) is simpler and a better fit. If you need request/response or synchronous RPC, use an API, not a log. If you need ad-hoc queries, get-by-key, or joins, use a database — Kafka isn't queryable. For small-scale or low-throughput workloads, Kafka's operational overhead (a cluster, replication, monitoring) often isn't worth it. Kafka shines for high-throughput, durable, replayable event streams consumed by many independent readers; outside that shape, something simpler usually wins.

// priorities / per-message routing / redelivery → RabbitMQ / SQS
// synchronous request/response                 → an API / RPC
// get-by-key, joins, ad-hoc queries            → a database
// low volume, simple needs                     → don't pay for a cluster

Why it's asked / follow-up: it's a maturity check — overusing Kafka is a common architectural mistake, and interviewers want to hear you reach for the simpler tool when it fits. Follow-up: “Kafka vs RabbitMQ in one line?” — Kafka is a replayable log for streams and fan-out; RabbitMQ is a smart queue/router for tasks — different shapes, not one strictly better.

Source: Apache Kafka — Use cases; and Kafka vs a queue.

How do you reprocess or replay events? Mid

Because consuming doesn't delete anything, replay is just moving a consumer group's committed offset backward (or to a timestamp) and letting it read forward again. You can seekToBeginning/seek in code, or reset offsets operationally with kafka-consumer-groups.sh --reset-offsets (to earliest, to a specific offset, or to a datetime). Common uses: recovering from a processing bug, backfilling a new downstream store, or bootstrapping a fresh consumer from history. The prerequisites are that the data is still within retention (or in tiered storage) and that reprocessing is safe — which is why idempotent processing matters, since replay re-delivers records that were already handled once.

$ kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
      --group order-processors --topic orders \
      --reset-offsets --to-datetime 2026-07-01T00:00:00.000 --execute

Why it's asked / follow-up: replayability is a headline Kafka capability, and this checks you know the mechanics and the idempotency caveat. Follow-up: “what limits how far back you can replay?” — the topic's retention (time/size), extended by tiered storage or by using a compacted/long-retention topic for source-of-truth data.

Source: Apache Kafka — Managing consumer groups (offset reset).

How do you evolve a message schema without breaking consumers? Senior

The rule is: make only compatible changes and let a compatibility policy enforce it. Under the common backward-compatible mode you can add optional fields with defaults and remove optional fields, but you cannot remove/rename a required field or change a type incompatibly — those break readers. Pair that with a deploy order: for backward compatibility, upgrade consumers before producers (new readers cope with both old and new data); for forward compatibility, upgrade producers first. For a genuinely breaking change, don't mutate in place — version the event (a new topic or a schema version field) and run both until consumers migrate. A schema registry automates the check; the discipline is the same with or without one.

// safe (backward-compatible):
+ optional discount = 0.0     // add field WITH a default
// unsafe → version the event instead:
- required customerId          // removing/renaming a required field breaks readers

Why it's asked / follow-up: schema evolution is where long-lived pipelines actually break, and it ties together the registry, compatibility modes, and deployment order. Follow-up: “how do you ship a truly breaking change?” — dual-write to a new topic/schema version, migrate consumers, then retire the old one — never a hard in-place break.

Source: Confluent developer docs — Schema evolution & compatibility (schema registry is a Confluent component; link-and-paraphrase). See schema registry.

Every answer links its primary source inline — the Apache Kafka documentation (design, producer/consumer configuration, replication, and delivery-semantics sections) and the relevant Kafka Improvement Proposals (KIP-98 exactly-once, KIP-405 tiered storage, KIP-500 KRaft, KIP-848 the next-gen consumer protocol, KIP-932 Queues) for feature-level provenance. A few ecosystem answers describe Confluent components (the schema registry, ksqlDB) that are not part of Apache Kafka core; those link Confluent's developer documentation and are paraphrased, never lifted. The questions are a curated set of the topics a Kafka interviewer commonly covers — core concepts, producers, consumers, ordering, delivery semantics, replication, storage, the ecosystem, and the KRaft architecture — not a copy of any question bank. This page covers Apache Kafka the open-source platform; feature-arrival details cross-link to the Kafka version reference. Last updated July 2026.

Mungomash LLC · More on Kafka