The shape of the workload decides who looks fast. Kafka’s throughput comes from sequential appends, producer batching, and zero-copy reads — a continuous firehose of small messages is its best case, and it degrades gracefully as load climbs. RabbitMQ is the opposite profile: with short queues and moderate load its tail latency is excellent, often beating Kafka at p99, because messages pass through memory and rarely touch disk. Let queues grow deep, though, and RabbitMQ starts paging messages out, throughput drops, and latency goes from the best in the field to the worst. Kafka’s log does not care how far behind a consumer is.
The OpenMessaging benchmark cited above deserves a caveat: it measured sustained throughput and latency on three-broker clusters pushing 1 KB messages with replication factor 3 and durable producers. That is Kafka’s home game — a firehose with no routing. It did not measure topic-exchange fan-out, priority queues, per-message TTLs, or selective routing, which is the work RabbitMQ exists to do. A benchmark of those patterns would flatter RabbitMQ the way the firehose flatters Kafka.
Durability settings dominate the spread more than the engines do. A Kafka producer with acks=1 is answering a different question than one with acks=all and min.insync.replicas=2 — and Kafka famously trusts the OS page cache rather than fsyncing every message, while RabbitMQ’s quorum queues fsync before confirming. Comparing a loosely configured system against a strictly configured one measures the settings, not the software. Most published head-to-heads, including vendor ones, blur this line somewhere.
And most teams never reach territory where any of it matters. Below roughly 50K messages per second, either broker is loafing; the bottleneck is almost always the consumer — deserialization, the database write per message, the external API call. If your throughput target fits in that range, pick by shape (replay and fan-out versus routing and acks), not by benchmark deltas you will never get close to.