With the announcement of KIP-932, Queues for Kafka, I thought it was worthwhile a revisit of the subject of queues vs logs and how we actually can build better queues on top of logs.
Posts I wrote on the RabbitMQ blog in 2020
Kafka and RabbitMQ blog posts I wrote elsewhere in 2019
Since I started working at companies that run Messaging-as-a-service (84codes) or actually build the messaging systems themselves (VMware, Splunk) I have been writing blog posts but not on my own blog. I don’t want the confusion of double posting so I’m just going to start posting links this content on my blog and perhaps add some commentary. So here goes for 2019:
Why I'm Not Writing Much On My Blog These Days
Firstly, I joined the RabbitMQ core team which is a demanding job that takes most of my energy, and the second reason is that I pretty much only blog about RabbitMQ now and those posts go on the RabbitMQ blog. So if you are interested in my writing about RabbitMQ, then please head over to our blog.
I also have posts I’d like to write about Apache Pulsar, Apache Kafka, Pravega, Redis and NATS. But I don’t have much time and while I think I would be impartial, I wouldn’t expect others to think so. I have skin in the game now.
But I still spend time understanding how other systems work and how they are positioned in the market. Knowing how the industry evolves and what customers expect help us evolve RabbitMQ while keeping it “rabbity”. RabbitMQ will always aim to be a general purpose message broker, not a data platform nor a big data complex event processing system. But just like object oriented languages have benefited from incorporating some functional language paradigms, RabbitMQ can benefit from incorporating aspects of other messaging paradigms - but without losing its soul or the reasons why users already love it.
Back to writing… blog posts can be a bit like benchmarks: if it’s one vendor vs another then your scepticism level should go through the roof, probably into orbit. Not only might it be an apples to oranges comparison, but a biased one. Likewise if I am writing about why I don’t like some aspect of another messaging system, is that biased or is it an impartial analysis? So I’ll stick to RabbitMQ for now.
If you like my writing about RabbitMQ, I will be posting at least monthly on the RabbitMQ blog about things that I find interesting and that I think will be valuable to the community. Feel free to suggest subjects to me that you’d like me to cover.
A Look at Multi-Topic Subscriptions with Apache Pulsar
This is a sister post to one I am writing about multi-topic subscriptions with Apache Kafka that you can read soon on the Cloud Karafka blog (link coming soon). I will provide a summary of those results before we get started with Apache Pulsar. The run the same tests in my tests of both technologies.
The objective is to get an understanding of what to expect from multi-topic subscriptions, specifically we are testing message ordering. Message ordering is a fundamental component of messaging systems and even though cross topic ordering is not guaranteed by Pulsar or Kafka, I find it interesting and useful to know what to expect.
Quorum Queues - Making RabbitMQ More Competitive in Reliable Messaging
The multiple design defects of RabbitMQ Mirrored Queues have been well documented by the community and acknowledged by the RabbitMQ team. In an age where new messaging systems are appearing that compete in the reliable messaging space, it is critical for RabbitMQ to improve its replicated queue story in order to continue to compete in that space. Which is why it is so exciting to see that the RabbitMQ team have been working hard to deliver a new replicated queue type based on the Raft consensus algorithm. Quorum queues are still in beta and as such are subject to change before release. Likewise, their capabilities will no doubt evolve and improve over future releases. There are currently limitations to the features of Quorum Queues but if data safety is your most important requirement then they aim to satisfy your needs.
In this post we'll going to look at the design of Quorum Queues and then in a later post we'll run a series of chaos tests to test the durability claims of this new queue type.
Why I Am Not a Fan of the RabbitMQ Sharding Plugin
I recently spoke at the RabbitMQ Summit in London about using the Consistent Hash Exchange to maintain processing order guarantees while scaling out consumers. Afterwards I was asked why I don’t opt for the Sharding Plugin instead. One of the downsides of the Consistent Hash Exchange I spoke of in the talk was that you don’t get automatic queue assignment for your consumers. The Sharding Plugin makes an attempt to address this problem but doesn’t go all the way. In this post I’ll describe my issues with the Sharding Plugin.
Testing Producer Deduplication in Apache Kafka and Apache Pulsar
Failures can induce message duplication on both the producer and consumer side. In this post we’ll focus solely on producer side duplication, looking at how the deduplication feature works in Apache Pulsar and Apache Kafka. I have run many hours of deduplication tests of both messaging systems and we´ll see the results of those tests.
On the producer side, when a producer sends a message and an error occurs, such as a TCP connection failure, the producer has no way to know if the message was persisted or not. We have two choices, send the message again to ensure it gets delivered and risk duplication, or not send it again and risk the message never getting delivered.
How to (not) Lose Messages on an Apache Pulsar Cluster
In this post we’ll put the protocols we covered in the Understanding How Apache Pulsar Works post to the test. As in previous tests of How to Lose Messages on a RabbitMQ Cluster and How to Lose Messages on a Apache Kafka Cluster, I’ll be using Blockade to kill off nodes, slow down the network and lose packets. Unlike in those previous tests, these tests are automated and go further, not only testing for data loss but also correct ordering and duplication.
In each scenario we’ll stand-up a new blockade cluster with a specific configuration of:
Apache Pulsar broker count
Apache BookKeeper node (Bookie) count
Ensemble size (E)
Write quorum size (Qw)
Ack quorum size (Qa)
Understanding How Apache Pulsar Works
I will be writing a series of blog posts about Apache Pulsar, including some Kafka vs Pulsar posts. First up though I will be running some chaos tests on a Pulsar cluster like I have done with RabbitMQ and Kafka to see what failure modes it has and its message loss scenarios.
I will try to do this by either exploiting design defects, implementation bugs or poor configuration on the part of the admin or developer.
In this post we’ll go through the Apache Pulsar design so that we can better design the failure scenarios. This post is not for people who want to understand how to use Apache Pulsar but who want to understand how it works. I have struggled to write a clear overview of its architecture in a way that is simple and easy to understand. I appreciate any feedback on this write-up.