Quorum Queues - Making RabbitMQ More Competitive in Reliable Messaging

Quorum Queues - Making RabbitMQ More Competitive in Reliable Messaging

The multiple design defects of RabbitMQ Mirrored Queues have been well documented by the community and acknowledged by the RabbitMQ team. In an age where new messaging systems are appearing that compete in the reliable messaging space, it is critical for RabbitMQ to improve its replicated queue story in order to continue to compete in that space. Which is why it is so exciting to see that the RabbitMQ team have been working hard to deliver a new replicated queue type based on the Raft consensus algorithm. Quorum queues are still in beta and as such are subject to change before release. Likewise, their capabilities will no doubt evolve and improve over future releases. There are currently limitations to the features of Quorum Queues but if data safety is your most important requirement then they aim to satisfy your needs.

In this post we'll going to look at the design of Quorum Queues and then in a later post we'll run a series of chaos tests to test the durability claims of this new queue type.

Why I Am Not a Fan of the RabbitMQ Sharding Plugin

Why I Am Not a Fan of the RabbitMQ Sharding Plugin

I recently spoke at the RabbitMQ Summit in London about using the Consistent Hash Exchange to maintain processing order guarantees while scaling out consumers. Afterwards I was asked why I don’t opt for the Sharding Plugin instead. One of the downsides of the Consistent Hash Exchange I spoke of in the talk was that you don’t get automatic queue assignment for your consumers. The Sharding Plugin makes an attempt to address this problem but doesn’t go all the way. In this post I’ll describe my issues with the Sharding Plugin.

How to Lose Messages on a RabbitMQ Cluster

In my RabbitMQ vs Kafka series Part 5 post I covered the theory of RabbitMQ clustering and some of the gotchas. In this post we'll demonstrate the message loss scenarios described in that post using Docker and Blockade. I recommend you read that post first as this post assumes understanding of the topics covered.

Blockade is a really easy way to test out how distributed systems cope with network partitions, flaky networks and slow networks. It was inspired by the Jepson series. In this post we'll either be killing off nodes, partitioning the cluster, introducing packet loss or slowing down the network. So with Blockade, some bash and python scripts we’ll test out some failure scenarios.

RabbitMQ vs Kafka Part 5 - Fault Tolerance and High Availability with RabbitMQ Clustering

Fault tolerance and High Availability are big subjects and so we'll tackle RabbitMQ and Kafka in separate posts. In this post we'll look at RabbitMQ and in Part 6 we'll look at Kafka while making comparisons to RabbitMQ. This is a long post, even though we only look at RabbitMQ, so get comfortable.

In this post we'll look at the strategies for fault tolerance, consistency and high availability (HA) and the trade-offs each strategy makes. RabbitMQ can operate as a cluster of nodes and as such can be classed as a distributed system. When it comes to distributed data systems we often speak about consistency and availability.

We talk about consistency and availability with distributed systems because they describe how the system behaves under failure. A network link fails, a server fails, a hard disk fails, a server is temporarily unavailable due to GC or a network link is lossy or slow. All these things can cause outages, data loss or data conflicts. It turns out that it is generally not possible to provide a system that is ultimately consistent (no data loss, no data divergence) and available (will accept reads and writes) under all failure modes.

We'll see that consistency and availability are at two ends of a spectrum and you'll need to choose which of those you'll optimize for. The good news is that with RabbitMQ this is a choice that you can make. It gives you the nerd knobs required to tune it for greater consistency or greater availability.

In this post we'll be paying close attention to what configurations produce data loss of acknowledged writes. There is a chain of responsibility between producers, brokers and consumers. Once a message has been handed off to a broker, it is the broker's job not to lose that message. When the broker acknowledges receipt of a message to the publisher, we don't expect that message to be lost. But we'll see that this indeed can happen depending on your broker and publisher configuration.

RabbitMQ Work Queues: Avoiding Data Inconsistency with Rebalanser

With RabbitMQ we can scale-out our consumers by simply adding more, but we can also scale-out our queues. There are a few reasons why scaling out our queues might be preferential to simply adding more consumers to a single queue (competing consumers), one of those reasons is when using the work queue pattern.

Event-Driven Architectures - Queue vs Log - A Case Study

In the previous post we looked at relative event ordering and the decoupling of publishers and consumers among other things. In this post we'll take those concepts and look at an example architecture. We'll look at the various modelling possibilities we have with RabbitMQ representing a queue based system, and Kafka representing a log based system.

RabbitMQ vs Kafka Part 4 - Message Delivery Semantics and Guarantees

Both RabbitMQ and Kafka offer durable messaging guarantees. Both offer at-most-once and at-least-once guarantees but kafka offers exactly-once guarantees in a very limited scenario.

Let's first understand what these guarantees mean:

  • At-most-once delivery. This means that a message will never be delivered more than once but messages might be lost.
  • At-least-once delivery. This means that we'll never lose a message but a message might end up being delivered to a consumer more than once.
  • Exactly-once delivery. The holy grail of messaging. All messages will be delivered exactly one time.

Delivery is probably the wrong word for the above terms, instead Processing might be a better way of putting it. After all what we care about is whether a consumer can process a message and whether that is at-most-once, at-least-once or exactly-once. But using the word processing complicates things, exactly-once delivery makes less sense now as perhaps we need it to be delivered twice in order to be able to successfully process it once. If the consumer dies during processing, then we need that the message be delivered a second time for a new consumer.

RabbitMQ vs Kafka Part 3 - Kafka Messaging Patterns

In Part 2 we covered the patterns and topologies that RabbitMQ enables. In this part we'll look at Kafka and contrast it against RabbitMQ to get some perspective on their differences. Remember that this comparison is within the context of an event-driven application architecture rather than data processing pipelines, although the line between them can be a bit grey. Perhaps it is more like a continuum and this comparison focuses on the event-driven applications end of that continuum.

RabbitMQ vs Kafka Part 2 - RabbitMQ Messaging Patterns

In this part we're going to forget about the low level details in the protocols and concentrate on the higher level patterns and message topologies that can be achieved in RabbitMQ. In Part 3 of the series we'll do the same for Apache Kafka.

First we'll cover the building blocks, or routing primitives, of RabbitMQ:

  • Exchange types and bindings
  • Queues
  • Dead letter exchanges
  • Ephemeral exchanges and queues
  • Alternate Exchanges
  • Priortity Queues

Then we'll combine them all into a set of example patterns.