September 6, 2019

A Look at Multi-Topic Subscriptions with Apache Pulsar

September 6, 2019

This is a sister post to one I am writing about multi-topic subscriptions with Apache Kafka that you can read soon on the Cloud Karafka blog (link coming soon). I will provide a summary of those results before we get started with Apache Pulsar. The run the same tests in my tests of both technologies.

The objective is to get an understanding of what to expect from multi-topic subscriptions, specifically we are testing message ordering. Message ordering is a fundamental component of messaging systems and even though cross topic ordering is not guaranteed by Pulsar or Kafka, I find it interesting and useful to know what to expect.

Jack Vanlightly

November 2, 2018

Messaging Systems

Testing Producer Deduplication in Apache Kafka and Apache Pulsar

Jack Vanlightly

November 2, 2018

Messaging Systems

Testing Producer Deduplication in Apache Kafka and Apache Pulsar

Failures can induce message duplication on both the producer and consumer side. In this post we’ll focus solely on producer side duplication, looking at how the deduplication feature works in Apache Pulsar and Apache Kafka. I have run many hours of deduplication tests of both messaging systems and we´ll see the results of those tests.

On the producer side, when a producer sends a message and an error occurs, such as a TCP connection failure, the producer has no way to know if the message was persisted or not. We have two choices, send the message again to ensure it gets delivered and risk duplication, or not send it again and risk the message never getting delivered.

Jack Vanlightly

October 21, 2018

Messaging Systems

How to (not) Lose Messages on an Apache Pulsar Cluster

Jack Vanlightly

October 21, 2018

Messaging Systems

How to (not) Lose Messages on an Apache Pulsar Cluster

In this post we’ll put the protocols we covered in the Understanding How Apache Pulsar Works post to the test. As in previous tests of How to Lose Messages on a RabbitMQ Cluster and How to Lose Messages on a Apache Kafka Cluster, I’ll be using Blockade to kill off nodes, slow down the network and lose packets. Unlike in those previous tests, these tests are automated and go further, not only testing for data loss but also correct ordering and duplication.

In each scenario we’ll stand-up a new blockade cluster with a specific configuration of:

Apache Pulsar broker count
Apache BookKeeper node (Bookie) count
Ensemble size (E)
Write quorum size (Qw)
Ack quorum size (Qa)

Jack Vanlightly

October 3, 2018

Messaging Systems

Understanding How Apache Pulsar Works

Jack Vanlightly

October 3, 2018

Messaging Systems

I will be writing a series of blog posts about Apache Pulsar, including some Kafka vs Pulsar posts. First up though I will be running some chaos tests on a Pulsar cluster like I have done with RabbitMQ and Kafka to see what failure modes it has and its message loss scenarios.

I will try to do this by either exploiting design defects, implementation bugs or poor configuration on the part of the admin or developer.

In this post we’ll go through the Apache Pulsar design so that we can better design the failure scenarios. This post is not for people who want to understand how to use Apache Pulsar but who want to understand how it works. I have struggled to write a clear overview of its architecture in a way that is simple and easy to understand. I appreciate any feedback on this write-up.