How to (not) Lose Messages on an Apache Pulsar Cluster — Jack Vanlightly

How to (not) Lose Messages on an Apache Pulsar Cluster

In this post we’ll put the protocols we covered in the Understanding How Apache Pulsar Works post to the test. As in previous tests of How to Lose Messages on a RabbitMQ Cluster and How to Lose Messages on a Apache Kafka Cluster, I’ll be using Blockade to kill off nodes, slow down the network and lose packets. Unlike in those previous tests, these tests are automated and go further, not only testing for data loss but also correct ordering and duplication.

The code is in the Pulsar directory of my ChaosTestingCode repo on Github.

In each scenario we’ll stand-up a new blockade cluster with a specific configuration of:

  • Apache Pulsar broker count

  • Apache BookKeeper node (Bookie) count

  • Ensemble size (E)

  • Write quorum size (Qw)

  • Ack quorum size (Qa)

The clients communicate to the brokers via a proxy.

In each test we’ll send messages and perform some kind of chaos action, such as killing the topic owner Pulsar broker, killing a bookie that is a member of the current ledger ensemble, isolating nodes from ZooKeeper etc.

Once publishing is complete, we start up a reader and count the messages and ensure that all acknowledged message were received, and that they are read in the correct order.

Feel free to skip to the end where I summarize the results. If you care about the test method, then read on.

Anatomy of a Test

All tests are run with a producer sending messages as fast as it can, which with my setup is about 20000-30000 a second. Some tests take longer, so I increase the total number of messages to send, rather than slow down the producer.

Deduplication is disabled.

Tests are run by calling the pulsar-test.py python script. This script accepts the following arguments:

  • test type (no-fail, kill-broker, kill-bookie, kill-bookies[n], isolate-broker-from-zk, isolate-bookie-from-zk, custom-isolation[partition|partition]). You’ll see what each one means in the scenarios below

  • test name

  • number of test runs

  • number of messages to send

  • message marker of when to kick-off the chaos action. 50000 for example, means on receiving the 50000th ack, perform the chaos action. It usually takes time to kick in so the action may take place a few hundred thousand messages later.

  • ledger config (E, Qw, Qa) in the format: E-Qw-Qa, such as 2-2-1

  • node counts in format: brokers-bookies, such as 3-3

  • deduplication enabled (true/false)

Various blockade.yml files are stored in the cluster/blockade-files directory. Each one has a different number of brokers and bookies. Before calling “blockade up”, the correct file is copied to the cluster directory.

The pulsar-test.py performs the following:

  1. If a blockade cluster is running then destroy it with “blockade destroy” command

  2. Copy corresponding blockade.yml

  3. blockade up

  4. Start publishing messages. The contents of each message is a monotonically increasing integer. This is used in the ordering check.

  5. Perform chaos action

  6. Once publishing complete, start reading

  7. Anaylze and log results

  8. Repeat “number of test runs” times

All output is displayed in the terminal. A more concise output is logged to: [test-name]-output.txt in the automated/test-output directory. Details of duplicates and out-of-order messages are logged to [test-name]-duplicates.txt and [test-name]-out-of-order.txt.

The results of each run come in the format:

Final send count: 2000000
Final ack count: 2000000
Final positive ack count: 2000000
Final negative ack count: 0
Messages received: 2000000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0

Messages are tracked:

  • When a message is sent, we add its integer value to a dictionary “messages_sent” as the key and an empty list as the value.

  • When a message is positively acked we add the integer value to the set “messages_pos_acked”.

  • When a message is negatively acked we add the integer value to the set “messages_neg_acked”.

On reading a message we look it up in the “messages_sent” dictionary and add the message id to its list. From there, we can detect missing messages, duplicates etc:

  • Acked messages missing = Any keys in the “messages_sent” dictionary with an empty message id list and that exists in the pos acked set.

  • Non-acked messages received = Any keys in the “messages_sent” dictionary with a non-empty message id list and that does not exist in the pos acked set.

  • Duplicates = Any keys in the “messages_sent” dictionary with multiple message ids in its list

  • Out-of-order = the integer value is lower than the previous message and it has a non-empty message id list in the dictionary (i.e. it has not been consumed once already)

Note that the producer will only allow up to 10000 messages in flight at a time. Once the acks are lagging 10000 behind the sending, it pauses itself to wait for acks to catch up a little. We’ll see the number 10000 in some of the tests for this reason.

Finally, regarding duplicates. The Pulsar client will automatically retry messages on failure up to the send timeout. This means that duplicates can be created when a message successfully gets written to BookKeeper but the Pulsar broker dies or fails before it can send an ack to the client. Unless explicitly specified, the deduplication feature is off on all tests.

First Gotcha - Default E, Qw, Qa

I experienced strange behaviour where readers and consumers could not consume a topic after killing off bookies. It ended up being related to the default E, Qw and Qa settings. The cursors are stored in ledgers created by the above default values and if there are not enough bookies to satisfy those numbers then the subscriptions cannot be created.

Those defaults are set in the environment section of each bookie in the blockade.yml:

environment: {
   "clusterName": "cluster-1",
   "zookeeperServers": "zk1:2181",
   "managedLedgerDefaultEnsembleSize": 4,
   "managedLedgerDefaultWriteQuorum": 3,
   "managedLedgerDefaultAckQuorum": 2
}

In the above configuration you’ll need at least 4 bookies to make your cluster work. So when testing with smaller numbers of bookies you’ll need to make sure that those numbers match the lowest number of bookies that your scenario will use (taking into account any bookie failures also).

All the scenarios use the defaults of 2, 2, 2 and I always have at a minimum 3 bookies, so that we ensure we don’t lose cursor data when killing one bookie and that we can create new readers and consumers.

Scenario 0 - Control (no-fail test)

Our first scenario will simply publish 2,000,000 messages without performing any chaos actions and checking that all messages are read, and that those messages are read in the right order without duplication. The test is run 5 times.

Below I include the terminal output to give an idea of what it does. I have omitted the test output of all but the last run for brevity.

$ python pulsar-test.py no-fail no-fail-test1 5 2000000 50000 2-2-1 3-3 false
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #1 on topic no-fail-test1_5  ------------
... (omitted)

-----------------------------------------------
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #2 on topic no-fail-test1_5  ------------
... (omitted)

-----------------------------------------------
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #3 on topic no-fail-test1_5  ------------
... (omitted)

-----------------------------------------------
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #4 on topic no-fail-test1_5  ------------
... (omitted)

-----------------------------------------------
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #5 on topic no-fail-test1_5  ------------

2018-10-19 16:30:44.795 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.208.9:6650
2018-10-19 16:30:44.799 INFO  ClientConnection:285 | [192.168.208.1:34476 -> 192.168.208.9:6650] Connected to broker
2018-10-19 16:30:45.965 INFO  BatchMessageContainer:41 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://vanlightly/cluster-1/ns1/no-fail-test1_5] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2018-10-19 16:30:45.965 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, ] Getting connection from pool
2018-10-19 16:30:47.313 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar3:6650
2018-10-19 16:30:47.313 INFO  ClientConnection:287 | [192.168.208.1:34482 -> 192.168.208.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar3:6650
2018-10-19 16:30:47.894 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, ] Created producer on broker [192.168.208.1:34482 -> 192.168.208.9:6650] 
Send count: 56645 Ack count: 50000 Pos: 50000 Neg: 0
No chaos action to perform
Send count: 102275 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 151578 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 207610 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 255241 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 309300 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 354595 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 407730 Ack count: 400000 Pos: 400000 Neg: 0
Send count: 459060 Ack count: 450000 Pos: 450000 Neg: 0
Send count: 503590 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 557272 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 601064 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 653045 Ack count: 650000 Pos: 650000 Neg: 0
Send count: 701051 Ack count: 700000 Pos: 700000 Neg: 0
Send count: 752087 Ack count: 750000 Pos: 750000 Neg: 0
Send count: 800455 Ack count: 800000 Pos: 800000 Neg: 0
Send count: 851194 Ack count: 850000 Pos: 850000 Neg: 0
Send count: 904089 Ack count: 900000 Pos: 900000 Neg: 0
Send count: 958704 Ack count: 950000 Pos: 950000 Neg: 0
Send count: 1007863 Ack count: 1000000 Pos: 1000000 Neg: 0
Send count: 1053645 Ack count: 1050000 Pos: 1050000 Neg: 0
Send count: 1102725 Ack count: 1100000 Pos: 1100000 Neg: 0
Send count: 1150921 Ack count: 1150000 Pos: 1150000 Neg: 0
Send count: 1201317 Ack count: 1200000 Pos: 1200000 Neg: 0
Send count: 1254143 Ack count: 1250000 Pos: 1250000 Neg: 0
Send count: 1303459 Ack count: 1300000 Pos: 1300000 Neg: 0
Send count: 1358449 Ack count: 1350000 Pos: 1350000 Neg: 0
Send count: 1408474 Ack count: 1400000 Pos: 1400000 Neg: 0
Send count: 1456386 Ack count: 1450000 Pos: 1450000 Neg: 0
Send count: 1505650 Ack count: 1500000 Pos: 1500000 Neg: 0
Send count: 1550343 Ack count: 1550000 Pos: 1550000 Neg: 0
Send count: 1601060 Ack count: 1600000 Pos: 1600000 Neg: 0
Send count: 1652008 Ack count: 1650000 Pos: 1650000 Neg: 0
Send count: 1701559 Ack count: 1700000 Pos: 1700000 Neg: 0
Send count: 1751662 Ack count: 1750000 Pos: 1750000 Neg: 0
Send count: 1801550 Ack count: 1800000 Pos: 1800000 Neg: 0
Send count: 1852799 Ack count: 1850000 Pos: 1850000 Neg: 0
Send count: 1901281 Ack count: 1900000 Pos: 1900000 Neg: 0
Send count: 1951079 Ack count: 1950000 Pos: 1950000 Neg: 0
Send count: 2000000 Ack count: 2000000 Pos: 2000000 Neg: 0
2018-10-19 16:31:51.177 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, cluster-1-2-0] Closed producer
2018-10-19 16:31:51.346 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.208.9:6650
2018-10-19 16:31:51.348 INFO  ClientConnection:285 | [192.168.208.1:34574 -> 192.168.208.9:6650] Connected to broker
2018-10-19 16:31:51.357 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, reader-5a64ed, 0] Getting connection from pool
2018-10-19 16:31:51.645 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar3:6650
2018-10-19 16:31:51.645 INFO  ClientConnection:287 | [192.168.208.1:34578 -> 192.168.208.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar3:6650
2018-10-19 16:31:51.716 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, reader-5a64ed, 0] Created consumer on broker [192.168.208.1:34578 -> 192.168.208.9:6650] 
Last confirmed entry: [0, 2557]
Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 58]
Received: 100000 Curr Entry: [0, 115]
Received: 150000 Curr Entry: [0, 177]
Received: 200000 Curr Entry: [0, 236]
Received: 250000 Curr Entry: [0, 300]
Received: 300000 Curr Entry: [0, 386]
Received: 350000 Curr Entry: [0, 446]
Received: 400000 Curr Entry: [0, 506]
Received: 450000 Curr Entry: [0, 569]
Received: 500000 Curr Entry: [0, 629]
Received: 550000 Curr Entry: [0, 695]
Received: 600000 Curr Entry: [0, 754]
Received: 650000 Curr Entry: [0, 816]
Received: 700000 Curr Entry: [0, 892]
Received: 750000 Curr Entry: [0, 955]
Received: 800000 Curr Entry: [0, 1021]
Received: 850000 Curr Entry: [0, 1086]
Received: 900000 Curr Entry: [0, 1152]
Received: 950000 Curr Entry: [0, 1211]
Received: 1000000 Curr Entry: [0, 1268]
Received: 1050000 Curr Entry: [0, 1339]
Received: 1100000 Curr Entry: [0, 1410]
Received: 1150000 Curr Entry: [0, 1472]
Received: 1200000 Curr Entry: [0, 1534]
Received: 1250000 Curr Entry: [0, 1595]
Received: 1300000 Curr Entry: [0, 1654]
Received: 1350000 Curr Entry: [0, 1713]
Received: 1400000 Curr Entry: [0, 1779]
Received: 1450000 Curr Entry: [0, 1836]
Received: 1500000 Curr Entry: [0, 1896]
Received: 1550000 Curr Entry: [0, 1971]
Received: 1600000 Curr Entry: [0, 2043]
Received: 1650000 Curr Entry: [0, 2101]
Received: 1700000 Curr Entry: [0, 2164]
Received: 1750000 Curr Entry: [0, 2226]
Received: 1800000 Curr Entry: [0, 2297]
Received: 1850000 Curr Entry: [0, 2367]
Received: 1900000 Curr Entry: [0, 2433]
Received: 1950000 Curr Entry: [0, 2496]
Received: 2000000 Curr Entry: [0, 2557]
Read phase complete with message (0,2557,378,-1)
2018-10-19 16:32:35.080 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/no-fail-test1_5, reader-5a64ed, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 2000000
Final ack count: 2000000
Final positive ack count: 2000000
Final negative ack count: 0
Messages received: 2000000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0

We can see the concise results in test-output/no-fail-test1-output.txt

16:11:45 Start test
16:13:54: 
16:13:54: Test Run #1 on topic no-fail-test1_1  ------------
16:15:41: Results --------------------------------------------
16:15:41: Final send count: 2000000
16:15:41: Final ack count: 2000000
16:15:41: Final positive ack count: 2000000
16:15:41: Final negative ack count: 0
16:15:41: Messages received: 2000000
16:15:41: Acked messages missing: 0
16:15:41: Non-acked messages received: 0
16:15:41: Out-of-order: 0
16:15:41: Duplicates: 0
16:15:41: ----------------------------------------------------
16:17:57: 
16:17:57: Test Run #2 on topic no-fail-test1_2  ------------
16:19:44: Results --------------------------------------------
16:19:44: Final send count: 2000000
16:19:44: Final ack count: 2000000
16:19:44: Final positive ack count: 2000000
16:19:44: Final negative ack count: 0
16:19:44: Messages received: 2000000
16:19:44: Acked messages missing: 0
16:19:44: Non-acked messages received: 0
16:19:44: Out-of-order: 0
16:19:44: Duplicates: 0
16:19:44: ----------------------------------------------------
16:22:01: 
16:22:01: Test Run #3 on topic no-fail-test1_3  ------------
16:23:52: Results --------------------------------------------
16:23:52: Final send count: 2000000
16:23:52: Final ack count: 2000000
16:23:52: Final positive ack count: 2000000
16:23:52: Final negative ack count: 0
16:23:52: Messages received: 2000000
16:23:52: Acked messages missing: 0
16:23:52: Non-acked messages received: 0
16:23:52: Out-of-order: 0
16:23:52: Duplicates: 0
16:23:52: ----------------------------------------------------
16:26:20: 
16:26:20: Test Run #4 on topic no-fail-test1_4  ------------
16:28:18: Results --------------------------------------------
16:28:18: Final send count: 2000000
16:28:18: Final ack count: 2000000
16:28:18: Final positive ack count: 2000000
16:28:18: Final negative ack count: 0
16:28:18: Messages received: 2000000
16:28:18: Acked messages missing: 0
16:28:18: Non-acked messages received: 0
16:28:18: Out-of-order: 0
16:28:18: Duplicates: 0
16:28:18: ----------------------------------------------------
16:30:43: 
16:30:43: Test Run #5 on topic no-fail-test1_5  ------------
16:32:36: Results --------------------------------------------
16:32:36: Final send count: 2000000
16:32:36: Final ack count: 2000000
16:32:36: Final positive ack count: 2000000
16:32:36: Final negative ack count: 0
16:32:36: Messages received: 2000000
16:32:36: Acked messages missing: 0
16:32:36: Non-acked messages received: 0
16:32:36: Out-of-order: 0
16:32:36: Duplicates: 0
16:32:36: ----------------------------------------------------

As you can see, it’s not fast. It took 20 minutes to run the five tests. The main issue is that creating this blockade cluster is pretty slow. With RabbitMQ and Kafka it was pretty quick, but this Pulsar cluster is really CPU intensive and slow to start up. Part of that might be that we are normally running 8 nodes (1 ZK, 1 Proxy, 3 Brokers and 3 Bookies) but sometimes up to 11.

Also see that no message loss occurred with no duplication or ordering issues.

Scenario 1 - Kill the Topic Broker Owner (kill-broker test)

Configuration 2-2-1 3-3.

In this test, the pulsar-test.py uses some shell scripts to detect which broker is the owner and kills it during message transmission.

This scenario has the minimum redundancy possible with E=2, Qw=2 and Qa=1 with 3 pulsar brokers and 3 bookies. This means we have two copies of each message but only an ack from a single bookie is required for the broker to send an ack to the client.

We kill a broker during message transmission which should not cause any data loss. Messages are only acked once persisted to disk by Qa bookies and the broker itself sends the ack. If we kill a broker, at worst we might see unacknowledged messages being read by the reader. A message may be persisted by BK but the Pulsar broker dies before being able to send the ack to the client. When a new broker takes over, the recovery process will detect those messages and ensure they are replicated before closing that ledger, making them available to readers.

I have included the full output of one run to give you an idea of when the script performs each action. Each test run sends 1,000,000 messages, starting the process of killing the owner at message 50000. The test is performed 5 times.

$ python pulsar-test.py kill-broker kill-broker-test1 5 1000000 50000 2-2-1 3-3 false
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1
Test Run #1 on topic kill-broker-test1_1  ------------
-------------------------------------------------
WRITE PHASE
-------------------------------------------------
2018-10-19 20:31:48.389 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar2:6650
2018-10-19 20:31:48.389 INFO  ClientConnection:287 | [192.168.96.1:52202 -> 192.168.96.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar2:6650
Send count: 51706 Ack count: 50000 Pos: 50000 Neg: 0
Send count: 101199 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 155831 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 208015 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 250923 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 303241 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 350970 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 400041 Ack count: 400000 Pos: 400000 Neg: 0
Send count: 450612 Ack count: 450000 Pos: 450000 Neg: 0
-------------------------------------------------
pulsar2 is the topic owner, killing pulsar2!!!!!!
-------------------------------------------------
Send count: 501553 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 551276 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 602319 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 651554 Ack count: 650000 Pos: 650000 Neg: 0
2018-10-19 20:32:07.710 INFO  ClientConnection:1237 | [192.168.96.1:52202 -> 192.168.96.9:6650] Connection closed
2018-10-19 20:32:07.710 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test1_1, cluster-1-1-0] Schedule reconnection in 0.1 s
2018-10-19 20:32:07.710 INFO  ClientConnection:195 | [192.168.96.1:52202 -> 192.168.96.9:6650] Destroyed connection
2018-10-19 20:32:07.810 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test1_1, cluster-1-1-0] Getting connection from pool
2018-10-19 20:32:07.815 ERROR ClientConnection:792 | [192.168.96.1:52196 -> 192.168.96.9:6650] Failed lookup req_id: 3 error: 6
2018-10-19 20:32:07.815 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test1_1, cluster-1-1-0] Schedule reconnection in 0.192 s
2018-10-19 20:32:08.009 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test1_1, cluster-1-1-0] Getting connection from pool
2018-10-19 20:32:08.500 INFO  ConnectionPool:53 | Deleting stale connection from pool for pulsar://pulsar2:6650 use_count: -1 @ 0
2018-10-19 20:32:08.500 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar2:6650
2018-10-19 20:32:08.501 INFO  ClientConnection:287 | [192.168.96.1:52276 -> 192.168.96.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar2:6650
-------------------------------------------------
pulsar2 KILLED!
-------------------------------------------------
Send count: 702977 Ack count: 700000 Pos: 692977 Neg: 7023
2018-10-19 20:32:38.508 INFO  ClientConnection:1237 | [192.168.96.1:52276 -> 192.168.96.9:6650] Connection closed
2018-10-19 20:32:38.509 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, cluster-1-1-0] Schedule reconnection in 0.1 s
2018-10-19 20:32:38.509 INFO  ClientConnection:195 | [192.168.96.1:52276 -> 192.168.96.9:6650] Destroyed connection
2018-10-19 20:32:38.610 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, cluster-1-1-0] Getting connection from pool
2018-10-19 20:32:38.647 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-19 20:32:38.647 INFO  ClientConnection:287 | [192.168.96.1:52306 -> 192.168.96.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-19 20:32:39.592 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, cluster-1-1-0] Created producer on broker [192.168.96.1:52306 -> 192.168.96.9:6650] 
Send count: 750954 Ack count: 750000 Pos: 740000 Neg: 10000
Send count: 805568 Ack count: 800000 Pos: 790000 Neg: 10000
Send count: 853879 Ack count: 850000 Pos: 840000 Neg: 10000
Send count: 904020 Ack count: 900000 Pos: 890000 Neg: 10000
Send count: 953552 Ack count: 950000 Pos: 940000 Neg: 10000
Send count: 1000000 Ack count: 1000000 Pos: 990000 Neg: 10000
2018-10-19 20:32:47.320 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, cluster-1-1-0] Closed producer
2018-10-19 20:32:47.466 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.96.9:6650
2018-10-19 20:32:47.468 INFO  ClientConnection:285 | [192.168.96.1:52406 -> 192.168.96.9:6650] Connected to broker
2018-10-19 20:32:47.477 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, reader-63a5b2, 0] Getting connection from pool
2018-10-19 20:32:47.496 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-19 20:32:47.496 INFO  ClientConnection:287 | [192.168.96.1:52410 -> 192.168.96.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-19 20:32:47.562 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, reader-63a5b2, 0] Created consumer on broker [192.168.96.1:52410 -> 192.168.96.9:6650]
-------------------------------------------------
READ PHASE
-------------------------------------------------
Last confirmed entry: [1, 387]
Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 72]
Received: 100000 Curr Entry: [0, 148]
Received: 150000 Curr Entry: [0, 207]
Received: 200000 Curr Entry: [0, 272]
Received: 250000 Curr Entry: [0, 335]
Received: 300000 Curr Entry: [0, 400]
Received: 350000 Curr Entry: [0, 466]
Received: 400000 Curr Entry: [0, 534]
Received: 450000 Curr Entry: [0, 611]
Received: 500000 Curr Entry: [0, 688]
Received: 550000 Curr Entry: [0, 747]
Received: 600000 Curr Entry: [0, 815]
Received: 650000 Curr Entry: [0, 913]
Received: 700000 Curr Entry: [1, 7]
Received: 750000 Curr Entry: [1, 83]
Received: 800000 Curr Entry: [1, 157]
Received: 850000 Curr Entry: [1, 227]
Received: 900000 Curr Entry: [1, 287]
Received: 950000 Curr Entry: [1, 343]
Read phase complete with message (1,387,469,-1)
2018-10-19 20:33:11.943 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/kill-broker-test2_10, reader-63a5b2, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 1000000
Final ack count: 1000000
Final positive ack count: 990000
Final negative ack count: 10000
Messages received: 990000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0

We see that midway through sending the messages, pulsar2 is identified as the owner and shortly after is killed. The client experiences a connection failure and reconnects, via the proxy, to pulsar1 which is the new owner. It then commences the sending of its messages.

10000 messages are negatively acknowledged (timeouts) because there are usually 10000 messages in flight due to the messages in flight limit the producer adheres to. That leaves 990000 acknowledged messages, all of which were read, in the correct order without any duplicates.

Example results of a 5 test run:

19:59:08: 
19:59:08: Test Run #1 on topic kill-broker-test_1  ------------
20:00:41: Results --------------------------------------------
20:00:41: Final send count: 1000000
20:00:41: Final ack count: 1000000
20:00:41: Final positive ack count: 990000
20:00:41: Final negative ack count: 10000
20:00:41: Messages received: 990000
20:00:41: Acked messages missing: 0
20:00:41: Non-acked messages received: 0
20:00:41: Out-of-order: 0
20:00:41: Duplicates: 0
20:00:41: ----------------------------------------------------
20:02:54: 
20:02:54: Test Run #2 on topic kill-broker-test_2  ------------
20:03:54: Results --------------------------------------------
20:03:54: Final send count: 1000000
20:03:54: Final ack count: 1000000
20:03:54: Final positive ack count: 1000000
20:03:54: Final negative ack count: 0
20:03:54: Messages received: 1000000
20:03:54: Acked messages missing: 0
20:03:54: Non-acked messages received: 0
20:03:54: Out-of-order: 0
20:03:54: Duplicates: 0
20:03:54: ----------------------------------------------------
20:06:06: 
20:06:06: Test Run #3 on topic kill-broker-test_3  ------------
20:07:31: Results --------------------------------------------
20:07:31: Final send count: 1000000
20:07:31: Final ack count: 1000000
20:07:31: Final positive ack count: 990000
20:07:31: Final negative ack count: 10000
20:07:31: Messages received: 991065
20:07:31: Acked messages missing: 0
20:07:31: Non-acked messages received: 1065
20:07:31: Out-of-order: 0
20:07:31: Duplicates: 0
20:07:31: ----------------------------------------------------
20:09:42: 
20:09:42: Test Run #4 on topic kill-broker-test_4  ------------
20:11:18: Results --------------------------------------------
20:11:18: Final send count: 1000000
20:11:18: Final ack count: 1000000
20:11:18: Final positive ack count: 990000
20:11:18: Final negative ack count: 10000
20:11:18: Messages received: 990000
20:11:18: Acked messages missing: 0
20:11:18: Non-acked messages received: 0
20:11:18: Out-of-order: 0
20:11:18: Duplicates: 0
20:11:18: ----------------------------------------------------
20:13:36: 
20:13:36: Test Run #5 on topic kill-broker-test_5  ------------
20:15:09: Results --------------------------------------------
20:15:09: Final send count: 1000000
20:15:09: Final ack count: 1000000
20:15:09: Final positive ack count: 990000
20:15:09: Final negative ack count: 10000
20:15:09: Messages received: 990000
20:15:09: Acked messages missing: 0
20:15:09: Non-acked messages received: 0
20:15:09: Out-of-order: 0
20:15:09: Duplicates: 0
20:15:09: ----------------------------------------------------

No message loss. Run 3 saw 1065 unacknowledged messages get read in one run. This is expected and not a problem unless duplication caused by retries is an issue. The deduplication feature when enabled should avoid that, we’ll test that too later on.

As predicted, broker fail-over produces no message loss.


Scenario 2 - Kill a Bookie in the Current Ledger Ensemble (kill-bookie test)

Configuration 2-2-1 3-3.

Again, we will test the resiliency of Pulsar with the minimum redundancy configuration of 2-2-1 with 3 brokers and 3 bookies. Midway through sending 1,000,000 messages, a bookie in the current ledger ensemble will be identified and killed. Then we’ll check if the reader reads all 1,000,000 messages in the right order.

I cheat a little and assume that the current ledger is the first ledger. I have performed many many tests and at least the first million messages in my tests end up in entries on the first ledger. So we simply target the ensemble for the first ledger which has a known path in ZooKeeper.

We shouldn’t see any message loss as we only kill one bookie and we have a Qw of 2. If the bookie that acks a message gets killed then we have only 1 copy of that entry until AutoRecovery can rereplicate it, but we’d have to lose both to lose the data. There is no split-brain because the bookies are followers.

I show the full output of a single run, then show the results of 5 test runs.

$ python pulsar-test.py kill-bookie kill-bookie-test 5 1000000 50000 2-2-1 3-3 false
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #1 on topic kill-bookie-test_1  ------------
-------------------------------------------------
WRITE PHASE
-------------------------------------------------
2018-10-19 23:46:10.939 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.224.9:6650
2018-10-19 23:46:12.877 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar3:6650
2018-10-19 23:46:12.877 INFO  ClientConnection:287 | [192.168.224.1:50588 -> 192.168.224.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar3:6650
2018-10-19 23:46:13.397 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/kill-bookie-test_1, ] Created producer on broker [192.168.224.1:50588 -> 192.168.224.9:6650] 
Send count: 52575 Ack count: 50000 Pos: 50000 Neg: 0
Send count: 100560 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 156455 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 203545 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 255199 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 305732 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 357709 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 407634 Ack count: 400000 Pos: 400000 Neg: 0
-------------------------------------------------
bookie1 is in the current ledger ensemble, killing bookie1!!!!!!
-------------------------------------------------
Send count: 451809 Ack count: 450000 Pos: 450000 Neg: 0
Send count: 501234 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 551348 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 601729 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 650566 Ack count: 650000 Pos: 650000 Neg: 0
-------------------------------------------------
bookie1 KILLED!
-------------------------------------------------
Send count: 701961 Ack count: 700000 Pos: 700000 Neg: 0
Send count: 751294 Ack count: 750000 Pos: 750000 Neg: 0
Send count: 802939 Ack count: 800000 Pos: 800000 Neg: 0
Send count: 850390 Ack count: 850000 Pos: 850000 Neg: 0
Send count: 903739 Ack count: 900000 Pos: 900000 Neg: 0
Send count: 953286 Ack count: 950000 Pos: 950000 Neg: 0
Send count: 1000000 Ack count: 1000000 Pos: 1000000 Neg: 0
2018-10-19 23:46:47.281 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/kill-bookie-test_1, cluster-1-2-0] Closed producer
-------------------------------------------------
READ PHASE
-------------------------------------------------
2018-10-19 23:46:57.499 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.224.9:6650
2018-10-19 23:46:57.500 INFO  ClientConnection:285 | [192.168.224.1:50756 -> 192.168.224.9:6650] Connected to broker
2018-10-19 23:46:57.508 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-bookie-test_1, reader-4d4064, 0] Getting connection from pool
2018-10-19 23:46:57.566 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar2:6650
2018-10-19 23:46:57.566 INFO  ClientConnection:287 | [192.168.224.1:50760 -> 192.168.224.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar2:6650
2018-10-19 23:46:57.750 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/kill-bookie-test_1, reader-4d4064, 0] Created consumer on broker [192.168.224.1:50760 -> 192.168.224.9:6650] 
LCE. broker pulsar2 lac_line "1:475",
Last confirmed entry: [1, 475]
Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 70]
Received: 100000 Curr Entry: [0, 165]
Received: 150000 Curr Entry: [0, 270]
Received: 200000 Curr Entry: [0, 339]
Received: 250000 Curr Entry: [0, 403]
Received: 300000 Curr Entry: [0, 462]
Received: 350000 Curr Entry: [0, 519]
Received: 400000 Curr Entry: [0, 579]
Received: 450000 Curr Entry: [0, 658]
Received: 500000 Curr Entry: [0, 717]
Received: 550000 Curr Entry: [0, 782]
Received: 600000 Curr Entry: [0, 848]
Received: 650000 Curr Entry: [0, 942]
Received: 700000 Curr Entry: [1, 57]
Received: 750000 Curr Entry: [1, 119]
Received: 800000 Curr Entry: [1, 195]
Received: 850000 Curr Entry: [1, 267]
Received: 900000 Curr Entry: [1, 357]
Received: 950000 Curr Entry: [1, 418]
Received: 1000000 Curr Entry: [1, 475]
Read phase complete with message (1,475,324,-1)
2018-10-19 23:47:51.236 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/kill-bookie-test_1, reader-4d4064, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 1000000
Final ack count: 1000000
Final positive ack count: 1000000
Final negative ack count: 0
Messages received: 1000000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0
----------------------------------------------------

We saw that midway through sending, bookie1 got killed. Message transmission carried on unhampered and the reader was able to read all 1,000,000 messages in the correct order without duplication.

The results of five runs:

23:52:20 Start test
23:54:49: 
23:54:49: Test Run #1 on topic kill-bookie-test_1  ------------
23:56:38: Results --------------------------------------------
23:56:38: Final send count: 1000000
23:56:38: Final ack count: 1000000
23:56:38: Final positive ack count: 1000000
23:56:38: Final negative ack count: 0
23:56:38: Messages received: 1000000
23:56:38: Acked messages missing: 0
23:56:38: Non-acked messages received: 0
23:56:38: Out-of-order: 0
23:56:38: Duplicates: 0
23:56:38: ----------------------------------------------------
23:58:54: 
23:58:54: Test Run #2 on topic kill-bookie-test_2  ------------
00:00:50: Results --------------------------------------------
00:00:50: Final send count: 1000000
00:00:50: Final ack count: 1000000
00:00:50: Final positive ack count: 1000000
00:00:50: Final negative ack count: 0
00:00:50: Messages received: 1000000
00:00:50: Acked messages missing: 0
00:00:50: Non-acked messages received: 0
00:00:50: Out-of-order: 0
00:00:50: Duplicates: 0
00:00:50: ----------------------------------------------------
00:03:12: 
00:03:12: Test Run #3 on topic kill-bookie-test_3  ------------
00:05:01: Results --------------------------------------------
00:05:01: Final send count: 1000000
00:05:01: Final ack count: 1000000
00:05:01: Final positive ack count: 1000000
00:05:01: Final negative ack count: 0
00:05:01: Messages received: 1000000
00:05:01: Acked messages missing: 0
00:05:01: Non-acked messages received: 0
00:05:01: Out-of-order: 0
00:05:01: Duplicates: 0
00:05:01: ----------------------------------------------------
00:07:15: 
00:07:15: Test Run #4 on topic kill-bookie-test_4  ------------
00:09:04: Results --------------------------------------------
00:09:04: Final send count: 1000000
00:09:04: Final ack count: 1000000
00:09:04: Final positive ack count: 1000000
00:09:04: Final negative ack count: 0
00:09:04: Messages received: 1000000
00:09:04: Acked messages missing: 0
00:09:04: Non-acked messages received: 0
00:09:04: Out-of-order: 0
00:09:04: Duplicates: 0
00:09:04: ----------------------------------------------------
00:11:23: 
00:11:23: Test Run #5 on topic kill-bookie-test_5  ------------
00:13:23: Results --------------------------------------------
00:13:23: Final send count: 1000000
00:13:23: Final ack count: 1000000
00:13:23: Final positive ack count: 1000000
00:13:23: Final negative ack count: 0
00:13:23: Messages received: 1000000
00:13:23: Acked messages missing: 0
00:13:23: Non-acked messages received: 0
00:13:23: Out-of-order: 0
00:13:23: Duplicates: 0
00:13:23: ----------------------------------------------------

No message loss, no duplicates, no ordering issues.

Next test.


Scenario 3 - Isolate the Topic Owner Broker from ZooKeeper (isolate-broker-from-zk test)

Configuration 2-2-1 3-3.

In this test we’ll isolate the topic owner broker from ZooKeeper, but not from anything else. The predicted behaviour is that the broker should receive writes until it realizes that it cannot see ZK, at which point it will restart itself. A different broker should then take over.

We should see similar behaviour to that of killing the node.

Below is the full output of a single run:

$ python pulsar-test.py isolate-broker-from-zk iso-broker 1 2000000 50000 2-2-1 3-3 false
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #1 on topic iso-broker_1  ------------
-------------------------------------------------
WRITE PHASE
-------------------------------------------------
2018-10-20 08:22:43.961 INFO  ConnectionPool:63 | Created connection for pulsar://172.22.0.9:6650
2018-10-20 08:22:43.963 INFO  ClientConnection:285 | [172.22.0.1:46968 -> 172.22.0.9:6650] Connected to broker
2018-10-20 08:22:44.664 INFO  BatchMessageContainer:41 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://vanlightly/cluster-1/ns1/iso-broker_1] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2018-10-20 08:22:44.664 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, ] Getting connection from pool
2018-10-20 08:22:45.254 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 08:22:45.254 INFO  ClientConnection:287 | [172.22.0.1:46972 -> 172.22.0.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 08:22:46.329 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, ] Created producer on broker [172.22.0.1:46972 -> 172.22.0.9:6650] 
Send count: 57639 Ack count: 50000 Pos: 50000 Neg: 0
pulsar1 is the topic owner, isolating pulsar1 from zookeepr!!!!!!
Send count: 100918 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 151902 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 200658 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 251287 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 306076 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 353796 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 403469 Ack count: 400000 Pos: 400000 Neg: 0
Send count: 455195 Ack count: 450000 Pos: 450000 Neg: 0
-------------------------------------------------
pulsar1 ISOLATED!
-------------------------------------------------
Send count: 506817 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 557051 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 601668 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 652490 Ack count: 650000 Pos: 650000 Neg: 0
Send count: 709134 Ack count: 700000 Pos: 700000 Neg: 0
Send count: 757860 Ack count: 750000 Pos: 750000 Neg: 0
Send count: 805959 Ack count: 800000 Pos: 800000 Neg: 0
Send count: 858283 Ack count: 850000 Pos: 850000 Neg: 0
Send count: 907911 Ack count: 900000 Pos: 900000 Neg: 0
Send count: 951597 Ack count: 950000 Pos: 950000 Neg: 0
Send count: 1002732 Ack count: 1000000 Pos: 1000000 Neg: 0
Send count: 1050689 Ack count: 1050000 Pos: 1050000 Neg: 0
Send count: 1104743 Ack count: 1100000 Pos: 1100000 Neg: 0
Send count: 1157332 Ack count: 1150000 Pos: 1150000 Neg: 0
Send count: 1202361 Ack count: 1200000 Pos: 1200000 Neg: 0
Send count: 1256378 Ack count: 1250000 Pos: 1250000 Neg: 0
Send count: 1301586 Ack count: 1300000 Pos: 1300000 Neg: 0
Send count: 1352458 Ack count: 1350000 Pos: 1350000 Neg: 0
Send count: 1407455 Ack count: 1400000 Pos: 1400000 Neg: 0
2018-10-20 08:23:40.624 INFO  ClientConnection:1237 | [172.22.0.1:46972 -> 172.22.0.9:6650] Connection closed
2018-10-20 08:23:40.624 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Schedule reconnection in 0.1 s
2018-10-20 08:23:40.624 INFO  ClientConnection:195 | [172.22.0.1:46972 -> 172.22.0.9:6650] Destroyed connection
2018-10-20 08:23:40.724 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Getting connection from pool
2018-10-20 08:23:41.103 INFO  ConnectionPool:53 | Deleting stale connection from pool for pulsar://pulsar1:6650 use_count: -1 @ 0
2018-10-20 08:23:41.103 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 08:23:41.104 INFO  ClientConnection:287 | [172.22.0.1:47050 -> 172.22.0.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 08:23:42.139 INFO  ClientConnection:1237 | [172.22.0.1:47050 -> 172.22.0.9:6650] Connection closed
2018-10-20 08:23:42.139 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Schedule reconnection in 0.186 s
2018-10-20 08:23:42.139 INFO  ClientConnection:195 | [172.22.0.1:47050 -> 172.22.0.9:6650] Destroyed connection
2018-10-20 08:23:42.325 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Getting connection from pool
2018-10-20 08:23:42.750 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar3:6650
2018-10-20 08:23:42.751 INFO  ClientConnection:287 | [172.22.0.1:47054 -> 172.22.0.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar3:6650
2018-10-20 08:23:43.632 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Created producer on broker [172.22.0.1:47054 -> 172.22.0.9:6650] 
Send count: 1459178 Ack count: 1450000 Pos: 1450000 Neg: 0
Send count: 1505703 Ack count: 1500000 Pos: 1500000 Neg: 0
Send count: 1551367 Ack count: 1550000 Pos: 1550000 Neg: 0
Send count: 1601042 Ack count: 1600000 Pos: 1600000 Neg: 0
Send count: 1651310 Ack count: 1650000 Pos: 1650000 Neg: 0
Send count: 1707836 Ack count: 1700000 Pos: 1700000 Neg: 0
Send count: 1751159 Ack count: 1750000 Pos: 1750000 Neg: 0
Send count: 1802089 Ack count: 1800000 Pos: 1800000 Neg: 0
Send count: 1851082 Ack count: 1850000 Pos: 1850000 Neg: 0
Send count: 1906930 Ack count: 1900000 Pos: 1900000 Neg: 0
Send count: 1951920 Ack count: 1950000 Pos: 1950000 Neg: 0
Send count: 2000000 Ack count: 2000000 Pos: 2000000 Neg: 0
2018-10-20 08:23:57.505 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, cluster-1-0-0] Closed producer
-------------------------------------------------
READ PHASE
-------------------------------------------------
2018-10-20 08:24:07.620 INFO  ConnectionPool:63 | Created connection for pulsar://172.22.0.9:6650
2018-10-20 08:24:07.621 INFO  ClientConnection:285 | [172.22.0.1:47156 -> 172.22.0.9:6650] Connected to broker
2018-10-20 08:24:07.635 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, reader-c1ba0b, 0] Getting connection from pool
2018-10-20 08:24:07.665 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar3:6650
2018-10-20 08:24:07.665 INFO  ClientConnection:287 | [172.22.0.1:47160 -> 172.22.0.9:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar3:6650
2018-10-20 08:24:07.749 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, reader-c1ba0b, 0] Created consumer on broker [172.22.0.1:47160 -> 172.22.0.9:6650] 
Last confirmed entry: [1, 658]
Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 61]
Received: 100000 Curr Entry: [0, 142]
Received: 150000 Curr Entry: [0, 212]
Received: 200000 Curr Entry: [0, 288]
Received: 250000 Curr Entry: [0, 349]
Received: 300000 Curr Entry: [0, 420]
Received: 350000 Curr Entry: [0, 487]
Received: 400000 Curr Entry: [0, 554]
Received: 450000 Curr Entry: [0, 612]
Received: 500000 Curr Entry: [0, 675]
Received: 550000 Curr Entry: [0, 731]
Received: 600000 Curr Entry: [0, 790]
Received: 650000 Curr Entry: [0, 847]
Received: 700000 Curr Entry: [0, 904]
Received: 750000 Curr Entry: [0, 961]
Received: 800000 Curr Entry: [0, 1022]
Received: 850000 Curr Entry: [0, 1080]
Received: 900000 Curr Entry: [0, 1136]
Received: 950000 Curr Entry: [0, 1194]
Received: 1000000 Curr Entry: [0, 1252]
Received: 1050000 Curr Entry: [0, 1310]
Received: 1100000 Curr Entry: [0, 1370]
Received: 1150000 Curr Entry: [0, 1428]
Received: 1200000 Curr Entry: [0, 1487]
Received: 1250000 Curr Entry: [0, 1545]
Received: 1300000 Curr Entry: [0, 1602]
Received: 1350000 Curr Entry: [0, 1660]
Received: 1400000 Curr Entry: [0, 1721]
Received: 1450000 Curr Entry: [1, 1]
Received: 1500000 Curr Entry: [1, 60]
Received: 1550000 Curr Entry: [1, 124]
Received: 1600000 Curr Entry: [1, 186]
Received: 1650000 Curr Entry: [1, 247]
Received: 1700000 Curr Entry: [1, 303]
Received: 1750000 Curr Entry: [1, 361]
Received: 1800000 Curr Entry: [1, 420]
Received: 1850000 Curr Entry: [1, 478]
Received: 1900000 Curr Entry: [1, 539]
Received: 1950000 Curr Entry: [1, 598]
Received: 2000000 Curr Entry: [1, 658]
Read phase complete with message (1,658,138,-1)
2018-10-20 08:24:44.361 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/iso-broker_1, reader-c1ba0b, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 2000000
Final ack count: 2000000
Final positive ack count: 2000000
Final negative ack count: 0
Messages received: 2000000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0
----------------------------------------------------

In this scenario we sent 2,000,000 messages as it takes a while to isolate the owner broker and for that change to have an impact. At about the 1.4M message mark the fail-over occured and the reconnection happened quickly. The reader read all 2,000,000 messages in the right order without duplication.

The results of 5 runs:

08:33:38 Start test
08:35:55: 
08:35:55: Test Run #1 on topic iso-broker_1  ------------
08:38:10: Results --------------------------------------------
08:38:10: Final send count: 2000000
08:38:10: Final ack count: 2000000
08:38:10: Final positive ack count: 2000000
08:38:10: Final negative ack count: 0
08:38:10: Messages received: 2000000
08:38:10: Acked messages missing: 0
08:38:10: Non-acked messages received: 0
08:38:10: Out-of-order: 0
08:38:10: Duplicates: 0
08:38:10: ----------------------------------------------------
08:40:29: 
08:40:29: Test Run #2 on topic iso-broker_2  ------------
08:42:36: Results --------------------------------------------
08:42:36: Final send count: 2000000
08:42:36: Final ack count: 2000000
08:42:36: Final positive ack count: 2000000
08:42:36: Final negative ack count: 0
08:42:36: Messages received: 2000000
08:42:36: Acked messages missing: 0
08:42:36: Non-acked messages received: 0
08:42:36: Out-of-order: 0
08:42:36: Duplicates: 0
08:42:36: ----------------------------------------------------
08:44:55: 
08:44:55: Test Run #3 on topic iso-broker_3  ------------
08:47:08: Results --------------------------------------------
08:47:08: Final send count: 2000000
08:47:08: Final ack count: 2000000
08:47:08: Final positive ack count: 2000000
08:47:08: Final negative ack count: 0
08:47:08: Messages received: 2001000
08:47:08: Acked messages missing: 0
08:47:08: Non-acked messages received: 0
08:47:08: Out-of-order: 0
08:47:08: Duplicates: 1000
08:47:08: ----------------------------------------------------
08:49:33: 
08:49:33: Test Run #4 on topic iso-broker_4  ------------
08:51:52: Results --------------------------------------------
08:51:52: Final send count: 2000000
08:51:52: Final ack count: 2000000
08:51:52: Final positive ack count: 2000000
08:51:52: Final negative ack count: 0
08:51:52: Messages received: 2001090
08:51:52: Acked messages missing: 0
08:51:52: Non-acked messages received: 0
08:51:52: Out-of-order: 0
08:51:52: Duplicates: 1090
08:51:52: ----------------------------------------------------
08:54:18: 
08:54:18: Test Run #5 on topic iso-broker_5  ------------
08:56:41: Results --------------------------------------------
08:56:41: Final send count: 2000000
08:56:41: Final ack count: 2000000
08:56:41: Final positive ack count: 2000000
08:56:41: Final negative ack count: 0
08:56:41: Messages received: 2000000
08:56:41: Acked messages missing: 0
08:56:41: Non-acked messages received: 0
08:56:41: Out-of-order: 0
08:56:41: Duplicates: 0
08:56:41: ----------------------------------------------------

No message loss or ordering issues, but this time we have seen some message duplication. On two test runs 1000 and 1090 messages were duplicated.

When the reader sees that the integer value of a message is lower or equal to the previously read message and this value has been previously read (meaning it is a dup), it logs the message id and integer value of the previously consumed message and of the current message to the [test-name]-duplicates.txt file.

test-output]$ cat iso-broker_duplicates.txt
run|last_msg_id|last_value|curr_msg_id|curr_value
3|(0,1643,999,-1)|1356634|(1,0,0,-1)|1355635
4|(0,1597,792,-1)|1220380|(1,0,0,-1)|1219588
4|(1,712,296,-1)|1811705|(2,0,0,-1)|1811409

There is one entry for run 3, indicating that the duplications were in one contiguous block of messages. The integer value of 1,355,635 was found a second time at message id (1,0,0,-1) which is first message in ledger 1, entry 0. It seems that 1000 messages were written at the end of ledger 0 and then those same 1000 messages were written again at the beginning of the next ledger.

The same is true for two contiguous blocks of duplicates in run 4.

The duplications could have been caused by the client resending messages after the first broker restarted itself. 1000 messages get sent to pulsar1, it sends them to a BK ensemble, but before it acks the client it shutsdown because it has lost ZooKeeper connectivity. The client then reconnects, now to pulsar2, the new owner, and sends the 1000 messages again. Pulsar2 recovered the last ledger, owned by pulsar1, closed it and then opened a new ledger, ready to accept new messages. It received the 1000 messages from the client and wrote them to its new ledger. That would explain what we see in the duplicates log.

We can test that by enabling the deduplication feature. When enabled, a broker stores the last sequence number of each producer in a hashtable. When it receives a lower sequence number it knows that it is a duplicate and ignores it. It stores the (producer, sequence number) per topic data in a cursor so that, in a broker fail-over, the new broker can recreate the hashtable. The hashtable is snapshotted periodically so in the event of a broker failure, the latest sequence numbers in the hashtable might be lost. This would create a window of opportunity for duplication in the event of a broker fail-over if the new broker only relied on that snapshot. To avoid that scenario the new Pulsar owner reads the last N entries from the ledger and adds them to the hashtable, thereby covering any gap induced by the fail-over.

Scenario 4 - Isolate the Topic Owner Broker from ZooKeeper with Deduplication Enabled (isolate-broker-from-zk test)

Configuration 2-2-1 3-3 with deduplication.

Exactly the same test as scenario 3 except that we’ll enable the deduplication feature and see if this prevents the deduplication.

I enable the deduplication feature by adding brokerDeduplicationEnabled to the environment section of each broker in the blockade.yml.

The results of 5 runs:

09:24:34 Start test
09:26:56: 
09:26:56: Test Run #1 on topic iso-broker_1  ------------
09:29:27: Results --------------------------------------------
09:29:27: Final send count: 2000000
09:29:27: Final ack count: 2000000
09:29:27: Final positive ack count: 2000000
09:29:27: Final negative ack count: 0
09:29:27: Messages received: 2000000
09:29:27: Acked messages missing: 0
09:29:27: Non-acked messages received: 0
09:29:27: Out-of-order: 0
09:29:27: Duplicates: 0
09:29:27: ----------------------------------------------------
09:32:48: 
09:32:48: Test Run #2 on topic iso-broker_2  ------------
09:35:06: Results --------------------------------------------
09:35:06: Final send count: 2000000
09:35:06: Final ack count: 2000000
09:35:06: Final positive ack count: 2000000
09:35:06: Final negative ack count: 0
09:35:06: Messages received: 2000000
09:35:06: Acked messages missing: 0
09:35:06: Non-acked messages received: 0
09:35:06: Out-of-order: 0
09:35:06: Duplicates: 0
09:35:06: ----------------------------------------------------
09:37:21: 
09:37:21: Test Run #3 on topic iso-broker_3  ------------
09:39:21: Results --------------------------------------------
09:39:21: Final send count: 2000000
09:39:21: Final ack count: 2000000
09:39:21: Final positive ack count: 2000000
09:39:21: Final negative ack count: 0
09:39:21: Messages received: 2000000
09:39:21: Acked messages missing: 0
09:39:21: Non-acked messages received: 0
09:39:21: Out-of-order: 0
09:39:21: Duplicates: 0
09:39:21: ----------------------------------------------------
09:41:39: 
09:41:39: Test Run #4 on topic iso-broker_4  ------------
09:43:51: Results --------------------------------------------
09:43:51: Final send count: 2000000
09:43:51: Final ack count: 2000000
09:43:51: Final positive ack count: 2000000
09:43:51: Final negative ack count: 0
09:43:51: Messages received: 2000000
09:43:51: Acked messages missing: 0
09:43:51: Non-acked messages received: 0
09:43:51: Out-of-order: 0
09:43:51: Duplicates: 0
09:43:51: ----------------------------------------------------
09:46:17: 
09:46:17: Test Run #5 on topic iso-broker_5  ------------
09:48:53: Results --------------------------------------------
09:48:53: Final send count: 2000000
09:48:53: Final ack count: 2000000
09:48:53: Final positive ack count: 2000000
09:48:53: Final negative ack count: 0
09:48:53: Messages received: 2000000
09:48:53: Acked messages missing: 0
09:48:53: Non-acked messages received: 0
09:48:53: Out-of-order: 0
09:48:53: Duplicates: 0
09:48:53: ----------------------------------------------------

That seems to have solved the duplication issue caused in scenario 3.

Scenario 5 - Isolate a Bookie from ZooKeeper (isolate-bookie-from-zk test)

Configuration 2-2-1 3-3.

Bookies do not require ZooKeeper for reads and writes, only for registering themselves with ZK, garbage collection and AutoRecovery. This I need to confirm. So accordingly, isolating a bookie from ZooKeeper should have no affect. Let’s see.

The test identifies a bookie in the ensemble of the current ledger and isolates it from ZooKeeper midway through message transmission. I will omit the full output of a single run, for brevity. The results of 5 runs are:

$ cat iso-bookie_output.txt
12:03:52 Start test
12:06:10: 
12:06:10: Test Run #1 on topic iso-bookie_1  ------------
12:08:18: Results --------------------------------------------
12:08:18: Final send count: 2000000
12:08:18: Final ack count: 2000000
12:08:18: Final positive ack count: 2000000
12:08:18: Final negative ack count: 0
12:08:18: Messages received: 2000000
12:08:18: Acked messages missing: 0
12:08:18: Non-acked messages received: 0
12:08:18: Out-of-order: 0
12:08:18: Duplicates: 0
12:08:18: ----------------------------------------------------
12:10:35: 
12:10:35: Test Run #2 on topic iso-bookie_2  ------------
12:12:40: Results --------------------------------------------
12:12:40: Final send count: 2000000
12:12:40: Final ack count: 2000000
12:12:40: Final positive ack count: 2000000
12:12:40: Final negative ack count: 0
12:12:40: Messages received: 2000000
12:12:40: Acked messages missing: 0
12:12:40: Non-acked messages received: 0
12:12:40: Out-of-order: 0
12:12:40: Duplicates: 0
12:12:40: ----------------------------------------------------
12:14:55: 
12:14:55: Test Run #3 on topic iso-bookie_3  ------------
12:16:54: Results --------------------------------------------
12:16:54: Final send count: 2000000
12:16:54: Final ack count: 2000000
12:16:54: Final positive ack count: 2000000
12:16:54: Final negative ack count: 0
12:16:54: Messages received: 2000000
12:16:54: Acked messages missing: 0
12:16:54: Non-acked messages received: 0
12:16:54: Out-of-order: 0
12:16:54: Duplicates: 0
12:16:54: ----------------------------------------------------
12:19:10: 
12:19:10: Test Run #4 on topic iso-bookie_4  ------------
12:21:16: Results --------------------------------------------
12:21:16: Final send count: 2000000
12:21:16: Final ack count: 2000000
12:21:16: Final positive ack count: 2000000
12:21:16: Final negative ack count: 0
12:21:16: Messages received: 2000000
12:21:16: Acked messages missing: 0
12:21:16: Non-acked messages received: 0
12:21:16: Out-of-order: 0
12:21:16: Duplicates: 0
12:21:16: ----------------------------------------------------
12:23:36: 
12:23:36: Test Run #5 on topic iso-bookie_5  ------------
12:25:39: Results --------------------------------------------
12:25:39: Final send count: 2000000
12:25:39: Final ack count: 2000000
12:25:39: Final positive ack count: 2000000
12:25:39: Final negative ack count: 0
12:25:39: Messages received: 2000000
12:25:39: Acked messages missing: 0
12:25:39: Non-acked messages received: 0
12:25:39: Out-of-order: 0
12:25:39: Duplicates: 0
12:25:39: ----------------------------------------------------

No message loss, no duplication, ordering ok.

Scenario 6 - Kill Multiple Bookies (kill-bookies[n] test)

Configuration 3-3-1 3-5.

This scenario is the same as scenario 2 except we have redundancy of 3 and kill off 2 bookies. Pulsar tolerates up to Qw-1 bookie failures without message loss.

We can use the “kill-bookies[n]” command to kill an arbitrary number of bookies in the current ensemble.

The full output of one run:

$ python pulsar-test.py kill-bookies[2] kill-m-bookies 1 2000000 50000 3-3-1 3-5 false
Creating blockade cluster
Running test with config: E 3 Qw 3 Qa 1 with nodes bookie1 bookie2 bookie3 bookie4 bookie5 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 3-3-1

Test Run #1 on topic kill-m-bookies_1  ------------
-------------------------------------------------
WRITE PHASE
-------------------------------------------------
2018-10-20 22:41:21.932 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.176.11:6650
2018-10-20 22:41:21.932 INFO  ClientConnection:285 | [192.168.176.1:47178 -> 192.168.176.11:6650] Connected to broker
2018-10-20 22:41:22.599 INFO  BatchMessageContainer:41 | { BatchContainer [size = 0] [batchSizeInBytes_ = 0] [maxAllowedMessageBatchSizeInBytes_ = 131072] [maxAllowedNumMessagesInBatch_ = 1000] [topicName = persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1] [producerName_ = ] [batchSizeInBytes_ = 0] [numberOfBatchesSent = 0] [averageBatchSize = 0]} BatchMessageContainer constructed
2018-10-20 22:41:22.599 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, ] Getting connection from pool
2018-10-20 22:41:23.140 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 22:41:23.140 INFO  ClientConnection:287 | [192.168.176.1:47182 -> 192.168.176.11:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 22:41:23.631 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, ] Created producer on broker [192.168.176.1:47182 -> 192.168.176.11:6650] 
Send count: 53933 Ack count: 50000 Pos: 50000 Neg: 0
-------------------------------------------------
Identifing first 2 bookies in ensemble
-------------------------------------------------
Send count: 105567 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 150962 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 202992 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 251576 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 304059 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 354476 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 401795 Ack count: 400000 Pos: 400000 Neg: 0
Send count: 451880 Ack count: 450000 Pos: 450000 Neg: 0
Send count: 500607 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 550300 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 600474 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 650469 Ack count: 650000 Pos: 650000 Neg: 0
Send count: 702927 Ack count: 700000 Pos: 700000 Neg: 0
Send count: 751328 Ack count: 750000 Pos: 750000 Neg: 0
Send count: 805205 Ack count: 800000 Pos: 800000 Neg: 0
Send count: 853940 Ack count: 850000 Pos: 850000 Neg: 0
-------------------------------------------------
2 BOOKIES KILLED!
-------------------------------------------------
Send count: 901111 Ack count: 900000 Pos: 900000 Neg: 0
Send count: 950963 Ack count: 950000 Pos: 950000 Neg: 0
Send count: 1000561 Ack count: 1000000 Pos: 1000000 Neg: 0
Send count: 1053648 Ack count: 1050000 Pos: 1050000 Neg: 0
Send count: 1103723 Ack count: 1100000 Pos: 1100000 Neg: 0
Send count: 1155762 Ack count: 1150000 Pos: 1150000 Neg: 0
Send count: 1202638 Ack count: 1200000 Pos: 1200000 Neg: 0
Send count: 1253326 Ack count: 1250000 Pos: 1250000 Neg: 0
Send count: 1300821 Ack count: 1300000 Pos: 1300000 Neg: 0
Send count: 1355902 Ack count: 1350000 Pos: 1350000 Neg: 0
Send count: 1409752 Ack count: 1400000 Pos: 1400000 Neg: 0
Send count: 1451236 Ack count: 1450000 Pos: 1450000 Neg: 0
Send count: 1500961 Ack count: 1500000 Pos: 1500000 Neg: 0
Send count: 1554802 Ack count: 1550000 Pos: 1550000 Neg: 0
Send count: 1600702 Ack count: 1600000 Pos: 1600000 Neg: 0
Send count: 1652755 Ack count: 1650000 Pos: 1650000 Neg: 0
Send count: 1705403 Ack count: 1700000 Pos: 1700000 Neg: 0
Send count: 1757240 Ack count: 1750000 Pos: 1750000 Neg: 0
Send count: 1804954 Ack count: 1800000 Pos: 1800000 Neg: 0
Send count: 1850631 Ack count: 1850000 Pos: 1850000 Neg: 0
Send count: 1907237 Ack count: 1900000 Pos: 1900000 Neg: 0
Send count: 1951117 Ack count: 1950000 Pos: 1950000 Neg: 0
Send count: 2000000 Ack count: 2000000 Pos: 2000000 Neg: 0
2018-10-20 22:42:20.572 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, cluster-1-0-0] Closed producer
-------------------------------------------------
READ PHASE
-------------------------------------------------
2018-10-20 22:42:30.708 INFO  ConnectionPool:63 | Created connection for pulsar://192.168.176.11:6650
2018-10-20 22:42:30.710 INFO  ClientConnection:285 | [192.168.176.1:47374 -> 192.168.176.11:6650] Connected to broker
2018-10-20 22:42:30.715 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, reader-b23b69, 0] Getting connection from pool
2018-10-20 22:42:31.028 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 22:42:31.028 INFO  ClientConnection:287 | [192.168.176.1:47378 -> 192.168.176.11:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 22:42:31.110 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, reader-b23b69, 0] Created consumer on broker [192.168.176.1:47378 -> 192.168.176.11:6650] 
Last confirmed entry: [0, 2546]
Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 62]
Received: 100000 Curr Entry: [0, 135]
Received: 150000 Curr Entry: [0, 203]
Received: 200000 Curr Entry: [0, 276]
Received: 250000 Curr Entry: [0, 341]
Received: 300000 Curr Entry: [0, 409]
Received: 350000 Curr Entry: [0, 470]
Received: 400000 Curr Entry: [0, 532]
Received: 450000 Curr Entry: [0, 599]
Received: 500000 Curr Entry: [0, 661]
Received: 550000 Curr Entry: [0, 724]
Received: 600000 Curr Entry: [0, 799]
Received: 650000 Curr Entry: [0, 866]
Received: 700000 Curr Entry: [0, 943]
Received: 750000 Curr Entry: [0, 1005]
Received: 800000 Curr Entry: [0, 1073]
Received: 850000 Curr Entry: [0, 1133]
Received: 900000 Curr Entry: [0, 1205]
Received: 950000 Curr Entry: [0, 1268]
Received: 1000000 Curr Entry: [0, 1340]
Received: 1050000 Curr Entry: [0, 1402]
Received: 1100000 Curr Entry: [0, 1460]
Received: 1150000 Curr Entry: [0, 1523]
Received: 1200000 Curr Entry: [0, 1588]
Received: 1250000 Curr Entry: [0, 1647]
Received: 1300000 Curr Entry: [0, 1705]
Received: 1350000 Curr Entry: [0, 1764]
Received: 1400000 Curr Entry: [0, 1825]
Received: 1450000 Curr Entry: [0, 1884]
Received: 1500000 Curr Entry: [0, 1941]
Received: 1550000 Curr Entry: [0, 1998]
Received: 1600000 Curr Entry: [0, 2064]
Received: 1650000 Curr Entry: [0, 2122]
Received: 1700000 Curr Entry: [0, 2184]
Received: 1750000 Curr Entry: [0, 2241]
Received: 1800000 Curr Entry: [0, 2295]
Received: 1850000 Curr Entry: [0, 2364]
Received: 1900000 Curr Entry: [0, 2425]
Received: 1950000 Curr Entry: [0, 2481]
Received: 2000000 Curr Entry: [0, 2546]
Read phase complete with message (0,2546,561,-1)
2018-10-20 22:44:15.596 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_1, reader-b23b69, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 2000000
Final ack count: 2000000
Final positive ack count: 2000000
Final negative ack count: 0
Messages received: 2000000
Acked messages missing: 0
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0
----------------------------------------------------

The results of 5 runs:

$ cat kill-m-bookies_output.txt
22:59:01 Start test
23:01:37: 
23:01:37: Test Run #1 on topic kill-m-bookies_1  ------------
23:04:49: Results --------------------------------------------
23:04:49: Final send count: 2000000
23:04:49: Final ack count: 2000000
23:04:49: Final positive ack count: 2000000
23:04:49: Final negative ack count: 0
23:04:49: Messages received: 2000000
23:04:49: Acked messages missing: 0
23:04:49: Non-acked messages received: 0
23:04:49: Out-of-order: 0
23:04:49: Duplicates: 0
23:04:49: ----------------------------------------------------
23:07:26: 
23:07:26: Test Run #2 on topic kill-m-bookies_2  ------------
23:09:23: Results --------------------------------------------
23:09:23: Final send count: 2000000
23:09:23: Final ack count: 2000000
23:09:23: Final positive ack count: 2000000
23:09:23: Final negative ack count: 0
23:09:23: Messages received: 2005107
23:09:23: Acked messages missing: 0
23:09:23: Non-acked messages received: 0
23:09:23: Out-of-order: 0
23:09:23: Duplicates: 5107
23:09:23: ----------------------------------------------------
23:11:56: 
23:11:56: Test Run #3 on topic kill-m-bookies_3  ------------
23:14:48: Results --------------------------------------------
23:14:48: Final send count: 2000000
23:14:48: Final ack count: 2000000
23:14:48: Final positive ack count: 2000000
23:14:48: Final negative ack count: 0
23:14:48: Messages received: 2000000
23:14:48: Acked messages missing: 0
23:14:48: Non-acked messages received: 0
23:14:48: Out-of-order: 0
23:14:48: Duplicates: 0
23:14:48: ----------------------------------------------------
23:17:25: 
23:17:25: Test Run #4 on topic kill-m-bookies_4  ------------
23:20:24: Results --------------------------------------------
23:20:24: Final send count: 2000000
23:20:24: Final ack count: 2000000
23:20:24: Final positive ack count: 2000000
23:20:24: Final negative ack count: 0
23:20:24: Messages received: 2000000
23:20:24: Acked messages missing: 0
23:20:24: Non-acked messages received: 0
23:20:24: Out-of-order: 0
23:20:24: Duplicates: 0
23:20:24: ----------------------------------------------------
23:23:05: 
23:23:05: Test Run #5 on topic kill-m-bookies_5  ------------
23:25:03: Results --------------------------------------------
23:25:03: Final send count: 2000000
23:25:03: Final ack count: 2000000
23:25:03: Final positive ack count: 2000000
23:25:03: Final negative ack count: 0
23:25:03: Messages received: 2000000
23:25:03: Acked messages missing: 0
23:25:03: Non-acked messages received: 0
23:25:03: Out-of-order: 0
23:25:03: Duplicates: 0
23:25:03: ----------------------------------------------------

Notice that we got over 5000 duplicates in one test run. I inspected the full output of that run and see a type of connection error regarding a bad producer id.

Send count: 557331 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 601885 Ack count: 600000 Pos: 600000 Neg: 0
2018-10-20 23:07:49.283 INFO  HandlerBase:129 | [persistent://vanlightly/cluster-1/ns1/kill-m-bookies_2, cluster-1-2-0] Schedule reconnection in 0.1 s
2018-10-20 23:07:49.319 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 648684
2018-10-20 23:07:49.319 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 649684
2018-10-20 23:07:49.320 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 650684
2018-10-20 23:07:49.320 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 651684
2018-10-20 23:07:49.321 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 652684
2018-10-20 23:07:49.321 ERROR ClientConnection:638 | [192.168.208.1:58276 -> 192.168.208.11:6650] Got invalid producer Id in SendReceipt: 0 -- msg: 653047

This requires further investigation, it may or may not be linked to the bookie failures. Either way, it only caused duplication and that could have been avoided by activating the deduplication feature.

But still no message loss or ordering issues.

What happens if we kill Qw bookies though? That is an unfair test as we would be destroying all the replicated data. Still, let’s run it to confirm the behaviour.

Scenario 7 - Kill Qw Bookies (kill-bookies[n] test)

Configuration 2-2-1 3-5.

In this scenario we kill both bookies that make up the ensemble of the current ledger.

The full output of a single run:

$ python pulsar-test.py kill-bookies[2] kill-qw-bookies 1 2000000 50000 2-2-1 3-5 false
Destroying blockade cluster
Creating blockade cluster
Running test with config: E 2 Qw 2 Qa 1 with nodes bookie1 bookie2 bookie3 bookie4 bookie5 proxy pulsar1 pulsar2 pulsar3 zk1 
Creating tenant and namespace with configuration 2-2-1

Test Run #1 on topic kill-qw-bookies_1  ------------
-------------------------------------------------
WRITE PHASE
-------------------------------------------------
2018-10-20 23:46:24.587 INFO  ConnectionPool:63 | Created connection for pulsar://172.20.0.11:6650
2018-10-20 23:46:24.588 INFO  ClientConnection:285 | [172.20.0.1:40032 -> 172.20.0.11:6650] Connected to broker
2018-10-20 23:46:25.620 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 23:46:25.620 INFO  ClientConnection:287 | [172.20.0.1:40036 -> 172.20.0.11:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 23:46:26.098 INFO  ProducerImpl:155 | [persistent://vanlightly/cluster-1/ns1/kill-qw-bookies_1, ] Created producer on broker [172.20.0.1:40036 -> 172.20.0.11:6650] 
Send count: 56001 Ack count: 50000 Pos: 50000 Neg: 0
-------------------------------------------------
Identifing first 2 bookies in ensemble
-------------------------------------------------
Send count: 101422 Ack count: 100000 Pos: 100000 Neg: 0
Send count: 150870 Ack count: 150000 Pos: 150000 Neg: 0
Send count: 204342 Ack count: 200000 Pos: 200000 Neg: 0
Send count: 252210 Ack count: 250000 Pos: 250000 Neg: 0
Send count: 304805 Ack count: 300000 Pos: 300000 Neg: 0
Send count: 357891 Ack count: 350000 Pos: 350000 Neg: 0
Send count: 400852 Ack count: 400000 Pos: 400000 Neg: 0
Send count: 450256 Ack count: 450000 Pos: 450000 Neg: 0
Send count: 502910 Ack count: 500000 Pos: 500000 Neg: 0
Send count: 551386 Ack count: 550000 Pos: 550000 Neg: 0
Send count: 604204 Ack count: 600000 Pos: 600000 Neg: 0
Send count: 653056 Ack count: 650000 Pos: 650000 Neg: 0
Send count: 709835 Ack count: 700000 Pos: 700000 Neg: 0
-------------------------------------------------
2 BOOKIES KILLED!
-------------------------------------------------
Send count: 753606 Ack count: 750000 Pos: 750000 Neg: 0
Send count: 803318 Ack count: 800000 Pos: 800000 Neg: 0
Send count: 853127 Ack count: 850000 Pos: 850000 Neg: 0
Send count: 905068 Ack count: 900000 Pos: 900000 Neg: 0
Send count: 951923 Ack count: 950000 Pos: 950000 Neg: 0
Send count: 1004825 Ack count: 1000000 Pos: 1000000 Neg: 0
Send count: 1051416 Ack count: 1050000 Pos: 1050000 Neg: 0
Send count: 1102301 Ack count: 1100000 Pos: 1100000 Neg: 0
Send count: 1151166 Ack count: 1150000 Pos: 1150000 Neg: 0
Send count: 1205605 Ack count: 1200000 Pos: 1200000 Neg: 0
Send count: 1251999 Ack count: 1250000 Pos: 1250000 Neg: 0
Send count: 1306006 Ack count: 1300000 Pos: 1300000 Neg: 0
Send count: 1357391 Ack count: 1350000 Pos: 1350000 Neg: 0
Send count: 1406744 Ack count: 1400000 Pos: 1400000 Neg: 0
Send count: 1458207 Ack count: 1450000 Pos: 1450000 Neg: 0
Send count: 1509467 Ack count: 1500000 Pos: 1500000 Neg: 0
Send count: 1558339 Ack count: 1550000 Pos: 1550000 Neg: 0
Send count: 1603861 Ack count: 1600000 Pos: 1600000 Neg: 0
Send count: 1654170 Ack count: 1650000 Pos: 1650000 Neg: 0
Send count: 1702673 Ack count: 1700000 Pos: 1700000 Neg: 0
Send count: 1751781 Ack count: 1750000 Pos: 1750000 Neg: 0
Send count: 1803040 Ack count: 1800000 Pos: 1800000 Neg: 0
Send count: 1850616 Ack count: 1850000 Pos: 1850000 Neg: 0
Send count: 1906137 Ack count: 1900000 Pos: 1900000 Neg: 0
Send count: 1955752 Ack count: 1950000 Pos: 1950000 Neg: 0
Send count: 2000000 Ack count: 2000000 Pos: 2000000 Neg: 0
2018-10-20 23:47:28.728 INFO  ProducerImpl:467 | [persistent://vanlightly/cluster-1/ns1/kill-qw-bookies_1, cluster-1-0-0] Closed producer
-------------------------------------------------
READ PHASE
-------------------------------------------------
2018-10-20 23:47:38.923 INFO  ConnectionPool:63 | Created connection for pulsar://172.20.0.11:6650
2018-10-20 23:47:38.924 INFO  ClientConnection:285 | [172.20.0.1:40216 -> 172.20.0.11:6650] Connected to broker
2018-10-20 23:47:38.931 INFO  HandlerBase:53 | [persistent://vanlightly/cluster-1/ns1/kill-qw-bookies_1, reader-1b446f, 0] Getting connection from pool
2018-10-20 23:47:39.176 INFO  ConnectionPool:63 | Created connection for pulsar://pulsar1:6650
2018-10-20 23:47:39.176 INFO  ClientConnection:287 | [172.20.0.1:40220 -> 172.20.0.11:6650] Connected to broker through proxy. Logical broker: pulsar://pulsar1:6650
2018-10-20 23:47:39.276 INFO  ConsumerImpl:168 | [persistent://vanlightly/cluster-1/ns1/kill-qw-bookies_1, reader-1b446f, 0] Created consumer on broker [172.20.0.1:40220 -> 172.20.0.11:6650] 
Last confirmed entry: [0, 2506]

Notice that the sending of messages went on, unhindered. When both bookies failed, the broker will have created a new fragment with a new ensemble. There were originally 5 bookies, leaving 3 available to host the new fragment with E and Qw of 2.

However the reader blocks on read_next() as it is trying to read from a fragment that we no longer have the data for (unless the killed bookies come back with their data intact).

We can inspect in ZooKeeper, which ledgers are underreplicated. I have a bash script that monitors the underreplication path.

$ bash monitor-underrep.sh 
[0000]

We could also inspect the details of the first ledger:

$ bash show-ledger.sh 00/0000/L0000
...(omitted a bunch of uniteresting output)
quorumSize: 2
ensembleSize: 2
length: 0
lastEntryId: -1
state: OPEN
segment {
  ensembleMember: "172.20.0.3:3181"
  ensembleMember: "172.20.0.5:3181"
  firstEntryId: 0
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.4:3181"
  firstEntryId: 949
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.6:3181"
  firstEntryId: 959
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.6:3181"
  firstEntryId: 1340
}
digestType: CRC32C
password: ""
ackQuorumSize: 1

The first fragment, that our reader is blocking on, has bookies with IP addresses 172.20.0.3 and 172.20.0.5 which correspond to the two bookies we killed.

The ledger cannot be recovered because the data of the first fragment is gone. If the data on those two bookies is gone forever all we could do is skip to entry id 949 to continue reading.

But if we could bring back just one of those two bookies, then AutoRecovery could ensure that the fragment gets fully replicated and is available again. Let’s simulate that by starting bookie1 using the command “blockade start bookie1” from the cluster directory.

The first thing we notice is that the reader jumps into life and consumes 650,000 messages, then blocks and finally it reaches the 60 second timeout I gave it and the test run ends.

Start reading from (0,0,0,-1)
Received: 50000 Curr Entry: [0, 66]
Received: 100000 Curr Entry: [0, 158]
Received: 150000 Curr Entry: [0, 246]
Received: 200000 Curr Entry: [0, 312]
Received: 250000 Curr Entry: [0, 382]
Received: 300000 Curr Entry: [0, 444]
Received: 350000 Curr Entry: [0, 501]
Received: 400000 Curr Entry: [0, 566]
Received: 450000 Curr Entry: [0, 635]
Received: 500000 Curr Entry: [0, 698]
Received: 550000 Curr Entry: [0, 761]
Received: 600000 Curr Entry: [0, 826]
Received: 650000 Curr Entry: [0, 895]
Read phase complete with message (0,946,478,-1)
2018-10-21 00:04:15.760 INFO  ConsumerImpl:761 | [persistent://vanlightly/cluster-1/ns1/kill-qw-bookies_1, reader-1b446f, 0] Closed consumer 0
Results --------------------------------------------
Final send count: 2000000
Final ack count: 2000000
Final positive ack count: 2000000
Final negative ack count: 0
Messages received: 691566
Acked messages missing: 1308433
Non-acked messages received: 0
Out-of-order: 0
Duplicates: 0
----------------------------------------------------

Notice that the last entry it read from before giving up was 946, just three away from the beginning of the second fragment. It got stuck at the end of the first fragment, but why?

UPDATE: I have updated this section from the original after feedback from Ivan Kelly.

Well, with a Qa of 1, when we killed both bookies of the ensemble, the last three entries only got persisted to a single bookie. That bookie is still down, so those last three entries are not readable.

I use the reader.py script to consume from that topic. It also blocks at the 650,000 message mark. Now I start bookie 3: blockade start bookie3.

The reader.py then gets past the point, slow at first then rapidly consumes the 2,000,000 messages. This is due to the fact that exponential back-off is applied to reads, then when reading resumes after previous read failures, it does a TCP like slow start where it gradually increases the batch sizes it reads from bookies.

I inspect the underreplication path in ZooKeeper and ledger 0000 no longer appears. In theory, now that the ledger is fully replicated, I could kill bookie3 again, and because bookie1 now will have all the entries of the first fragment, AutoRecovery should kick-in and rereplicate the ledger to a second bookie.

$ blockade kill bookie3
$ bash monitor-underrep.sh 
[]
[]
[]
[]
[]
[]
[]
[]
[]
[0000]
[0000]
[0000]
[0000]
[]
[]

Each [] meanings that no ledgers are underreplicated. It takes about 2 seconds to inspect the underreplication path (exec to docker, run zookeeper-shell etc). So it took about 16 seconds to detect that ledger 0000 was underreplicated. But then just 8 seconds to perform rereplication.

If we inspect the ledger again, we should see a different ensemble for the first fragment:

$bash show-ledger.sh 00/0000/L0000
...
quorumSize: 2
ensembleSize: 2
length: 0
lastEntryId: -1
state: OPEN
segment {
  ensembleMember: "172.20.0.6:3181"
  ensembleMember: "172.20.0.4:3181"
  firstEntryId: 0
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.4:3181"
  firstEntryId: 949
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.6:3181"
  firstEntryId: 959
}
segment {
  ensembleMember: "172.20.0.7:3181"
  ensembleMember: "172.20.0.6:3181"
  firstEntryId: 1340
}
digestType: CRC32C
password: ""
ackQuorumSize: 1

bookie3 (172.20.0.5) is no longer a member of the ensemble for the first fragment. AutoRecovery has done its job.

So what did we learn?

That you can only lose up to Qw-1 bookies. If you lose Qw bookies then you really want to bring those bookies back online with their data intact. In that case then you will not have lost data.

Auto Recovery also cannot repair an under-replicated ledger where the number of lost bookies of its ensemble is equal to its ack quorum (Qa). So with Qa=1 and a single lost bookie, AutoRecovery will not be able to repair the ledger.

Regarding this inability for BookKeeper to recover a ledger in this situation, Ivan Kelly helped out by commenting that: “Its the close operation that is failing. Close need to decide what the last entry of the segment is. With Qa=1, this is not possible if 1 bookie of the write quorum is down, as that bookie could have acknowledged a write (or could yet acknowledge a write if we're being strict)”.

So Qa=1 is a dangerous configuration choice as all you need is a single downed bookie and an under-replicated ledger cannot be repaired.


Results Summary

Total acked messages missing: 0

Total out-of-order messages: 0

None of the tests were able to produce message loss. None of the tests were able to produce ordering issues.

I am ignoring the killing Qw bookies test, though once we brought back both bookies we were able to consume all the messages.

Conclusions

Today we’ve seen that both killing a broker and isolating a broker from ZooKeeper cannot cause message loss but can cause non-acknowledged messages to be read (which is normal and unavoidable) and can cause duplication if the deduplication feature is not enabled. We have seen that duplication works, even in a fail-over situation. I will be testing deduplication further in the next part.

Killing a single bookie also cannot produce message loss with a Qw of 2 or more. Isolating a bookie from ZooKeeper also has no affect.

We saw that losing all the bookies of an ensemble will block readers until those bookies come back. If they don’t come back then your data is gone. But we can’t expect anything else when we lose all copies of the data.

Minimum values:

  • Write Quorum (Qw). You need at least Qw of at least 2 to survive the loss of a bookie.

  • Ack Quorum (Qa). You need a Qa of at least 2 for AutoRecovery to be able to rereplicate a ledger after the loss of a bookie.

  • You need at least Default E bookies available in order to create readers and consumers as their cursors use that value for their cursors.

All in all, Apache Pulsar is looking very robust so far and hard to misconfigure. I have found a couple of little things that warrant further investigation, such as producers slowing down significantly after a broker fail-over, if the number of messages in flight at the time was close to the internal pending message queue size. But regarding the claims of zero message loss and ordering guarantees, Apache Pulsar has come out flawless.

I have a shopping list of further tests to run, and we’ll see which ones I have time for:

  • Deduplication and broker fail-over

  • Long-running test (a few hours) with slow producers, randomly killing a node (broker or bookie every 5 minutes). Check for message loss and ordering issues.

  • Send a load of messages. Then over a period of hours, kill a random bookie every 5-10 minutes. Check at the end if AutoRecovery has managed to maintain full replication without message loss.

  • Partitioned topics - all types of tests

  • Slow/flaky network tests

Thanks for reading.