January 22, 2023

Paper: VR Revisited - Checkpoint-Based Replica Recovery (part 6)

January 22, 2023

In part 5 we looked at basic and log-suffix recovery and came to the conclusion that perhaps a persistent State Machine Replication (SMR) protocol should not use asynchronous log flushing with a recovery protocol for recovering from data loss - at least not one that gets stuck after a cluster crash. Checkpoint based recovery doesn’t change that position but it does introduce a very important component of any SMR protocol - checkpointing application state.

Jack Vanlightly

January 17, 2023

Paper: VR Revisited - Log-Based Replica Recovery (part 5)

Jack Vanlightly

January 17, 2023

One of the selling points of VR Revisited is that replicas do not need to write anything to stable storage, or can choose to write to storage asynchronously which can give this protocol a latency advantage over protocols that require fsyncing of key operations.

Jack Vanlightly

January 2, 2023

Paper: VR Revisited - Application state and commit-number monotonicity (part 4)

Jack Vanlightly

January 2, 2023

Part 4 was going to be focused on the replica recovery sub-protocol but while writing the replica recovery specification I discovered that I had failed to enforce a critical property - that of commit number monotonicity.