This is a review of Impossibility of Distributed Consensus with One Faulty Process by M. J. Fischer, N. A. Lynch, and M. S. Patterson (referred to as the FLP result). I restate the paper's proof no consensus algorithm running in an asynchronous network can be totally correct. I've filled in the use of "by easy induction" with induction, use slightly different notation, and you can hover or click underlined terms to show their definition. I likely guild the lily.
We are presented with a set of processes, , which implement a consensus algorithm. Each may communicate with other processes with messages. Pending messages are stored in a message buffer. When attempts to receive a message, it receives a random pending message addressed to it, or . A process receiving infinitely many times is guaranteed to receive all pending messages. Though, for a single receive, it is valid for the system to return despite a pending message being available.
At any time, may take a step where it attempts to receive a message. A step by is an event, denoted . A run, denoted , is a sequence of events. We assume processes don't have clocks (no timeouts), and thus only change their state upon receiving a message.
The set of possible configurations of the system is denoted . Each configuration contains a possible system state. The configuration after applying an event is denoted , and after applying a run . A configuration is reachable from if . Accessible configurations are reachable from an initial configuration.
Lemma 0: If an event is applicable to it is applicable to all reachable from .
We say correctly implements a consensus algorithm, insofar as all messages sent can be received. Messages may be delated infinitely and are ordered randomly; thus, messages are applicable to all configurations reachable from the one in which they are sent.
Lemma 1: If the set of processes taking steps in and are disjoint, .
This follows, as the runs modify disjoint parts of the system's state.
Each has a decision value . As is correct, if any has a decision value, all will eventually have the same one. The goal of consensus is for all processes to agree on a decision value.
An initial configuration is a configuration with an empty message buffer, for which no process has a decision value, i.e. .
Partially correct consensus algorithms satisfy two requirements. (1) No accessible configuration has more than one decision value. (2) Both 0 and 1 are accessible decision values.
These correctness requirements seem reasonable. (1) requires processes come to agreement, and (2) they can come to agreement on all options.
A configuration is deciding if a process has a decision value. A consensus algorithm is totally correct if it is partially correct, and every admissible run is deciding. The difference between totally and partially correct consensus algorithms is, partially correct ones may never make a decision.
A configuration with a decision value of 1 is 1-valent, and 0 is 0-valent. If there are 1-valent and 0-valent configurations reachable from , is bivalent. If there are only 1-valent or only 0-valent configurations reachable from , is univalent.
Lemma 2: All totally correct consensus algorithms have a bivalent initial configuration.
Imagine each initial configuration arranged in a graph. Two configurations are connected if they differ by the state of only one process. As we've placed no constraints on initial configurations, other than the message buffer be empty and no decision value be reached, each configuration is connected to every other configuration by a chain of configurations.
We prove Lemma 2 by contradiction. Assume there is no bivalent initial configuration. For a totally correct consensus algorithm, each configuration must be deciding, and, as there are no bivalent configurations, must be one of 0-valent or 1-valent. Thus, our graph must contain a 0-valent configuration connected to a 1-valent one. Call these two configurations and respectively.
Let be the process who's state differs between and , and be a run from in which takes no steps. As and differ only by the state of , and takes no steps in , . As is reachable from , it must be 0-valent. As is reachable from , it must be 1-valent. Thus, we have arrived at a contradiction and one of or is bivalent, proving Lemma 2.
Lemma 3: Let be a bivalent configuration of a partially correct consensus algorithm, and be an event applicable to . Let denote the set of configurations reachable from without applying , and the set of configurations reachable from by applying : . contains a bivalent configuration.
By Lemma 0, as is applicable to , is applicable to all configurations in . This validates our construction of .
We prove Lemma 3 by contradiction. Assume there is no bivalent configuration in . Let be an -valent configuration reachable from , as is bivalent . If , is in and -valent, as is -valent. Otherwise, . In either case can be or , so contains both 0-valent and 1-valent configurations.
Call two configurations neighbors if one is reachable from the other via a single event. Let be an -valent configuration . As contains both 0 and 1 valent configurations, is a -valent configuration. Let be a run where . As the valence of and is different, there must exist neighbor configurations and and an event , such that , is -valent, and is -valent.
Case 1, . By Lemma 1 ; however, this is a contradiction, as is -valent, and is -valent (as is assumed to contain only univalent configurations, and is -valent).
Case 2, . Consider any deciding run from in which takes no steps, denoted . By Lemma 1, , which is valent per the definition of . Note also, , which is -valent, by the definition of and . Thus, is bivalent; however, this is a contradiction, as is deciding (and thus univalent).
Lemma 3 tells us, for any event , and bivalent configuration , there is an admissible run, , such that is bivalent. Lemma 1 tells us there is always a bivalent initial configuration. We can use these to construct an algorithm for advancing a partially correct consensus algorithm, such that it never reaches a decision value.
Starting from a bivalent configuration , when a process sends a message to be received in , apply , then apply . is bivalent, so this can be repeated indefinitely.
A previous essay described RAFT (a contemporary consensus algorithm), even RAFT is not totally correct. If the network is partitioned, and no single partition contains the majority of nodes, RAFT will stop advancing.