Time, Clocks, and the Ordering of Events in a Distributed System - Paper Review
This is a review of Leslie Lamport's paper Time, Clocks, and the
Ordering of Events in a Distributed System. I focus on the
paper's proof of conditions under which a distributed system of
physical clocks can produce an ordering of events, which does not
violate causality. The proof lies in the appendix of the paper, and is
extremely brief: Not unlike reading a mystery novel. I expand on the
paper by presenting the proof, all the math worked out, explained with
a less mysterious level of brevity.
If a is the sending of a message, and b the receiving, then
aβb.
A nice property of β is, a can causally effect b if and
only if aβb. Because, for events a and b occurring on
Piβ and Pjβ, a can only effect b if Pjβ has learned about a
by receiving a message from Piβ.
C1 ensures the first rule of β is satisfied, and C2
the second. Thus, a system of clocks satisfying the clock conditions
will order events with a β relation.
Logical Clocks
Before considering physical clocks, we consider a simpler clock. Logical
clocks satisfy the clock conditions, but don't order events in a way
compatible with our notion of time.
To order events with a logical clock, processes follow two rules.
IR1: Each process Piβ increments Ciβ between any two successive events.
Processes implementing these rules satisfy the clock conditions C1 and
C2, by IR1 and IR2 respectively.
Time, and Logical Clock Time
In a sense, logical clock time and the time we are used to line up.
βΊ guarantees events happening before a message is sent, are
ordered before events on the receiving process after the message is
received. This preserves causality, as in order for an event on Piβ to
cause an event on Pjβ a message must be sent between the processes.
When events can not cause eachother, our logical clocks order
events according to the number of events preceding them. This has no
basis in our typical notion of time. To address this, we turn to
physical clocks.
Physical Clocks
Physical clocks track time as we perceive it. A watch is a physical
clock, and like a watch, physical clocks can become out of sync. This
can cause causality to break; If one clock is so far ahead of another
the receiving timestamp is smaller than the sending one, then receive
β send.
This section models a physical clock, derives the conditions under which
causality can break, and proves what conditions will ensure it does not.
Physical Clock Conditions
To model a physical clock, we put two conditions on its behavior.
PC1: There exists a constant kβͺ1, such that for all i:
β£dtdCiβ(t)ββ1β£<k.
PC2: There exists a constant Ο΅ such that, for all Piβ, Pjβ:
β£Ciβ(t)βCjβ(t)β£<Ο΅.
There are two important things to note about these conditions. First,
physical clocks are not naturally imbued with them. In a real system, a
process might turn off indefinitely, in which case the clock ceases
advancing, and Ο΅ is no longer bounded. In this sense, the math
in this paper isn't easily mapped to reality. Second, while physical
clocks will generally run at rates close to correct (making PC1
reasonable), they will not automatically synchronize (making PC2
unreasonable, without a synchronization algorithm).
Anomalous Clocks
Clocks become out of sync when they run at the wrong rate.
Mathematically speaking we look to the derivative. For a clock running
at the correct speed dtdC(t)β=1, too fast
dtdC(t)β>1, and too slow dtdC(t)β<1.
Breaking Causality
For any two processes communicating, there is a lower bound on how
quickly they may exchange messages. At the limit, this is the distance
between the processes, divided by the speed of light. We call this lower
bound ΞΌ.
Our goal is to find values of Ο΅ for which, for any message,
sent by Piβ at t, and received by Pjβ at tβ², Ciβ(t)<Cjβ(tβ²).
If this is true, βΊ will satisfy C2. As ΞΌ is a lower bound
on message send time, tβ²>t+ΞΌ. Thus, to satisfy C2:
Ciβ(t)β€Cjβ(t+ΞΌ)ββ
By PC1, the slowest possible clock advances at a rate
dtdCjβ(t)β>(1βk). Thus, Cjβ(t+ΞΌ)>Cjβ(t)+(1βk)ΞΌ. As
we are interested in solving the innequality for all possible values
of Cjβ(t+ΞΌ), we plug in the lower bound into the right side of
(2), yielding a bound on Ο΅.
If this inequality is true, βΊ will satisfy C2. And thus, for
software satisfying C1, βΊ will preserve causality. If the
inequality is false, it will be possible to order the receiving of a
message before its sending, causing βΊ to break causality.
Preserving Causality
Having discovered conditions under which causality breaks, we set about
describing an algorithm which preserves causality, and prove it works.
Our algorithm is the same one used for logical clocks, slightly modified
for physical time.
IRP1: The physical clocks for each process are always advancing:
dtdCiβ(t)β>0.
IRP2: (a) For each i, if Piβ sends a message m at physical time t,
then m contains a timestamp Tmβ=Ciβ(t). (b) Upon receiving a
message m at tβ², process Pjβ sets
Cjβ(tβ²)β₯max(Cjβ(tβ²),Tmβ+ΞΌ).
We model our processes as being arranged in a graph, the edges of which
represent communication links. We make three assumptions about our graph
and the behavior of its processes.
The diameter, d, of the graph is the shortest number of links
which connects any two processes.
For a message sent at physical time t, and received at tβ², we
define the total delayvmβ=tβ²βt. vmβ<ΞΎmβ+ΞΌmβ
where ΞΎmβ is the maximum unpredictable delay of the
message, and ΞΌmβ the minimum predictable delay for the
message. For speaking about the entire graph, we define
v=max(vmβ), ΞΎ=max(ΞΎmβ), and ΞΌ=max(ΞΌmβ), for
all possible m.
Every Ο seconds a message is sent along every communication
link.
With these assumptions, we set about finding a value for Ο΅. To
do so, we will find lower and upper bounds on Ciβ(t), then subtract
them to find a value for Ο΅.
The Slowest Clock
Say process P1β sends a message to P2β at time t1β, which is
received at time t2β. Per PC1, C2β(t) must be at least as fast
as the slowest clock.
Per IRP2, after the message has been received, the clock of the
receiving process must satisfy C2β(t2β)β₯C1β(t1β)+ΞΌmβ.
Combining this with (4).
(9) sets a lower
bound on the receiving clock's time, in terms of the sending clock. Now,
imagine a series of processes [P1β,β¦,PNβ]. At time tiβ, process
Piβ receives a message from process Piβ1β, sent at time
tiβ1β²β, where tiβ1β²ββ€tiββ€tiβ²β. Consider what happens
when we repeatedly apply (9) to C3β.
Every time we apply the inequality, it subtracts ΞΎ, and decreases
the time and clock subscript by one. For a process Piβ, there is a
chain of (iβ1) processes leading up to it, so (9) can be applied
(iβ1) times to find a lower bound on Piβ's clock.
Now imagine this scenario, applied to our graph of processes. Recall
assumptions (1) and (3), which say our graph has diameter d, and every
process sends a message to every other process every Ο units of
time. If we imagine timestamps propagating through the graph as messages
are sent, every d(Ο+v) units of time, a timestamp from every
process will have had the opportunity to propagate to every other
process.
We can thus apply our earlier inequality for the minimum value of a
clock in a series of processes, for a series of processes of length d.
For any two processes, when tβ₯t1β+d(Ο+v):
This sets a lower bound on the value of a clock, in terms of some known
value of another clock and the amount of time that has passed. We are
now half way to finding Ο΅. In the next section, we look for an
upper bound.
The Fastest Clock
Say at time txβ, Cxβ is the clock with the highest value. By
PC1:
However, this argument is subtly incomplete. When a process receives a
message, it sets its clock's value to be β₯Tmβ+ΞΌmβ (IRP2).
Imagine Pxβ sends a message at txβ, setting Tmβ=Cxβ(txβ). Upon
receiving the message at time tβ², the receiver sets their clock to
Tmβ+ΞΌmβ. Is it possible for
Tmβ+ΞΌmβ>Cxβ(txβ)+(1+k)(tβ²βtxβ)? If so,
(14) would be
incorrect.
The smallest value of k is 0 which occurs if a clock is running at
exactly the correct rate. We can plug this into the inequality, and say
it is satisfied if, tβ²βtxβ<ΞΌmβ. This is false, as by definition
ΞΌmβ is less than the smallest possible transmission time for the
message. Hence, (14) remains true, even when considering clocks settings
themselves ahead upon receiving messages from the fastest clock.
Synchronicity
Equation (13)
holds for any i,j, so setting j=x, t1β=txβ, and combining
(13) and
(14).
This holds for all i, and thus bounds the value of a clock. The
farthest off two clocks can be, is the gap between the fastest and
slowest possible clocks. That is, the difference between the upper and
lower bound of (16).
Notice how (18) applies to all t, txβ, where tβ₯txβ+d(Ο+v). This means, for any t, we can find a txβ, such that
tβtxβ=d(Ο+v). This is the lowest value for tβtxβ, so we can
replace the tβtxβ term in (18).
This concludes our proof for a bound on Ο΅. Our algorithm takes
d(Ο+v) units of time to synchronize, and then the amount two
clocks may be out of sync is bounded.
Ο΅β€d(2k(Ο+v)+ΞΎ)βββ
Conclusion
In equation (3), we found the maximum Ο΅ that preserves
causality. We can combine (3) and (20) investigate this further.
Ο΅d(2k(Ο+v)+ΞΎ)ββ€(1βk)ΞΌβ€(1βk)ΞΌβ
All of these variables, except Ο describe physical properties of
the system. Ο, the rate at which messages are sent, is a property
of the software. So, we rearrange the equation in terms of Ο.
Οβ€2k(1βk)ΞΌ/dβΞΎββv
Selecting a value for Ο which satisfied this inequality, would
guarantee ordering events such that, if a caused b, aβΊb.