Analysis of Errors and
Correctness Principles
Introduction
This appendix contains an analysis of errors arising
in the generation and processing of NTP timestamps and the determination of delays and offsets. It establishes error bounds as a
function of measured roundtrip delay and dispersion to the root (primary reference source) of the synchronization subnet. It also
discusses correctness assertions about these error bounds and the timetransfer, filtering and selection algorithms used in
NTP.
The notation w = [u,v] in the following describes
the interval in which u is the lower limit and v the upper limit, inclusive. Thus, u = min (w) <= v = max (w), and for scalar a, w
+ a = [u + a, v + a]. Table 12 shows a summary of other notation used in the analysis. The notation <x> designates the
(infinite) average of x, which is usually approximated by an exponential average, while the notation x "hat" designates an estimator
for x. The lowercase Greek letters theta, delta and epsilon are used to designate measurement data for the local clock to a peer
clock, while the uppercase Greek letters THETA, DELTA and EPSILON are used to designate measurement data for the local clock relative
to the primary reference source at the root of the synchronization subnet. Exceptions will be noted as they arise.
Timestamp Errors
The standard second (1 s) is defined as
"9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the
cesium133 atom" [ALL74b], which implies a granularity of about 1.1x1010 s. Other intervals can be determined as rational multiples
of 1 s. While NTP time has an inherent resolution of about 2.3x1010 s, local clocks ordinarily have resolutions much worse than this,
so the inherent error in resolving NTP time relative to the 1 s can be neglected.
In this analysis the local clock is represented by a
counter/divider which increments at intervals of s seconds and is driven by an oscillator which operates at frequency f sub c = n over
s for some integer n. A timestamp T(t) is determined by reading the clock at an arbitrary time t (the argument t will be usually
omitted for
conciseness). Strictly speaking, s is not known
exactly, but can be assumed bounded from above by the maximum reading error rho. The reading error itself is represented by the random
variable r bounded by the interval [rho ,0], where rho depends on the particular clock implementation. Since the intervals between
reading the same clock are almost always independent of and much larger than s, successive readings can be considered independent and
identically distributed. The frequency error of the clock oscillator is represented by the random
variable f bounded by the interval [phi ,phi ],
where phi represents the maximum frequency tolerance of the oscillator throughout its service life. While f for a particular clock is
a random variable with respect to the population of all clocks, for any one clock it ordinarily changes only slowly with time and can
usually be assumed a constant for that clock. Thus, an NTP timestamp can be represented by the random variable T:
T = t + r + f tau
,
where t represents a clock reading, tau represents
the time interval since this reading and minor approximations inherent in the measurement of tau are neglected.
In order to assess the nature and expected magnitude
of timestamp errors and the calculations based on them, it is useful to examine the characteristics of the probability density
functions (pdf) p sub r (x) and p sub f (x) for r and f respectively. Assuming the clock reading and counting processes are
independent, the pdf for r is uniform
over the interval [rho ,0]. With conventional
manufacturing processes and temperature variations the pdf for f can be approximated by a truncated, zeromean Gaussian distribution
with standard deviation sigma. In conventional manufacturing processes sigma is maneuvered so that the fraction of samples rejected
outside the interval [phi ,phi ] is acceptable. The pdf for the total timestamp error epsilon (x) is thus the sum of the r and f
contributions, computed as
which appears as a bellshaped curve, symmetric
about rho over 2 and bounded by the interval
Since f changes only slowly over time for any single
clock,
where epsilon without argument designates the
interval and epsilon (x) designates the pdf. In the following development subscripts will be used on various quantities to indicate to
which entity or timestamp the quantity applies. Occasionally, epsilon will be used to designate an absolute maximum error, rather than
the interval, but the distinction will be clear from context.
Measurement Errors
In NTP the roundtrip delay and clock offset between
two peers A and B are determined by a procedure in which timestamps are exchanged via the network paths between them. The procedure
involves the four most recent timestamps numbered as shown in Figure 14, where the theta sub 0 represents the true clock offset of
peer B relative to peer A. The T sub 1 and T sub 4 timestamps are determined relative to the A clock, while the T sub 2 and T sub 3
timestamps are determined relative to the B clock. The measured roundtrip delay delta and clock offset theta of B relative to A are
given by
The errors inherent in determining the timestamps
T1, T2, T3 and T4 are, respectively,
For specific peers A and B, where f sub A and f sub
B can be considered constants, the interval containing the maximum error inherent in determining delta is given by
In the NTP local clock model the residual frequency
errors f sub A and f sub B are minimized through the use of a typeII phaselock loop (PLL). Under most conditions these errors will
be small and can be ignored. The pdf for the remaining errors is symmetric, so that delta "hat" = <delta> is an unbiased
maximumlikelihood estimator for the true roundtrip delay, independent of the particular values of rho sub A and rho sub B.
However, in order to reliably bound the errors under
all conditions of component variation and operational regimes, the design of the PLL and the tolerance of its intrinsic oscillator
must be controlled so that it is not possible under any circumstances for f sub A or f sub B to exceed the bounds [phi sub A ,phi sub
A ] or [phi sub B ,phi sub B ], respectively. Setting rho = max ( rho sub A ,rho sub B ) for convenience, the absolute maximum error
epsilon sub delta inherent in determining roundtrip delay delta is given by
neglecting residuals.
As in the case for delta, where f sub A and f sub B
can be considered constants, the interval containing the maximum error inherent in determining theta is given by
Under most conditions the errors due to f sub A and
f sub B will be small and can be ignored. If rho sub A = rho sub B = rho; that is, if both the A and B clocks have the same
resolution, the pdf for the remaining errors is symmetric, so that theta "hat" = < theta > is an unbiased maximumlikelihood
estimator for the true clock offset theta sub 0, independent of the particular value of rho. If rho sub A != rho sub
B, < theta > is not an unbiased estimator; however, the bias error is in the order of
and can usually be neglected.
Again setting rho = max ( rho sub A , rho sub
B ) for convenience, the absolute maximum error epsilon sub theta inherent in determining clock offset theta is given by
Network Errors
In practice, errors due to stochastic network delays
usually dominate. In general, it is not possible to characterize network delays as a stationary random process, since network queues
can grow and shrink in chaotic fashion and arriving customer traffic is frequently bursty. However, it is a simple exercise to
calculate bounds on clock offset
errors as a function of measured delay. Let T sub 2
 T sub 1 = a and T sub 3  T sub 4 = b. Then,
The true offset of B relative to A is called theta
sub 0 in Figure 14. Let x denote the actual delay between the departure of a message from A and its arrival at B. Therefore, x + theta
sub 0 = T sub 2  T sub 1 == a . Since x must be positive in our universe, x = a  theta sub 0 >= 0, which requires theta sub 0
<= a. A similar argument requires that b <= theta sub 0, so surely b <= theta sub 0 <= a. This inequality can also be
expressed
which is equivalent to
In the previous section bounds on delay and offset
errors were determined. Thus, the inequality can be written
where epsilon sub theta is the maximum offset error
and epsilon sub delta is the maximum delay error derived previously. The quantity
called the peer dispersion, defines the maximum
error in the inequality. Thus, the correctness interval I can be defined as the interval
in which the clock offset C = theta is the midpoint.
By construction, the true offset theta sub 0 must lie somewhere in this interval.
Inherited Errors
As described in the NTP specification, the NTP time
server maintains the local clock THETA, together with the root roundtrip delay delta and root dispersion EPSILON relative to the
primary reference source at the root of the synchronization subnet. The values of these variables are either included in each update
message or can be derived as described in the NTP specification. In addition, the protocol exchange and clockfilter algorithm provide
the clock offset THETA and roundtrip delay delta of the local clock relative to the peer clock, as well as various error accumulations
as described below. The following discussion establishes how errors inherent in the timetransfer process accumulate within the subnet
and contribute to the overall error budget at each server.
An NTP measurement update includes three parts:
clock offset theta, roundtrip delay delta and maximum error or dispersion epsilon of the local clock relative to a peer clock. In case
of a primary clock update, these values are usually all zero, although EPSILON can be tailored to reflect the specified maximum error
of the primary reference source itself. In other cases THETA and delta are calculated directly from the four most recent timestamps,
as described in the NTP specification. The dispersion EPSILON includes the following contributions:
• 
1. Each time the local clock is read a reading error is incurred due
to the finite granularity or precision of the implementation. This is called the measurement dispersion rho. 
• 
2. Once an offset is determined, an error due to frequency offset or
skew accumulates with time. This is called the skew dispersion phi tau, where phi represents the skewrate constant (NTP.MAXSKEW over
NTP.MAXAGE in the NTP specification) and tau is the interval since the dispersion was last updated. 
• 
3. When a series of offsets are determined at regular intervals and
accumulated in a window of samples, as in the NTP clockfilter algorithm, the (estimated) additional error due to offset sample
variance is called the filter dispersion epsilon sub sigma. 
• 
4. When a number of peers are considered for synchronization and two
or more are determined to be correctly synchronized to a primary reference source, as in the NTP clockselection algorithm, the
(estimated)additional error due to offset sample variance is called the selection dispersion epsilon sub xi. 
Figure 15 shows how these errors accumulate in the
ordinary course of NTP processing. Received messages from a single peer are represented by the packet variables. From the four most
recent timestamps T1, T2, T3 and T4 the clock offset and roundtrip delay sample for the local clock relative to the peer clock are
calculated directly. Included in the message are the root roundtrip delay DELTA prime and root dispersion EPSILON prime of the peer
itself; however, before sending, the peer adds the measurement dispersion rho and skew dispersion phi tau, where these quantities are
determined by the peer and tau is the interval according to the peer clock since its clock was last updated.
The NTP clockfilter procedure saves the most recent
samples theta sub i and delta sub i in the clock filter as described in the NTP specification. The quantities rho and phi characterize
the local clock maximum reading error and frequency error, respectively. Each sample includes the dispersion epsilon sub i = rho + phi
(T sub 4  T sub 1 ), which is set upon arrival. Each time a new sample arrives all samples in the filter are updated with the skew
dispersion phi tau sub i, where tau sub i is the interval since the last sample arrived, as recorded in the variable peer.update. The
clockfilter algorithm determines the selected clock offset theta (peer.offset), together with the associated roundtrip delay delta
(peer.delay) and filter dispersion epsilon sub sigma, which is added to the associated sample dispersion epsilon sub i to form the
peer dispersion epsilon (peer.dispersion).
The NTP clockselection procedure selects a single
peer to become the synchronization source as described in the NTP specification. The operation of the algorithm determines the final
clock offset THETA (local clock), roundtrip delay DELTA (sys.rootdelay) and dispersion EPSILON (sys.rootdispersion) relative to the
root of the synchronization subnet, as shown in Figure 15. Note the inclusion of the selected peer dispersion and skew accumulation
since the dispersion was last updated, as well as the select dispersion epsilon sub xi computed by the clockselect algorithm itself.
Also, note that, in order to preserve overall synchronization subnet stability, the final clock offset THETA is in fact determined
from the offset of the local clock relative to the peer clock, rather than the root of the subnet. Finally, note that the packet
variables DELTA prime and EPSILON prime are in fact determined from the latest message received, not at the precise time the offset
selected by the clockfilter algorithm was determined. Minor errors arising due to these simplifications will be ignored. Thus, the
total dispersion accumulation relative to the root of the synchronization subnet is
where tau is the time since the peer variables were
last updated and  THETA  is the initial absolute error in setting the local clock.
The three values of clock offset, roundtrip delay
and dispersion are all additive; that is, if THETA sub i, DELTA sub i and EPSILON sub i represent the values at peer i relative to the
root of the synchronization subnet, the values
represent the clock offset, roundtrip delay and
dispersion of peer j at time t. The time dependence of theta sub j (t) and epsilon sub j (t) represents the localclock correction and
dispersion accumulated since the last update was received from peer i, while the term epsilon sub i represents the dispersion
accumulated by peer i from the time its clock was last set until the latest update was sent to peer j. Note that, while the offset of
the local clock relative to the peer clock can be determined directly, the offset relative to the root of the synchronization subnet
is not directly determinable, except on a probabilistic basis and within the bounds established in this and the previous
section.
The NTP synchronization subnet topology is that of a
tree rooted at the primary server(s). Thus, there is an unbroken path from every time server to the primary reference source. Accuracy
and stability are proportional to synchronization distance LAMBDA, defined as
The selection algorithm favors the minimumdistance
paths and thus maximizes accuracy and stability. Since THETA sub 0, DELTA sub 0 and EPSILON sub 0 are all zero, the sum of the clock
offsets, roundtrip delays and dispersions of each server along the minimumdistance path from the root of the synchronization subnet
to a given server i are the clock offset THETA sub i, roundtrip delay DELTA sub i and dispersion EPSILON sub i inherited by and
characteristic of that server.
Correctness Principles
In order to minimize the occurrence of errors due to
incorrect clocks and maximize the reliability of the service, NTP relies on multiple peers and disjoint peer paths whenever possible.
In the previous development it was shown that, if the primary reference source at the root of the
synchronization subnet is in fact a correct clock,
then the true offset theta sub 0 relative to that clock must be contained in the interval
When a number of clocks are involved, it is not
clear beforehand which are correct and which are not; however, as cited previously, there are a number of techniques based on
clustering and filtering principles which yield a high probability of detecting and discarding incorrect clocks. Marzullo and Owicki
[MAR85] devised an algorithm designed to find an appropriate interval containing the correct time given the confidence intervals of m
clocks, of which no more than f are considered incorrect. The algorithm finds the smallest single intersection containing all points
in at least m  f of the given confidence intervals.
Figure 16 illustrates the operation of this
algorithm with a scenario involving four clocks A, B, C and D, with the calculated time (shown by the up arrow symbol) and confidence
interval shown for each. These intervals are computed as described in previous sections of this appendix. For instance, any point in
the A interval may possibly represent the actual time associated with that clock. If all clocks are correct, there must exist a
nonempty intersection including all four intervals; but, clearly this is not the case in this scenario. However, if it is assumed that
one of the clocks is incorrect (e.g., D), it might be possible to find a nonempty intersection including all but one of the intervals.
If not, it might be possible to find a nonempty intersection including all but two of the intervals and so on.
The algorithm proposed by DEC for use in the Digital
Time Service [DEC89] is based on these principles. For the scenario illustrated in Figure 16, it computes the interval for m = 4
clocks, three of which turn out to be correct and one not. The low endpoint of the intersection is found as follows. A variable f is
initialized with the number of presumed incorrect clocks, in this case zero, and a counter i is initialized at zero. Starting from the
lowest endpoint, the algorithm increments i at each low endpoint, decrements i at each high endpoint, and stops when i >= m  f .
The counter records the number of intersections and thus the number of presumed correct clocks. In the example the counter never
reaches four, so f is increased by one and the procedure is repeated. This time the counter reaches three and stops at the low
endpoint of the intersection marked DTS. The upper endpoint of this intersection is found using a similar procedure.
This algorithm will always find the smallest single
intersection containing points in at least one of the original m  f confidence intervals as long as the number of incorrect clocks is
less than half the total f < m over 2. However, some points in the intersection may not be contained in all m  f of the original
intervals; moreover, some or all of the calculated times (such as for C in Figure 16) may lie outside the intersection. In the NTP
clockselection procedure the above algorithm is modified so as to include at least m f of the calculated times. In the modified
algorithm a counter c is initialized at zero. When starting from either endpoint, c is incremented at each calculated time;
however, neither f nor c are reset between finding the low and high endpoints of the intersection. If after both endpoints have been
found c > f, f is increased by one and the entire procedure is repeated. The revised algorithm finds the smallest
intersection of m  f intervals containing at least m  f calculated times. As shown in Figure 16, the modified algorithm
produces the intersection marked NTP and including the calculated time for C.
In the NTP clockselection procedure the peers
represented by the clocks in the final intersection, called the survivors, are placed on a candidate list. In the remaining steps of
the procedure one or more survivors may be discarded from the list as outlyers. Finally, the clockcombining algorithm described in
Appendix F provides a weighted average of the remaining survivors based on synchronization distance. The resulting estimates represent
a synthetic peer with offset between the maximum and minimum offsets of the remaining survivors. This defines the clock offset THETA,
total roundtrip total delay delta and total dispersion EPSILON which the local clock inherits. In principle, these values could be
included in the time interface provided by the operating system to the user, so that the user could evaluate the quality of
indications directly.
