TL;DR
Deep dive into why keeping clocks synchronized across distributed systems is fundamentally hard, covering NTP, PTP, logical clocks, and Google's TrueTime approach.
Key Points
- Quartz crystal drift causes ~110 seconds/year deviation per 10°C temperature change in standard computer clocks
- NTP achieves tens of milliseconds accuracy over internet, sub-millisecond on LANs; insufficient for microsecond-precision domains like HFT
- Google's TrueTime returns time intervals with bounded uncertainty rather than single timestamps, enabling strong consistency in Spanner
- CockroachDB uses hybrid logical clocks (HLC) on commodity hardware with configurable 500ms max clock offset for consistency safety
Why It Matters
Clock synchronization underpins critical distributed systems—from database consistency to financial transactions to debugging traces. Engineers must understand the tradeoffs between accuracy, latency, and complexity when choosing between NTP, PTP, logical clocks, or custom solutions like TrueTime. Wrong choices silently break consistency guarantees or add unacceptable latency.
Source: arpitbhayani.me