20250226

Desining Data-Intensive Applications - Chapter 9 - Consistency and Consensus

Linearizability

Linearizability makes a distributed system behave like a single-threaded program, i.e., in a linearizable system, all operations appear to be executed atomically on a single copy of the data, and their effects are immediately visible to all subsequent operations in real-time order. While this simple concept is appealing, achieving linearizability is challenging because all nodes must agree on a global order of oeprations, making the system susceptible to network delays and failures.

Causality

Causality is a weaker guarantee than linearizability. Unlike linearizability, causal consistency does not enforece a global ordering of operations but ensures that causally related operations are observed in the correct sequence. That means when operation A depends on operation B, the system ensures the operations’ causal order is maintained. However, if operation C is independent of A and B, there is no enforced ordering between them.

CAP Theorem and Linearizability vs Causality

The CAP theorem presents the idea that ony two of consistency (C), availability (A), and partition tolerance (P) can be achieved in a distributed system. However, networks are inherently unreliable, making partition tolerance (P) required. Therefore, we can prioritize either consistency (C) or availability (A). (NOTE: The CAP theorem isn’t about picking “two out of three”, but rather balancing trade-offs during partitions.)

Linearizability corresponds to strong consistency, ensuring a single globally ordered state. Causal consistency, while weaker than linearizability, still maintains order among dependent operations. While causal consistency can improve network fault tolerance compared to strict linearizability, systems that fully prioritize availability often prioritize causal consistency (eventual consistency), which imposes even fewer ordering constraints (than linearizability).

Consensus Problem

Even with the relaxed requirements of causal consistency, nodes in a distributed system still need a mechanism to track and enforce causal dependencies, which introduces coordination challenges similar to the consensus problem. There are various approaches to coordinate distributed systems, such as quorum-based mechanisms, versioning with epoch numbers, and commit protocols like Two-Phase Commit (2PC).


TODO:


index 20250225 20250227