20250119
- Zero Second Thinking
- did some household chores
This week will be busy. I hope I can make it through.
Designing
Data-Intensive Applications - Chapter 7 - Transactions
Transactions are a database feature that helps prevent anomalies
caused by read/write conflicts, allowing the application layer to avoid
handling these situations.
An example of a conflict is when two clients increment the same
counter. Both clients read the counter (almost) simultaneously, seeing
the value as 3
, and each writes 4
to the
database. Ideally, the counter should be 5
because the
increment occurs twice, but concurrent reads lead to this conflict. This
is called lost updates.
In addition to lost updates, the following are examples of race
conditions caused by concurrency:
- Dirty reads: A client reads data that includes another client’s
uncommitted writes.
- Dirty writes: A client overwrites data that includes another
client’s uncommitted writes.
This chapter also covers other concurrency issues such as read
skew, write skew, and phantom reads.
To prevent concurrency issues, three widely used isolation levels
are:
- Read Committed: Prevent dirty reads and dirty writes.
- Snapshot Isolation: Builds on read committed by also preventing read
skew. (implemented with MVCC - multi-version concurrency control)
- Serializable: Prevents all conflicts mentioned above by virtually
ordering requests to the database, ensuring full isolation.
There are three main approaches to implementing serializable
transactions.
- Literal Serial Execution: Transactions are executed one at a time in
a strict serial order. While it is a quite simple approach, this
significantly limits concurrency and is rarely used in practice.
- Two-Phase Locking (2PL): This approach ensures serializability by
requiring transactions to require locks on all the records they access
(there are specific rules on handling two-phase locking). Locks are held
until the transaction is completed, preventing conflicts but potentially
causing performance bottlenecks due to contention and deadlocks. It has
been widely used, but SSI is taking over its popularity.
- Serializable Snapshot Isolation (SSI): Unlike locking-based methods,
SSI does not lock records in advance. Instead, It detects conflicts at
the time of committing a transaction. If a conflict is found, the
transaction is aborted. By postponing conflict checks until the end, SSI
improves concurrency and performance, making it promising candidate for
future standards. Both SI and SSI use version-based isolation (MVCC:
multi-version concurrency control)
PostgreSQL:
Documentation: 17:13.2. Transaction Isolation
The Serializable isolation level is implemented using a technique
known in academic database literature as Serializable Snapshot
Isolation…
TODO:
- 2044E, F
- 2048D
- Naturalization Interview (20250207)
index 20250117 20250120