Anomalies

An anomaly or read phenomena can happen when a transaction reads data that may have been modified by another concurrent transaction.

Dirty read

A dirty read happens when a transaction T1 reads data that has been modified by a concurrent transaction T2 that has not has been committed or rolled back yet. T1 ends up working with stale data if T2 does not commit.

T2 starts executing and sets x to a new value, T1 starts executing and reads x, the value of x is the value just set by T2, T2 rolls back, the value of x is not persisted to the database but T1 will move forward with the stale value of x that was written before T2 rolled back.

Non-repeatable read

A non-repeatable read happens when a transaction T1 reads a value x before and after it is modified by a transaction T2 and T2 has committed. If x has been assigned a new value by T2, T1 will have seen two different values for the same variable in a transaction.

T1 reads x with value 0, T2 updates x to 1 and commits and then T1 reads x again but the value is now 1.

Phantom reads

A phantom read is almost like a non-repeatable read but it is said to happen when more than one row is being selected.

A transaction T1 executes a statement to select a set of rows, a concurrent transaction T2 executes a statement that modifies, inserts or deletes a row and commits. T1 executes the first statement again hoping to select the same set of rows but a different set is returned.

T1 starts executing and selects a set of rows from a, while T1 executing, T2 inserts a new row into a and commits, when T1 tries to read the same set of rows from a it gets a different set because a new element was added by T2.

Lost updates

A lost update may happen when transactions T1 and T2 both read and try to update x using the just read value. Similarly to what happens when two threads try to update a variable without synchronization, one of the transactions may read x just a little bit before the other one updates it and end up working with a stale value.

In this example, there are two transactions T1and T2. Both are executing concurrently and both read x and increment its value by 1, if x had the value 0 the expectation would be that the new value of x would be 2 after both transactions commit but the reality is that the new value of x will be 1 because both transactions read 0 before incrementing it by 1.

Dirty writes

A dirty write happens when a transaction reads an uncommitted value from another transaction, modifies and writes it.

T1 and T2 are executing concurrently, T2 sets x to 1, T1 reads x with value 1, T2 rolls back, T1 increments x by 1 resulting in 2 and writes it. The value of x is 2 instead of its previous value increased by 1.

Write skew

A write skew happens when concurrent transactions respect database invariants(e.g table.x > 0) but when committed break one or more invariants.

Assuming that a.balance + b.balance > 0 is an invariant, both transactions T1 and T2 respect the invariant on their own and are allowed to commit, but after both transactions are committed the invariant is broken.

Isolation levels

Read and write anomalies can be avoided by choosing the right isolation level.

Read uncomitted

Dirty reads are allowed, transactions can read data that has been modified by other transactions even if they have not been committed yet.

Read committed

Allows transaction T1 to read data that has been committed by other transactions while T1 is still executing, repeatable reads are not guaranteed.

Repeatable reads

Same as read committed but reading x will always result in the same value.

Serializable

Each transaction behaves like they are executed to completion before other transactions starts executing.

References

Database internals