5.3.4 Detecting Concurrent Writes
Figure 5-12. Concurrent writes in a Dynamo-style datastore: there is no well-defined ordering.
- Dynamo-style DB allow concurrent writes
==> conflicts will occur even if strict quorums are used
If each node simply overwrote the value for a key whenever it received a write request from a client, the nodes would become permanently inconsistent
- to achieve eventual consistency, the replicas should converge toward the same value
5.3.4.1 last write wins (discarding current writes)
- each replica need only store the most "recent" value and allow "older" values to be overwritten and discarded
- then if we could define what is "recent", the replicas will eventually converge
- but with concurrent writes, the order is undefined
LWW (last write wins) -- force an arbitrary order by attaching a timestamp to each write
usage of LWW
DB |
description |
Cassandra |
only supported conflict resolution method |
Riak |
optional feature |
cost of LWW : durability
- for w successful concurrent writes, only 1 will survive and others will be discarded
- LWW may even drop non-concurrent writes
==> if losing data is not acceptable, LWW is a poor choice for conflict resolution
The only safe way of using a database with LWW is to ensure that a key is only written once and thereafter treated as immutable, thus avoiding any concurrent updates to the same key.
5.3.4.2 the "happens-before" relationship and concurrency
- B is causally dependent on A if B's operation builds upon A's operation.
==> B must have happened later than A
- A happens before B if B depends on A
- 2 operations are concurrent if neither happens before the other
possibilities of A & B
- A before B
- B before A
- A & B concurrent ==> need an algo to determine if concurrent or not
concurrency, time, and relativity
- for defining concurrency, exact time doesn't matter (problem of clocks in distributed system)
- 2 operations are called concurrent if they are both unaware of each other
- relativity in physics : information cannot travel faster than the speed of light
Consequently, two events that occur some distance apart cannot possibly affect each other if the time between the events is shorter than the time it takes light to travel the distance between them.
Reference
Designing Data-Intensive Applications by Martin Kleppman