Consistency Model
Introduction
The three main properties of correctness
,scalability
and fault tolerance
, consistency model takes care of the correctness
aspect. It applies to any distributed systems, it cant be avoided
Distributed systems involve concurrent reads and writes
- concurrent read and write on the same shared data structures may lead to in consistency in the data
Consistency model is a middleware before the application layer
Distributed Shared Memory(DSM)
Each machine in the network can access a common address space, they also have a local copy
of all memory
Even though theres 3 programs running, but its a sequential code since p2
busy waits for p1
to be done and p3
wait for p2
to be done to access both v1
and v2
Naive DSM protocol
On read:
ready any variable from local memory
On write:
perform asynchronous write, sends an update message to all other machine, do not wait
Problem
Causality violation
p2
might not get the most updated value of v1
and it sends done2
to p3
before v1
can reach p3
This will lead to p3 to carry out f3 with outdated value of v1 and wrong value of v2 since v2 is computed base on outdated value of v1 and hence lead to unexpected outcome
Summary
This protocol is fast but has unexpected consequences to user applications if protocol features are not known
What is consistency model
- Distributed systems promise to behave according to certain rules
- these rules constitute the consistency model
- A correct application must be aware of these
rules
- Unlike protocols, there is no correct or incorrect consistency models
- The choice of a consistency model depends on the application context
- A strict set of rules usually provide programming ease at the application layer, whereas a set of relaxed rules make application programming much harder
- Implementing certain consistency model in distributed systems is extremely hard problem due to
faults
andByzantine failures.
Strict Consistency
- Requirement: Each operation is stamped with a global wall clock time
- Rules:
- Each Read gets the
latest
written value (In terms of physical timestamp) - all operations in one machine are executed
in order
of their timestamps
- Each Read gets the
Excercise
W(v1)P1 -> w(done1)P1
w(done1)P1 -> r(done1)P2
r(done1)p2 -> r(v1)p2
hence w(v1)p1 -> r(v1)p2
Therefore, during r(done1)P2, we already have w(v1)P1 and therefore, while reading updated done1, we already have updated v1.
Sequential Consistency
- Requirement
Each operation is stamped with a global wall clock time
- Rules:
- All operations in one machine are executed in order
- all machines observe results according to
some
total order
It doesnt require you to read thelatest
write as long as you are reading it via a certain order
Example
Is this situation ever possible if the system implements Sequential Consistency?
Infer
Read 2.2 -> Write 1.1 (by construction)
Write 1.1 -> write1.2 (by rule 1)
Write 1.2 -> Read 2.1 (by construction)
Read 2.1 -> Read 2.2(by rule 1)
Therefore Read 2.2 -> Read 2.1 Which is a contradiction
Naive DSM protocol and sequential consistency
When you read, you read from the memory, and when you write you send the write to other machine's memory
Possible violation
- All operations in one machine are executed in order (e.g. in program order)
- Naïve DSM protocol does not wait for the Write to complete. Thus can go ahead to read from
local memory
before thepreceding
write finishes. This violates Rule #1 of sequential consistency.
- Naïve DSM protocol does not wait for the Write to complete. Thus can go ahead to read from
- All machines observe results according to some total order (i.e. reads see most recent writes)
- We note that two machines can issue concurrent Writes and different machines can receive these writes in different order, thus violating the total order enforced by sequential consistency model.
Causal Consistency
All causally-related
read/write ops were executed in an order that reflects their causality
Rules
- All operations in one machine are executed in order (e.g. in program order)
- Read/Write operations in the same machines are causally related w.r.t. the order they appear
- All
causally-related
operations are executed according tosome order capturing their causality
- Concurrent operations may be observed in different order by different machines
Example
Is this situation ever possible if the system implements causal Consistency?
Infer
- Read 2.2 -> Write 1.1 (by construction)
- Write 1.2 -> Read 2.1 (causal order)
- Read 2.1 -> Read 2.2(causal order)
- write 1.1 -> write 1.2 (causal order)
Therefore base on 1 and 5, Read 2.1 -> write 1.2
Advantages
- Parallel Operations
- parallel operations, that are not causally dependent, can be executed in different orders by different machines
- Sequential consistency enforces a global ordering among ALL operations
- Thus, performance should be better than sequentially consistent systems (strict order among read/write on the same object)
- On the same object, causally consistent systems never change the order of causally dependent updates across machines
- Programming Ease?
- Performance?
Total Store Order
Machines might be able to change the order of read and write if they are on different objects, it does have to respect the program order
a machine might execute w(x)a and r(y)? in any order. However, it cannot reorder, w(x)a and r(x)?
Rules
- Read by other machines
cannot return
a new value of an objectuntil the respective write to the same object
is observed byall machines
.
Implementation
- Store buffer
- fast, faster than Dram
- periodically flush to DSM
- For
Write
- Invalidation
- During
Read
- Invalidate all remote/local copies (other machine will be force to go to the share memory instead of their own memory)
- During
Eventual Consistency
Rules
- Allow stale reads (i.e reading outdated value) but ensure that reads will eventually reflect previously written values
Advantage
If a machine is disconnected from the network can still allow operation
- better to read stale values than nothing
- better to same in local than nothing
Given the relaxed orders on writes, allows increased parallelism
Disadvantage
Stale reads
Writes are not ordered, they may create conflicts later
- might involve application user intervention
Eventual vs Sequential Consistency
- Sequential Consistency is Pessimistic
- It thinks that something might go wrong and hence order should enforced
- It decides on the order of operations as they are executed
- Eventual Consistency is Optimistic
- Let updates happen as they occur, worry about their order later
- Imagine changing code in Git with your project partners. You may update local machine as they happen, later need to resolve conflict with the code of team members.
Example
File Synchorizer: A set of machines containing common files.
Goal
- The content in a common file eventually become identical in all machine
- Do not replace new content of the file with old one
Dealing with Conflicts
Depends on the type of application
- Resolve mailboxes with different messages (Easy)
- Changes to different files in a coding project (Medium)
- Changes to the same line of a file in a coding Project (Difficult)
Case Study: IVY
Read Protocol
Write Protocol
Properties
- Every page has exactly one owner at a time
- If the page is mapped r/w by the owner, then there does not exist any other copies
- If the page is mapped r/o by the owner, then the copy is identical to other copies
- The central manager is aware of all copies of a page in the network