Data versioning

Introduction

Dynamo provides eventual consistency, which allows for updates to be propagated to all replicas asynchronously.

Design details

Conflicting versions

In the event of concurrent update and failures, there might be multiple versions of the data, resulting in a conflict

Client must perform reconciliation in order to collapse multiple branches of data evolution back into one

Example

Merge
Multiple versions of shopping cart available
Merging is performed to make sure no data is lost
deleted item may resurface

Causality

Dynamo uses vector clocks to capture causality between different versions of the same object

Info

Vector clocks is effectively a list of (node,counter) pairs in this case

Example

D1 ([Sx,1])D2 ([Sx,2])D3 ([Sx,2],[Sy,1])D4 ([Sx,2],[Sz,1])D5 ([Sx,3],[Sy,1],[Sz,1])write handled by Sxwrite handled by Sxwrite handled by Szwrite handled by Syreconciled and written by Sx

  • A client creates a new object
  • Sx handles the write and create a new object D1 with vector clock [Sx,1]
  • client update the object, request still handled by Sx hence only increment the sequence number by 1 and a new object D2
  • D2 descends from D1, hence overwriting D1
    • There may be replicas of D1 elsewhere that has not seen D2
  • Client try to update again, but this time handle by Sy, resulting in creation ob object D3, since it is a new machines, new vector clock is appended, thus the final vector clock ([Sx,2],[Sy,1])
  • Then another client try to update again, but the request is handled by Sz and Sz only has the version of D2 available, hence arriving at object D4 with ([Sx,2],[Sz,1])
  • Base on the vector clock, a node that is aware of D3 and receives D4 will know that they dont have a causal relationship and reconciliation have to be carried out
  • Assume a client performs a read, the summary of the clocks of D3 and D4 will be sent
  • If reconciliation is performed a final version of D5 is written by Sx with the combination of all the vector clocks ([Sx,3],[Sy,1],[Sz,1])

One can determine whether two versions of an object are on parallel branches or causal ordering by examine they vector clocks

if the properties 1 and 2 of vector comparison if not satisfied then there may be a conflict and require reconciliation, else can be ignored

Vector clock issues

The size of the vector clocks may gets too big, if many servers coordinate the writes to a object

Hint

  • the writes are usually handled by one of the top N nodes in the preference list
  • In case of network partition or multiple server failures, write request may be handled by nodes that are not in the top N nodes
  • This causing the size of the vector clocks to grow (if not it will always be < N)

Vector clock truncation scheme

dynamo store a timestamp(along with (node,counter) pair) that indicates the last time the node updated the data item.
When the number of (node,counter) pair exceeds a threshold
The oldest pair is deleted

Warning

May lead to inefficiencies in reconciliation as the descendant relationship cannot be derived accurately