Interfacing

#distributedSystems #Project

Introduction

From the paper, there are only two operations that we need to take care of for the interfacing options for the system get and put, however there are many considerations and specific implementations of this that we need to take care of

How request are handled

It provides two core operations - get(key) and put(key, value, context).
The key is treated as an opaque array of bytes. Dynamo applies a hash function #updateLink on the key to generate a storage location.
The value is also treated as an opaque array of bytes representing the data object

Write context

There may be instances were multiple versions of a same data might be present, hence a write context has to be passed to the put request to determine which version of the data to write to
write context will contain information such as vector clock more about the use of vector clock in versioning here

The context information is stored along with the object so that the system can verify the validity of the context object supplied to the put request

For concurrent writes, the coordinator node generates a new write context that subsumes all the vector clocks of the conflicting versions. This new reconciled write context is returned to the client on success.

Execution

the node to handle the request is selected by
1. a load balancer to select base on load information
2. partition-aware client library
coordinator is the node that handles the read and write operations, typically is the first among the top N node in the preference list
- If the requests are received through a load balancer, requests may be routed to any random node in the ring
- The node receiving the request, if it is not in the top N node, it will not coordinate the request but forward to the first node among the top N node in the preference list

Important

May lead to imbalance load distribution

can be solved by allowing any of the top N nodes in the preference list to be the coordinator
The coordinator is chosen to be the node replied the fastest to the previous read request
This information should be stored in the context

Clients can configure Dynamo's consistency and durability parameters like R, W and N on a per-instance basis to control tradeoffs.
- R (Read quorum size) - The minimum number of nodes that must participate in a successful read operation.
- W (Write quorum size) - The minimum number of nodes that must participate in a successful write operation.
- N (Replication factor) - The number of nodes that each data item is replicated to, more can be found in Replication

Info

Increasing R and W improves consistency but reduces availability. Setting lower values improves availability but risks inconsistency.
N controls the durability and number of replicas for each data item. Higher N means more copies are stored.
Typical values are R=2, W=2 and N=3. This provides weak consistency but high availability.
Setting W=1 and N=1 means a write is considered successful even if it succeeds on only one node. This maximizes write availability.
Setting R=1 means reads can be satisfied from any node without checking other replicas. This improves read performance.

Put request handling

Upon receiving a put() request for a key, the coordinator generates a vector clock for the new version and writes the new version locally
The coordinator then sends the new version with the vector clock to the N highest ranked reachable nodes, if at least W-1 nodes respond then the write is considered successful
must specify which version it is updating, by passing the context it obtained from earlier read operation
Replication is required for consistency and durability

Get request handling

Coordinator request all existing version of data for that key from the N highest-ranked reachable node in the preference list for that key
if there are multiple branches that cannot be syntactically reconciled, it will return all objects at the leaves with the corresponding version information in the context
The divergent versions are then reconciled and the reconciled version superseding the current versions is written back
A better visualisation here

Question

Do we really need all the version of data
What if there are too many versions accumulated and the query become slow
for divergent versions, do we return the reconciled version?

State machine

Each client request results in the creation of a state machine on the node that received the client request.
The state machine contains all the logic for identifying the nodes responsible for a key, sending the requests, waiting for responses, potentially doing retries, processing the replies and packaging the response to the client

They use a state machine to keep track of the current status of the client request a desired output is only produced when the state machine terminates

Read operation

To read up more about versioning Data versioning