Interfacing

Introduction

From the paper, there are only two operations that we need to take care of for the interfacing options for the system get and put, however there are many considerations and specific implementations of this that we need to take care of

How request are handled

Write context

There may be instances were multiple versions of a same data might be present, hence a write context has to be passed to the put request to determine which version of the data to write to
write context will contain information such as vector clock more about the use of vector clock in versioning here

The context information is stored along with the object so that the system can verify the validity of the context object supplied to the put request

For concurrent writes, the coordinator node generates a new write context that subsumes all the vector clocks of the conflicting versions. This new reconciled write context is returned to the client on success.

Execution

Important

May lead to imbalance load distribution

  • can be solved by allowing any of the top N nodes in the preference list to be the coordinator
  • The coordinator is chosen to be the node replied the fastest to the previous read request
  • This information should be stored in the context
Info

  • Increasing R and W improves consistency but reduces availability. Setting lower values improves availability but risks inconsistency.
  • N controls the durability and number of replicas for each data item. Higher N means more copies are stored.
  • Typical values are R=2, W=2 and N=3. This provides weak consistency but high availability.
  • Setting W=1 and N=1 means a write is considered successful even if it succeeds on only one node. This maximizes write availability.
  • Setting R=1 means reads can be satisfied from any node without checking other replicas. This improves read performance.

Put request handling

Get request handling

Question

Do we really need all the version of data
What if there are too many versions accumulated and the query become slow
for divergent versions, do we return the reconciled version?

State machine

Each client request results in the creation of a state machine on the node that received the client request.
The state machine contains all the logic for identifying the nodes responsible for a key, sending the requests, waiting for responses, potentially doing retries, processing the replies and packaging the response to the client

They use a state machine to keep track of the current status of the client request a desired output is only produced when the state machine terminates

Read operation

send readrequest to nodeswait for minimum numberof required responsefailedToo few repliesWithin a given bound timegather all data and determine which to returnreturn dataversioning not enabledsyntacticreconciliationversioning enabledwrite dataGenerateopaque writecontext
To read up more about versioning Data versioning