Introduction

#distributedSystems #Project

Dynamo

![[Dynamo.pdf]]

Introduction

This series studies the design and implementation of amazon's dynamo db, and we will try to replicate a more simplified version using go

The design and implementation of dynamo db is targeted at applications with a very specific use case - those only need a simple primary key interface with high availability. The design consideration that goes behind dynamo is that they want to make sure that they database is still accessible and writable even during node addition or node failure.
To achieve high availability, Dynamo sacrifices consistency under certain failure scenarios
Dynamo is targeted mainly at applications that need an "always writeable" data store where no updates are rejected due to failures or concurrent writes

Design considerations

Amazon first used this as a internal service for some of their use cases on their e-commerce site and they built dynamo with these services that they are serving

System Architecture

Here is the table with the problems Dynamo is solving:

Problem	Technique	Advantage
Partitioning	Consistent Hashing	Incremental Scalability, High Availability for writes
Versioning	Vector clocks with reconciliation during reads	Version size is decoupled from update rates
Handling temporary failures	Sloppy Quorum and hinted handoff	Provides high availability and durability guarantee when some replicas are not available
Recovering from permanent failures	Anti-entropy using Merkle trees	Synchronizes divergent replicas in the background
Membership and failure detection	Gossip-based membership protocol and failure detection	Preserves symmetry and avoids centralized registry for membership and liveness info

Implementations

Dynamo’s local persistence component allows for different storage engines to be plugged in. Engines that are in use are Berkeley Database (BDB) Transactional Data Store2, BDB Java Edition, MySQL, and an in-memory buffer with persistent backing store.

I will be splitting the entire implementation plan into the different sections below

We will need to partition our distributed systems into different sections and each machine handles a dedicated section of the system using Consistent Hashing
How do we interface with the database system and the different considerations behind every read and write operations
Making sure that we replicate our data to different nodes in the distributed systems, so that in any case of failure, we know that our data is safe
Since there are different copies of the same data in our system, we need to handle the different versions of data carefully
One of the most important thing to consider in a distributed system is Failure handling. What do we do when one or a few of the nodes failed.