Coordinator Election

Introduction

Many approaches in distributed computing involved a special machine that has a unique role or purpose within the system

Election is not Synchronisation

Bully Algorithm

The Bully Algorithm is a method used in distributed systems to elect a coordinator or leader. It ensures that there is always a coordinator available to handle requests and maintain system stability.

In the Bully Algorithm, each process in the system has a unique identifier or process ID. The process with the highest ID is considered the coordinator. However, if a lower-ranked process believes that the current coordinator has failed or become unresponsive, it initiates an election process to select a new coordinator.

The Bully Algorithm works as follows:

  1. When a process detects that the current coordinator has failed, it sends an election message to all processes with higher IDs.
  2. If a higher-ID process receives an election message, it responds by sending an OK message back to the initiating process.
  3. If no higher-ID process responds within a certain time frame, the initiating process assumes that it is the highest-ranked active process and declares itself as the new coordinator.
  4. Once elected, the new coordinator sends a coordination message to all other processes to inform them of its selection.
  5. If any lower-ranked processes receive this coordination message, they update their knowledge of the new coordinator and continue normal operations.

The Bully Algorithm ensures that there is always a coordinator available in case of failures or unresponsiveness. It also guarantees that only one process becomes the coordinator at any given time.

However, it's important to note that the Bully Algorithm does not handle situations where multiple processes initiate an election simultaneously or when network partitions occur. Additional mechanisms are needed to address these scenarios in distributed systems.

Assumptions

Failure detection

When a node did not receive a response from the coordinator

Election

within 2×T(m)+T(p) It will start a election process

Example

12345did not respondin time'2' is the coordinator'2' is the coordinatorNO!No3 is the coordinatorNo4 is the coordinatorNO BODY DISAGREEIM THE PRESIDENT

Announcement

The node that won the election will announce to the rest of the node that it is the new coordinator

Edge cases

  1. When a node won the election, but it failed before announcement is made
    1. The other node detects that it still cant communicate with the coordinator, which is 5 in this case since 4 has not yet made its announcement
    2. The other nodes will repeat the election process
  2. The old node wake up after announcement
    1. When a node wake up, it will start election
  3. The old node wake up during discovery, before announcement
    1. wait for the request and reject
    2. 4's election failed
  4. The old node wake up after discovery, before announcement
    1. Both new elected coordinator and the old node will send out a message that it is the coordinator
    2. since the old node has a higher node, it will announce to the new node, and the new node will announce to the rest that the old node is the new coordinator
    3. In this case, 5 wake up and send the announcement to 4 and 4 will broadcast to the rest of the node that 5 is the coordinator

Communication complexities

the communication complexity of the bully algorithm can be O(N^2) in the worst case scenario where every process sends an Election message to every other higher-ranking process. However, with efficient handling of responses and timeouts, this complexity can be significantly reduced in practice.

Ring Election

Processors are logically connected as a ring
Each node in the system is aware of its successor, its successor's successor

Discovery Process

  1. A node realizes that the coordinator has failed when it communicate with the coordinator
  2. After realizing that the coordinator has failed, a node passes an election message with its ID around the ring.
  3. The election message is passed to its successor
  4. However, if the successor has failed, it is skipped and the message is sent to successor’s successor  and so on.
  5. After receiving an election message, a machine appends its own ID and passes the modified election message around the ring.
  6. Once an election message is received and if the receiver machine finds its own ID in the message, then we conclude that the message had made its way all around the ring.
    12345{2}requestno response{2,3}{2,3,4}{2,3,4}{2,3,4,1}

Announcement

Adding new node

2 Loops to traverse the ring, one round to discover, one round to announce
12345{6,1,2}{6,1}{6,1,2,3}{6,1,2,3,4}{6,1,2,3,4,5}6{6}{6,1,2,3,4,5}

This process can be simplified by the new node asking for the current ring structure from a current node in the group and the new node will compute the new ring structure and start broadcasting

Important

When multiple node trying to insert at the same time, we only allow one retriever of the node structure at the same time

Edge cases

  1. 5 wakes up after 4 sent the message to 1
    1. reject and re-elect itself as the new coordinator
  2. 5 wakes up during the election
    1. still the same since its id is append to the message

Invitation Algorithm

Network partition

The Invitation Algorithm provides a protocol for forming groups of available machines within network partitions, and then creating larger groups as failed machines are returned to service or network partitions are rectified.

  1. A node realizes that the coordinator has failed, elects itself as the coordinator and broadcasts the coordinator message to all other reachable nodes
  2. A node responds to such a message and tries to elect itself as the coordinator similarly
  3. if a response is received from a higher ID machine, then the machine starting the election process retires
  4. You will have multiple partitions after the election

Partition merge

Partition 1Partition 21234i) coordinator?noii) coordinatoriii) yesiv) JOIN ME!v) YES BoSSvi) GO JOIN 4

Discover

Periodically all the coordinators of the individual partitions will check for the existence of all other coordinator by broadcasting the message, all other coordinator will respond to the message if they are the coordinator

Merging

Once the coordinators knows each other's existence and the coordinator with the high id will initiate the merge to the lower coordinator node

announcement

After the lower coordinator node agree to join the other network, it will broadcast to the members of it partition to join the new partition

Advantage

Partially solves the problem of detecting node failures

Issues