Concurrency Control in Distributed Database

In a distributed database system, achieving high availability and robustness are crucial goals to ensure that the system remains functional and accessible, even during failures.

Key Concepts

High Availability:
- Goal: Ensure that the distributed database is available to users almost all the time (targeting an availability rate of 99.99%, also known as “four nines”).
- Significance: High availability minimizes downtime, allowing users to access and interact with the database reliably.
Robustness:
- Goal: Make the system resilient so it can continue to operate even when parts of it fail.
- Requirements for Robustness:
  - Failure Detection: The system must be able to identify when a site (node) or communication link has failed. This is essential so that the system knows which components are unavailable and can adjust accordingly.
  - Reconfiguration: Once a failure is detected, the system should automatically reconfigure itself to maintain functionality. This might involve redirecting tasks to functioning nodes, redistributing data, or reallocating resources.
  - Recovery and Reintegration: When a failed site or communication link is repaired, the system should be able to reintegrate it smoothly. This process often involves synchronizing data that may have been updated during the downtime and ensuring the repaired component can operate as part of the larger system.

Example Scenario

Phase	Description	Example
Failure Detection	System monitors for any site or link failures to identify components that are down.	Site B fails due to a network issue. The system detects that Site B is unresponsive and marks it as unavailable.
Reconfiguration	System adjusts its configuration to maintain functionality, redirecting requests as needed.	Read requests that would go to Site B are rerouted to Sites A and C, which hold copies of the data originally on Site B.
Recovery and Reintegration	After repairs, the system synchronizes the failed site to restore its data and bring it back online.	Once Site B is back online, it receives updates for any data changes that occurred during downtime. It rejoins normal operations.

Reconfiguration

Abort Active Transactions: If a site fails, any active transactions using its resources are aborted. This is to ensure they don't hold locks that might prevent other transactions from executing.
Remove Failed Replicas: If a site hosting a replica fails, it’s removed from the system catalog until it recovers. This avoids attempts to access that replica, which would fail.
Central Server Replacement: If a central server (such as a name server or global deadlock detector) fails, an election process among other sites will select a replacement.

Network Partition vs. Site Failure:

It's difficult to tell whether a site is down or simply partitioned (cut off due to a network issue). To handle this, reconfiguration must ensure no duplicate central servers are elected in different partitions, and each partition avoids conflicting updates on replicated data.
Majority-Based Approach: Majority voting is used to avoid inconsistencies. For example, if the network is partitioned, only the sites in the majority partition can update data.

Solution: Majority based approach

Each replica has a version number to track updates.
To lock a data item, a request is sent to at least half of the sites. Operations continue if the majority of sites grant the lock.
Read Operations: The replica with the highest version number (latest data) is read, ensuring accurate information.
Write Operations: The version number is updated, and changes are written to all locked replicas. This ensures that all replicas are up-to-date.

Alternative Read one write all available

Reading: You can read data from any available replica.
Writing: To commit changes, all replicas must be available. However, this approach can fail during link or partition issues because isolated sites may not realize they’re out of sync, leading to potentially outdated reads.

Site Reintegration:

When a previously failed site recovers, it must catch up on updates it missed.

Solution 1: Halt all updates while reintegrating the site, though this can disrupt operations.
Solution 2: Lock all relevant replicas, update them to the latest version, and then release the locks.

Coordinator Selection with the Bully Algorithm

The Bully Algorithm is an election algorithm used in distributed systems to select a new coordinator (leader) when the current coordinator fails. It’s designed for systems where each process (site) has a unique identifier (ID) and can communicate with each other. Here’s how it works: Key Concepts

Unique ID: Each site has a unique identifier, typically a number. Higher IDs are given more "priority" to become the coordinator.
Coordinator Role: The coordinator oversees certain tasks in the system, such as resource allocation or managing transactions. When it fails, a new coordinator must be elected.
Failure Detection: A site detects a failure when it doesn’t receive a response from the coordinator within a specified time, 𝑇.

Bully Algorithm

Steps in the Bully Algorithm The algorithm proceeds as follows when a site detects that the coordinator has failed:

Self-Election by the Detecting Site:
- The site that detects the coordinator's failure, let's call it 𝑆𝑖, initiates an election.
- 𝑆𝑖 sends an election message to all sites with higher IDs than itself.
- 𝑆𝑖 then waits for a response within a time period 𝑇.
Responses from Higher-Priority Sites:
- If any of the sites with a higher ID respond, they take over the election process (since they have a higher priority).
- 𝑆𝑖 waits for a new coordinator message from one of these higher sites, indicating that they have taken on the coordinator role.
Self-Election if No Higher Sites Respond:
- If 𝑆𝑖 receives no response within 𝑇, it assumes all sites with higher IDs have failed.
- 𝑆𝑖 then declares itself the new coordinator and broadcasts a coordinator message to inform all other sites.
Restarting the Algorithm if a Higher Site Responds Later:
- If a higher ID site comes back online or if another site detects the failure before 𝑆𝑖 completes the election, they may initiate their own election.
- This can lead to a situation where multiple elections are ongoing, but ultimately, the highest ID site available will always win.
Re-election after Recovery:
- When a previously failed site with a higher ID than the current coordinator recovers, it can initiate a new election and "bully" its way to become the new coordinator.
- The recovered site starts a new election, and because of its higher ID, it eventually becomes the coordinator.