Commit Protocols

System Failure Modes

Unique failures of distributed systems:

Failure of a site.
Loss of messages
- Handled by TCP-IP protocols etc.
Failure of a communication link
- Handled by routing messages via alternative links
Network partition
- if it is split into two or more subsystems, without any connection
  - Note: a subsystem may consist of a single node
Network partitioning and site failures are generally indistinguishable.

Commit Protocols

Commit Protocols are essential in distributed databases to ensure atomicity—meaning a transaction must either be fully completed or entirely rolled back at all involved sites, with no partial completion. This is especially critical when a transaction spans multiple locations in a distributed environment. Here’s a breakdown of two commonly discussed commit protocols:

Two-Phase Commit Protocol (2PC)

The Two-Phase Commit Protocol (2PC) is a method used in distributed systems to ensure atomicity in transactions that span multiple sites. Atomicity means the transaction should be completed entirely across all sites or not at all, even in the event of failures.
Standalone systems implement local transactions through transaction managers (TMs). Distributed systems coordinate TMs of multiple nodes to collectively commit success or failure. Therefore, transaction coordinators (TCs) are required.
A local TM implements functions such as concurrency control of local transactions and recovery from exceptions. A TC enables transactions, divides transactions into multiple sub-transactions and distributes them to corresponding nodes for execution, and coordinates transactions until completion (collectively commits success or failure). You can either implement TM and TC in the same process or deploy them on different nodes. Here’s an in-depth explanation of the 2PC protocol, its phases, and how it handles various failures.

Overview of 2PC Protocol

Assumptions: The protocol operates under a fail-stop model, meaning that a failed site simply stops functioning rather than causing corruption. This approach assumes that once a site fails, it doesn’t perform any other actions until recovery.
Roles:
- The Transaction coordinator (TC) is responsible for initiating and managing the 2PC process across all involved sites.
- The participants - Transaction Manager (TM) are the local sites where the transaction runs, each with a transaction manager to handle its part of the transaction.

Phases of 2PC Protocol

Phase 1: PREPARE

The TC writes a local <Prepare T> log and persists it. The TC sends a "Prepare T" message to all participants.
Each participant TM receives the "Prepare T" message and decides whether to commit the transaction based on its own situation:
If a TM decides to commit the transaction, it writes a <Ready T> log and persists it, and then sends a "Ready T" message to the TC.
If a TM decides not to commit the transaction, it writes an <Abort T> log and persists it, and sends an "Abort T" message to the TC. Then, the TM enters the transaction abortion process locally.

Phase 2: Recording the Decision

When the TC has received responses from all nodes or the waiting timer times out, the TC decides whether to commit or abort the transaction.
- If all participants respond with a "Ready T" message, the TC writes a <Commit T> log and persists it, and then sends a "Commit T" message to all participants.
- If the TC receives an "Abort T" response from at least one participant, or if any participant fails to respond within the timeout period, the TC writes a <Abort T> log, and then sends an "Abort T" message to all participants.
After the participants receive the message from the TC, they write <Commit T> or <Abort T> logs and persist them.

Handling Failures

In distributed systems, failures can happen at any time, which requires mechanisms to recover transactions safely and avoid leaving data in an inconsistent state. Here’s how 2PC handles different types of failures:

Site Failure (Participant Failure)
- When a failed site (participant) recovers, it checks its log to determine the status of the transaction:
  - If it finds <commit T>, it redoes (REDO(T)) the transaction to finalize the commitment.
  - If it finds <abort T>, it undoes (UNDO(T)) the transaction to ensure consistency.
  - If it finds <ready T>, it means the participant was prepared to commit but was waiting for the coordinator's decision. In this case:
    - The participant contacts the coordinator to find out if the transaction was committed or aborted.
    - Based on the response, it either redoes or undoes the transaction.
  - If it finds no record of T, the participant was not involved in preparing to commit. Thus, it safely aborts (undoes) the transaction.
Coordinator Failure
- When the coordinator fails, participants must determine the status of the transaction:
  - If any participant has <commit T> in its log, the transaction is considered committed.
  - If any participant has <abort T>, the transaction is considered aborted.
  - If some participants have <ready T> but no commit or abort decision has been logged, they must wait until the coordinator recovers, resulting in a blocking problem where the transaction cannot proceed until the coordinator returns.
Network Partition
- In case of a network partition:
  - If the coordinator and all participants are within the same partition, the protocol proceeds normally.
  - If participants and coordinator are in separate partitions, participants may assume the coordinator has failed and start their own recovery protocol, waiting for the coordinator to become available again.
  - Meanwhile, the coordinator in its partition may assume participants are down and follow its own recovery protocols, leading to delays until network connectivity is restored.

Recovery and Concurrency Control

In cases where a site recovers after failure, in-doubt transactions (transactions where a final commit or abort decision is pending) may need special handling:

In-Doubt Transactions: These are transactions that only have a <ready T> entry but lack a final <commit T> or <abort T>. To determine the fate of these transactions, the recovering site contacts other sites.
Speeding Up Recovery:
- Instead of a simple <ready T> record, participants can add <ready T, L> to their log, where L is the list of locks held by transaction T.
- This approach allows the participant to reacquire all necessary locks during recovery, making it faster and less prone to blocking, since the participant can quickly resume and complete the transaction.

Three-Phase Commit Protocol (3PC)

The Three-Phase Commit Protocol (3PC) is an enhancement of the Two-Phase Commit (2PC) protocol, designed to prevent the blocking problem that occurs in 2PC. By adding an additional phase, 3PC ensures that the transaction’s outcome can still be determined, even if the coordinator fails, under specific conditions.

Key Assumptions of 3PC

No Network Partitioning: The protocol assumes that all sites in the network remain connected, so no network splits can occur.
Fault Tolerance: At least one site (either a participant or the coordinator) is always operational.
Limited Site Failures: The protocol can handle up to 𝐾 failures (where 𝐾 is a predefined threshold).

Phases of 3PC

Phase 1: Prepare (Same as 2PC Phase 1)

The coordinator initiates the protocol by asking all participants to prepare to commit the transaction.
Each participant, upon determining that it can commit, logs <ready T> and replies with a "ready T" message.
If a participant cannot commit, it logs <no T> and sends an "abort T" message to the coordinator.

Phase 2: Pre-Commit (Intermediate Phase)

Coordinator Action: If all participants respond with "ready T," the coordinator makes a pre-commit decision.
- It logs the decision as <pre-commit T> and sends "pre-commit T" messages to all participants, signaling that they should prepare to finalize the transaction.
- The coordinator records the pre-commit status in at least 𝐾 sites to ensure redundancy and minimize risk of blocking if it fails.
Participant Action: Upon receiving "pre-commit T," each participant logs <pre-commit T>, effectively preparing them to commit while waiting for the final decision.

Phase 3: Commit/Abort

Coordinator Action: After pre-commit records have been logged, the coordinator sends a final commit T or abort T message to all participants.
- If all participants receive the commit message, they complete the transaction by making it permanent.
- If any participant receives an abort message, it undoes any changes.
Participant Action: Participants finalize the transaction by either committing or aborting based on the coordinator’s final message.

Advantages of 3PC

Avoids Blocking: The pre-commit phase helps avoid blocking, as participants can make decisions based on the pre-commit decision even if the coordinator fails.
Fault Tolerance: As long as no more than 𝐾 sites fail, participants can still safely proceed to commit or abort, avoiding indefinite waiting (blocking problem).

Disadvantages of 3PC

Higher Overheads: Adding a pre-commit phase increases message and storage overhead due to the need for logging in multiple sites.
Assumptions May Be Hard to Satisfy: In real-world distributed systems, ensuring no network partitioning and always having at least one operational site can be difficult.

References:

https://www.alibabacloud.com/blog/tech-insights---two-phase-commit-protocol-for-distributed-transactions_597326