Distributed Databases

Explain 2PC protocols. Discuss its failure & recovery techniques.

  • The 2PC operates during the normal operation and then it describes how it handles failures and finally how it carries out the recovery and concurrency control.
  • Assume a transaction T which is initiated at site S where the transaction coordinator is C.
  • When T completes its execution that is when all the sites at which T has been executed inform C that T has been completed C starts the 2PC protocol.
Phase 1
  • 'A' record is added by C <prepare T> to the log, and it forces the log onto stable storage.
  • 'A' prepare 'T' message is sent to all sites at which 'T' executed.
  • On receiving such a message, the transaction manager at that site determines whether it is willing to commit its portion of 'T'.
  • 'A' record <no T> is added to the log if the answer is no, and is then responded by sending an abort 'T' message to 'C'.
  • 'A' record <ready T> is added to the log if the answer yes, and forces the log (with all the log records corresponding to T) onto stable storage.
  • The transaction manager then replies with a ready 'T' message to 'C'.
Phase 2
  • Whenever 'C' receives responses to the prepare 'T' message from all the sites, or when a prespecified interval of time has been elapsed since the prepare 'T' message was sent out, 'C' can determine whether the transaction 'T' can be committed or aborted.
  • If 'C' received a ready 'T' message from all the participating sites only then can transaction 'T' be committed else transaction 'T' must be aborted.
  • Depending on the verdict, either a record <commit T> or a record <abort T> is added to the log and the log is forced onto stable storage.
  • At this point, the state of the transaction has been sealed.
  • After following on to this point, the coordinator sends either a commit 'T' or an abort 'T' message to all participating sites.
  • When a site receives that message, it records the message in the log.
Handling of Failures
The types of failures are:

Failure of a participating site
  • Actions are taken if the coordinator C detects that a site has failed, it takes these actions:
  • If the site fails before responding with a ready 'T' message to 'C', it is assumed by the coordinator that it responded with an abort 'T' message.
  • If the site fails after the coordinator has received the ready 'T' message from the site, the coordinator executes the rest of the commit protocol in the normal fashion, ignoring the failure of the site.
  • When a participating site 'S' recovers from a failure, it must examine its log to determine the fate of those transactions that were in the midst of execution when the failure occurred. Let 'T' be one such transaction.
We consider each of the possible cases:
  • The site executes redo(T) when the log contains a <commit T> record.
  • The site executes undo(T) when the log contains an <abort T> record.
  • The log contains a record. Here, the site must consult 'C' to determine the fate of 'T'.
Failure of the coordinator
  • If the coordinator fails in the midst of the execution of the commit protocol for transaction T, then the participating sites must decide the fate of 'T'.
  • In certain cases, the participating sites cannot decide whether to commit or abort T, and therefore these sites must wait for the recovery of the failed coordinator.
  • 'T' must be committed if an active site contains a <commit T> record in its log.
  • 'T' must be aborted if an active site contains an <abort T> record in its log.
  • If some active site does not contain a <ready T> record in its log, then the failed coordinator 'C' cannot have decided to commit 'T', because a site that does not have a <ready T> record in its log cannot have sent a ready 'T' message to 'C'. However, the coordinator may have decided to abort 'T', but not to commit 'T'. Rather than wait for 'C' to recover, it is preferable to abort 'T'.
  • If none of the preceding cases holds, then all active sites must have a <ready T> record in their logs, but no additional control records (such as <abort T> or <commit T>).
  • Since the coordinator has failed, it is impossible to determine whether a decision has been made, and if one has, what that decision is, until the coordinator recovers. Thus, the active sites must wait for C to recover. Since the fate of T remains in doubt, T may continue to hold system resources.
Recovery
  • When a failed site restarts, recovery is performed by using the recovery algorithm.
  • The recovery procedure must treat in-doubt transactions specially while dealing with the distributed commit protocols; the in-doubt transactions are transactions for which a <ready T> log record is found, but neither a <commit T> log record nor an <abort T> log record is found.
  • By contacting the other sites recovering site must determine the commit–abort status of transactions.
  • Even if the recovery is done as just described the normal transaction processing at the site cannot begin until all in-doubt transactions have been committed or rolled back.
  • Since multiple sites have been contacted finding the status of in-doubt transactions can be slow.
  • If the coordinator has failed, and no other site has information about the commit–abort status of an incomplete transaction, recovery potentially could become blocked if 2PC is used. Due to this the site that is performing the restart recovery may remain unusable for a long period.
  • To solve this problem, recovery algorithms provide support for noting lock information in the log.
  • Instead of writing a <ready T> log record, the algorithm writes a <ready T, L> log record, where 'L' is a list of all write locks held by the transaction 'T' when the log record is written.
  • At recovery time, after performing local recovery actions, for every in-doubt transaction 'T', all the write locks noted in the <ready T, L> log record are reacquired.
  • After lock reacquisition is complete for all in-doubt transactions, transaction processing can start at the site, even before the commit–abort status of the in-doubt transactions is determined.
  • The site recovery is faster and never gets blocked as the commit or rollback of in-doubt transactions proceeds concurrently with the execution of new transactions.
  • Note that new transactions that have a lock conflict with any write locks held by in-doubt transactions will be unable to make progress until the conflicting in-doubt transactions have been committed or rolled back.

Explain concurrency control in Distributed Database.

  • It is assumed that each site participates in the execution of the commit protocol to ensure the global transaction atomicity.
  • The protocols that are described further will require updates to be done on all the replicas of a data item.
  • The updates of the data item cannot be processed if any site that contains a replica of the data item is failed.
Locking Protocols
  • There are various locking protocols that have been used.
  • The only change that needs to be done is the way the lock manager deals with the replicated data.
  • There are several possible schemas that are applicable to an environment where the data can be replicated in several sites.
The types of locking protocols are as follows:

1. Single Lock-Manager Approach
  • In this approach the system maintains a single lock manager that resides in a single chosen site.
  • All the lock and unlock requests are made at this site.
  • Whenever a data item is needed to be locked it send a lock request to this site. The lock manager will then check if the lock request can be granted or not. If the lock is granted a message is sent to the site with that effect.
  • If not the request is delayed until the grant.
  • The transaction can read the data item from any of the sites which is a replica of the data item.
Advantages
  • Simple implementation
  • Simple deadlock handling
Disadvantages
  • Bottleneck
  • Vulnerability
2. Distributed Lock Manager
  • This approach is basically a compromise between the advantages and disadvantages where the lock manager function is distributed over several sites.
  • Each site maintains a local lock manager whose function is to administer the lock and unlock requests for those data items that are stored in that site.
  • Once it is decided that a lock request is granted the lock manager will send a message back to the initiator indicating him about his request.
Advantages
  • Simple implementation,
  • Reduces the degree to which the coordinator is a bottleneck.
  • Low overhead
  • Requiring two message transfers for handling lock requests, and one message transfer for handling unlock requests.
Disadvantages
  • Deadlock handling is more complex.
3. Primary Copy
  • Whenever a system uses data replication one of the replicas can be chosen as the primary copy.
  • For each of the data item the primary copy must be residing precisely on one site which we can call as the primary site for that data item.
4. Majority Protocol
The working of this protocol is as follows
  • If a data item is replicated on different sites then the lock request is to be sent to more than the one half of the sites where the data item is stored.
  • The lock manager will determine if the lock can be granted immediately.
  • The transaction does not operate until it has obtained a lock successfully on all the replicas of the data item.
Advantages
  • Extended to deal with site failures.
  • The protocol also deals with replicated data in a decentralized manner, thus avoiding the drawbacks of central control.
Disadvantages:
  • Implementation
  • Deadlock Handling
5. Biased Protocol
  • In this approach the requests for shared locks are given more favorable treatment than the requests for the exclusive locks.
  • Shared locks - When a transaction needs to lock data item Q, it simply requests a lock on Q from the lock manager at one site that contains a replica of Q.
  • Exclusive locks - When a transaction needs to lock data item Q, it requests a lock on Q from the lock manager at all sites that contain a replica of Q.
Advantages
  • Imposing less overhead on read operations than does the majority protocol. This savings is especially significant in common cases in which the frequency of read is much greater than the frequency of write.
Disadvantages
  • Additional overhead on writes.
  • The biased protocol shares the majority protocol’s disadvantage of complexity in handling deadlock.
6. Quorum Consensus Protocol
  • This protocol is a generalization of the majority protocol.
  • It assigns each site a non-negative weight.
  • They assign a read and write operation on two integers called read quorum and write quorum.
Timestamping
  • A timestamping scheme is followed so as to give a unique timestamp which will help the system in deciding the serialization order.
  • There are two methods for generating the unique timestamps one is centralized and one is distributed.
  • In the centralized scheme a single site distributes the timestamps. This site can use a logical counter or its own local clock for this purpose.
  • In the distributed scheme each site will generate a unique local timestamp by using a logical counter or then the local clock.
  • The order of concatenation is important.
  • The site identifier is used in the least significant position to ensure that the global timestamps are generated in one site are not always greater than those generated in another site.
  • There can be a problem if one site generates timestamps faster than the other site.
  • A mechanism is needed which will ensure that the local timestamps will be generated fairly across the system.
  • If a system clock is used for generating the timestamps it will be assigned fairly provided that no site has a system clock that runs fast or slow.
  • As the clocks cannot be perfectly accurate a technique that is similar for the logical clock is to be used to ensure that no clock gets ahead or behind.