System check point logic and Structure of Recovery manager

System check point logic:

System checkpoints may be triggered by operator commands, timers or else counters such as the number of bytes of log record ever since last checkpoint. The all-purpose idea is to minimize the distance one must travel in the log in the event of a catastrophe. This should be balanced against the cost of taking frequent checkpoints. Five minutes is a usual checkpoint interval.

Checkpoint algorithms that need system quiescence should be avoided because they imply that checkpoints will be taken infrequently thereby making restart expensive.

The checkpoint process comprises of writing a BEGIN_CHECKPOINT record in the log then invoking every component of the system so that it can contribute to the checkpoint as well as then writing an END_CHECKPOINT record in the log. These records bracket the checkpoint accounts of the other system components. Such a component may well write one or more log records thus that it will be capable to restart from the checkpoint. For illustration buffer manager will record the names of the buffers in the buffer pool file manager might record the status of files, network manager may well record the network status and transaction manager will record the names of all transactions active at the checkpoint.

Subsequent to the checkpoint log records have been written to non-volatile storage recovery manager records the address of the most recent checkpoint in a warm start file. This permits restart to quickly locate the checkpoint record (rather than in sequence searching the log for it.) For the reason that this is such a critical resource the restart file is duplexes (two copies are kept) as well as writes to it are alternated so that one file points to the current and another points to the previous checkpoint log record.

At system restart the programs are loaded as well as the transaction manager invokes each component to reinitialize itself. Data communications begins network-restart as well as the database manager reacquires the database from the operating system (opens the files).

Recovery administrator is then given control. Recovery administrator examines the most recent warm start file written by checkpoint to discover the location of the most recent system checkpoint in the log. Recovery administrator then examines the most recent checkpoint record in the log. If there was no work in progress at the system checkpoint as well as the system checkpoint is the last record in the log then the system is in restarting from a shutdown in a quashed state. This is a warm start as well as no transactions need be undone or redone. In this case recovery manager writes a resume record in the log and returns to the scheduler which opens the system for general use.

Alternatively if there was work in progress at the system checkpoint or if there are further leg records then this is a restart from a crash (emergency restart).

The following figure will assist to explain emergency restart logic:

29_emergency restart logic.jpg

Five transaction types with respect to the mostly recent system checkpoint and the crash point.

Transactions T1, T2, and T3 have committed as well as must be redone. Transactions T4 and T5 haven’t committed and so must be undone. Let's call transactions like T1, T2 and T3 winners as well as lets call transactions like T4 and T5 losers. Then the resume logic is:

RESTART: PROCEDURE;
DICHOTOMIZE WINNERS AND LOSERS;
REDO THE WINNERS;
UNDO THE LOSERS;
END RESTART;

It is significant that the REDOs occur before the UNDO (Do you see why (we are assuming page-locking and high-water marks from log-sequence numbers?)

As it stands this entail reading every log record ever written because redoing the winners requires going back to redo approximately all transactions ever run.

Much of the complexity of the restart process is dedicated to minimizing the amount of work that must be done therefore that restart can be as quick as possible (We are describing here one of the more trivial workable schemes.) Generally restart discovers a time T such that redo log records written prior to time T aren’t relevant to restart.

To see how to calculate the time T we first consider a particular object: a database page P. For the reason that this is a restart from a crash the most recent version of P may or may not have been recorded on non-volatile storage. Assume page P was written out with high water mark LSN (P). If the page was updated by a winner ‘after’ LSN (P), after that an update to P must be redone. On the other hand if P was written out to non-volatile storage with a loser's update then those updates must be undone. (Likewise message M may or may not have been sent to its destination.) If it was generated by a loser afterwards the message should be cancelled. If it was generated by a committed transaction however not sent then it should be retransmitted.) The figure below demonstrate the five possible types of transactions at this point: T1 began as well as committed before LSN (P), T2 began prior to LSN (P) and ended before the crash, T3 began after LSN (P) and ended before the crash, T4 began prior to LSN (P) but its COMMIT record doesn’t appear in the log and T5 began after LSN (P) and apparently never ended. To honour the commit of T1, T2 and T3 needs that their updates be added to page P (redone). However T4, T5 as well as T6 haven’t committed and so must be undone.

1272_transaction types.jpg

Five transactions kinds with respect to the most recent write of page P as well as the crash point,

See that none of the updates of T5 are reflected in this state so T5 is previously undone. Notice as well that all of the updates of T1 are in the state so it need not be redone. Therefore only T2, T3 and T4 remain. T2 and T3 should be redone from LSN (P) forward. The updates of the first half of T2 are before now reflected in the page P because it has log sequence number LSN (P). Alternatively T4 must be undone from LSN (P) backwards. (Here we are skip over the following anomaly: if after LSN (P) T2 backs up to a point before the LSN (P) then some undo work is required for T2. This problem isn’t difficult just annoying.)

Thus the oldest redo log record relevant to P is at or after LSN (P). (The write-ahead-log protocol is relevant here.) At system checkpoint data administrator records MINLSN, the log sequence number of the oldest page not yet written (the minimum LSN (P) of all pages, P, not yet written.) Likewise transaction manager records the name of each transaction active at the checkpoint. Restart chooses T as the MINLSN of the most current checkpoint.

Restart proceeds as follows- It reads the system checkpoint log record as well as puts each transaction active at the checkpoint into the loser set.

It after that scans the log forward to the end. If a COMMIT log record is come across that transaction is promoted to the winners set. If a BEGIN_TRANSACTION record is found the transaction is hesitantly added to the loser set. When the end of the log is encountered the winners also losers have been computed. The subsequent thing is to read the log forwards from MINLSN redoing the winners. Afterwards it starts from the end of the log read the log backwards undoing the losers.

This discussion of restart is extremely simplistic. Several systems have added mechanisms to speed restart by-

a) Never write uncommitted objects to non-volatile storage (stealing) therefore that undo is never required.

b) Write committed things to secondary storage at phase 2 of commit (forcing), therefore that redo is only rarely required (this maximizes “MINLSN).

c) Log the effectively completion of a write to secondary storage. This minimizes redo.

d) Force all objects at system checkpoint thus maximizing MINLSN.

Latest technology based Operating System Online Tutoring Assistance

Tutors, at the www.tutorsglobe.com, take pledge to provide full satisfaction and assurance in Operating System help via online tutoring. Students are getting 100% satisfaction by online tutors across the globe. Here you can get homework help for Operating System, project ideas and tutorials. We provide email based Operating System help. You can join us to ask queries 24x7 with live, experienced and qualified online tutors specialized in Operating System. Through Online Tutoring, you would be able to complete your homework or assignments at your home. Tutors at the TutorsGlobe are committed to provide the best quality online tutoring assistance for Operating System Homework help and assignment help services. They use their experience, as they have solved thousands of the Operating System assignments, which may help you to solve your complex issues of Operating System. TutorsGlobe assure for the best quality compliance to your homework. Compromise with quality is not in our dictionary. If we feel that we are not able to provide the homework help as per the deadline or given instruction by the student, we refund the money of the student without any delay.

©TutorsGlobe All rights reserved 2022-2023.