Patent 1213064 Summary

(12) Patent:	(11) CA 1213064
(21) Application Number:	449475
(54) English Title:	ASYNCHRONOUS CHECKPOINTING SYSTEM FOR ERROR RECOVERY
(54) French Title:	SYSTEME DE JALONNEMENT ASYNCHRONE POUR LA CORRECTION DES ERREURS
Status:	Expired

Bibliographic Data

(52) Canadian Patent Classification (CPC):	354/222
(51) International Patent Classification (IPC):	G06F 12/16 (2006.01)
(72) Inventors :	FINLEY, RUFUS E. (United States of America)
(73) Owners :	BURROUGHS CORPORATION (Not Available)
(71) Applicants :
(74) Agent:	R. WILLIAM WRAY & ASSOCIATES
(74) Associate agent:
(45) Issued:	1986-10-21
(22) Filed Date:	1984-03-13
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	No

(30) Application Priority Data:

Application No.	Country/Territory	Date
475,145	United States of America	1983-03-14

Abstracts

English Abstract

-30-
ASYNCHRONOUS CHECKPOINTING SYSTEM FOR ERROR RECOVERY
Abstract
A method of recovering from an error condition during
operation of a program that is modifying a data base
without corrupting the data base, wherein the program
includes calls to record the progress of the operation in
a table in memory. On the occurrence of an error
condition, the tables for all programs in operation are
transferred to a disk. During error recovery, the tables
are returned to memory where the information stored in
the respective tables is used by each active program to
restore operation of the particular program to a point
where the operation can be completed without corrupting
the data base. Each program is designed to interrogate
its own recovery table following the occurrence of an
error condition to restore operation at a point where the
integrity of the data base is assured.

Claims

Note: Claims are shown in the official language in which they were submitted.

The embodiments of the invention in which an exclusive
property or privilege is claimed are defined as follows:-

1. In a data base management system having a
plurality of application programs capable of concurrently
executing a plurality of tasks with respect to a data base
located in a non-volatile bulk memory, a method of providing
said system with the capability of recovering from an
unexpected interruption in system operation without
corrupting the data base, said method comprising:
allocating a task recovery area in a random access memory
during the initial processing of each task which may
modify the data base;
independently recording task recovery data during the
execution of each such task in its respective task
recovery area to the extent required to provide
recovery for the task without data base corruption
in the event of an unexpected interruption in system
operation while the task is executing, said
independently recording task recovery data occurring
in response to instructions embedded in the
application programs used for executing the task;
de-allocating each task recovery area in response to
deactivation of the task as a result of its
successful completion;
transferring recovery data in said task recovery areas
to said bulk memory in response to the occurrence of
an unexpected interruption in system operation while
a task is executing;
during system recovery operations, returning the recovery
data transferred to said bulk memory in response to
said interruption back to said task recovery areas
in said random access memory; and

then independently performing task recovery operations
for each task which was active when the unexpected
interruption occurred using the recovery data
returned to its respective task recovery area, said
task recovery operations being performed in a manner
such that the integrity of the data base is
maintained, said independently performing task
recovery operations occurring in response to
instructions embedded in the application programs
used for executing the task.

2. The invention in accordance with claim 1, wherein
the recovery data stored by each active task in its task
recovery area during task execution includes task progress
data.

3. The invention in accordance with claim 2, wherein
the step of independently performing task recovery operations
for each task includes interrogating the respective task
recovery area and in response to said progress data either
completing the task or backing out of the task and then
re-executing the task.

4. The invention in accordance with claim 1, including
the steps of recording an active task identification in an
active task table in said random access memory when a task is
activated, transferring said table to the bulk memory in
response to said interruption along with said task recovery
areas, and returning said active task table to said random
access memory along with said task recovery areas during said
system recovery operations.

5. The invention in accordance with claim 4, including
the step of interrogating said table to determine the tasks
for which the step of independently performing task recovery
operations are to be performed.

26

6. The invention in accordance with claim 5, wherein
the step of allocating includes storing data identifying the
locations of allocated task recording areas in said table,
and wherein the steps of independently recording task
recovery data and independently performing task recovery
operations refer to said table for determinng the location of
the respective recovery area allocated for each task.

7. The invention in accordance with claim 6, further
including the steps of recording a flag in each task recovery
area indicating if the data recorded therein is ready for use
during recovery, testing the flag in response to said
interruption, and inhibiting the transfer of a task recovery
area to the bulk memory until the flag indicates that the
data is ready for use during recovery operations.

8. The invention in accordance with claim 7, further
including the steps of determining when said task recovery
tables have been stored in said bulk memory, and in response
thereto initiating system recovery operations.

9. The invention in accordance with claim 1, wherein
a task may comprise a plurality of nested activities, wherein
the step of allocating includes dividing the task recovery
area of such a task into linked activity recovery areas, and
wherein the step of independently recording for such a task
includes recording recovery data during the performance of
each nested activity in a respective activity recovery area.

27

Description

Note: Descriptions are shown in the official language in which they were submitted.

L3~69~

AS~CEIRONOUS CHECKPOINTING SYSTEM FOR ERROR RECOVERY
Fie1~ tbe 1~ enti~n
This invention relates to a data processing system
for controlling a data base a~d, more particularly, is
directed to a method and means for preventing data base
15 corruption as a result of an unexpected system shutdown.
Back~rourd o the Inventi_
In a data base management system, for example, for
storing, updating and retrieval o information, such as
data items stored i~ ~he form of records in one or more
20 iles, unexpected system shutdown may result in corruption
o the data ba~e and cause problems in restarting the
data base managemeIl~ system. One well-known tech~ique
is to make a permanent updated r~cord of the data base
at fixed lnterval~ of time, ~uch as ~he beginning of a
25 d~y, or the start of a new shift, or other convenient
time. If the system e7~periences a shu~down due to power
failure or some other probleml an uncorrupted data base
can be duplicated u~ing the ~ackup recording and then
repea~ing the operationc that modify the data base from
the time of the last backup recording. Such a system has
obvious drawbacks in that duplicating a single d~y's or
single ~hift operation to update the data base at best
may involve many manohour~ of effort and at worse may be
impossible to re~onstruct.
3~
, . . . ~
.,~ ~

3~:~$~

~2--
1 The concept of checkpointing has been proposed which
provides for automatically ha]ting processing at controlled
intervals to make a magnetic tape or disk recordiny of the
condition of all variables of the machine run. In the even-t
of an error or interruption, restart proceduras make it
possible to continue processing from the last checkpoint
rather than from the beginning of the machine run. Such known
checkpointing techniques are under automatic system control
in which checkpoints are established at processing intervals
based on a certain number of items, transactions, or records
having been processed. At each checkpoint, input and output
records must be recorded along with the contents of storage
areas and memory, as well as the contents of counters and
registers in the processor. After an error or other
interruption, the accuracy of processing up to that point must
be verified and a restart procedure selected which re-enters
the main routine at that point.
According to the present invention there is provided in
a data base management system having a plurality of application
programs capable of concurrently executing a plurality of
tasks ~ith respect to a data base located in a non-volatile
bulk memory, a method of providing said system with the
capability of recovering from an unexpected interruption in
system operation without corrupting the data base, said method
~5 comprising allocating a task ~ecovery area in a random access
memory during the initial processing of each task which may
modify the data base; independently recording task recovery
data during the execution of each such task in i~s respective
task recovery area to the extent required to provide recovery
for the task without data base ~orruption in the event of an
unexpected interruption in system operation while the task is
executing, said independently recording task recovery data
occurring in response to instructions embedded in the
application programs used for executing the task; de-allocating
each task recovery area in response to deactivation of the task
as a result of its successful completion; transferring
recovery data in said task recovery areas to said bulk memory
in response to the occurrence of an unexpected interruption
,, .~

64

1 in system operation while a task is executing; during system
recovery operations, returning the recovery data transferred
to said bulk memory in response to said interruption back to
said task recovery areas in said random access memory; and
then independently performing task recovery opexations for
each task which was active when the unexpected interruption
occurred using the recovery data returned to its respective
task recovery area, said task recovery operations being
performed in a manner such that the integrity of the data hase
is maintained, said independently performing task recovery
operations occurring in response to instructions embedded in
the application programs used for executing the task.
In one embodiment there is provided an improved method
and means for error recovery to prevent data base corruption
as a result of unexpected system shutdown. The error
recovery system may be called "distributed asynchronous
checkpointing" since it is under software app~ication control
rather than automatic system control. The application
software provides for the continuous recording of information
needed to resolve data base inconsistencies or restart in the
event of a system shutdown. The application software
determines what data is temporarily stored in memory by the
current task program and when it is to be recorded based on
the particular function the software is currently performing.
Even though several tasks may be active concurrently, each
task checkpoints itself independently. A task as usad herein
is a collection of individual programs that operate as a
single transaction performing a pre-specified function. Each
task records this error recovery information in an area in
memory which is identified by a uniquP identification
number. If a task or particular activity within a task is
completed, its recorded error recovery inormation is erased.
However, if the data management system experiences an
unexpected shutdown, all currently recorded error recovery
information is transferred to the permanent storage, such as
a diskO During a subsequent error recovery operation, the
recorded information is returned to the random access storage;
each task that was currently active when the system shut down

L3~6~-~
--4--
l can then interrogate the recorded information and take
appropriate action to correct any inconsistencies based on
the recorded information.
An embodiment of the present invent.ion will now be
described by way of example, with reference to the
accompanying drawings in which:-
FIG. 1 is a block diagram of a digital data processing
system e.mbodying means and methods in accordance with the
present embodiment
FIGS. 2, 3 and 4 are flow diagrams illustrating the
overall operation of the data processing system of FIG. l;
FIG. 5 is a chart of the recovery tables in memory,
FIG. 6 is a flow chart of the system operation under
normal operation;
FIG. 7 is a flow chart of system operation with error
recovery in progress;
FIG. 8 is a block diagram showing schematically the
operation of a multi-program system having exror recovery
recording in memory;
FIGS. 9-19 are flow diagrams of subroutine calls used to
implement the error recovery operation; and
FIGS. 20-23 are flow diagrams showing an example o a
task using the error recovery system~

L3~69~

1 Detailed Description
The Prror recovery system of ~he present embodiment
may be incorporated in any data processing system in
which programs are executed that request block
allocations in a bulk memory used for storing a data base,
or that change data stored in the bulk memory. Such
operations, if unexpectedly interrupted before completion,
may corrupt the data base. Each program involving these
types of activities contains calls to "activity recording
10 services" to record sufficient information to support an
orderly recovery following any unexpected interr~uption.
FIG. 1 shows an example of a typical hardware system
which supports a data base stored in a bulk memory 16
such as a magnetic disk memory~ The setting up of files
15 on the disk, the storage of data in such files, or the
changing of data in such files is under user control from
a plurality of terminals, three of which are indicated at
10, 12, and 14 in FIG. 1. These terminals allow the
users to communicate with the data base stored on the
20 disk memory 16 by means of keyboards 18 at each terminal~
Interaction with the user is provided by a CRT display 20
at each terminal.
Transfer of digitally coded information into and
ou~ of the user terminals is controlled by a processor
25 22 over a common bus 24. The terminals are connected to
the bus through an interface 26. The processor 22 is
controlled by programs stored in a read only memory (ROM)
28. The processor uses a random access memory (RAM) 30
as a temporary storag during execution of the program~
30 The RAM has an auxiliary power source 32 which maintains
power to the RAM in case of power failure to the system7
Transfer of information to and from the disk memory 16

313~

, -6-
1 is through a disk controller 34 connected to the main bus
24. I~ will be seen that FIG. 1 represen~s conventi.onal
architecture of a digital data processing system.
Referring to FIGS. 2, 3 and 4, the system operation
under control of programs stored in the ROM 28 may be
summarized as follows:
Once power is turned on to the system, as indicated
at 36, a check is made (see 38 on FIG. 2) to determine
whether there was a prior pcwer failure which in~errupted
operation of the systemO A power failure is one of the
unexpected interruptions which causes shu~down of the
~ystem and invoke~ the error recovery procedure on
restoration of power. Whenever there i5 a pow~r ailure
during sy3tem operati`on, a Power Faiiure flag i~ set i~
a specific location in R~M 30 which is retained in RAM
30 by the auxiliary power source 32. The manner in
which the Power Failure 1ag is s~t is de~cribed below.
A~suming that the system is in an initial startup mode,
the Power Failure flag will be fals~.
The sy~tem then ~nter~ an initial startup phase,
indicated at 40, which "boots" the $y8tem and ~tart~
reading into the R~M 30 from the disk memory 16. This
places the sy~tem in an initial ope~ating phase.
The first operation of the system i5 to read two
~5 Status 1ags ~tored at a predetermined location on the
di~k and load these flags into predetermined locations
in the R~M 30. As will be d~scribed in detail below,
one flag indicates the statu~ of the data base, namely,
good or bad, and the other flag indicates whether or not
a recovery table is stored on the di-qk. Followiny an
initial power-up, both of these flags will be fals4.
However, if there has been a prior unexpected interruption

3~

3~

1 in the operation of the system due, for example, to a
software detected errQr or a power failure, both of
these flags will be true. Once the Data Base Status flag
and ~he RecoYery Table Status flag are stored in the RAM
30, these flags as stored on the disk are both reset to
false in preparation for a possible uncontrolled shutdown
of the system, as indicated at 44.
If the Data Base Statu~ flag transferred from the
disk to RAM 30 is true, indicating that the data base
is corrupted due to an unexpected shutdown, a warning
light is turned on at each of the user terminals, as
indicated at 46 and 48 in FIG. 2. As shown at 50 in
FIG. 3, the Recovery Table Status 1ag as stored in
memory 30 is then checked. If a recovery table is
precent on t`he disk, as indicated by the Recovery Table
Status flag being true, an error reçovery procedure is
required. However, assuming for the moment that the
system is in an initial power~up phase, the Racovery
Table 5tatus flag will be false. This sets the Recovery
in Progress flag in the RAM 30 to false, as indicated at
52. The syqtem then causes all defined application and
system tasks to be activated in anticipation of a user
request from one of the terminals, as indieated at 54.
As~uming that the Recovery in Progress ~lag is
false~ as determined at 56 in the flow diagram of FIG. 3,
the operating progr m the~ calls an "activity anchox
initiali~e" subroutine. The function of this subroutine
is to s~t up a table of task anchors in memory for each
of the tasks performed by ~ny appLication programs
called in response to user requests o a type that involve
an ~rror recovery procedure. User requests that only
interrogat~ the data base do not corrupt the data base

l3~

1 and so do not invoke any error recovery procedure~
As seen in FIG. 5, task anchors for up to 50 tasks, for
example, are established in RAM 30 at known locations.
Each task anchor includes a forward link address and a
backward link address which later are set to point to
the first and last activity recording areas for the
asociated task. These ac~ivity recoxding areas are used
to store error recovery data during execution of the task
programsO A task may have one or more activities which
modifie~ the data base. If only one activity recording
area is required for a particular task, the forward link
and backward link in the task anchor will point to the
same activity recording area. The initialized task
anchors have the links pointing to the task anchor
address. A Size word in the task anchor i~ set to the
size of the anchor table, a Lower Status word is set to
zero, and a Checksum is calculated and stored. This is
done for all ~he task anchors during ini~ialization, At
this point, the opera~ing program indicates to the
~erminal users that the system is ready to operate.
Reerring to the flow diagram of ~IG. 4, as indicated
at 6~, the data processing system now functions in a
normal manner to allow the users to allocate new files in
the data base, readout data, add additional information
~O the file in the da~a base, or initiate any other
transactions which the system is progr~mmed to perform in
managing the data base files.
Referring to FIG. 6, normal operation of the ~ystem
with or wi~hout error recovery is summarized. Ta~ks l-N
operate at the input level to provide input from the
respec~ive terminals. Thus task 1 a~ the inpu level

12~L3064

g
1 causes User #l terminal to display a menu from which the
user selects a particular option, e.g., create a file.
Task 2 does the same at User #2 terminal and task N, of
course, to the User #~ terminal. Each task responds to
the particular option selected by the user and prompts
the user to en~er addi~ional information, depending on
the op~ion selec~ed. For example, if User #l selects an
option to create a new file in the data base, task 1 will
then prompt User #l to enter additional information such
as the file name, description and fields. The input tasX
is completed by se~ding the new transaction tG a
transaction scheduler, which i~ a pro~ram ta~k for
initiating the transactions called by the several user
terminals~
lS As noted above, each task is a collection o
individual programs that operate ac a sin~le transaction
performing a prespecified function. In FIG. 6, broken
lines ~eparate the tasks to indicate that each ta~k may
be active concurrently with other tasXs. The processor
switches between tasks using interrupts to provide a
conventional multi-programming operation. The scheduler,
which i9 a ~eparate task, receives the specified
tran~actions from the input tasXs and save~ them in a
transaction queue on the disk. ~he queue identifies
each transaction called for by a user and stores the
associa~ed input data rom the user terminal. The
scheduler select~ a transaction in the queue to process
and uses the transaction to activate an appropriate ~ask
to be executed by the processor. The scheduler removes
the transaction from the queue when the scheduled task
is completed. The scheduler causes the particular ta~k
to be executed, a~ indicated at 150 o~ FIG. 6. Under

3~6~

-10--
1 normal conditions, the Recovery in Progress flag is false
since the task has not yet been initiated. When a
particular transaction is complete, as indicated at 152,
the scheduler is notified and removed from the disk queue.
During normal operation in the execution of any
assigned task by the data processiny system, the system
continuously monitors three conditions which result in
termination of operation. ~s shown in FIG. 4 at 64, if
a user turns the key off on the system, all currently
requested transactions are completed, the Data Base
Status flag is set to true and stored in the disk, as
indicated at 68, and then the power is shut off. A
second condition which is monitored is a software detected
error. The programs are written to contain ample checks
to detect common types of program int~rface errors,
unexpected changes to sensitive data in memory,
i~con~istent links in data base storage, vut of range
off~et values and unexpected rçtu~ns from call procedures.
If any such condition is detected, as indicated at 70,
the status o a Recovery in Progress flag is checked to
determine w~ether ~he system is currently operating in a
normal mode or in a recovery mode. Assuming that the
gygtem i9 currently operating in the normal mode so tha~
the Recovery in Progress flag is ~alse, the recovery
~5 tables and the ac~ive task table in memvry are transferred
to the disk, and the Recovery Table Presen~ Sta~us flag
on the disk i~ set to true, as indicated at 94 and 96.
The Data Ba~e Status flag, which may be either good or
bad at thi~ staye, is also stored on the disX (see 983
and the system returns to the ~tartup mode starting at
"A' in FIG. 2. If the ~y~tem is already in a recovery
mode at the time a software error is d~tected, the system
set~ the Data Base Statu~ flag in memory true,

~ .

3~6~

1 indicating that the data base status is bad. The Data
Base S~atus flag is then written to disX and the system
returns to the startup mode at "A".
A third unexpected event which is detected, as
indicated at 102 in FIG. 4, is a complete power failure.
A power failure does not result in an immediate loss of
information s~ored in RAM because the R~M is provided
with a backup battery power supply 32. A power failure
causes the Power Failure flag stored in the RAM to be
set to true, and the system remains ln this condition
until power i~ restored. Once power is restored to the
system, system operation returns to the ~tartup mode,
indicated at "A" on the f1QW diagram of FIG. 2.
Since the Power Failure flag is now true, the system
checks to see if a recovery was in progress. If not, as
indicated at 110 in FIG. 2, the recovery table and
active tasX table in RAM are stored on the disk along
with the Data Base 5tatus flag. The Recovery Table
Present Status ~lag i~ set to true and written on the
di~k a~ well. After the Power Failure flag is reset,
the system operation returns to the startup mode.
The startup operation is identical to that described
above in connection with FIG. 2. When ~y~tem op~3ration
reaches the point where .it checks to d termine whether a
recovery table is present, as indicated at 50 in FIG. 3,
if operating in a recovery mode, it will find that this
flag i5 ~rue. In this ~ase, the recovery table and the
active task table which were stored on the disk as the
result of a power failure or a software detected error,
are tra~sferred from the disk back into RAM, as
indicated at 1~0 in FIG. 3. A Recov~ry in Progress flag
in memory is ~hen set to true.

3~

1 When this flag is checked, as indicated at 56 in
the flow diagram of FIGo 3, the system will now find
that the flag is true, indicating ~hat a recovery is in
progress. This condition is displayed on all the
terminals to tell the users to stand by. The system
then schedules a task tc perform application exror
recovery for each active task identified in the active
task table, as indicated at 126 and described in detail
in connection with the flow diagram of FIG. 7. When
application error recovery has been completed for each
of the active tasks, the Recovery in Progress flag is
reset to false and the system is ready to resume normal
operation.
Referring to FIG. 7 in detail, if a recovery is in
15 progxess, as determined at 56, N is set to the number of
transactions in the active task table in memory. If N
is not equal to zero, the first transaction identi~ied in
the active task table i5 sent to the schedul~r. The
transaction scheduler, just a~ in normal operation
20 described in connection with FIG. 6, saves the
transaction in the disk queue. The scheduler then checks
to determine whether the recovery and the transaction are
complete, as indicated at 131~ If so, the complete
statu~ causes the scheduler to remove the transaction
from ~he queue and re~urns operation ~o the system. The
value of N i5 then decremented by 1, as indicated at 133.
If N is still not zero, the next transaction in the
ac~ive task table is sent to the scheduler. If a
transaction i5 not complete, the transaction scheduler
selects a transastion in he disk queue to process and
sends the transaction to the appropriate task to be

~5

-13-
1 processed, as indicated a~ 135 and 137 in FIG. 7 4 The
particular task receives and processes the transaction
with the Recovery in Progress flag set true, as indicated
at 139. When the task is complete, it sends the
transaction back to the scheduler with a complete status,
as indicated at 141. It will be seen that the operations
of FIGS. 6 and 7 are similar except that the transactions
are received from the active task table rather than from
the terminals, and the particular task called for by the
10 transaction is processed with the Recovery in Progress
flag set to true rathsr than being set to false.
As pointed out above, each task involving modifica-
tion of the data base for processiny a transaction
initiated by a user terminal incorporates its own recovery
15 procedures within the task program. Each program is
responsible for recording the information that is
necessary for error recovery to be successful. These
recordings occur to identify the phase in which the
program is currently processing, and the critical
20 information is collected and grouped by task in ~he error
recovery ~able. While this procedure can be implemented
for each task as a whole, it may be more convenient in
designing the error recovery procedure ~or a given task
to be divided into a number of separate nested activities
25 with each activity recording data in memory~ If an
activity is completed successfully, the recorded recov~ry
data for that activity may be deleted frcm memoryO
Thus, as shown in FIGo 8, the data base management system
uses the scheduler, indicated at 150, to activate the
30 tasks necessary to process the ~ransactions initiated by
the various user terminals~ Each task has a recording
area in RAM, indica~ed at 152, in which error recovery

3~

3~36~

-14-
1 data is stored. Each task may be subdivided into nested
subroutines, indicated as program A, program a, and
program C under task No. 1. Each subroutine, reEerred
to as an activity level, provides error recording with
an associated area in RAM.
Referring again to FIG. 5, each task has an area set
aside in memory, referred to as a task anchor. The anchor
for each task comprises a heading for storing link
addresses or pointers to first and last activity recording
10 areas in RAM. These activity recording areas are set
aside by each activity within a task. Thus, as shown in
FIG. 5, the first or highest level activity sets aside a
first recording area. A forward link address pointing to
the location of the first recording area is stored in the
15 task anchor. A backward link pointing to the same
recording area is also stored in the task anchor.
An activity header for the first recording area for each
task is now provided with a forward link and a backward
link, both of which point to the task anchor, assuming
20 there is only a single recording area for that task~ If
there are two or more recording areas, such as woul~d be
required for program B and program C, these additional
recording areas are provided with forward and backward
links which provide both forward and backward loops for
25 linking ~he taslc anchor and each of the activity recordiny
areas for the particular task in a ';chain". The activity
header for the anchor and each activity, in addition to
the forward and backward links, include the size of the
recording area, a Lower Status fla~, which indicates
30 the status of th~ next lower activity level in the task,
and a Checksum value. In addition, a Data Ready ~lag
is provided in each recording area and set true when

~3

^ -15
1 the activity recovery data is initiali~ed. This flag
is checked during recovery as a precaution against use
of incomplete da a. A recovery identification indicating
the activity level within the task is also recorded.
Individual data items stored in an activity
recording area are set by calls embedded in the
application program at the particular activity level
to "activity set byte", "activity set word" and "activity
set doubl~ word". Each of these calls is used to assign
a value to a byte, word or double word variable in the
recovery data stxucture. An "activity start" call
procedure is u~ed to mark the start of an activity for
recovery purposes. I~ establishes the number of bytes of
local data to be reserved for recording information for
the particular activity level and establishes a pointer
variable for the base s~ructure which defines the local
data for the particular dctivity level. Two other
procedure calls for the error recovery recording services
are required, an Nac~ivity data ready'~ call which is used
to mark the fact that the variables in the current
recovery data s~ructure have been assigned initial values,
and an "activity en~" call which marks the exit from an
activity and operates to discard all data associated with
the current activity level. Each of the above listed
~5 call procedures used for recording error recovery data
during ex~cution oE a task is described in more detail in
connection with the flow diagrams shown ~n FIGS. 9-16.
Th~ "activity start'l call, shown in detail in FIG.
9, is used to establish a recording area during execution
of an activity within a particular taskO As pointed out
above, a sinyle ask may consist of a single ac~ivity
or a series of activities, as outlined above in

~L3~6~

-16-
1 connection with FIG. 8. ~Jhen an "activity start" call
procedure is encountered during ex0cution of a program,
a subroutine is executed which first checks to make sure
that the Recovery in Progresc flag is false, since the
recording services are not invoked during a recovery. If
a recovery is in progress, the "activity start" call
proceduxe immediately terminates and returns to the
activity program. However, if a recovery is not in
progress, the "activity start" program then builds a
10 recording activity area in memory by first calculating
the record size required. This is determined by adding
the number of bytes requir~d for the heading of the
recording area to the amount of memory defined by the
application program tha~ is required to store the required
15 error recovery data. The operating system then sets
aside a buffer area in memory of the required size, and
the memory area is initialized. The "lower status" is
initialized to a ~ormal ctatus to indicate no lower level
activity has besn called.
Ater the ac~ivity recording area is ~uilt, the
recordin~ activity area is llnked ~o the ta~k anchor
through the forward link and to the ne~t higher level
recording area thxough the backward link, as illustrated
in FIG. 5.
After the recording area is established for the
particular activity by the activity start call and
linked to the tasX anchor and to the next higher
activity level recording area, the checksums for the new
recording area, the previous recording area and the
anchor are all recalculated and stored in the respective
headings ~o complete the activity start operation. Also
the "lower status" in the previous or next higher

~3~6~

l activity level heading is set to No Data to indicate no
data has yet been stored in the current activity
recording axea.
Once a recording area is established in memory for
use by a particular ac~ivity, the application program
within that activity may store a byte, a word (two bytes)
or a double word ~four bytes) into the recording area in
response to a ~et byte call, a set word call or a set
double word call, shown respectively in FIGS. lO, ll, and
12. In each instance, a check is first made of the
Recovery in Progre~s flag. If it is not true, the program
finds the activity recording axea, as shown in FIG. 15,
find~ the activity data item location in the data
recording area, as shown in FIG. 16, and stores the
l~ byte, word, or double word, as the case may be in the
item location. To fincl the activity record, as shown in
FIG. 15, the operating ~ystem first getq the task I.D.
from a predetermined location in memory where the
op~rating system ~tore~ the I.D. of the ac~ive ta~k. The
Z task I.D. determines the location o~ the task anchor,
and the task anchor provides a backward link poi~ting to
the last activity recordin~ area for the particular task.
To find the addre~ o~ the activity data item where
the byte, word or double word is to be stored, the offset
of the data address is computed knowing the base address
o the recording ~rea and the number of bytes used for
the header of the recording area. A new checksum is
determined and stored in the header after the byte, word
or double word has been ~tored in the memory. Thus by
using one of these three calls, an application program
can provide for error recovery by storing information
generated during the execution of the proyram in a

3~

,

6~

1~
1 specified recording area set aside in memory.
When the application program has stored enough
information in the recording area in memory to be able to
make a recovery following a software error or power
failure, the Data Ready flag in the recording area is
set using an "activity data ready" call, as shown in FIG.
13. Again, after checking to see that the Recovery in
Progress fla~ has not been set true, ~he par~icular
activity recording area in memory is located and the Data
Ready flag in the heading is set true. After calculating
a new checksum, the "lower status" in the header of the
previous activity recording area in the memory is set
to Incomplete, indicating that the next current
activity level has completed the storage of recovery
data but not called "activity end". A new checksum
for the previous or higher lavel recording area is
recalculated before e~i~ing the "activity data ready"
call.
Before exiting a completed activity program, an
"activity end" call is used to remove the associatad
r~covery recording area in the memory. The "activity
end" ca}l, as shown in detail in FIG. 14, again checks to
determine whether the pr~gram is being executed in the
recovery mode. If not, the "activity end" call then
finds the activity r~cord in memory and unlink~ the
recording area from the chain which links the anchor to
the recording areas ~or the other activity levels. Since
this invol~es changing the forward and backward link
addresses in the anchor a~d in the previous or next
higher order activity l~vel recording area, a new check-
sum is computed for both the previous recording area
heading and for the anchor heading. The recording area

3~

1 9--
1 is then released so that that area of memory can be
available for other uses. Also ~he "lower sta~us" of
the higher level recording area heading is changed to
an End o Data status, indicating the lower level
program routine has been completed.
Special calls are also used during the error
recovery operation to locate the data recorded by the
application program prior to the occurrence of the error
condition. Th~s~ include the "activity restart" call,
described in connection with the flow diagram shown in
FIG. 17. When the "activity res~art" call is encountered,
it gets the task identification from the table in memory.
The tasX identification points to the task anchor and is
uæed to obtain the forward link address from the
corre ponding task anchor in memory~ The forward link
points to the irst or highest aetivity level error
recovery recording area for the particular task, as
described above in connection with FIG. 5. The recovery
identification recorded in the header of the first
recording area is then checked to make sure there i5 a
match with the user recovery identification previously
stored in the header o the recordin~ area. If there is
a match and the checksum is valid, the Data Ready flag in
the header is checked to make ~ure active data has been
recorded. If so, a pointer to the beginning address o~
the recorded da~a is set and returned to the main program.
An "activity lower status" call is also provided
for use by the application program which operates to
return the lower status information stored in the header
Of an activi~y recording area in memory. As shown in
FIG~ 18, thi~ call routine gets the task identi~ication
and then obtains the forward link address from the anchor

L3~6~

~ -20-
1 header. From the highest level activity record, it
obtains the recovery identification stored in the header;
and if it matches ~he desired activity level, a checksum
is made and the lower status information in the header is
returned to the main program.
An "activity restart end" call, shown in FIG. 19,
is used to erase the activity record in memory. It
functions to erase all activity levels associated with
the particular ~ask in memory. The "activity restart
end" call gets the task id~ntification which points to
the anchor header. From the anchor header, it gets the
forward link address which points to the header of the
first recording area for the particular task. If there
i~ a record, it updates the anchor links to bypass the
15 activity level and releases the record area in memory.
This loop is repeated until all of the recording areas
associated with a paxticular tasX have b~en r^eleased.
A new checksum for the task anchor is then computed,
and the call returns to the main program.
To better under~tand the invention, operation of
the system in executing the user task for creating a new
file in the data base is summarized by the flow diagrams
of FIGS. 20, 21, 22, and 23. As described in connection
with FIG. ~, the scheduler initiate~ the appropriate task
to be processed. In the example, the task to be processed
is the Create File task. When the Create File transaction
i3 received from the scheduler, the Recovery in Progress
flag is checked. Assuming it is false, so that the normal
operation is in progress, the "activity start" call,
descri~ed above in connection with FIG. 9, is initiated
by the task program~ As a result, the recording area for
storing error recovery data is set aside in memory and

~LZ~3~6'~

21
1 the heading information of the first or highest activity
level recording area for the particular task is
initialized in ~he manner described above. In the
present example, it is assumed that the Create File task
does not have any nested activity levels and so no more
than one recording area for error recovery is required.
Once the recording area is set aside in memory, linked
to the task anchor, and initialized, an "activity set
word" call i8 u~ed to set a progress word in the recording
area to a null condition. An "activity set double word"
call is used to set a detail block number to zero in the
recording area, and an "activity set word" call is used
to set the user file in the recording area to æero. An
"activity data ready" call is then executed (see FIG. 13)
to set the Data Ready flag in the heading o the recording
area to true. The Create File task then proceeds to
reserve the next u~er file number and create a user file
description block on the disk. The block number is then
recorded in the error recovery recordi~g area by an
"activity set double word" call. An "activity set" call
is then u~ed to change the progress word to indicate the
progress detail by the task.
The Create File task then creates an entry in the
index blocX on a disX for this new user ile, and an
Z5 activity set word records a progress word in the recovery
recording area to indicate that the index has been
inserted. The Create ~ew File task then ~ets a catalog
entry on the disk to indicate that this new user file is
active. An "activity set word" call then records as
data in the ac~ivity recording area that the task has
progressed to setting the catalog entry to "active"O An
"activity end" call is then executed (see FIG. 14). This
removes the error recovery recording area from memory and

~3~

-22-
1 compl~tes ~he exeeution of the Create File task. The
transaction is returned to ~he scheduler as completed.
It will be seen that each particular application
program for executing a particular task involving
modification of the da~a base has embedded in the program
calls which set up the recording area in memory for the
recovery. Then progress inormation is recorded in the
recording area in memory and ~he Data Ready ~la~ is set.
If the task is completed without any error condition
arising, the recording area is released beore the task
returns operation to the scheduler.
ln the event tha~ an error condition arises before
a task is completed, the task is again scheduled but with
the Recovery in Progres~ flag set to true, as described
above in connection with FIG. 7. If the Recovery in
Progress flag is true, the program c~lls a Create File
Recovery program, as shown in the flow diagram of FIG.
20. The recovery subroutine is shown in detail in the
flow diagram of FIGS. 22 and 23. The subroutine first
calls an "ac~ivity restart" whichr as described above in
connection with FIG. 17, returned a pointer to the
associated recording area in memory where the recovery
data was recorded prior to the occurrence o the errox
condition, Using the pointer, the program checks the
status of ~he recording area to determine whether there
is active data recorded. If no data is recorded, the
program checks to see whether the catalog entry on the
disk is actîve for the particular user file. If the
catalog entry on the disk is not active, this indicates
t~at the transaction was completed at the time the error
was encountered. The program thereore calls an "activity
restart end". This call returns a "normal" status which

.

3 U~

.~ -23-
1 causes the transaction to be sent to the scheduler as
comple~e. If the catalog entry on the disk i.3 active,
the return status is set to "incomplete" to indicate the
transaction has not started. Under the~e condition~,
the program returns to the main program of the Create
File task, as indicated in FIG. 20, to be rescheduled.
If there has been data recorded in the recording
area, this indicates that the task had been partially
executed at the time the error condition occurred. In
this c~se, as shown by the flow dia~ram of FIG. ~2, the
program checks to determine if ~he recorded user file
number is still zero. If so, the recovery opexation
is terminated by setting the return status to "incomplete"
and returning the transaction to the scheduler. However,
if a file has been created so that the file number is
other than zero, the recorded data is used to determine
if the task reached the progres3 point where "detail
inserted" was completed before the error condition
occurred. If the ~a~k had reached this poin~, the program
2~ bxanches to "C" in the flow diagram, as shown in FIG. 23
However, assuming the task had not progressed to this
point in th~ program, the subroutine compares the u~er
file number with the highest reserve user file number.
If the re3erve number is greater than the file number,
the reserve number is decremented by 1. If a description
block number has been allocated so that it is not equal
to zero, the block number is deallocated, the return
3tatus is set to "incomplete", the "activity restart end"
call is executed, and the operation is returned to the
scheduler to be rescheduled. Thus, in effect, the error
recovery ~rocedure has backed out of the task, leaving
the data base in the same status it was in before the

~3~

-24-
1 Create File task was initated prior to the error
condition.
If the Create File task has proceeded to the point
where it had progressed to the "detail inserted" status,
the recovery program goes forward, as shown in FIG. 23,
and determines whether the program had progressed to th~
point where an index had been inserted on the di k. If
it had not progressed to this point, the program creates
an entry in the index block on the disk for the user
file. The recovery program then ehecks to determine
if the ta~k had progressed to the point where the catalog
name was ac~ivated. If not, the program sets the catalog
entry on the disk to an active state for this user file.
The return statu~ is set to "normal" since the Create
15 File task i5 now a comple~ed transaction.
From the above description, it will be seen that
each tasX includes procedures for recording progress data
in a specified area or areas in memory during execution.
Th~ data stored in these areas are retaineZ on the disk in
20 the event of an error that interrupts the execution o a
task. During a subsequent error operation, the data is
restored to the memory and used by the task to either
complete th~ task or back out of the task in a way that
does not ~orrupt the data base. This allows automatic
system recovery from software errors or power loss.
Recovery data i~ recorded in memory and then discarded
to allow new data to be recorded as task execution
progresses without an error condition occurxing.

Representative Drawing

Sorry, the representative drawing for patent document number 1213064 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	1986-10-21
(22) Filed	1984-03-13
(45) Issued	1986-10-21
Expired	2004-03-13

Abandonment History

There is no abandonment history.

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Application Fee			$0.00	1984-03-13

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
BURROUGHS CORPORATION

Past Owners on Record
None

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Drawings	1993-07-15	18	474
Claims	1993-07-15	3	141
Abstract	1993-07-15	1	25
Cover Page	1993-07-15	1	18
Description	1993-07-15	24	1,151

Language selection

Menus

English Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 1213064 Summary

English Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.