Note: Descriptions are shown in the official language in which they were submitted.
3~6
BACKGROUND OF THE NVENTION
Field of the Invention
_ _ _ _ . _ _ _ _ _ _ _ _ _
The present invention relates to a computer system in which two or
more computer modules can be coupled to a system bus, each of the modules
including an individual computer, a coupling memory and a working memory, and
in which the system bus comprises a control and address bus and a data bus,
and more particularly to such a system in which access can be gained to a
coupling memory either from the system bus or from an individual computer by
transfer techniques and in which only the individual computer has access to
].0 its worki.ng memory and the system bus can be coupled to a control computer.
Description of the Prior Art
_ ____ _ _____
A computer system of the type briefly described above is known in
the art. This prior system operates in a three-phase operation. The first
phase consists of a control phase during which only the control computer is
operative, carries out its program and informs the individual computers of
the function which they must carry out during the following phase. The
second phase consists of an autonomous phase during which the individual
computers carry out their assigned functions simultaneously and independently
of one another without being connected to the control computer or to its
memory, and then report the execution of their function by transmitting a
"STOP" signal to the control computer. The third phase consists of a data
exchange phase which starts when the control computer has received a "STOP"
signal from all of the indi.vidual computers or from a selection of
individual computers established by the circuit, and during which, under the
control of the control computer, the data exchange is carri.ed out between
the memories of the individual com-
.~.~f
b ~ 6
puters, and possibly the control computer.
For specific fields of use of data processing systems, for example
in process control monitoring of nuclear power stations~ for example, and in
navigation systems for flying bodies, as another example, computer systems
having a high degree of reliability are required.
The reliability of data processing systems can be increased by re-
dundancy in construction, for example by a multiple provision of critical
components such as a central unit with a working memory, in which in the
case of differing results, the result emitted by the majority of components
is used or else by a redundancy in the organization, for example by means of
redundant, full-correcting codes. A fundamental requirement of the organiz-
ation is in being able to continue computation without a time loss or with
; only a small time loss when faultsoccur. It is not sufficient to isolate and
replace faulty components and then to reinitiate the function being processed
from the beginning. If this is at all possible, the result in time loss
would generally be incompatible with the requirements of real time problems.
SUMMARY OF Tl-IE INVENTION
The object of the present invention is to provide a computer system
which facilitates real time operation inspite of breakdown of individual com-
ponents.
This object is achieved by means of a computer system of the typebriefly described above in that a safeguarding memory to which the control
computer has access the other system bus and a further memory to which the
control computer also has access are provided.
A high degree of reliability can be achieved with this computer
system if it is operated in such a manner that the control computer, the
further memory and a part of the existing modules are used to process the user
program. A monitoring phase is interposed at regular intervals in which all
-2-
3~6
of the individual computers are checked in respect of functioning capacity by
means of test programs stored in tlle working memories of the modules and de-
fective modules are determined and indicated. In the event that no defective
modules are recognized, the intermediate results calculated at that time are
stored in the safeguarding memory and further processing of the user program
is continued in normal fashion. In the event that one or more of the one
defective module is recognized, such modules are replaced by certain of the
other modules which are not used for processing the user program, for which
purpose the individual computer function of the module which is to be re-
placed is loaded from the other memory which stores the entire user programinto each replacing module. Then, further processing is continued with the
last-safeguarded intermediate results stored in the safeguarding memory.
Advantageously, the computer system processes the user program in
a three-phase cycle.
Advantageously, the computer system is operated in such a manner
that after as few as possible phase cycles a monitoring phase is additionally
inserted between the autonomous phase and the next data exchange phase.
For triggering of the monitoring phases, the computer system is
advantageously provided with a pulse generator which is coupled to a control
computer and which triggers the monitoring phases with a period of a pulse
train.
In order to exchange a defective module for an intact module, it
is expedient for each module to be provided with a fixed module number and
a module number which can be modified by the control computer for character-
ization purposes. The exchange process is then expediently carried out in
that the modifiable module numbers of the defective modules are exchanged
with those of intact modules, and their fixed module numbers are used for
addressing purposes.
26
Advantageously, a computer system constructed in accordance with
the present invention is provided with a time monitoring device which is
coupled to the computer system and which indicates an impermissibly long auto-
nomous phase and immediately i.ntroduces an addi~ional monitoring phase.
The computer system can advantageously be designed in such a manner
that each module possesses a parity production and checking unit which con-
stantly monitors the module and, on the recognition of a defect, reports this
defect to the control computer by means of a parity fault message and thus
immediately triggers a monitoring phase.
Thus, in accordance with one broad aspect of the invention, there
is provided a computer system comprising: a control computer; a system bus
system, including a control and address bus and a data bus, connected to said
control computer; a plurality of computer modules connected to said system
bus, each including an individual computer, a coupling memory and a working
memory, access to said coupling memory being had from said system bus and
from said individual computer; an information safeguarding memory connected
to said system bus for storing intermediate computed results; a further
memory connected to said control computer for storing an entire user program,
said control computer operable to monitor the performance of said individual
computers and to substitute an operable individual computer along with said
safeguarding and further memories in response to faulty operation of an
individual computer, means operable to periodically interpose a monitoring
phase in the multi-phase operation of the system; means in the individual
computers, including test program means in sai.d working memories, to check
the functioning capacity of the respective computers; means for determining
and signaling an intact or a defective module; means for storing the
intermediate computed results in response to fault-free detection; means for
3~6
causing said memory to load thc program of a defective module into a
substitute module in response to detection of a defective module; and means
for causing the safeguarding memory to provide the intermediate results to the
substitute module and continuing of the data processing originally undertaken.
In accordance with another broad aspect of the inventi.on there is
provided a method of operating a computer system which has a control computer,
a bus system connected to the control computer and a plurality of modules
connected to the bus system each including a working memory storing test
programs, an individual computer and a coupling memory, a safeguarding memory
and a further memory storing an entire user program, comprising the steps of:
operating the system through a con-crol phase in which the control computer
informs the individual cornputers of their f~mctions, an autonomous phase in
which the individual computers carry out their functions, and a data exchange
phase in which data is exchanged between computers; operating the system at
regular intervals through a monitoring phase in which the individual test
programs are rlm and contemporaneously checking the operating capabilities of
each module, storing intermediate computed results in the safeguarding
memory; continuing normal processing when faults are not found; loadi.ng the
individual function of a defective module from the further memory into
replacement module in response to detecti.on of a faulty module; and continuing
processing with the replacement module and the intermediate results stored in
the safeguarding memory.
BRIEF D SCRIPTION OF THE _ RAWING
Other objects, features and advantages of the invention, its
organizati.on, construction and mode of operation will be best understood from
the following detailed description, taken in conjunction with the accompanying
drawing, on which there is a single fi~gure which is a block diagram
-~a-
~ 't~ ~ ~
illustration of an exemplary embodiment of a computer system which ;s con-
structed and operates in accordance with the present invention.
DESCRIPTI~N OF THE PREFERRED EM DI~ENT
Referring to the drawing, the exemplary embodiment illustrated
comprises a pluralIty of computer modules 11, 12, 13, 15, 16 and 18 which are
coupled to a system data line. Each module comprises a coupling memory KS,
an individual computer ER and a working memory AS. In each module, only the
individual computer has access to its working memory, whereas access can be
obtained to the coupling memory selectively from the individual computer or
from the system bus. For purposes of fault recognition, each module is pro-
vided with a parity production and checking unit and possesses its own out-
put a for the parity fault message. By way of characterization, each module
possesses a fixed module number and a module number which can be modified
from the control computer. Furthermore, a control computer STR is provided
-4b-
. .
r~2~
which can be coupled to the system bus 1 and has access to a further memory
GS and access via the system bus, to a safeguarding memory SS. The further
memory GS preferably consists of a high-speed large-capacity memory, for
example a disc memory. All of the individual computers are preferably micro-
processors. The safeguarding memory SS is preferably identical in construc-
tion to the coupling memory of a module. Also provided are a pulse generator
T and a time monitoring device ZU which are both coupled to the control
computer STR. The pulse train period of the pulse generator regularly
triggers monitoring phases. All of the outputs a of the computer modules are
likewise connected to the control computer STR.
In the following the cooperation of all the described components
will be explained.
It has been assumed that the modules 11--15 are used to process the
user program, whereas the modules 16--18 are redundant modules. The computer
system which processes the user program comprises the modules 11--15, the
control computer STR and the further memory GS and can simultaneously process
as many sub-functions of the user program as computer modules 11^-15 are
provided.
Computer system operates in the above-described three-phase cycle.
The computer state is established following each three-phase cycle by the
individual functions stored in the modules and by the exchanged results
which are primarily intermediate results.
Whereas the individual functions are fixed and can be called up,
for example from the further memory, the intermediate results must be safe-
guarded. Safeguarding is carried out, together with a check on the computer,
in the additionally interposed monitoring phases.
The duration between two monitoring phases is determined by the
period duration of the pulse generator T. The pulse generator transmits an
3~"~
interrupt request to the control computer which inserts a monitoring phase
before the next data exchallge pilase.
A control computer starts test programs which are provided in all
the modules and which carry out a function check of the modules. Here, it
is necessary to use test programs which, in the case of fault-free modules,
do not permanently alter the memory contents. The fault messages are stored
in the coupling memory KS. The control computer now checks whether fault
messages have been received from modules and trusted with the processing of
a sub-function. If this is not the case, for the following data exchange
phase the safeguarding memory is coupled to the system bus in order to
simultaneously receive the intermediate results with the coupling memories
of the modules entrusted with the sub-functions. The further processing of
the user program is then continued without modification. If, however, the
faults occur, the defective modules are replaced by intact, previously unused
modules.
Replacement is carried out in the following steps: the module
numbers, modifiable by the control computer, of the free and defective
modules are exchanged and addressed during this procedure by way of the
fixed module numbers; then, the missing individual functions are reloaded
from the further memory which stores the entire user program. For the
duration of the following exchange phase, the safeguarding memory is coupled
to the system bus. In contrast ~o a fault-free situation in which the inter-
mediate results have been written into the safeguarding memory, it now forms
the source of safeguarded results. These safeguarded results are read from
the safeguarding memory and transferred into the coupling memories.
- This fulfills the conditions for th0 restarting of the system.
The starting point is the control phase which follows the last phase cycle
with a fault-free monitoring phase.
~3~326
In addition to initiation by the pulse generator T, monitoring
phases can also be triggered by the time monitoring device ZU which indicates
an impermissibly long autonomous phase or by a parity fault message from one
of the modules which appears at the output a. In these situations, the
modules are checked immediately and, only after the conclusion of the auto-
nomous phase.
Although I have described my invention by reference to a particular
illustrative embodiment thereof, many changes and modifications of the inven-
tion may become apparent to those skilled in the art without departing from
the spirit and scope of the invention. I therefore in~end to include within
the patent warranted hereon all such changes and modifications as may reason-
ably and properly be included within the scope of my contribution to the art.