Note: Descriptions are shown in the official language in which they were submitted.
~1~7~3Z~
This application is a divisional of copending
Canadian Patent Application Serial No. ~80,505 filed
June 14, 1977 in the name of Tandem Computers Incorporated.
This invention relates to a multiprocessor com-
puter system in which interconnected processor modules
provide multirrocessing (parallel processing in separate
processor modules)and multiprogramming (interleaved pro-
cessing in one processor module).
This invention relates particularly to a system
which can support high transaction rates to large on-line
data bases and in which no single component failure can
stop or contaminate the operation of the system.
There are many applications which require on-
line processing of large volumes of data at high trans-
action rates. For example, such processing is required in
retail applications for automated point of sale, inventory
and credit transactions and in financial institutions for
automated funds transfer and credit transactions.
In computing applications of this kind it is
important, and often critical, that the data processing
not be interrupted. A failure of an on-line computer
system can shut down a portion of the related business and
can cause considerable loss of data and money.
Thus, an on-line system of this kind must provide
not only sufficient computing power to permit multiple
computations to be done simultaneously, but it must also
provide a mode of operation which permits data processing
to be continued without interruption in the event some com-
ponent of the system fails.
The system should operate either in a fail-safe
mode (in which no loss of throughput occurs as a result of
-- 1 --
~
1 failure) or in a fail-soft mode (in which some slowdown
2 occurs but full processing capabilities are maintained)
3 in the event of a failure.
4 FurthermGre, the system should also operate in
S a way such t~at a failure of a single component cannot
6 contaminate the operation of the system. ~he system should
7 provide fault-tolerant computing. For faul~-tolerant
8 computing all errors and failures in the system should either
9 be corrected automatically, or if the failure or error cannot
10 be corrected automatically, it should be detected, or if it
11 cannot be detected, it should be contained and should not
12 be permitted to contaminate the rest of the system.
13 Since a single processor module can fail, it is
14 o~vious that a system which will operate without interruption
in an on-line application must have more than one processor
16 mOdule.
.,
~7 Systems which have more than one processor module
~8 can therefore meet one of the necessary conditions for non-
19 interruptible operation. However, the use of more than one
20 processor module in a system does not by itself provide all
21 the sufficient conditions for maintaining the required
22 processing capabilities in the event of component failure,
23 as will become more apparent from the description to follow.
24 Computing systems for on-line, high volume, trans-
25 action oriented, computing applications which must operate
26 without interruption therefore require multiprocessors as a
27 starting point. But the use of multiprocessors does not
28 guarantee that all of the sufficient conditions will be met,
29 and fulfilling the additional sufficient conditions for on-lir.e
32~ ~
1 systems o this kind has presented a number of problems
2 in the prior art.
- 3 The prior art approach to uninterrupted data
4 processing has proceeded generally along two lines -- either
5 adapting two or more large, monolithic, general purpose
6 computers for joint operation or lnterconnecting a plurality
7 of minicomputers to provide multiprocessing capabilities.
8 In the first case, adapting two large monolithic
g general purpose computers for joint operation, one conven-
10 tional prior art approach has been to have the two computers
11 share a common memory. Now in this type of multiprocessing
12 system a failùre in the shared memory can stop the entire
13 system. Shared memory also presents a number of other
14 p~blems includingsequencing accesses to the common memory.
15 ~his system, while meeting some of the necessary conditions
16 for uninterruptible processing, does not meet all of the
17 guficient conditions.
18 Furthermore, multiprocessing systems using large
19 general purpose computers are quite expensive because each
20 computer is constructed as a monolithic unit in which all
21 components (including the packaging, the cooling system,
22 etc.) mus* be duplicated each time another processor is
23 added to the system even though many of the duplicated
24 components are not required.
The other prior art approach of using a plurality of
26 minicomputers has tin common with the approach of using large
27 general purpose computers) suffered from the drawback of
28 having to adapt a communications link between computers
29 that were never originally constructed to provide such a
30 link. The required links were, as a result, usually made
. .
~i~7~2 ~
1 through the input/output channel. Connections through the
2 input/output channel are necessarily slower than internal
3 transfers within the processor itself, and such interprocessor
4 links have therefore provided relatively slow interprocessor
5 communication.
6 Furthermore, thè interprocessor connections
7 re~uired special adapter cards that added substantially to
8 the cost of the overall system and that introduced the
g possibility of single component failures which could stop
10 the system. Adding dual interprocessor lin~s and adapter
11 cards to avoid problems of critical slngle components failures
12 increased the overall system cost even more substantially.
13 Providing dual links and adapter cards between
14 a~l processors generally became very cumbersome and quite
lS complex from the standpoint of operation.
16 Another problem of the prior art arose out of the
17 way in which connections were made to peripheral devices.
18 ~ If a number of peripheral devices are connected to
19 a single input/output bus of one processor in a multiprocessor
20 system and that processor fails, then the peripheral devices
21 will be unavailable to the system even though the failed
22 processor is linked through an interprocessor connection to
23 another processor or processors in the system.
24 To avoid this problem, the prior art has provided an
2S input/output bus switch for interconnecting input/output busses
26 for continued access to peripheral devices when a processor
27 associated with the peripheral devices on a particular input/
28 output bus fails. The bus switches have been expensive and also
29 have presented the possibility of single component failure
30 which could down a substantial part of the overall system.
2/~
1 Providin~ software for the prior art multiprocessor
2 systems has also been a major problem.
3 Operating systems soft~are for such multiprocessing
4 systems has tended to be non~xistent. Where software had
been developed for such multiprocessor systems, it quite
6 often was restricted to a small number of processors and
7 was not adapted for the inclusion of additional processors.
8 In many cases it was necessary either to modify the operating
9 system or to put some of the operating system functions into
the user's own program -- an expensive, time-consuming~
11 operation~ ...
12 The prior art lacked a satisfactory standard operating
13 system for linking processors. It also did not provide an
14 o~erating system for automatically accommodating additional
processors in a multiprocessing system constructed to
16 accommodate the modular addition of processors as increased
17 computering power was required.
18 A primary object of the present invention is to
19 construct a multiprocessor system for on-line, transaction-
20 oriented applications which overcomes the problems of the
21 prior art.
22 A basic objective of the present invention is to
23 insure that no single failure can stop the system or significantly
24 affect system operation. In this regard, the system of
25 the present invention is constructed so that there is no
26 single component that attaches to everything in the system,
27 either mechanically or electrically.
28 It is a closely related objective of the present
29 invention to guarantee that every error that happens can be
30 either corrected, detected or prevented from contaminating the syste
- .
~i~7~2~ ;
1 It is another important ob~ective of the present
2 invention to provide a system architecture and basic mode
3 of operation which free the user from the need to ~et involved
- 4 with the system hardware and the protocol of interpro~essOr
S communication. In the present invention every major component
6 is modularized so that any major component can be removed or
7 replaced without stopping the system. In addition, the
8 system can be expanded in place (either horizontally ~y the
- g addition of standard processor modules or in most cases
vertically by the addition of peripheral devices) without
11 system interruption or modification to hardware or software.
12
13
14
16
17
18
19
21
22
23
24
26
27
28
29
.
1 Summar~ of the Invention
3 The multiprocessor system of the present invention
4 comprises multiple, independent processor modules and data
5 paths.
6 In one specific embodiment of the present invention
7 16 separate processor modules are interconnected by an
8 interprocessor bus for multiprocessing and multiprogramming.
9 In this specific embodiment each processor module supports
10 up to 32 device controllers, and each device controller-can
11 control up to eight peripheral devices.
12 Multiple, independent communication paths and ports
13 are provided between all major components of the system to
14 ir,ure that it is always possible to communicate between
5 processar modules and ~etween processor modules and peripheral
16 devices over at least two paths and also to insure that a
17 single failure will not stop system operation.
18 ~ ~hese multiple communication paths include multiple
19 interprocessor busses interconnecting each of the processor
20 modules, multiports in each device controller, and input/output
21 busses connecting each device controller for access by at
22 least two different processor modules.
23 Each processor module is a standard module and
24 includes as part of the module a central processing unit, a
25 main memory, an interprocessor control and an input/output
26 channel,
27 Each processor module has a pipelined microprocessor
28 operated by microinstructions included as a basic instruction
29 set in each processor module.
The basic instruction set in each
.
1 processor module recognizes the fact that there is an
2 interprocessor communications link; and when an additional
3 processor module is added to the system, the operating system
4 (a copy of which resides in each processor module) is informed
5 that a new resource is available for operation within the
6 existing operating system without the need to modify either
7 the system hardware or software.
8 To increase performance and to maintain very high
9 transaction rates each processor module includes a second
0 microprocessor which is dedicated to input/output operations.
11 A dual port access to the main memory by both the
12 central processing unit and the inputjoutput channel permits
13 direct memory access for the input/output transfers to also
14 increase performance.
~5 Each processor module is physically constructed
16 to fit on a minimum number of large printed circuit boards.
17 Using only a few boards for each processor module conserves
18 space for packaging and minimizes the length of the inter-
19 processor bus required to interconnect all of the processor
20 modules. A relatively short interprocessor bus minimizes the
21 deterioration of the signals on the interprocessor bus and
22 permits high speed of communication over the interprocessor
23 bus.
24 Each interprocessor bus is a high speedj synchronous
25 bus to minimize overhead in interprocessor communications and
26 to enable the system to achieve high throughput rates.
-~ 27 A separate bus controller monitors all transmissions
28 OVer the bus. The bus controller includes processor select
29 logic for determining the priority of data transfer between
30 any two processor modules over the interprocessor bus. The
., .
1 bus controller also includes bus control state lo~ic for
2 establishing a sender-receiver pair of processor ~odules
3 and a time frame for a transfer of in~ormation over the bus
4 between the sender-receiver pair.
5 Each bus controller includes a bus clock, and each --
6 central processing unit of each processor module has its own
7 separate clock. There is no master clock system subject to
8 a single component failure which could stop the entire
9 multiprocessor system.
0 Each processor module includes, in the interprocessor
11 control of the processor module, a certain amount of circuitry
12 on the printed circuit boards which is dedicated to communications
13 over the interprocessor buses.
14 ~ ~ach interprocessor control also includes fast
buffers (inqueue buffers and an outqueue buffer) which can
16 be emptied and filled by the central processing unit without
17 interfering with the interprocessor bus. This makes it
18 possible to sustain a higher data rate on the interprocessor
19 bus than could be sustained by any single pair of processors~
Several data transfers between pairs of processor modules
21 can be interleaved on an apparent simultaneous basis.
22 Because the interprocessor bus operates asynchronously
23 with each particular central processing unit, each inqueue
24 and outqueue buffer is clocked either by the processor module
or by the bus controller, but not by both simultaneously.
26 Each inqueue buffer and outqueue bufer therefore
27 has associated with it in the- interprocessor control some
28 logic that operates in synchronism with the bus clock and
29 other logic that operates in synchronism with the central
30 processing unit clock. Logic interlocks qualify certain
, s'
2~
1 transitions of the logic from one state to another st~te
2 to prevent loss of data in transfers between the asynchronous
3 interprocessor buses and processor module.
4 The logic is also arranged so that in the event
S a processor module is powering down, there will be no transient
6 effect on the interprocessor buses because the processor module
7 is losing control. The powering down of the processor module
8 on an interprocessor bus will therefore not disrupt any other
g interprocessor bus activity.
The bus controller and interprocessor control of
11 each processor module coact to perform all interprocessor
12 bus management in parallel with processing by the central
13 processing units so that there is no waste of processing
14 p~wer. This bus management is performed with low
protocol overhead in that it takes very few interprocessor
16 bus cycles to establish a bus transfer -- what processo~
17 bus module is sending and what processor module is receiving --
18 relativ~ to the amount of information actually transmitted.
19 The processor select logic of the bus controller
20 includes an individual select line which extends from the
21 processor select logic to each processor module. The select
22 lines are used in three ways in the protocol of establishing
23 a sender-receiver pair of processor modules and a time
~4 frame or transfer of information over the interprocessor
25 bus between the sender-receiver pair. The select lines are
26 used (1) in polling to determine which particular processor
Z7 module wants to send, (2) in receiving to inquire of a receiver
28 processor module whether the particular processor module wants to
29 receive, and (3) in combination with a send command to let the
30 sender processor module know the time frame for sending.
52~
1 The receiver processor module is qualified to
2 receive incoming data unsolicited by the receiver processor
3 module and without a software instruction.
4 Blocks of data between a sender-receiver pair of
S processor modules are transmitted over the interprocessor
6 bus in packets. At the end of each packet transfer the
7 interprocessor control of a receiver processor module logically
8 disconnects from the interprocessor bus to permit the bus
g control state logic to establish another sequence of a
10 different sender-receiver pair of processor modules and a
11 time frame for making a packet transfer between the other
12 pair of sender-receiver processor modules. Thus, as noted
13 above, several data block transfers between different sender-
14 receiver pairs of processor modules can therefore be interleaved
on the interprocessor bus on an apparently simultaneous basis
16 because of the faster clock rate of the interprocessor bus as
17 compared to the slower memory speed of the processor modules.
18 Each processor module memory includes a separate
19 buffer for each combination of a processor module and an
20 interproceSsor bus.
21 Each memory also includes a bus receive table for
2~ directing incoming data from an interprocessor bus to a
23 specified location in a related buffer in the memory of a
24 receiver processor module. Each bus receive table provides
25 a bus receive table entry which contains the address where the
26 incoming data is to be stored and the number of words expected
27 from the sender processor module. The bus receive table
28 entry is updated by firmware in the processor module after
29 the receipt of each packet and is effective with the firmware
30 either to provide a program interrupt when the entire data
11
7~24
1 block has been successfully received or to provide an interrupt
2 to the software program currently executing in the processor
3 module in response to the detection of an error in the course
4 of the transmission of the data over the interprocessor bus.
Producing a program interrupt only at the completion of the
6 data block transfer enables the transfer of data to be made
7 transparent to the software currently executing in the
8 processor module. The interrupt in response to the detection
9 of an error provides an integrity check on the transmission
0 Of data.
The input/output subsystem of the multiprocessor
2 system of the present invention is constructed to insure that
~3 no single processor module failure can impair system operation.
14 ~ In addition, the input/output subsystem is
15 constructed to handle very high transaction rates, to
16 maximize throughp~lt, and to minimize interference with
17 programs running in the processor modules.
18 , ~ As noted above, each processor module includes a
19 microprocessor which is dedicated to input/output operations.
The input/output system is an interrupt driven
21 system and provides a program interrupt only upon completion
22 Of the data transfer. This relieves the central processing
23 unit from being dedicated to the device while it is transferring
24 data.
Each input/output channsl is block multiplexed to
26 handle several block transfers of data from several device
27 controllers on an apparent simultaneous basis. This is
28 accomplished by interleaving variable length bursts of data
29 in transfers between the input/output channel and stress
30 responsive buffers in the device controllers.
12
i
~14 ~2~
1 As noted above, each device cont~oller has multiports,
2 and a separate input/output bus is connected to each port
3 so that each device controller is connected for access by
4 at least two different processor modules.
The ports of each device controller are constructed
6 so that each port is logically and physically independent
7 of each other port. No component part of one port is also a
8 component of another port so that no single component failure
9 in one port can affect the operation of another port.
Each device controller includes logic which insures
11 that only one port is selected for access at a time so that
12 transmitting erroneous data to one port can never contaminate
13 another port.
14 The input/output system of the present invention
~5 interfaces the peripheral devices in a failsoft manner. Theré
16 are multiple paths to each particular device in case of a
17 failure on one path. And a ~ailure of the device or a failure
18 of a processor module along one path does not affect the
19 operation of a processor module on another path to the device.
The input/output system of the present invention
21 is also constructed so that any type of device ca~ be put
22 on the system, and the input/output system will still make
23 maximum usage of the input/output channel bandwidth.
24 The device controllers are buffered such that all
25 transfers between the device controllers and the input/output
26 channel occur at the maximum channel rate.
27 The device controller may transfer between itself
28 and a peripheral device in bytes, but the device controller
29 must pack and unpack data to transfer words between itself
30 and the input/oUtPut channel.
2~
1 Because the buffers are locatcd in the device
2 controllers rather than in the input/output channel, the
3 present invention limits the buffering to only the buffering
4 required by a particular system configuration. The present
invention does not require a separate buffer for each peripheral
6 device in order to prevent overruns, as would be required
7 if the buffers were located in the input/output channel
8 rather than in the device controllers as had often been the
9 practice in the prior art.
As noted above, each buffer is a stress responsive
11 buffer and this provides two advantages.
12 First of all, each buffer can be constructed to
13 have an overall depth which is related to the type and number
14 O. devices to be serviced. Each device controller can therefore
15 have a buffer size which is related to the kind of devices
16 to be controlled.
17 Secondly, the stress responsive buffer aonstruction
18 and mode of operation of the present invention allows the
9 bufers to cooperate without communicating with each other.
20 This in turn permits optimum efficient use of the bandwidth
21 of the input/output channel.
22 The stress placed on a particular buffer is determined
23 by the degree of the full or empty condition of the buffer
24 in combination with the direction of the transfer with respect
25 to the processor module. Stress increases as the peripheral
26 device accesses the buffer, and stress decreases as the input/
27 output channel means access the buffer.
28 Each buffer has a depth which is the sum of a
29 threshold depth and a holdoff depth. The threshold depth
30 is related to the time required to service higher priority
.
14
~47~24
device controllers, and the holdoff depth is related to the
time required to service lower priority device controllers
connected to the same input/output channel.
The stress responsive buffer includes control logic for
keeping track of the stress placed on the buffer. The con-
trol logic is effective to make reconnect requests to the
input/output channel as the stress passes through a threshold
depth of the buffer.
Each buffer having a reconnect request pending is indivi-
dually connected to the input/output channel in accordance with
a polling scheme which resolves priority among all the de-
vice controllers having a reconnect request pending.
When the device controller is connected to the input/
output channell the data is transferred between the buffer
and the input/output channel in a burst at or near memory
speed.
Thus, because the buffers transmit data to and from the
peripheral devices at the relatively slow device speed and
can transmit the data to and from the processor modules at
or near memory speed in burst transfers, and in response to
buffer stressJ the burst transfers can be time divisions
multiplexed so that individual bursts from several device
controllers can be interleaved to optimize efficient use of
the bandwidth of the input/output channel and also to permit
several block transfers from different device controllers
-15-
1147~24
to be made on an apparent simultaneous basis.
The present invention comprises a datapath system of
the kind in which data is transferred between a computer
memory and peripheral devices through device controllers,
each of which include a buffer. The device controllers
are connected to a channel which controls transfer of data
between the computer memory and the buffer in each of the
device controllers. The channel has substantially greater
data transfer bandwidth than the bandwidth of data trans-
- 10 fer between the peripheral devices and the device control-
lers. Data can be transferred continuously between the buffer
in a device controller and an associated peripheral device.
Data transfer between several pairs of device controllers
and associated peripheral devices may occur concurrently.
The channel periodically "reconnects" to the buffer in each
device controller and transfers data between the computer
memory and the buffer, in accordance with the transfer of
data between that buffer and its associated peripheral
device.
The problem solved by this invention is the means by which
the channel determines which buffer should be reconnected,
and when it should be reconnected. The means used measures
a quantity called "stress"associated with the fullness
of each buffer; and for each device controller presents a
signal called "reconnect request" to the channel when
the stress meets or exceeds threshold level; provides the
-16-
li~7~2~
channel with a means of determining relative priority of
simultaneous requests so it can respond to the highest
priority request.
Device controllers of several types may be attached to
the channel in many combinations.
Each device controller has a control for keeping track
of the degree of stress of the buffer in that device con-
troller, and the control is effective to signal a recon-
nect request to the channel when the degree of stress
increases beyond a threshold level. A means of varying
the threshold level is provided so that the threshold level
can be changed depending on the number and combination of
device controllers attached to a particular channel.
Stress is a term that may be qualitatively defined as
a measure of how soon a buffer will need to be reconnected
to the channel to transfer data from the buffer to the
channel to prevent overfilling of the buffer ~if data is
being transferred from a peripheral device through the
buffer and channel to the computer memory) or to the buffer
from the channel to prevent "over emptying" of the buffer
(if data is being transferred from the computer memory
through the channel and buffer to a peripheral device~.
Data trarlsfers to or from peripheral devices are generally
measured in bytes per time unit; therefore, although stress
is associated with level of fullness of a buffer, it can
also be thought of as being a measure of time dependency.
3Q
~"
ilZ~
Stress is related to buffer fullness as follows: on trans-
fer of data from a peripheral device through a buffer and
channel to a computer memoryJ stress is minimum when the
buffer is empty, maximum when it is fullJ and increases
as each unit o data is transferred from the peripheral de-
- vide to the buffer. Stress decreases in this case as each
unit of data is transferred from the buffer through the
channel to the computer memory. ~n transfers of data
from a computer memory through a channel and buffer to a
peripheral deviceJ stress is minimum when the buffer is
full and maximum when the buffer is empty. In this case
stress increases for each unit of data transferred from
the buffer to the peripheral deviceJ and decreases for
each unit of data transferred from the channel to the
buffer.
' Stress is therefore a measure of the level of fullness
of a buffer due to transfer of data between the channel
and the bufferJ and between the buffer and a peripheral
device.
Threshold level is also related to buffer fullness.
Threshold level is the value to which buffer stress will
be allowed to rise before interaction between the buffer
and the channel is initiated.
The threshold level of each buffer can be set as needed
when de~ice controllers are added to the data path system.
-18-
The setting must satisfy two criteria:
First, as data is transferred between a buffer and
an associated peripheral device, buffer fullness changes,
increasing stress level. When the stress level meets
the threshold level, the device controller requests recon-
nection to the channel, but reconnection may not occur
immediately because the channel may be reconnected to
another device controller. Furthermore, one or more
higher priority device controllers may request reconnec-
tion to the channel. Therefore, the first criterion to
be satisfied in setting the threshold level for the buffer
in a particular device controller is that the remaining
space in the buffer must be sufficient to allow the buf-
fer to transfer data to or from the associated peripheral
device, at the rate demanded by the peripheral device,
for a period of time long enough to allow the channel
to reconnect to one device controller lower in priority,
and all device controllers higher in priority than that
particular device controller.
Second, in order to allow a reconnect request by a par-
ticular device controller to be acted on by the channel
even though there may be devices higher in priority
than that particular device controller, and making recon-
nect requests of the channel at the same general time as
3~
-19-
~Y
11gL7~2~
the particular device controller and thereby continually
preventing the reconnect request of the particular controller
from being responded to by the channelJ the threshold level
setting must satisfy the following criterion: the threshold
setting of a buffer in a particular device controller must
be set such that the space in the buffer between the point
at which the stress level is minimum and the point at
which the stress level is equal to the threshold level
~ must be sufficient to allow the buffer to transfer data to
:. 10 or from an associated peripheral device, at the rates de-
:.,
. manded by the peripheral device, for a period of time long
enough for the channel to reconnect all device controllers
lower in priority than the particular device controller.
The effect is that, after a particular device has reconnec-
`~ ted to the channel, it will not request reconnection again
u~ntil there has been time for all lower priority device
controllers to be reconnected to the channel.
Multiprocessor system apparatus and methods which
incorporate the structure and techniques described above
and which are effective to function as described above
3Q
-20-
l i~rA~
.
~1~a7~;:4
constitute further, specific objects of this invention.
According to a broad aspect of the present invention,
there is provided a stress responsive buffer for a burst
: multiplexed input/output system of the kind in which
variable-length bursts of data are transferred between an
input/output channel and a device controller and are time
~ multiplexed with bursts from other device controllers,
said stress responsive buffer comprising,
buffer memory means for receiving and storing
data from the input/output channel and from peripheral
'~ devices attached to the device controller,
.;
I
: 30
-21-
3Z4
buffer control logic means comprising logic means for keeping
track of the degree of stress placed on the buffer and effective to
.~ make reconnect requests and disconnect requests to the input/output
channel as the stress passes through certain values,
stress varying means for varying the value of stress at which
the device controller ma~es a reconnect request to the input/output
channel.
According to another aspect o the present invention, there
; is provided a method for buffering data in a burst multiplexed input/
output system in which variable lengthbursts of data are transferred
between an input/output channel and a device controller, and also are
timè multiplexed with bursts from other device controllers comprising
the steps of storing in a buffer memory data received from the input/
output channel or from peripheral devices associated with a device con-
troller, monitoring the level of fullness of the buffer memory, relating
the level of fullness to a predetermined threshold, requesting, in re
sponse to the relationship between the level of fullness and the threshold,
connection or disconnection from the input/output channel, and altering
the threshold at which the device controller makes a request for connec-
tion or disconnection to ensure adequate storage capacity of the buffer.
The invention will now be described in greater detail with
reference to the accompanying drawings, in which:
-22-
L7~32~
Figure 1 is an isometric, block diagram view of
a multiprocessor system constructed in accordance with one
embodiment of the present invention. Figure 1 shows several
processor modules 33 connected by two interprocessor buses
35 (an X bus and a Y bus) with each bus controlled by a
bus controller 37. Figure 1 also shows several dual-port
- device controllers 41 with each device controller connected
to the input/output (I/O) buses 39 of two processor
modules;
Figure 2 is a block diagram view showing details
of the connections of the X bus controller and the Y bus
controller to the individual processor modules. Figure 2
shows, in diagrammatic form, the connections between each
bus controller and the interprocessor control 55 of an
individual processor module;
Figure 3 is a detailed diagrammatic view of the
logic of one of the bus controllers 37 shown in Figure
2;
Figure 4 is a detailed diagrammatic view of the
logic for the shared output buffer and control 67 in the
interprocessor control 55 of a processor module as illus-
trated in Figure 2;
Figure 5 is a view like Figure 4 but showing the
logic for an inqueue buffer and control 65 of the inter-
processor control 55 for a processor module;
- 23 -
~14~2~
1 Fiq. 6 is a state diagram of the logic 81 for
2 a bus controller 37 and illustratcs how the logic res-
3 ponds to the protocol lines ~oing into the bus controll
4 and generates the protocol lines going out of the bus
5 controller-to the processor modules;
7 Fig. 7 is a state diagram like Fig. 6 but
8 showing the logic 73 and 75 for the shared outqueue
9 buffer and control 67 of Fig. 4;
10 ' ~
11 Fig. 8 is a state diagram like Figs. 6 and 7 '
12 but showing the logic 93 and 101 for the inqueue buffer
13 And control 65 of Fig. 5;
14
Fig. 9 is a diagrammatic view showing the time
16 sequence for the transmission o a given packet between
17 a 8ender processor module and a receiver processor
18 module;
19
i 20 Flg. 10 is a logic diagram of the bus empty
21 state logic section 75 and the processor fill state
22 logic section 73 of the outqueue buffer and the control
23 67 shown in Fig. 4;
24
Fig. 11 is a listing of logic equations for
26 the logic diagram shown in Fig. 10;
27
28 Fig. 12 is a block diagram of the input/output
29 (I/O) system of the multiprocessor system shown in
30 Fig. l;
24
~ 7~2~
1 Fig. 13 is a block diagram of the input/output
2 ~I/O) channel 109 of a processor module. Fig. 13 shows
3 the major components of the I/O channel and the data
4 path relating those component parts
6 Fig. 14 i; a detailed view showing the
7 individual lines in the I/O bus 39 of Fig. l;
9 Fig. 15 is an I/O channel protocol diagram
10 showing the state changes of the T bus 153 for an execute
11 input/output (EIO) caused by the microprogram 115 in the
12 CPU 105. The sequence lllustrated is initiated by the
13 CPU 105 and is transmitted through the I/O channel 109
14 of the processor module 33 and on the T bus 153 to a
15 device controller 41 as shown in Fig. l;
16
17 Fig. 16 is an I/O channel protocol diagram
18 showing ,the state changes of the T bus 153 for a reconnect
19 and data transfer sequence initiated by the I/O channel
20 microprogram 121 in response to a request signal from a
21 device controller 41;
22
23 Fig. 17 is an I/O channel protocol diagram
24 showing the state changes of the T bus 153 for an
25 interrogate I/O (IIO) instruction or an interrogate high
26priority I/O (HIIO) instruction initiated by the CPU
27 microprogram 115. The sequence illustrated is trans-
28mitted over the T bus 153 to a device controller 41;
29
~7~2~
1 Fig. 18 is a table identifying the functions
2 referred to by the mnemonics in Figs. 15 throuqh 17;
4 Fig. 19 is a block diagram showing the
S general structure of the ports 43 and a device controller
6 41 as illustrated in Fig. l
8 Fig. 20 is a block diagram of a port 43 shown
9 in Fig. 19. This Fig. 20 shows primarily the data paths
10 within a port 43;
11
12 Fig. 21 is a block diagram showing the data
13 path details of the interface common logic 181 of the
14 device controller 41 shown in Fig. 19;
16 Fig. 22 is a block diagram showiny the component
17 parts of a data buffer 189 in the control part of a
18 device controller 41 as illustrated in Fig. 19;
19
Fig. 23 is a graph illustrating the operation
21 Of the data buffer 189 illustrated in Figs. 22 and Fig. 19;
,. - .
23 Fig. 24 is a timing diagram illustrating the
24 relationship of SERVICE OUT ~SVO) from the channel 109
25 to the loading of data into the port data register 213
~' 26 ~Fig. 21) and illustrates how the parity check is started
~,...
27 before data is loaded into the register and is continued
~-~ 28 until after the data has been fully loaded into the register;
~; 29
~- 3~ Fig. 25 is a-schematic view showing details
31 of the power on circuit (PON) shown in Figs. 19 and 21;
26
32~
1 Fig. 26 is a logic diagram of the buffer
2 control logic 243 of th~ data buffer 189 (shown in
3 Fig. 22) of a device controller 41. Fig. 26 shows
4 how the buffer control logic 243 controls the hand-
5 shakes on the data bus and controls the input and output
; 6 pointers;
8 Fig. 27 is a listing of the logic equations
g for the select register 173 shown in Fig. 20. These
10 logic equations are implemented by the port control
11 logic 191 shown in Fig. 20
13 Fig. 28 is a timing diagram showing the
14 operation of the two line handshake between the I/O
15 channel 109 and the ports 43;
16
17 Fig. 29 is a logic diagram showing the logic
~8 for the general case of the handshake shown in Fig. 28.
19 The logic shown in Fig. 29 is part of the T bus machine
2~ 143 of the input/output channel 109 shown in Fig. 13;
21
22 Fig. 30 is a block diagram of a power
23 distribution system. Fig. 30 shows how a plurality of
24 independent and separate power supplies 303 are
25 distributed and associated with the dual port device
26 controllers 41 for insuring that each device controller
27 has both a primary and an alternate power supply;
28
29 Fig. 31 is an enlarged, detailed view of
30 the switching arrangement for switching between a
27
~1~7~324
1 primary power supply and an alternate supply for a
2 device controller. The switching structure shown in
3 Fig. 31 permits both automatic switching in the event -
4 of a failure of the primary power supply and manual
S switching in three different modes--off, auto an~
6 alternate;
8 Fig. 32 is a block diagram showing details
g of one of the separate and independent power supplies
10 303 illustrated in Fig. 30;
11
12 Fig. 33 is a block diagram view showing
13 details of the vertical buses and the horizontal buses
14 for supplying power from the separate power supplies
15 303 shown in Fig. 30 to the individual device controllers
16 41. ~he particular bus arrangement shown in Fig. 33
17 permits easy selection of any two of the individual
18 power supplies as the primary and the alternate power
19 supply for a particular de~ice controller
21 Fig. 34 is a block diagram of the memory
, 22 system and shows details of the memory 107 of a processor
b'~ 23 module 33 shown in Fig. l;
24
Fig. 35 is a block diagram showing details
26 of the map section 407 of the memory 107 shown in
27 Fig. 34;
; 28
~9 Fig. 36 is a block diagram showing the
30 organization of logical memory into four logical address
28
. ~.
. . A ~ . .
~1~'7~
1 areas and four separate map sections corresponding to
2 the four logical address areas. Fig. 36 also shows
3 details of the bits and fields in a single map entry
4 of a map section;
6 Fig. 37 is a block diagram showing details
7 of one of the memory modules 403 illustrated in Fig. 34.
8 The memory module 403 shown in Fig. 37 is a semiconductor
g memory module;
~1 Fig. 38 is a diagram of a check bit generator
12 used in the semiconductor memory module 403 shown in
13 Fig. 37. Fig. 38 also lists logic equations for two of
14 th eight bit parity trees used in the check bit register;
16 Fig. 39 is a diagram of a check bit comparator
17 used in the semiconductor memory module 403 shown in
18 Fig. 37. Fig. 39 includes the logic equation for nine
19 bit parity tree for syndrome bit zero;
21 Fig. 40 is a diagram of a syndrome decoder
22 used in the semiconductor memory module 403 shown in
23 Fig. 37. Fig. 37 also lists the logic equations for
24 the operation of the logic section 511 of the syndrome
25 decoder;
26
27 Fig. 41 is a logic diagram of a bit complementer
28 used in the semiconductor memory module 403 shown in
29 Fig. 37; and
29
_.,
~2~
1 Fig. 42 shows the various states of a two
2 processor system running an application program which is
3 requixed to be running continuously. The diagrams
4 illustrate the two processors successively failing and
5 being repaired and the application program changing its
6 mode of operation accordingly
11
13
18
19
21
22
23
24
26
27.
28
29
24
THE MULTIPROCESSOR SYSTEM:
Figure 1 is an isometric diagrammatic view of a
part of a multiprocessor system constructed in accordance
with one embodiment of the present invention. In Figure
1 the multiprocessor system is indicated generally by the
reference numeral 31.
The multiprocessor system 31 includes individual
processor modules 33. Each processor module 33 comprises
a central processing unit 105, a memory 107, an input/out-
put channel 109 and an interprocessor control 55.
The individual processor modules are inter-
connected by interprocessor buses 35 for interprocessor
communications.
In a specific embodiment of the multiprocessor
system 31, up to sixteen processor modules 33 are inter-
connected by two interprocessor buses 35 (indicated as the
X bus and the Y bus in Figure 11.
Each interprocessor bus has a bus controller 37
associated with that bus.
- 31 -
~1478;2~
1 ~he bus controllers 37, interprocessor buses 35
2 and interprocessor controls 55 (Fig. 1), together with
3 associated microprocessors 113, microprograms 115 and bus
4 receive tables 150 (Fig. 2) provide an interprocessor bus
system. The construction and operation of this interprocessor
6 bus system are illustrated in Figs. 2 - 11 and 42 and are
7 described in more detail below under the subtitle The
8 Interprocessor Bus System.
0 The multiprocessor system 31 has an input/output
11 (I/O) system for transferring data between the processor
12 modules 33 and peripheral devices, such as the discs 45,
13 terminals 47, magnetic tape drives 49, card readers 51, and
14 l':ne printers 53 shown in Fig. 1.
16 The I/0 system includes one I/0 bus 39 associated
17 wlth each I/O channel 109 of a processor module and one or
18 more multi-port device controllers 41 may be connected to
19 each I/O bus 39.
21 In the specific embodiment illustrated, each device
22 controller 41 has two ports 43 for connection to two different
23 processor modules 33 so that each aevice controller is
24 connected for access by two processor modules.
26 The I/O system includes a microprocessor 119
27 and a microprogram 121 in the I/O channel 109 (See Fig. 12.)
28 which are dedicated to input/output transfers.
29
32
_,
~1~'78:~4
1 As also diagrammatically illustrated in Fig. 12,
2 the microprocessor 113 and microprogram 115 of the central
3 processing unit 105 and an input/output control table 140 in
4 the main memory 107 of each processor module 33 are operatively
associated with the I/O channel 109.
7 The construction and operation of these and other
8 components of the I/O system are illustrated in Figs. 12 - 29
9 and are described in detail below under the subtitle The
Input/Output System and Dual Port Device Controller.
11 .,_
12 The multiprocessor system includes a power distribution
3 system 301 which distributes power from separate power supplies
14 te the processor modules 33 and to the device controllers 41
in a way that permits on-line maintenance and also provides
16 redundancy of power on each device controller.
17
18 As illustrated in Fig. 30, the power distribution
19 system includes separate and independent power supplies 303.
21 A separate power supply 303 is provided for each
22 processor module 33, and a bus 305 supplies the power from
23 the power supply 303 to the central processing unit 105
24 and memory 105 of a related processor module 33.
26
27
28
29
.
33
il478~4
As also illustrated in Figure 30, each device controller 41 is
connected for supply of power from two separate power supplies 303
through an automatic switch 311. If one power supply 303 for a particular
device controller 41 fails, that device controller is supplied with
power from the other power supply 303, and the changeover is accomplished
smoothly and without any interruption or pulsation in the power supplied
to the device controller.
The power distribution system coacts with the dual port system
of the device controller to provide continuous operation and access to
the peripheral devices in the event of a failure of either a single
port 43 or a single power supply 303.
The multiprocessor system includes a power on (PON) circuit
182 ~the details of which are shown in Figure 25) in several components
of the system to establish that the power to that particular component
is within certain acceptable limits.
For example, the PON circuit 182 is located in each CPU 105,
in each device controller 41, and in each bus controller 37.
-34-
~14'7824
1 The purpose of the PON circuit is to present
2 a signal establishing the level of power applied to that
3 particular component; and if the power is not within
4 certain predetermined acceptable limits, then the signal
S output is used to directly disable the appropriate bus
6 signal of the component in which the PON is located.
8 The power-on circuit functions in four states --
9 power off; power going from off to on; power on; and
10 power going from on to off.
11
12 The power-on circuit initializes all of the logic
13 states of the system as the power is brought up; and in
14 the present invention, the power-on circuit provides an
15 add~tional and very important function of providing for
16 a fail-safe system with on line maintenance. To do this,
17 the power-on circuit in the present invention is used in
18 a unique way to control the interface circuits which drive
19 all of the intercommunication buses in the system.
21 The construction and operation of the power
22 distribution system are illustrated in Figs. 30-33 and
¦ 23 are described in detail below under the subtitle Power
24 Distribution System.
26 The multiprocessor system includes a memory system
27 in which the physical memory is divided into four logical
28 address areas -- user data, system data, user code and
29 system code (See Fig. 36.1.
~ 30
; 35
, ~
.
1'7~24
1 The memory system includes a map 407 and control
2 logic 401 (See Fig. 34.) for translating all logical addresses
3 to physical addresses and for indicating pages absent from
4 primary storage bit present in secondary storage as required
5 to implement a virtual memory system in which the physical
6 page addresses are invisible to users.
8 The memory system incorporates a dual port access
g to the memory by the central processing unit 105 and the
10 I/O channel lO9. The I/O channel 109 can therefore access
11 the memory 107 directly (without having to go through the
12 central processing unit 105) for data transfers to and from
13 a device controller 41.
14
The construction and operation of the memory
16 system are illustrated in Figs. 34-41 and are described in
17 detail below under the subtitle Memory System.
18
19 An error detection system is incorporated in
20 the memory system for correcting all single bit and detecting
21 all double bit errors when semiconductor memory is used in
22 the memory system. This error detection system utilizes a
23 16 bit data field and a 6 bit check field as shown in Fig. 37
24 and includes a data bit complementer 487 as also shown in
25 Fig. 37 for correcting single bit errors.
26
27
28
29
36
11~'7824
1 Figs. 37 through 41 and the related disclosure
2 illustrate and describe details of the error detection
3 system.
Before going into the detailed description of
6 the systems and components noted generally above, it should
7 be noted that certain terminology will have the following
8 meanings as used in this application.
The term "software" will refer to an operating
system or a user program instructions; the term "firmware"
12 will refer to a microprogram in read only memory; and
13 the term "hardware" will refer to actual electronic logic
14 a d data storage.
16 The operating system is a master control program
17 executing in each processor module which has primary control
18 Of the allocation of all system resources accessible to
19 that processor module. The operating system provides a
20 scheduling function and determines what process has use of
21 that processor module. The operating system also allocates
22 the use of primary memory (memory management), and it
23 operates the file system for secondary memory management.
24 The operating system also manages the message system.
25 This provides a facility for information transfer over
26 the interprocessor bus.
27
28
29
37
824
1 The operating system arrangement parallels
2 the modular arrangement of the multiprocessor system
3 components described above, in that there are no "global"
4 components.
6 At the lowest level of the software system,
7 two fundamental entities are implemented--processes and
8 messages.
A process is the fundamental entity of control
11 within a system.
12
13 Each process consists of a private data space
14 and register values, and a possibly shared code set. A
15 process may also access a common data space.
i6
17 A number of processes coexist in a processor
18 module 33.
19
The processes may be user written programs, or
21 the processes may have dedicated functions, such as, for
22 example, control of an I/O device or the creation and
23 deletion of other processes.
24
A process may request services from another
process, and this other process may be located in the same
27processor module 33 as a process making the request, or
28the other process may be located in some other processor
29mOdule 33.
.
38
~1~7~24
1 The processes work in an asynchronous manner,
2 and the processes therefore need a method of communication
3 that will allow a request for services to be queued with-
4 out "races" (a condition in which the outcome depends
5 upon the sequence of which process started first)--thus
6 the need for "messages" (an orderly system of interprocessor
7 module communication described in more detail below).
9 Also, all interprocessor module communication
0 should appear the same to the processes, regardless of ~
11 whether the processes are in the same or in different~
12 processor modules.
13
14 As will become more clear from the description
15 to follow, the software structure parallels the hardware;
16 and different processes can be considered equivalent to
17 certain components of the hardware in arrangement and
18 unction.
19
For example, just as the I/O channel 109
,................................................. .
21 communicates over the I/O bus 39 to the device controller
22 41, a user process can make a re~uest (using the message
23 system) to the process associated with that device controller
~6', 24 41; and then the device process returns status back
25 similar to the way the device controller 41 returns
26 information back to the I/O channel 109 over the I/O bus 39.
27
28 The other fundamental entity of the software
29 system, the message, consists of a request for service as
30 well as any required data. When the request is completed,
.
39
~7~
1 any reguired values will be returned to the requesting
2 process.
4 When a message is to be sent between ~rocesses
5 in two different processor modules 33, the interprocessor
6 buses 35 are used. However, as noted above, all communication
7 between processes appears the same to the processes,
8 regardless of whether they are in the same or in different
9 processor modules 33.
11 This so~tware organization provides a number
12 of benefits.
13
14 This method of structuring the ~oftware also
15 provides for significantly more reliable software. By
16 being able to compartmentalize the software structure,
17 smaller module sizes can be obtained, and the interfaces
18 between modules are well defined.
19
The system is also more maintainable because
21 of the compartmentalization of function.
22
23 The well defined modules and the well defined
24 interfaces in the software system also provide advantages
25 in being able to make it easily expandible--as in the
26 case of adding additional processor modules 33 or device
27 controllers 41 to the multiprocessor system.
28
29 Furthermore, there is a benefit to the user
30 of the multiprocessor system and software system in that
~478~4
1 the user, writing his program, need not be aware of either
`5 2 the actual machine configuration or the physical location
3 of other processes.
- 4
Just as the hardware provides multiple function-
6 ally equivalent modules with redundant interconnects, so
7 does the software.
9 For example, messages going between processes
0 in different processor modules 33 may use either inter-
processor bus 35. Also, device controllers 41 may be
2 operated by processes in either of the processor modules
13 33 connected to the device controller 41.
14
The multiprocessor hardware system and software
16 system described above enable the user to develop a
17 ~ault tolerant application system by virtue of its
18 replicated modules with redundant interconnects.
19
21
22
23
24
: 25
; 26
27
28
29
_ 41
.
.
t~;9~ja~a
:` . .
1 THE INTERPROCESSOR BUS SYST~M:
3 As pointed out above, the individual processor
4 modules 33 are interconnected by two interprocessor buses
5 35 (an X bus and a Y bus) with each bus controlled by a
6 related bus controller 37. Each interprocessor bus 35, in
7 combination with its bus controller 37 and a related inter-
8 processor control 55 in each processor module 33, provides
g a multi-module communication path from any one processor
10 module to any other processor module in the system. The
11 use of two buses assures that two independent paths exist
12 between all processor modules in the system. Thereorel a
13 failure in one path ~one bus) does not prevent communication
14 b~tween the processor modules.
16 ~he bus controller 37 for each interprocessor bus
17 35 is à controller which is, in a preferred form of the
18 invention, separate and distinct from the processor modules 33.
19
Each interprocessor bus 35 is a synchronous
21 bus with the time synchronization provided by a bus clock
22 generator in the bus controllers 37. The interprocessor
23 control portions 55 of all of the modules associated with
24 the bus make state changes in synchronism with that bus
25 clock during transfers over the bus.
26
27 As will be described in more detail below, the
28 CPU 105 operates on a different clock from the inter-
29 processor bus clock. During the filling of an outqueue
30 or the emptying of an inqueue in the interprocessor control
42 ~
~7~i24`.
1 SS by the CPU, the operation takes place at the CPU clock
2 rate. However, transmission of packets over the inter-
3 processor bus always takes place at the bus clock rate.
It is an important feature of the present
6 invention that the information transmitted over the inter-
7 processor bus is transferred at high transmission rates
8 without any required correspondence to the clock rates of
9 the various CPUs 105. The infor~ati,on transfer rate over
10 the interprocessor bus is also substantially faster than
11 would be permitted by direct memory accesses into and out
12 of the memory sections 107 at memory speed. This ensures
13 that there is adequate bus bandwidth even when a large number
14 0~ proceg80r modules i9 connected in a multiprocessor system.
16 A benefit of using separate clocks for each CPU
17 lOS ls that a master system clock is not required, and
18 this eliminates a potential source of sipg?e component
19 failure which could stop the entire system.
21 The interprocessor control 55 incorporates logic
22 interlocks which make it possible to operate the inter-
23 processor buses 35 at one clock rate and each CPU 105
24 at its own independent clock rate without loss of data.
26 ~he information transmitted over the bus is
27 transmitted in multiword packets. In a preferred form
v
28 Of the present invention each packet is a sixteen word
29 packet in which fifteen of the words are data words and
30 one word is a check word.
~, . . . .
~ 43
~;,~,7~
1 The control logic within the bus controller
2 37 and the interprocessor controls 55 of the individual
3 modules 33 follows a detailed protocoL. The protocol
4 provides for establishing a sender-receiver pair and a
5 time frame for the data packet transfer. At the end of
6 the time frame for the transmission of the data packet,
7 the bus controller 37 is released for another such sequence.
8 The specific manner in which these functions are carried
9 out will become more apparent after a descriptlon of the
10 structural features of Figs. 3-9 below.
11 ,, ,
12 X bus 35 is identical in structure to the
13 Y bus 35, so the structure of only one bus will be
14 described in detail.
16 As illustrated in Fig. 2, each bus 35 comprises
17 sixteen individual bus data lines 57, five individual
18 bus pro~ocol lines 59, and one clock line 61, and one
19 select line 63 for each processor module 33.
21 As also illustrated in Fig.,2, the inter-
22 processor control 55 of each processor module 33 includes
23 two inqueue sections 65 (shown as an X inqueue section
24 and a Y inqueue section in Fig. 2) and a shared outqueue
25 section 67.
26
27 With the specific reference to Fig. 4, the
28shared outqueue section 67 includes an outqueue buffer 69
29which performs a storage function. In a preferred form
30the buffer 69 has sixteen words of sixteen bits each. The
44
. , ,
~r~
j
1 buffer 69 is loaded by the CPU and holds the data until the
2 packet transmission time, at which time the data is gated
3 out to the bus, as will be described in more detail below.
The outqueue section 67 also includes a receive
6 register 71, which in a preferred form of the invention
7 is a four bit register. This register is loaded by the
8 CPU with the n~mber of the processor module to which the
9 data will be sent.
11 The control part of the outqueue section 67
12 includes a processor fill state logic section 73 which
13 operates in synchronism with the CPU clock, a bus empty
4 s5ate logic section 75 which operates in synchronism with
15 the X or Y bus clook, and an outqueue counter 77. During
16 filling of the outqueue buffer 69 by the CPU, the out-
17 queue counter 77 scans the buffer 69 to direct the data
18 input into each of the sixteen words of the buffer; and,
19 as the sixteenth word is stored into the outqueue buffer
20 69, the outqueue counter 77 terminates the fill state.
21
22 '~he outqueue section 67 also includes an out-
23 queue pointer 79 which connects the entire outqueue
24 seCtion to either the X bus or the Y bus 35. The outqueue
25 pointer 79 allows the logic sections 73 and 75 and the
26 buffer 69 to be shared by the X and Y interprocessor buses 3~.
27
28 . As illustrated in Fig. 3, the bus controller
29 37 comprises a bus control state logic section 81, a
30 sender counter 83, a processor select logic section 85,
~1478~
1 a receive register 87, a packet counter 89 and a bus
2 clock generator 91.
4 With reference to Fig. 5, each inqueue section
5 65 comprises a bus fill state logic section 93 which
- 6 operates in synchronism with the bus clock, a sender
7 register 95, an inqueue buffer 97, an inqueue counter 99,
8 and a processor empty state logic section 101 which
9 operates in synchronism with the CPU clock.
11 Fig. 6 is a state diagram of the bus control
12 logic 81 of the bus controller 37.
13
14 - Fig. 7 is a state diagram of the logic sections
15 73 and 75 of the outqueue section 67.
16
17 Fig. 8 is a state diagram of the logic sections
18 93 and 101 of the inqueue sections 65.
'19
With reference to Fig. 7, the processor fill state
21 logic section 73 has basically four states--EMPTY, FILL, FULL
22 and WAIT--as indicated by the respective legends. The bus
23 empty state logic section 75 has basically four states--
24 IDLE, SYNC, SEND and DONE--as illustrated by the legends.
26 Continuing with a description of the notation in
27 Fig. 7, the solid lines with arrows indicate transitions
2g from the present state to the next state. Dashed arrows
29 ending on the solid arrows indicate conditions which must
30 be satisfied for the indicated transition to take place.
46
i,~,i .
. , .
-
i~4'7~Z4
1 The synchronization of state machines running
2 off relatively asynchronous clocks require a careful
3 c9nstruction of an interlock system These important
4 interlocks are noted by the dashed arrows in the state
S diagrams. These interIocks perform a synchronization
6 of two relatively asynchronous state machines. The
7 dashed arrows in Fig. 7 and Fig. 8 running between the
8 state machines thus indicate signals which synchroni2e
9 ~qùalify~ the indicated tsansistions of the state machines.
11 .
13
14
16
17
18
19
, A,
21
22
24
26
27
28
29
, . . .
r. 47
'
~47~
1 With reference to the FILL state for the
2 logic section 73, it should ~e noted that the store
3 outqueue condition will not cause an exit from the
4 FIL~ state until the outqueue counter 77 has advanced
5 to count 15 (on a count which starts with zero)
6 at which time the FILL state will advance to the FULL
7 state.
9 Similarly, it should be noted that the SEND
10 state of the logic section 75 wili not terminate on the
11 select and send command condition until the outqueue
12 counter 77 reaches count 15, at which time the SEND
13 state advances to the DONE state.
14 ~
The asterisk in the notation of Fig. 7
16 indicates an increment of the outqueue counter 77.
17
18 Fig. 6 shows the state diagram for logic 81
19 of the bus controller and illustrates that the logic
20 has basically four states--IDLE, POLL, RECEIVE and SEND.
21
22 The notation in Fig. 6 is the same as that
23 described above for Fig. 7. A solid arrow line indicates
24 a state transition from one state to another and a
25 dotted arrow line to that solid arrow line indicates a
26 condition which must occur to allow the indicated
.~.. . .
' 27 (solid line arrow) transition to occur. An asterisk
28 on a state transition in this case indicates that
2g simultaneously with the indicated transition the sender
30 counter 83 is incremented by one.
,~ ' .
b ' ' 48
.
~47~4 1
1 The dashed arrow output lines in Fig. 6
2 indicate protocol commands issued from the bus
3 controller to the interprocessor bus.
In both Fig. 7 and Fig. 6 a dashed arrow
~` 6 leaving a state indicates a logic output from that
7 state such as a logic output signal to a protocol line
- 8 (in the case of the bus empty state logic 75) or to a
9 status line of the processor module (in the case of the
10 processor fill state logic 73).
11
12 Fig. 8 shows the state diagrams for the bus
13 fill state logic section 93 and the processor empty state
14 l~gic section lOl.
16 The state diagram for the logic section 93
17 includes four state9--SYNC, ACKNOWLEDGE, RECEIVE and FULL.
18
'., 19 The state diagram for the logic section 101
20 includes four states--RESET, READY, INTERRUPT and DUMP.
21
22 The notation (solid line arrows and dashed line
~;` 23 arrows) is the same as described above for Fig. 7 and Fig. 6.
The asterisk in Fig. 8 indicates an increment
26 in the inqueue counter 99.
27
28 Fig. 9 is a timing diagram showing the time
,~ . .
~; 29 sequence in which the state changes given in Figs. 6, 7
." 30 and 8 occur.
49
~.47~2~
1 The sequence shown in Pig. 9 accomplishes
2 the transmission of a packet from one processor module
3 to another processor module at the bus clock rate
4 (assuming that the intended receiver module is ready to
5 receive the packet).
7 Fig. 9 shows the time sequences for a success-
8 ful packet transfer with individual signal representations
9 listed from top-to-bottom in Fig. 9 and with time periods
10 Of one bus clock each shown from left-to-right in the
11 order of increasing time in Fig. 9.
12
13 The top line in Fig. 9 indicates the state of
14 t~e bus controller, and each division mark represents a
15 clock period or cycle of the bus clock generator 91 shown
16 ln Fig. 3. Each time division of the top line carries
17 down vertically through the various signal representations
18 l~sted by the legends at the left side of the figure.
19
Taking the signals in the sequence presented
~'
21 from top-to-bottom in Fig. 9, the first signal (below
22 the bus controller state line) is the SEND REQUEST signal
23 (one of the protocol group indicated by the reference
~4 numeral 59 in Fig. 3) and specifically is the signal
25 which may be asserted by the outqueue control logic
26 section 67 of any processor module 33. The signal is
27 transmitted to the bus control state logic section 81
28 of the bus controller 37 (see Fig. 3).
29
. ,
, 50
, ~
~147~
.
1 The next signal shown in Fig. 9 (the SELECT
2 s~gnal) represents a signal which originates from the
3 processor select logic section 35 of the bus controller
-~ 4 37 and which is transmitted on only one at a time of the --
S select lines 63 to a related processor module 33.
7 The next signal represented in Fig. 9, the~
8 SEND ACKNOWLEDGE signal, may be asserted only by a
9 particular processor 33 when that processor is selected
10 and when its bus empty state logic section 75 is in
11 the SEND state (as illustrated in the third state of
12 Fig. 7). This SEND ACKNOWLEDGE signal is used by the
13 bus controller 37 to establish the identity of a processor
14 m~dule 33 wishing to send a packet.
16 The next signal, the RECEIVE COMM~ND signal, repre-
17 sent5 a signal from the bus controller 37 transmitted on one
18 of the protocol lines 59. This signal does two things.
rs
First of all, this signal in combination with
21 receiver SELECT interrogates the receiver processor module
22 33 to find out whether this receiver module is ready to
23 receive (as indicated by the ACKNOWLED~E state in Fig. 8).
24
Secondly, this signal has a secondary function
26 Of disabling the bus empty state logic section 75 of the
27 receiving module so that the receiving module cannot gate
28 an intended receiver number to the data bus should the
29 outqueue section of the intended receiver module 33 also
30 have a data packet of its own ready to send.
il47~2~
1 In this regard, during the time that the
2 sender processor is asserting the SEND ACK~IOWL~DGE
3 signal it is also gating the receiver number to the
4 bus for use by the bus controller 3i. The bus 35
itself is, of course, a non-directional bus so that the
6 information can be gated to the data bus 57 by any module
7 for use by either the bus controller ~7 for a control
8 function or for use by another processor for an information
g transfer function. It should be noted that a module 33
10 may gate data to the bus only when its SELECT line is -
11 asserted and the RECEIVE COMMAND signal is not asserted.
12
13 During the time that the RECEIVE COMMAND signal
14 i~ asserted the bus controller 37 is gating the sender
?5 number to the data bus 57 for capture by the seleated
16 receiver processor module.
17
18 The next signal line (the RECEIVE ACKNOWLEDGE
19 line in Fig. 9) represents a signal which is transmitted
20 from the selected receiving module's bus fill state logic
,i~,
21 section 93 to the bus control state logic section 81 of
.
22 the bus controller 37 (over one of the protocol lines 59)
23 to indicate that the selected receiver module is in the
24 AcKNowLEDGE 5tate (as indicated by the legend in Fig. 8)
25 and thus ready to receive the packet which the sender
26 module has ready to transmit.
27
28 If the RECEIVE ACKNOWLEDGE signal is not
29 asserted by the receiver module, the sender SELECT,
30 the SEND COMMAND and the time frame transmission of
.. ~,
' 3~ 31 the data packet itself will not occur.
52
~&
a~47~z4,
1 If the RECEIVE AC~NOWL~DGE signal is asserted,
2 then the sequence indicated by the SEND COMMAND line
3 will occur.
The SEND COk~AWD line represents a signal
6 which originates from the bus control state logic
7 section 81 of the bus controller 37 and which is trans-
8 mitted to the bus empty state logic section 75 of the
g sender processor module 33 over one of the protocol lines
10 59.
11
12 In combination with a SELECT of the sender
13 processor module the SEND COMMA~D signal enables the
14 se~~der processor module to send a packet to the
15 receiver module during the sixteen clock cycles
16 bracketed by the SEND COM~AND signal.
17
18 The final line ~the data/16 line) represents
j~ 19the information present on the data lines 57 during the
r~
20above-described sequence.
21
22 The data is gated to the bus by the selected sender
23processor module and is transmitted to the receiver
b' 1 24processor module into the inqueue buffer 97 (see Fig. 5)
5during this sixteen clock cycle time frame. This assumes
26that the RECEIVE ACKNOWLEDGE signal was received by the
27bus controller in response to the RECEIVE COMMAND signal.
~ .
j~, 28
,~ 29
; 30
.,
53
~,j
`
`
~147824
, . . .
1 If the RECEIVE ACKNOWL~DGE signal had not
2 been received by the bus controller, then the SEND
3 CO~ND signal would not have been asserted and the
4 bus controller 37 would have resumed the POLL state
5 as shown in Fig. 6.
7 With reference to Figs. 2, 7, 10 and ll, a
8 typical operation of the outqueue buffer and control 67
9 of one processor module 33 will now be described.
, 10
11 As illustrated in Fig. 10, the processor fill
12 state logic section 73 includes two flip-flops A and B,
13 and the bus empty state logic section 75 includes two
. ,
14 f~ip-flops C and D.
~f i
16 Summarizing the state assignments as shown by
17 the AB and CD tables in Fig. 10, the EMPTY state is
18 defined as A = 0, B = 0. ~he FILL state is defined as
19 A = 1, B = 0. The FULL state i9 defined as A = l, B = l;
20 and the WAIT state is defined as A = 0, B = 1.
21
22 Similarly, the corresponding combinations of
23 the C and D state variable5 are defined to be the IDLE,
24 SYNC, SEND and DONE states res~ectively. State assign-
25 ments previously listed could also be given in form of
26 logic equations. For example, EMPTY = A ~ ~, and this
27 notation is utilized in the Fig. ll logic equation
28 listings.
29
.'~ , , .
. ~. 54
~ ' ' . .
~;~47~4~
1 In operation and with specific reference to
2 Fig. 7, the initial state reached through power on
3 initialization or manual reset is the EMPTY state shown
4 in the top left part of Fig. 7.
6 The EMPTY state of the processor fill state
7 logic 73 provides a ready signal to the central processor
8 unit ~CPU) 105 to indicate the presence of that state,
9 as indicated by the dashed arrow RDY shown as leaving
10 the empty state in Fig. 7.
11 '
~2 ~he CPU firmware (microprogram) in response to
13 that ready signal, when a transmission over the inter-
14 processor bus is required, will provide a store receive
15 slgnal ~shown by the dashed arrow incoming to the diagram
16 in Fig. 7). This store receive signal qualifies (synchronizes) the
17 transitlon which advances the EMPTY state to the FILL
18 state.
19
The CPU firmware, to transfer data into the
21 outqueue buffer 69, will provide a store outqueue signal
22 ~the dashed arrow entering the diagram in Fig. 7~ for
23 each word to be stored in the buffer 69.
24
,,
Each occurrence of this store outqueue signal
26 will advance the outqueue counter 77, commencing with a
27 count of zero, until a count of 15 is reached.
28
29 On the sixteenth occurrence of the store out-
30 queue signal a transition from the FILL to the FULL state,
f,',3~ 55
, ; ,
1 as illustrated by the solid line arrow in Fig. 7, is
2 allowed.
4 The FULL state of the processor FILL state logic
5 provides a synchronization condition to the bus empty state
- 6 logic denoted ~y the dashed arrow leaving the FULL state
7 of logic 73 and going down to the logic 75 in Fig. 7.
9 The processor fill state logic 73 will remain
10 in the E'ULL state until the bus empty state logic 75
11 has subsequently reached the DONE state.
12
13 Now, referring specifically to the bus empty
14 stAte l~gic denoted by 75 in Fig. 7, the initial state,
5 IDLE, for the logic section 75 in Fig. 7 is again pro-
16 vided by power on initialization or manual reset.
17
18 The bus empty state logic 75 will remain in
19 the IDLE state until the transistion to the SYNC state is
20 allowed as shown by the dashed arrow from the FULL state
21 Of the processor fill 73.
22
23 The empty state logic 75 will proceed with no
J ~24 qualification required from the SYNC state to the SEND
.~
25 state~
26
27 It is in the SEND state that the SEND REQUEST
28 signal to the bus and to the bus controller is asserted
29 (as indicated by the dashed arrow going down and leaving
30 the diagram 75 from the SEND state).
.
56
.
~47~Z4'
1 In response to this SEND REQUEST signal, the
2 bus controller logic 81 (Fig. 6) will poll processor
3 modules successively until the sender is identified
4 ~as discussed earlier with reference to Fig. 9).
6 The bus controller will issue a RECEIVE CO~MAND
7 and SELECT to the intended receiver processor module; and
8 upon receipt of the RECEIVE ACKNOWLEDGE signal will proceed
9 to the packet time frame (also identified in Fig. 9).
During the packet time frame the bus controller
12 asserts SELECT of the sénder processor module and also
13 asserts the SEND COMMAND signal to the sender processor module.
,.
14
This SELECT signal and SEND COMMAND signal is
16 shown as enterlng the diagram and quali'fying (synchronizing)
17 tran~ition9 leaving and entering the SEND state as noted
18 in Fig. 7.
19
Each bus clock while SELECT and SEND COMMAND
21 are asserted will advance the outqueue counter 77 commenc-
22 ing with a count of zero.
23
24 On the sixteenth clock period of SELECT and SEND
. ~ ,
25 COMMAND the transition terminating the SEND state and ad-
r! 26 vancing to the DONE state is qualified (synchronized as
' h,
27 shown by the dashed arrow allowing that transition).
28
29 When the empty state logic 75 has reached the
30 DONE state, a transition of the processor fill state logic
.
_~ 57
.,~, _,,
1147~?~
l 73 from FULL to WAIT is yualified tas denoted by the
2 dashed arrow leaving the done state).
4 Next, the ~AIT state of the processor fill
5 state logic 73 qualifies a transition of the bu~ ~mpty
6 state logic 75 from the DONE state to the IDLE state
7 (as denoted by a dashed arrow leaving the WAIT state and
8 qualifying the indicated transition).
Finally, the bus empty state logic 75, being in
11 the IDLE state, qualifies the transition of the processor
12 fill state logic 73 from the WAIT state to the EMPTY state
13 ~as denoted by the dashed arrow leaving the IDLE state).
~4
At this point a packet has been loaded into the
16 outqueue buffer 69 by the processor module and transmitted
17 over the bus 35 to the receiver processor module, and the
}8 outqueue contrcl processor fill state logic 73 and bus
19 empty state logic 75 have returned to their initial states.
ZO
21 The above description relates to the transitions
22 and qualifications indicated in Fig. 7. The action of
23 the logic sections 73 and 75 involved in the above
24 description of operation of Fig. 7 will now be noted
25 with reference to the logic diagram of Fig. 10 and the
26 logic equation listing of Fig. ll.
~7
28 With reference to Fig. 10, as noted above, the
29 flip-flops A and B are JK flip-flops and are edge
30 triggered flip-flops in that state changes occur only
58
J,~47~:~4;
1 on clock transitions (as indicated by the small triangular
-2 symbols and legends on the lefthand sldes of the flip-flops
3 A and B in Fig. 10).
The primary significance of the logic
6 diagram in Fig. 10 is to lllustrate the transition from
7 one state to another in the state machines shown in
8 Fig. 7. Thus, to illustrate the transition from IDL2
9 to SYNC in the empty state logic 75, the operation
10 proceeds as follows. - ~
11
12 ~o implement a change from the IDLE state
13 to the SYNC state, the state variable C must be set.
14
1 15 ~he logic equation for the J input of state
; I 16 variable C i9 as shown in Fig. 11 and is indicated by
17 the reference numeral 103. In this equation the inter-
18 lock ~shown by the dashed arrow from the full state of
19 the fill state logic 73 in Fig. 7 to the transition)
20 corresponds to the quantity (A B) or (FULL) in the
21 equation indicated by the reference number 103. The D
22 or (IDLE) in the equation indicated by reference numeral
23 103 in Fig. 11 corresponds to the IDLE state shown by the
24 legend in Fig. 7. ~he J in the equation corresponds to
25 the J input of the C flip-flop in Fig. 10. And the (C)
1 26 corresponds to the true output of the C flip-flop in
¦ 27 Fig. 10.
j, 'C,
28
; 29 Other state transitions of the Fig. 7
. ~ i
~ 30 diagram will not be described in further detail with
;." .
59
~i
i,~,,.~
1~47~24
1 reference to Figs. 10 and 11 since it is believed
2 that these transitions as carried out by the logic
3 diagram in Fig. 10 and the logic equations in Fig. 11
4 are clear from the above examples of the transition from
IDLE state to SYNC state as described in detail above.
7 Figs. 10 and 11 show the logic diagram and
8 logic equations for the state diagram of the outqueue
9 buffer and control 67. Corresponding logic diagrams
and logic equations have not been illustrated for the -
11 inqueue buffer and control 65-or the bus controller
12 37 because such logic diagrams and equations are similar
13 to those shown in Fig. 10 and Fig. 11 and are easily
14 o~tainable from the state diagrams shown in Figs. 6 and 8.
16 Each processor module 33 (Fig. l) in the multi-
17 processor system is connected to both interprocessor buses
~'~ 18 35 ~Fig: 1) and is capable of communicating with any pro-
19 cessor module including itself o~er either bus. For each
. ~ 20 block data transfer, one processor module is the source
~h., 21 or sender and another is the destination or receiver.
23 Transmission of data by a processor module
24 over one of the interprocessor buses is initiated and
25 accomplished under software control by means of the SEND
26 instruction.
~;~ 27
' In the SEND instruction the microprogram 115
29 (Fig. 2) and the CPU microprocessor 113 (Fig. 2) interacts
30 with the shared outqueue section 67 of the interprocessor
.
.~,~ ,
~147~ZS
1 control 55 to read a data block from memory 101 to break
2 it up into packets, to calculate packet check sum words,
3 and to transmit the block one packet at a time over a
4 bus to the receiving prdkessor module. Parameters
supplied to the SEND instruction specify the number of
6 words in the block, the starting address of the block,
which bus to use, the destination processor, and a
8 maximum initial timeout value to wait for the outqueue
9 67 (Fig. 2) to become available.
11 The SEND instruction terminates only after the
12 entire block has been transmitted; thus sending a block
13 is a single event from the software viewpoint. However,
14 t':e SEND instruction is interruptable and resumable, so
that response of the operating system to other events is
16 not impaired by the length of the time required to
17 complete a SEND instruction.
18
19 Receiving of data by a processor module over
the interprocessor buses is not done by means of a soft-
21 ware instruction, since the arrival times and sources
22 of data packets cannot be predicted. The receiving of
23 data is enabled but cannot be initiated by the receiver.
24
.
The CPU microprocessor 113 takes time out from
.
26 software instruction processing as required to execute
27 the BUS RECEIVE microprogram 115. This microprogram
28 takes the received data packet from one of the inqueue
29 sections 65 (Fig. 2) of the interprocessor control 55,
stores the data into a memory buffer, and verifies correct
31 packet check sum.
61
'
~ .
~47~
1 Reassembly of received packets into blocks
2 is accomplished using the Bus Receive Table 150 (BRT)
3 in memory. The BRT contains 32 two-word entries, corres-
4 ponding to the two buses from each of the sixteen pro-
5 cessor modules possible in on~ specific implementation ;~
6 of the multiprocessor system. Each BRT entry corres-
7 ponding to a bus and a sender contains an address word
8 and a count word. The address word specifies into which
9 buffer in the System Data area incoming data from that
0 sender is to be stored. The count word specifies how many
1~ data words remain to complete the block transfer from
12 that sender.
13
14 - As each data packet is received, the CPU micro-
15 processor 113 suspends processing o software instructions,
16 and the bus receive microprogram 115 is activated. This
17 microprogram reads the address and count words from the
18 sender's BRT entry, stores the data packet into the
19 specified area, veriies correct packet check sum, and
20 restores adjusted values of the address and count words
21 into the BRT entry. If the packet caused the count to
22 reach zero or if the packet contained incorrect check sum,
,,.
~'~ 23 the bus receive microprogram sets a completion interrupt
~ 24 flag to signal termination of the data block to the soft-
i 25 ware. The CPU microprogram then resumes software
~Y
26 instruction processing at the point where it left off
27 with no disturbance except delay to the currently executing
28 program.
29
62
"................................................... . .
~147~Z4
1 It is an important feature that data blocks
2 from several senders can all be assembled concurrently
3 by a receiving processor module from data packets received
4 in any sequence. This interleaved assembly of blocks
from packets is carried on transparently to the soft-
6 ware executing in the receiver processor. Only success-
7 ful block completions or erroneous transmissions cause the
8 software to be interrupted.
9 . .
It is also important that a time-sharing or -
time-slicing of the interprocessor bus hardware has been
12 achieved in two areas.
13
14 First, each interprocessor bus and associated
bus controller allow packets to be transmitted between
16 any sender and receiver as required. The circular polling
17 by a bus controller to identify a requesting sender
18 ensures that all processor modules have an equal opportunity
19 to send over that bus. Each bus provides a communication
path which is shared in time in an unbiased way by all
21 processor modules.
22 -
23 Secondly, each inqueue section 65 of the inter-
24 processor control ~5 of a processor module is shared in
time by incoming packets from several senders. That is,
26 the inqueue logic and storage of a processor is not
27 dedicated to a single sender for the duration of a block
28 transfer. Instead, each packet received is correctly
29 directed into memory by the BRT entry corresponding
3~ to its sender and bus. Data blocks from several senders
~j .
~ 63
;~ ' ' ' .
~14782g
1 are assembled correctly in the receiver's memory
2 independently of the order in which the senders make
3 use of the bus.
A processor module has two ways of controlling
6 its ability to receive packets over the X bus or the Y bus.
8 First, there is a bit in the CPU's interrupt
9 MASK register corresponding to each interprocessor bus.
10 When the MASK bit is on, micro-interrupts for that bus
ll are allowed. Micro-interrupts (activation of the sUS
12 RECEIVE microprsgram) occur when the Processor Empty
13 state logic 101 ~Fig. 5) of an inqueue section 65 reaches
14 t~e MICRO-INT state after a packet has been received
into an inqueue buffer. If the MASK bit is off when a
16 packet is received, the micro-interrupt and subsequent
17 processing of the packet into memory will be deferred
18 until t~e MASK bit is set on by a software instruction.
19
Software operations such as changing a BRT
21 entry are performed with micro-interrupts disabled to
22 avoid unpredictable results. No packets are lost while
23 micro-interrupts are disabled. The first packet received
24 will be held in the inqueue buffer until the micro-
interrupt is enabled. Subsequent packet transfers while
26 the inqueue buffer is full are rejected since the Bus '
27 Fill state 93 logic will be in the EULL state and thus
28 unable to assert RECEIVE ACKNOWLEDGE in response to
29 SELECT.
64
~1~7aZ4
1 A second means of controlling its ability to
2 receive packets over the bus is the action taken ~y a
3 processor module after an X bus or Y bus receive
4 completion interrupt (activation of an operating system
interrupt handler).
7 When a check sum error is detected in a received
8 packet or when the BRT word count remaining in a data
g block reaches zero as a packet is stored into memory,
the BUS RECEIVE microprogram sets the X bus or Y bus
11 completion interrupt flag. Otherwise, the microprogram
12 issues the RINT signal (see Fig. 8) to the inqueue
13 Processor Empty state logic 101 to allow another packet
14 to be received. When the completion flag is set, however,
the RINT signal is not issued.
16
17 It i8 thus the responsibility of the bus receive
18 completion software interrupt handler to issue the RINT
19 signal ~by means of an RIR software instruction) to reenable
the inqueue 65. Until this occurs, the inqueue Bus Fill
21 state logic 93 remains in the FULL state and no additional
22 packets will be received.
23
24 The completion interrupt signal can therefore
25 designate either a block data transfer that has been sent
26 and received without error, or it can designate a partial
27 transfer in which a check sum error is detected, and in
28 which partial transfer of the completion interrupt is
29 generated as a result of the check sum error detected.
30 In the latter case, the sender continues to send the data
~147~Z4
1 block but the receiver discards the data block after the
2 check sum error has been detected. This error shows up
3 in the bus receive table tBRT) count word as a negative
4 value. This will become more apparent from the
description of the operation which follows.
7 The SEND instruction is an instruction that
8 requires four parameter words in the CPU register stack.
0 The first of the four parameter words is a
11 count of the number of words to be transferred. This value
2 must match the number expected by the BRT in the receiver
13 processor module if the transfer is to complete success-
14 f~llly.
1~
16 The second parameter word is the address, minus
17 one, in the System Data area in the sender processor's
~8 memory where the data to be transferred is located.
19
The third parameter word is a timeout value
21 allotted to completing a single packet ~fifteen data word)
22 transfer. The timeout period is restarted for each packet
23 transferred by the SEND instruction.
~ 24
: 7~ 1 25 ~he fourth parameter word specifies the bus
26 (whether the X bus or the Y bus) to be used and specifies
27 the receiver processor module. The high order bit of the
~8 parameter specifies the bus and the low order four bits,
.. . .
29 in one specific implementation of the invention, specify
~ 30 the number of the receiver processor module.
'~- 66
. ~
:
~ .
~4782~
1 At the completion of a SEND instruction, there
Z are two possible conditions.
4 The first condition is that a packet timeout
occurred and the remaining packets were not transmitted
6 and the instruction was terminated at that point. In
7 this event the remaining packets of the block are not
8 transmitted.
The second condition is an indication that a
11 successful data block transfer has been completed.
12
13 Thus, in initial summary of the SEND operation,
14 the SEND instruction fills.the outqueue buffer 69 (Fig. 4)
15 with fifteen data words, appends an odd-parity check sum,
16 and signals the bus controller 37 that it has a packet
17 ready for transmission. After each sixteen word packet
18 is transmitted, execution of the SEND instruction resumes
19 at the point where it left off. If the last packet of
20 the block has less than fifteen words, the remaining words
21 are filled in with zeros. The instruction terminates when
., ,
22 the last packet is transmitted.
24 Fig. 5 shows the logic diagram and Fig. 7 shows
25 the state diagram for the send hardware.
26
27 The first action of the SEND instruction
28 sequence is to issue the S/RECEIVE signal to the processor
29 fill state logic 73 (Fig. 4) and to supply on the M Bus
30 tFig. 4) the receiver processor number to the receive
67
, ~s,
.
. - ~
~1~7~24
1 register 71. Simultancously, the pointer of the outqueue
2 pointer 79 is set in accordance with the high order bit
3 of the r~ Bus tO connect the outqueue 67 to either the X
4 bus or the Y bus.
6 The store receive (S/REC~IVE) signal causes the
7 processor fill state logic 73 (which is initially in the
8 empty state as shown in Fig. 7) to advance to the FILL
; 9 state as shown in Fig. 7. This state transition causes
the receive register 71 tFig. 4) to be loaded with the
11 receiver processor number.
12
13 At this point the outqueue section 67 is ready
14 fjr the data packet to be loaded into the outqueue buffer
69. Now, up to fifteen words are read from memory and
16 are stored, by means of the M bus (Fig. 4), into the out-
17 queue buffer 69. The store outqueue signal causes each
18 word on the M bus to be written into the outqueue buffer
19 69 in a location specified by the outqueue counter 77.
i 20 Each store outqueue signal also causes the outqueue counter
21 77 to be advanced by one.
22
23 As the words are being read from memory, the
1 24 address word is being incremented by one, and the count
,
Of the words to be sent is being decremented by one.
~,~ 26 If the count reaches zero before fifteen words are read
27 from memory, the remainder of the outqueue buffer is
28 filled with zeros to pad out the data packet.
29
~ 68
!
; I ~ - .
~147~4
1 In addition, as the words are being loaded into
2 the outqueue buffer 69, the microprogram 115 (Fig. 2) is
3 calculating a modulo-two sum of the data words. After
4 the fifteenth data word has been loaded, this odd check-
S sum word is ioaded into the sixteenth location of the
6 outqueue buffer 69.
8 At this time the outqueue counter 77 has a value
j of count 15 and this value, in combination with the store
outqueue signal, causes the processor fill state logic 73
11 to advance from the FILL state to the FULL state as shown
12 in Fig. 7,
13
14 At this point the microprogram 115 has aompleted
loading of the data into the outqueue 69. The miaroprogram
6 now waits for the packet to ~e transmitted by testing for
17 occurrence of the ready (RDY) signal shown in Fig. 7.
18
19 While waiting for the packet to be transmitted,
the microprogram 115 increments a timer; and if the timer
,
21 runs out or expires before the ready ~RDY) signal is
22 asserted, the microprogram issues the clear outqueue
23 (CLOQ) signal to the processor fill state logic 73 (see
24 Fig. 4). This causes the processor ill state logic 73
25 to return to the empty state as shown in Fig. 7, and the
26 microprogram then terminates the SEND instruction with
27 the time out indication.
28
29 In normal operation, the FULL state of the
30 processor fill state logic 73 qualifies the bus empty
131~1 .'
.
~.~47a20. .
1 state logic 75 to advance from the IDLE state to the SYNC
2 state shown in Fig. 7. Next, the SYNC state automatically
3 advances to the SEND state, and this state causes the
4 SEND REQUEST signal to be issued to the bus controller 37.
The SEND REQUEST signal initiates a packet transfer
6 sequence described earlier.
8 As described earlier, when the sender processor
9 module has been identified by the bus controller 37 by
10 polling, and when the receiver processor module has
11 accepted the packet transfer by means of the RECEIVE
12 ACKNOWLEDGE signal, the data packet is gated from the
13 outqueue buffer 69 through the outqueue pointer 79 to
14 o~e of the data buses 57 for loading into the inqueue of
15 the receiver processor module.
16
17 As the sixteenth word i5 gated to the bus, the
18 value of the outqueue counter count 15, in combination with
19 the SEND COMMAND signal and the SENDER SELECT signal causes
20 the SEND state of the bus empty state logic 75 to advance
21 to the DONE state.
22
23 The DONE state qualifies the FULL state of the
24 processor fill state logic 73 ~as shown by the dashed line
25 arrow going from the DONE state to the indicated transition
26 from the FULL state in Fig. 7)- to advance to the WAIT state.
27
28 Next, the WAIT state qualifies the DONE state
29 to advance to the IDLE state as illustrated by the state
30 diagram in Fig. 7.
.
~147824
1 Finally, the IDLE state qualifies the WAIT state
2 to advance to the EMPTY state as also indicated in the
3 state diagram of Fig. 7.
The EMPTY state, of the ~rocessor fill state
6 logic 73, provides the READY indication to the micro-
7 program 115.
g If the packet just transmitted was the last
10 packet in the specified data block, the SEND instruction
11 is terminated and the successful block transfer indication
12 is given.
13
14 If the packet transmitted is not the last
15 packet in a data block, then the sequence described above
16 i~ repeated until all words in the block haue been trans-
17 mitted, or until a timeout error has occurred.
18
19 The SEND instruction is interruptable and
20 resumable; however, the SEND instruction is only interrupt-
21 able between packets; and the interruption of the SEND
22 instruction has no effect on the data transmitted.
23
~¦ 24 ~hus, by means of a single software instruction
~the SEND instruction) a data block of up to 32,767 words
26 is transmittable from a sender processor module ta a
27 receiver processor module, and accuracy of the transmission
28 is checked by the packet check-sum. Also, the trans-
29 mission occurs at a high data transfer rate, because
,j
30 the buffering provided by the outqueue buffer 69 of the
71
:~
~147~24
1 sender processor module enables the transfer to be made
2 at interprocessor bus speed independent of the memory
3 speed of the sender processor module. This allows efficient
4 use of this communication path between a number of pro-
5 cessor modules on a time slicing basis.
7 As noted above, there is no instruction for
8 receive.
For a processor module to receive data over -
11 an interprocessor bus, the operating system in that pro-
12 cessor module must first configure an entry in the bus
13 receive table (BRT). Each BRT entry contains the address
14 wkere the incoming data is stored and the number of
15 words expected.
16
17 While the sender processor module is executing
18 the send instruction and sending data over the bus, the
19 bus receive hardware and the microprogram 115 in the
20 receiver processor module are storing the data away
21 according to the appropriate BRT entry (this occurs inter-
22 leaved with software program execution).
23
,~
24 When the receiver processor module receives
25 the expected number of words from a given sender, the
26 currently executing program is interrupted, and that
27 particular bus transfer is completed.
28
29 Fig. 5 shows the logic diagram and Fig. 8 shows
30 the state diagram for the bus receive hardware.
72
, . .
,,
~147~24
1 As previously pointed out, there are identical
2 X and Y inqueue sections 65 in each processor module for the
3 X bus and the Y bus, Only one of the inqueue sections
4 will therefore be referred to the description which follows.
6 After initial reset of a processor module, or
7 after a previous receive operation, the RESET state of the
8 processor empty state logic 101 advances to the R~ADY
9 state. The READY state qualifies the SYNC state of the bus
0 fill state logic 93 to advance the logic to the ACKNOWLEDGE
state,
3 In this ACXNOWLEDGE state the inqueue section 65
14 rPturns RECEIVE ACKNOWL~DGE to the bus controller 37 in
15 response to a SELECT 63 (see Fig. 2) of that processor
16 madule 33. This indicates the readiness of the X inqueue
17 secti.on 65 to receive the data packet.
18
9 In the packet transfer sequence (described in
20 detail above) the combination of the SELECT of that
21 processor module and the RECEIVE COMMAND signal qualify
t",',~" 22 the ACKNOWLEDGE state of the bus fill state logic 93 and
23 to advance to the RECEIVE state.
~' 24
! '
r" 2~ At this state transition the sender register 95
t~ 26 (Fig. 5) is loaded with the number of the sending processor
/
',!~;.' 27 mOdule~
28
29 In the RECEIVE state the data packet is loaded
- 30 from the data bus to the inqueue buffer 97 under control
31 of the inqueue counter 99.
73
~4 :`,
J''J~":.~
- ,i, ~ l
.1~47~8Z4
1 As the sixteenth word of the packet is loaded,
2 it causes the RECEIVE state to advance to the ~ULL state
3 (see Fig. 8).
Now the FULL state qualifies the READY state
6 of the processor empty state logic 101 to advance to
7 the MICROINTERRUPT state as shown in Fig. 8. The MICRO-
8 INTERRUPT state presents an INQ~EUE FULL state to the
9 CPU interrupt logic. This INQUEUE FULL signal causes a
10 microinterrupt to occur at the end of the next software
11 instruction if the ~SX bit corresponding to that bus is on.
12
13 The bus receive microprogram 115 activated
14 by the interrupt first of all issues a LOCK signal (see
15 Fig. 5) to the processor empty state logic 101. This
16 causes the MICROINTERRUPT state of the processor empty
17 8tate logic 101 to advance to the DUMP state.
18
19 The LOCK signal also selects either the X
20 inqueue or the Y inqueue; subject, however, to the
21 condition if both inqueues are full and enabled, the X
22 queue is selected.
23
24 Next, the microprogram 115 issues the K/SEND
25 signal which causes the sender register 95 contents to be
26 gated to the K bus (as shown in Fig. 5) to obtain the
27 packet sender's processor number.
28
29 Using this processor number, the microprogram
30 115 reads the sender processor's BRT entry to obtain the
31 address and count words.
74
~147~
1 If the count word is zero or negative, the
2 packet is discarded; and in this case, the microprogram
3 115 issues a RINT signal which causes the processor
4 empty state logic 101 to advance from the DUMP state to
5 the RESET state as shown in Fig. 8. In this event there
6 is no further action. The microinterrupt is terminated,
7 and software instruction processing is resumed.
9 If the count is positive, the microprogram 115
10 reads words from the inqueue buffer 97 to the ~ bus by
11 means of the K/INQUEUE signal as shown in Fig. 5.
12
13 With each occurrence of the K/INQUEUE signal,
14 t~e inqueue counter 99 is incremented to scan through the
15 inqueue buffer 97.
16
17 As each data word is read from the inqueue
~ 18 buffer 97, the count word is decremented, the memory
i 19 address word is incremented, and the data word is stored
20 into memory.
21
22 If the count word reaches zero, no more words
23 are stored in memory, a completion interrupt flag is set,
24 and the sender processor number is saved in a memory
25 location. In that event the fill state bus logic 93 stays
26 in the FULL state until cleared by a software RIR instruction.
L 27
28 Thus, when a data block has been completely
29 received, the count word will contain a value between minus
'5:; 30 14 and zero. After the completion interrupt occurs, no
., .
, ... .
~147824
1 further transfers to the processor over the bus which
2 cause the interrupt are permitted until the inqueue is
3 cleared with an RIR instruction.
As the data words are stored into the memory,
6 a modulo-two sum of packet data is calculated.
8 If the check sum is bad, the word count in the
g BRT entry is set to minus 256, a completion interrupt flag
10 is set, and the sender processor number is saved in memory.
11 As above, the bus fill state logic 93 stays in the FULL
12 state until cleared by an RIR instruction.
13
14 - If the count word does not reach zero, and the
15 check sum is good, the bus receive microprogram 115 issues
16 the RINT signal to the processor empty state logic as shown
17 in Fig. 5 which causes the DUMP state of the processor
18 empty state logic 101 to advance to the RESET state as
19 ghown in Fig. 8.
Zl The RESET state of the logic 101 qualifies the
22 bus fill state logic 93 to advance from the FULL state to
23 the SYNC state as also shown in Fig. 8.
24
At this point, the logic has been returned to
26 the state it was in before the packet was received, thus
27 enabling the receipt of more packets.
28
29
76
~147~4
1 ~hese packets may be from the same sender,
2 completing that data block, or the packets may be from
3 some other sender.
This completes the action of the bus receive
6 microprogram 115 and the microprocessor 113 resumes pro-
7 cessing of software instructions.
g When a bus receive completion interrupt has
0 occurred, the software interrupt handler obtains the sender
11 processor number from the memory location where that number
2 was saved, and the software interrupt handler can then
13 detect if a check sum error occurred by examining that
14 5ender processor's bus receive table count word.
16 In the case of a transmission error, the count
17 word has been set to minus 256. Otherwise, the count word
18 will contain a value between minus fourteen and zero.
19
As mentioned above, it is thus the responsibility
21 of the bus receive completion software interrupt handler
22 to issue the ~INT signal (by means of an RIR software
23 ~nstruction) to reenable the inqueue 65.
24
In summary on the receive operation, just as
26 the sending of a data block by a sender processor module
27 is viewed by software as a single event, the receiving of
28 data ~y a receiver processor does not cause a software
29 interrupt of the receiver processor module until the
30 entire data block has been received or until an error has
__
~i47~2~
1 has occurred. Also, the inqueues 65 serve as buffers to
2 allow the txansmission of data to occur at bus transmission
3 rates while allowing the storing of data into memory and
4 the checking of the data to occur at memory speed. This
5 ability to use the high transmission rate on the bus insures
6 adequate bus bandwidth to service a number of processor modules
7 on a time slicing basis. Finally, the provision of a check
8 sum word in each data packet provides a means in the receiver
g processor module for checking the accuracy of the data
10 received over the multiprocessor communication path.
11
12 Information sent over the interprocessor bus is
13 sent under the control of the operating system and is sent
14 fr~m one process in one processor module 33 to another pro-
~5 cess in another processor module 33. A process (as described
16 in detail above in the description of the Multiprocessor
17 System) is a fundamental entity of control in the soft,ware
18 system; and a number of processes coexist in a processor
19 module 33. The information sent over the interprocessor bus
20 between processes in different processor modu-es consists
21 Of two types of elements, control packets and data.
22
23 The control packets are used to inform the
24 receiving processor module 33 about message initiations,
25 cancellations, and data transfers.
26
27 In this rega,rd it should be noted that, while
28 the interprocessor buses 35 interconnect the processor
29 modules 33, a process within a particular processor
30 module 33 communicates,with another process or with
31 other processes within another processor module 33 through
32 a method of multiplexing the interprocessor bus 35. The
78
~147824
1 bus traffic between two processor modules 33 will there-
2 fore contain pieces of interprocess communications
3 that are in various states of completion. ~any inter-
4 process communications are therefore being interleaved
5 on an apparently simultaneous basis.
7 The hardware is time slicing the use of the
8 interprocessor bus 35 on a packet level, and multiple
9 processes are intercommunicating both within the pro-
10 cessor modules 33 and to the extent necessary over the -
11 interprocessor buses 35 in message transactions which
12 occur interleaved with each other. Under no circumstances
13 is an interprocessor bus 35 allocated to any specific
14 pr;~cess-to-process communication.
16 Data information is sent over the interprocessor
17 bus in one or more packets and is always preceded by a
18 control'packet and is always followed by a trailer packet.
19
The control packet preceding the data packets
21 is needed because a bus is never dedicated to a specific
22 message, and the control packet is therefore needed to
,, ,
23 correctly identify the message and to indicate how much
24 data is to be received in the message.
.~,,,
I 25
,:!
26 This information transfer (control packet, data
27 in~ormation, trailer packet) is made as an indivisible
28 unit once it is started. The sender processor module
29 sends the data block as an individual transmission tcon-
30 sisting of some number of data packets) and sends the
.
79
~47~Z4
1 trailer packet as an individual transmission; and only
2 then is the sender processor module able to send
3 information relating to another message.
T'he trailer packet serves two purposes.
....
7 First of all, if there is an error during a data
8 transmission (and therefore the rest of the data block must
9 be discarded), the trailer packet indicates the end of
! lo the block.
.. 11 .
12 Secondly, if the sender attempts to send too
13 much data (and again the block must be discarded), the
; 1 14 traller packet provides a means for recognizing data has
15 been transmitted and the data transmission has completed.
16
17 The information transmitted is either duplicated
, 18 over different paths (so that it is insured that the
~,
,l 19 lnformation will get to the receiver) or a recei~er acknowledg-
~' 20 ment is required tso that the information is repeated if
21 necessary). Any single bus error therefore cannot cause
22 information to be lost, and any single bus error will not
¦ 23 be seen by the two processes involved.
,
24
The bus receive software interlocks with the
26 bus receive hardware (the inqueue section 65 shown in
27 Fig. 2) by controlling the transfer of information from
28 the inqueue into the memory 107.
29
~j 30
." i . .
_- .
:ir-..
~f~
, j
~1478Z4
1 This allows such operations as changing the
2 bus receive table information to be done without race
3 conditions (synchronization problems).
Cnce the bus receive table information has
6 been updated, the interlock is removed by clearing the
7 previous completion interrupt and by reenabling the
8 bus receive microinterrupts by setting on the bus mask
9 bit in the mask register.
11 This does two things. It allows the inqueue
12 hardware to aocept a packet into the inqueue, and it also
13 enables the bus receive microprogram to transfer that
14 information from the inqueue into memory.
16 The hardware/software system is so constructed
17 that no information is lost on a system power failure (such
18 as a complete failure of AC power from the mains) or on a
19 line transient that causes a momentary power failure for
1 20 part of the system.
21
22 This hardware/software system coaction includes
23 a power warn signal (see line 337 of Fig. 3~ supplied to
,~
24 the inqueue section 65 (see Fig. 2) so that, at most, one
25 further packet of information can be loaded into the in-
;,,
26 queue after the receipt of the power warn signal.
27
28 The software action in this event includes a
29 SEND instruction to force the inqueues to be full. The
30 net effect is to insure that no transmissions are completed
81
1~478Z4
1 after the processor module 33 has received its power
2 warn signal, so that the state of every transfer is
3 known when logic power is removed.
The interprocessor buses 35.are used by the
6 operating system to ascertain that other processor
7 modules in the system are o~erating. Every N seconds,
8 each of the processor modules 33 sends a control packet
- 9 to each processor module 33 in the system on each
10 interprocessor bus 35. Every two N seconds, each pro-
11 cessor module 33 must have received such a packet from
12 each processor module 33 in the system. A processor
13 module that does not respond is considered down. If a
14 processor module does not get its own message, then that
15 processor module 33 knows that something is wrong with
16 it, and it will no longer take over I/O device controllers
17 41.
18
19 Fig. 42 diagrammatically illustrates how a
20 particular application program can run continuously even
21 though various parts of the multiprocessor system can
~22 become inoperative.
23
,24 Each of the separate views shown in Fig. 42
25 illustrates a multiprocessor system configuration which
26 consists of two processor modules 33 connected by dual
27 interprocessor buses 35 (indicated as an X bus and a Y bus),
~28 a device controller 41 which controls a number of keyboard
29 terminals, and another device controller 41 which controls
30 a disc-
-
8:2
~ .
!
11478Z4
1 The individual views of Fig. 42 indicate various
2 parts of the multiprocessor system re~dered unserviceable
3 and then reintroduced into the multiprocessor system
4 in a serviceable state.
6 The sequence starts with the upper left hand
7 view and then proceeds in the order indicated by the
8 broad line arrows between the views. The sequence thus
g goes from the condition indicated as (1) Initial State
10 to (2) CPU 0 Down to (3) CPU 0 Restored to (4) CPU 1 Down
11 to ~5) CPU 1 Restored (as indicated by the legends above
12 each individual view).
13
14 In the initial state of the multiprocessor system
15 shown in the view entitled "Initial State" at the upper
16 left hand corner of Fig. 42, one copy (PA) of the application
17 program i8 active. This copy makes a system call to create
18 the copy PB as a backup to which the application program
19 PA then passes information. All of the I/O is taking
20 placé by way of the processor module 0. In this initial
21 state either interprocessor bus 35 may fail or be brought
22 down (as indicated by the bars on the X bus) and can be
, ~ , _
'23 then reintroduced into the multiprocessor system without
24 producing any effect on the application program PA.
~,
~- 26
27
28
29
83
` 1~478Z4
;
"
In the next view (the view entitled "CPU 0
Down") the processor module 0 is rendered unserviceable.
The multiprocessor system in~orms the application pro-
gram PA that this has happened, and the application
program PA no longer tries to communicate with the pro-
gram PB. All of the I/O is switched by the multiprocessor
system to take place by way of the processor module 1,
., ;
and the application program continues to service the
terminals without interruption over the I/O bus 39 connect-
ing the processor module 1 with the device controllers 41
~as indicated by the solid line arrow on the right hand
I/O bus 39).
In the next state of operation of the multi-
processor system, as illustrated in the center top view
of Figure 42 and entitled "CPU 0 Restored", the processor
module 0 is now brought back into service by way of a con-
sole command. The processor module 0 is reloaded with the
multiprocessor system from the disc by way of the processor
module 1. The application program PA is informed that
processor module 0 is now serviceable and the application
program PA tells the multiprocessor system to create another
copy of the application program in the processor module 0.
This other copy is designated as PC. The terminals con-
tinue in operation without interruption.
- 84 -
~478Z4
Next~ the processor module 1 is rendered inoperative, as illus-
trated in the view entitled "CPU 1 Down". The application program PC
is informed of this fact by the multiprocessor system and the application
- program PC takes over the application. The multiprocessor system auto-
matically performs all of the I/O by way of the processor module 0.
The terminals continue without interruption.
Finally~ as indicated by the top right hand view of Figure
42 entitled "CPU 1 Restored", the processor module 1 is rendered
operable by way of a console command and is reloaded with the multi-
processor system from the disc by way of the processor module 0. The
application program PC is informed that the processor module is now
availab~e, and it tells the multiprocessor system to create another
copy of itself (application program PD) in the processor module 1. All
elements of the multiprocessor system are now operable.
During the whole of this time both interprocessor buses and bothprocessor modules had been rendered unserviceable and reintroduced into
the system, but the application program and the terminals continued
without a break.
~ ~ -85-
~, ~
1147824
It is an important feature of the multi-
processor system that not only can the application
program continue while something has failed, but also
that the failed component can be repaired and/or
replaced while the application program continues. This
is true not only for the processor modules and inter-
processor buses but also for all elements of the
multiprocessor system, such as power supplies, fans
in the rack, etc. The multiprocessor system 31 thus is
a true continuously operating system.
- 86 -
11~7824
1 THE INPUT/OUTPUT SYSTEM AND DUZ~L POE~T DE:VICE: CONTROLL~R:
3 The multiprocessor system 31 shown in Fig. 1
- 4 includes an input/output (I/O) system and dual port
5 device controllers 41 as noted generally above.
7 The general purpose of the IjO system is to
- 8 allow transfer of data between a processor module 33 and
g peripheral devices.
11 It is an important feature of the present
12 invention that the data transfer can be accomplished over
13 redundant paths to insure fail soft operations so that
14 a failure of a processor module 33 or a failure of a
part of a dévice controller 41 will not inhibit transfer
16 of data to and from a particular peripheral device.
17
18 Each device controller 41 has dual ports 43
19 and related structure which, in association with two
20 related I/O buses 39, permit the redundant access to a
21 peripheral device as will be described in more detail below.
~....
22
23 The I/O system of the present invention also
24 has some particularly significant features in terms o
25 performance. For example, one of the performance
26 features of the I~O system of the present invention is
27 the speed (bandwidth) at which the input/output bus
28 structure operates. The device controllers 41 collect
29 data from peripheral devices which transmit data at
30 relatively slow rates and transmit the collected data
87
11478Z4
1 to the processor modules in a burst multiplex mode at
2 or near memory speed of the processor modules 33.
4 As illustrated in Fig. 1, each processor
5 module 33 is attached to and handles a plurality of
6 individual device c~ntrollers 41; and this fact ma~es
7 it possible for each device controller 41 to be
8 connected (through dual ports 43) to more than one
g processor module 33 in a single multiprocessor system.
11 With reference now to Fig. 12 of the drawings,
12 each processor module 33 includes, in addition to the
13 interprocessor control 55 noted above, a central processor
1~ unit tCPU) part 105, a memory part 107 and an input/output
15 tI/0) channel part 109.
16
17 As illustrated in Fig. 12 and also in Fig. 1,
18 each device controller 41 controls one or more devices
19 through connecting lines 111 connected in a star pattern,
zo i.e. each device independently connected to the device
21 COntrOller,
22
23 In Fig. 12 a disc drive 45 is connected to one
24 device controller 41 and a tape drive 49 is connected to
25 another device controller 41.
26
27 With continued reference to Fig. 12, each CPU
28 part 105 includes a microprocessor 113. A microprogram
29 115 is associated with each microprocessor 113. A part
30 of the microprogram 115 is executed by the microprocessor
88
_
11417824
1 113 in performing I/O instructions for the I/O system.
2 The I/O instructions are indicated in Fig. 12 as EIO
3 (execute I/O), IIO (interrogate I/O), HIIO (interrogate
4 high priority I/O); and these instructions are
5 illustrated and described in greater detail below with
6 reference to Figs. ~5, 16 and 17.
8 The microprocessor 113 has access to the I/O
9 bus 39 by way of the I/O channel 109 by a collection
lO of paths 117 as illustrated in Fig. 12.
11
12 With continued reference to Fig. 12, the I/O
13 channel 109 includes a microprocessor 119, and a micro-
14 program 121 is associated with the microprocessor 119.
16 The microprogram 121 has a single function
17 ~n the multiprocessor system, and that function is to
18 perform the reconnect and data transfer sequence
19 illustrated in Fig. 16 (and described in more detail
20 below).
21
22 The I/O channel 109 of a processor module 33
23 also includes (as shown in Fig. 12) data path logic 123.
24
As best illustrated in Fig. 13, the data path
26 logic 123 includes a channel memory data register 125,
27 an input/output data register 127, a channel memory
28 address register 129, a character count register 131,
29 an active device address register 133, a priority resolv-
30 ing register 135 and parity generation and check logic 137.
89
~ 78;~
1 The path 117 shown in Fig. 12 includes two buses
2 indicated as the M bus and the K bus in Fig. 13.
4 The M bus is an outbus from the microprocessor
113 and tr~nsmits data into the input/output data
6 register 127.
8 The K bus is an inbus which transmits data
g from the data path logic 123 into the microprocessor 113.
11 With reference to Fig. 12, a path 139 connects
12 the data path logic 123 and the memory subsystem 107.
13
14 This path 139 is illustrated in Fig. 12 as
including both a hardware path 139A and two logical paths
16 139B and 139C in the memory subsystem 107 of a processor
17 mOdUle 33.
18
9 Logical paths 139B and 139C will be described
20 in greater detail below in connection with the
21 description of Fig. 16.
22
23 The hardwaxe path 139A includes three branches
24 as illustrated in Fig. 13.
26 A first branch 139A-l transmits from memory
27 into the channel ~emory data register 125.
28
29 A second path 139A-2 transmits from the channel
30 memory address register 129 to memory.
il47~Z4
1 And a third path 139A-3 transmits from the
2 input/output data register 127 to memory.
4 With reference to Fig. 12, the input/output
5 channel of a processor module 33 includes a control logic
6 section 141.
8 This control logic section 141 in turn includes
9 a T bus machine 143 (see Fig. 13) and request lines
10 RECONNECT IN tRCI) 145, LOW PRIORITY INTERRUPT REQUEST
11 (LIRQ) 147, HIGH PRIORITY INTERRUPT REQUEST (~IRQ) 149
12 and RANK 151 tsee Fig. 14).
13
14 The I/O bus 39 shown in Fig. 14 and Fig. 12
15 also includes a group of channel function lines 153, 157
16 and 159. See also Fig. 13. The TAG bus (T bus) 153
17 consists of four lines which serve as function lines, and
18 there are three lines SERVICE OUT (SVO) 155, SERVICE IN
19 (SVI) 157, and STOP IN (STI) 159 which serve as handshake
20 lines as indicated by the legends in Fig. 14.
21
22 As shown in Fig. 14 and Fig. 12, the I/O bus
2339 also includes a group of data lines 161, 163, 165, 167
24and 169.
26 The DATA BUS lines 161 and PARITY 163 are bi-
27directional and serve as data lines and as indicated in
28Fig. 14, there are sixteen DATA BUS lines 161 and one
29PARITY line 163 in this group.
11478Z4-
I ~he lines ~D OF ~ANSFE~ ~EO~) 165~ ~AD oU~
d P~D IN (PADI) 16~ ser
di ate special cnditinS
4 on the data lines 161 and 163 from time-to-eime.
6 1 the I/O buS 39 inClude
7 lLne (IORS~) 171 AS also shown in ~ig- 14 and in F~g- 12-
b command illustrated in
Specific ~ormat on the
d is valid. $his specific
d sor the ~ buS fUnctions
C) md Read Device Status (
la for the preferSed emb
16 se of the ~ bus unction
17 data or field transmitted on lines 0 to 5 o~ the data
ifY the operation to be P
d on lineS 8 to 12 o~ th
ontroller 41 (or more pr
i cOntroller whiCh is at
hich the command is addre
d on data buS lineS 13 to
t hed to the deVice cnt
by that device controlle
~6 this comrand
227 of the ~ ~us function
29 bus bits 0, 1, 2 and 3 indicate ownership error, interrupt
d ~ice busY~ and paritY e
15 return devi~e dependen
92
~4q f~4
1 The functions on the T bus are transmitted
2 in three sequences, shown in Figs. 15, 16 and l7 and
3 described in detail below.
Each T bus function is asserted by the channel
6 and a handshake se~uence is performed between the channel
7 lO9 and the device controller 41 using the handshake lines
8 155, 157 and 159 to acknowledge receipt of the T bus
9 function. Control of the T bus and handshake is the
10 function of the T bus machine 143 in Fig. 13.
11
12 Fig. 28 is a timing diagram showing the operation
13 Of the handshake between the I/O channel lO9 and the ports 43.
14
As illustrated in Fig. 28, line 155 transmits
16 the service out signal (SVO) and line 157 transmits the
17 service in signal (SVI).
18
19 The channel clock cycle is shown in vertical
20 orientation with the SVO and SVI signals.
21
22 As illustrated in Fig. 28, the service in
23 (SVI) signal is not synchronized with the channel clock
24 and may be asserted at any time by the device controller
25 in response to a service out signal from the I/O channel
,26 109.
27
28 Before asserting service out (SVO), the channel
~9 109 asserts the T bus function and, if required, the data
30 bus.
93
1~47824
1 The channel then asserts a service out signal
- 2 as indicated by the vertical rise 279 in Fig. 28; and,
3 SVO remains true until the device controller responds
4 with service in ~SVI) (281), acknowledging the channel
S command; SVI~remains true until the channel drops SVO.
7 When the device controller 41 asserts the
8 service in (SVI) signal, the channel 109 removes the
9 service out (SVO) signal (as shown by the vertical drop
10 283 in Fig. 28) in a time period typically between one
11 and two clock cycles; and in response, the device controller
12 drops service in tsvI) as shown by the vertical drop
13 285 in Fig. 28.
14
When the device controller drops the service
16 in ~SVI) signal, the channel 109 is free to reassert a
17 5ervice out signal (SVO) for the next transfer; however,
18 the channel will not reassert SVO until SVI has been
19 dropped.
21 The arrows 281A, 283A and 285A in Fig. 28
22 indicate the responses to the actions 279, 281, 283
23 regpectively.
24
The handshake is completed at the trailing
26 edge of the vertical drop 285 as shown in Fig. 28,
27
28 On an output transfer, the interface data register
29 213 of the controller accepts the data at the leading edge
30 Of service out (vertical rise 279) and transfers the data
~47~z~
1 to the control part of the device controller 187 at the
2 trailing edge of the service out (the vertical drop~283). .
4 . On an input transfer the channel 109 accepts
5 data from the device con~roller at the trailing edge of .
6 service out ~the vertical drop 283).
~ 7
8 Thus, a two linè handshake is used to interlock
9 transfer of information between the channel 109 and its
device controller 41, since they act asynchronously.
1~
12 This is the general handshake condition,
13 indicated as handshake 2L in Figs. 15, 16 and 17.
14 .
?5 In add~tion, two special handshake considerations
16 occur, when appropriate. .
17
18 First, channel commands used to select a
19 device controller are not handshaken by SVI, since no
20 single device controller is selected during this time. .
21
22 These commands include (as shown in Fig. 18):
23 SEL - Select;
24 LAC - Load Address & Command;
HPOL - Hi Priority Interrupt Poll;
~26 LPOL - Lo Priority Interrupt Poll; and
27 RPOL - Reconnect Interrupt Poll.
28
29 Also, commands used to terminate a sequence are
30 not handshaken by SVI since they cause a selected device
31 controller to deselect itself.
~.~ 47a24
. .
,, These commands include (as also shown in Fi~. 18):
~- 2 DSEL - De-Select;
3 ABTI - Abort Instruction (I/O); and
4 A8TD - Abort Data.
5 ,
6 For all of the commands noted above which are
7 not handshaken, the channel asserts SVO ~155) for a given
8 period of time te.g., two clock cycles) and then the
9 channeI removes SVO. This type of handshake is referred
10 to as Handshake lL in Figs. 15, 16 and 17.
11 ~
12 Second, data transfer is handshaken normally
13 except that when a device controller wishes to signal that
14 it does not require further service, it returns stop-in -
tS~I) instead of SVI. When SVO is next dropped by the
16 channel, the port deselects itself. STI otherwise hand-
17 ~hake5 in the same manner,as SVI.
18
19 As a further condition on all handshakes, when
20 the channel prepares to assert SVO, it initiates a timer
2L (part of T bus machine 143 in Fig. 13) which times out
22 and posts an error if the next handshake cycle is not
23 initiated and completed within the period of time set
24 by the timer. If the timer times out, an error is
25 posted at the appropriate point in the sequence, and
26 either ABTI ~EI0, IIO or HIIO sequence) or ABTD
27 ~reconnect sequence) is sent to the device controller 41
28 (see discussions of Figs. 15, 16 and 17).
- 29
' 96
.
' ~
1~4q824,
1 Fig. 29 shows the logic for the handshake shown
2 in Fig. 28. The logic shown in Fig. 29 is part of the
3 T bus machine 143 shown in Fig. 13. The logic shown in
4 Fig. 29 is the logic which is effective for the general
5 handshake condition noted above.
7 The logic shown in Fig. 29 includes a service
8 out flip-flop 287 and a service in synchronization flip-
9 flop 289. As illustrated by the dividing lines and
10 legends in Fig. 29, the flip-flops 287 and 289 are
11 physically located within the channei lO9.
12
13 The device controller 41 includes combinational
14 logic 291 and a transmitter 293 which transmits a service
15 in slgnal (SVI) back to the D input of the flip-flop 289.
16
17 The ~u~ctioning of the logic shown in Fig.
18 29 is as follows.
19
The channel lO9 asserts service out by turning
21 on the J input of the flip-flop 287; and when the next
22 clock cycle starts, the service out signal is transmitted
23 by a transmitter 295 to the device controller.
24
When the combinational logic 291 in the device
26 controller is ready it enables the transmitter 293 to
27 return the service in signal (SVI) to the flip-flop 289.
28 This completes the handshake.
29
97
~147824
1 Turning now to the dual port device controller,
2 as illustrated in Fig. 19, each of the dual ports 43 in
3 a device controller 41 is connected by a physical
4 connection 179 to interface common logic 181 (shown in
more detail in Fig. 21) and each of the ports 43 is
6 also associated through a logical connection 183 to the
7 interface common logic 181 as determined by an ownership
8 latch 185.
.9
As shown by the connecting line 180 in Fig.
11 19, the interface co~mon logic 181 is associated with
12 the control part 187 of the device controller 41. The
13 control part 187 of the device controller includes a
14 ~uffer 189.
16 The dual ports 43 shown in block diagram form
17 in Fig. 19 tand in more detail in Fig. 23) are important
18 parts of the multiprocessor system of the present
19 invention because the dual ports provide the failsoft
20 capability for the I/O system.
21
22 The ports 43 and related system components are
23 structured in such a way that the two ports 43 of one
24 device controller 41 are logically and physically
25 independent of each other. As a result, no component
26 part of one port 43 is also a component of the other
27 port 43 of a particular device controller 41; and no single
28 component failure (such as an integrated circuit failure)
29 in one port can affect the operation of the other port.
98
1~47824
1 Each port 43 unctions to interface (as
2 ~ndicated by the legend in Fig. 19) a processor module
~ 3 33 with a device controller, and ultimately with a
- 4 particular device, through the device controller 41.
5 The port-43 is the entity that communicates with the
6 processor module and communicates with the control part
7 of the device controller 187 (conditional on the state
8 Of the ownership latch 185).
That is, the port itself makes the connection
11 to a processor module (dependent upon instructions
12 received from the I/O channel 109 as discussed in more
13 detail below) by setting its select bit 173.
14
Each of the individual ports 43 in a particular
16 device controller 41 can be connected independently to
17 a processor modùle 33 and at the same time as the other
18 port in that device controller is connected to a different
19 module. However, the ownership latch 185 establishes
20 the logical connection between the control part of the
21 device controller and one of the dual ports 43 so that
22 only one port has control of the device controller at any
23 one point in time.
24
The decode logic determines what function
26 i8 being txansmitted on the T bus 153 at any particular time.
27
28 The control logic combines T bus functions
- 2~ to perform specific port functions, for e~ample, set
- 30 select bit, clear select bit, read interrupt status.
,~
~147824
1 The functioning of the control logic is
2 illustrated in the logic equations set out in Fig. 27.
4 When a connection sequence (to be described
5 later in reference to Figs. 15, 16 and 17) is transmitted
6 over the I/O bus 39, one of the ports 43 (and only the
7 one port 43 in a device controller 41 attached to that
8 I/O bus 39) connects (in a logical sense) to the bus 39
9 by setting its select bit 173.
11 This logical connection is determined by part
12 Of the data transmitted in that connection sequence.
13 When connected, that particular port 43 subsequently
14 responds to channel protocols in passing inormation
15 between the channel and the control part of the device
16 controller. The device address comparator 193 is the
17 component part of the port 43 that determines the port's
}8 unique address.
19
2~ The device address comparator 193 determines
21 the unique address for a particular port 43 by comparing
22 the de~ice address field on the data bus 161 during a
23 LAC T bus function, with device address jumpers associated
24 with a particular port 43. When the address transmitted
25 by the channel 109 matches the address determined by the
26 jumpers on a particular port 43, the term ADDCOMP (see
27 Fig. 27) is generated and the select bit 173 for that
28 port is set (assuming that the other conditions set out
29 in Fig. 27 allow the select bit to be set). The port
30 43 then responds to all T bus operations until the sequence
31 terminates by clearing the select bit.
100
(
r ~ s78Z4
1 The abbreviations used in Fig. 27 include
2 the following:
3 Add Comp - Address Compare (Device Address);
4 PARORFF - Parity OK Flip-Flop;
SEL - Select;
6 OWN - Ownership; and
7 SELBIT - Select Bit.
:,
11
19
21
22
23
24
26
27
28
29
.'~ ~ ' ' ' , .
~ ': l'Oi
.
. , .
f
1~47824
1 The parity check register 177 is related to
2 the parity generator and check logic 137 of Fig. 13 in
3 that on output the parity generator logic 137 generates
4 the parity to be checked by the parity checker 177 of
the port 43, and this parity must check or the operation
6 will be aborted by the I/O channel 109 of the processor
7 module 33. On input, the interface common logic 181
8 generates parity to be checked by the channel parity
9 check logic 137 in a similar fashion.
11 As shown in Fig. 24, the parity check is
12 started before data is loaded into the register, and
13 the parity check is continued until after the data has
14 been fully loaded into the register. That is, the
15 parity on the D bus is checked by the port parity
16 register whenever the channel asserts SVO with an output
17 T bus function, and the parity is monitored for the
18 duration of SVO to insure that the data on the D bus is
19 stable for the duration of SVO while the port transfers
20 the data into the data register 213.
21
22 This parity check occurs on each transaction
23 in a T bus sequence; and if a parity error occurred during
24 any transaction in the sequence, the error is returned
25 as a status bit in response to a T bus function during a
26 sequence. For example, in an EIO sequence tFig. 18 and
27 15) the P bit return for RDST indicates that the port
28 determined a parity error during the EIO sequence.
29
102
~1~7824
1 As illustrated in Fig. 18, the parity error
2 bit is a bit number 3 on the D bus in response to a
3 RDST function on the T bus.
If a parity error occurs at some time other
6 -than during an EIO sequence, the parity error is reported
7 during the read interrupt status (RIST) T bus function
8 similar to the manner described above for the RDST T
9 bus function.
11 The parity error is cleared at the beginning
12 of an EIO, IIO, HIIO or reconnect sequence as shown
13 in Fig. 24.
14
If a parity error is detected during any
16 sequence it i8 recorded by the parity check register
17 to be returned on the D bus in response to a RDST or
18 RIST T bus function.
19
With continued reference to Fig. 20, the
21 function of the enable latch 175 in the port 43 is to
22 allow the I/O system to recover from a certain class of
23 errors that would otherwise render inoperative both of
24 the I/O buses 39 attached to a particular device controller
25 41. The enable latch 175 accomplishes this by not allow-
26 ing the port 43 to place any signals on the I/O bus 39.
27
28 The enable latch 175 is cleared by a specific
29 disable command. This is a load address and command (LAC)
30 T bus function with a specific operation code trans-
31 mitted on the D bus 161.
103
1~47~24
1 Once the enable latch 175 is cleared, this
2 enable latch cannot be programmatically reset.
4 The port 43 includes a status multiplexer 195.
5 The status multiplexer 195 returns the ownership error
6 mentioned above if the device controller 41 is logically
7 connected to the other port 43 of that device controller,
8 to indicate that the device controller is owned by the
9 other port and commands to this port will be ignored.
11 The port 43 includes an interface transceiver
12 197 for each input line (i.e., SVI, STI, Data Bus, Parity,
13 PADI, RCI, LIRQ, HIRQ) of the I/O bus 39 shown in Fig. 14.
14 The transceivers 197 transmit data from the port 43 to
15 the I/O channel 109 when the port select bit 173 is
16 set and the T bus function on the T bus 153 requires
17 that the device controller 41 return information to the
18 channel. The transceivers 197 pass information from the
19 data bus 161 into the port 43 at all times.
21 It is a feature of the present invention that
22 the power on circuit 182 acts in association with the
23 transceivers 197 to control the behavior of the trans-
24 ceivers as the device controller 41 is powered up or
25 powered down, in a way which prevents erroneous signals
26 from being placed on the I/O bus while power is going up
27 or down. This feature is particularly si~nificant from
28 the standpoint of on line maintenance.
29
104
~47~4
1 As shown in Flg. 20, each transceiver 197
2 comprises a receiver 198 and a transmltter 200.
4 The transmitter is enabled by an enable line
5 202.
7 There are several terms which are on the enable
8 line 202. These include the select bit 173, a required
9 input function on the T bus, and a signal from the PON
10 circuit 182.
1~
12 The signal from the PON circuit, in a particular
13 embodiment of the present invention, is connected in a
14 "w~re or" connection to the output of the gate which
15 ccmblnes the other terms so that the output of the PON
16 circuit override5 the other terms by pulling down the
17 enable line 202. ~his insures that the transmitter 200
18 (in one specific embodiment, an 8T26A or 7438) is placed
19 in a high impedence state until the PON circuit detects
20 that the power is at a sufficient level that the integrated
21 circuits will operate correctly. The PON circuit output
22 stage is designed to take advantage of a property of the
23 specific transceiver integrated circuit used. On this
24 particular type IC if the driver enable line 202 is held
25 below two diode drops above ground potential, the txans-
26 mitter output transistors are forced into the off state
27 regardless of the level of power applied to the integrated
28 circuit. This ensures that the driver cannot drive the bus.
29
., 105
~147~324
1 This particular combination of features provides
2 a mode of operation wherein the output of the integrated
3 circuit is controlled as power comes up or goes down,
4 whereas normally the output of an integrated circuit is
5 undefined when power drops below a certain level.
7 This same circuit is used on the X and Y buses
8 Of the interprocessor bus system to ccntrol the transceivers
g and control signals generated by the interprocessor control
10 55. As indicated in Fig. 30, each central pr~cessor unit
11 (CPU) 105 has a PON circuit 182 which is similar to the
12 PON circuit 182 in the device controller. The PON circuits
13 therefore control the transmitters for all of the device
14 controllers 41 and all of the interprocessor controls 55.
16 Details of the power-on ~PON) circuit are shown
17 in Fig. 25 where the circuit is indicated generally by the
18 reference numeral 1~2.
19
The purpose of the PON circuit is to sense two
21 different voltage levels of the five volt supply.
22
23 If power is failing, the circuit senses the
24 point at which power drops below a certain level which
25 renders the logic in the device controller or CPU an
26 indeterminate state or condition. At this point the
27 circuit supplies signals to protect the system against
28 the logic which subsequently goes into an undefinable state.
29
106
1147~24
The se~on~ voltn~c levcl which the PON circults
will sensc is a value tllat is perceived when power is
coming up. This second level at which power is sensed
will be greater than the first level by roughly 100
millivolts to provide hysteresis for the system to
eliminate any conditions of oscillation.
The PON circuit stays in a stable condi~ion
after it senses one of the voltage conditions until it
senses the other voltage condition, at which point it
changes state. The state at which the PON circuit is in
at any particular time determines the voltage level at
which the transition to the other state will be made.
The power on circuit 182 thus presents a
signal establishing an indication that the power is
within predetermined, acceptable operating limits for
the device controller 41. If the power is not within
those predetermined, acceptable operating limits, the
signal output of the power-on circuit 182 is used to
directly disable the appropriate bus signals of the
device controller 41.
The output of the PON circuit 182 is a binary
output. If the output is a one, the power is within
satisfactory limits. If the output of the PON circuit
is a zero, this is an indication that the power is below
the acceptable limit.
. . .
- 107 -
, ~ ~ ,r
11478Z4f
1 ~he power-on circuit 182 shown in Fig. 25 and
2 to be described in detail below is used with the device
3 controller 41 and has seven output driver stages which
4 are used in the application of the power-on circuit 182
5 to the device controller 41. However, the same power-on
6 circuit 182 is also used with the CPU 105 and the bus
7 controller 37, but in those applications the power-on
8 circuit will have a lesser number of output driver stages.
As illustrated in Fig. 25, the PON circuit 182
11 comprises a current source 184 and a differential amplifier
12 186.
13
14 The differential amplifier 186 has, as one
15 lnput, a temperature compensated reference voltage input
16 on a llne 188 and has a second input on a li,ne 190 which
17 ~ s an indication of the voltage that is to be sensed by
18 the powqr-on circuit.
19
The reference voltage on line 188 is established
21 by a zener diode 192.
22
23 The differential,amplifier 186 comprises a
,, ,
24 matched pair of transistors 194 and 196.
26 The voltage applied on the line 190.is
27 determined by resistors 198, 200 and 202. The resistors
28 198, 200 and 202 are metal film resistors which provide
2ga high degree of temperature stability in the PON
30 circuit. t
lQ8
~47~2g ~
1 ~he outputs on lines 204 and 206 of the
2 differential amplifier 186 are applied to a thrce
3 transistor array tthe transistors 208, 210 and 212),
4 and this three transistor array in turn controls the
5 main output control transistor 214.
7 The main output control transistor 214
8 drives ail output drivers that are attached. For example,
g in the application of the PON circuit 182 for the device
10 controller 41 (as illustrated in Fig. 25), the main out-
11 put transistor 214 drives output stages 216 through 228.
12 The output stage 216 is used to clear the logic, the out-
13 put stages 218, 220 and 222 are used in combination with
14 the interface devices of one port 43 of.the device controller
15 41, and the output stages 224, 2Z6 and 228 are used in
16 combination with the interface device of the other port
17 43 of the device controller 41.
18
19 Finally, the PON circuit 182 includes a
20 hy5teresis control 230. The hysteresis control 230
2~ includes resistors 232, 234 and a transistor 236.
22
23 In operation, assuming that operation is
24 started from a power off state to a power on condition,
25 the power is applied through the current source 182 to
26 the differential amplifier 186 and to the main output
27 control transistor 214. At this time the voltage on
28 the line 190 is less.than the voltage on the line 188
29 so the differential amplifier 186 holds the output of
30 the main output control transistor 214 in the off state.
109
~_.
~147~24
is, in turn, will ~orce thc nutput stages 216 through
22~ on.
This asserts the output of thc POI~ circuit
lS2 in the zero state, the state indicating that power
is not witllin acceptable limits.
- As voltage rises, the input voltage on line
190 will increase until it equals the reference voltage
on line lSS. At this point the differential amplifier
186 ~rives the main output control transistor 214, turn-
110 ing it on. This removes the base drive from the output stages
216 through 228, forcing these output stages off. The
output of the PON circuit 182 is then a one, indicating
that the power is within acceptable limits.
At this point the hysteresis control circuit
230 comes into play. While power was coming on, the
transistor 236 of the hysteresis control circuit 230
was on. When the transistor 236 is on, the resistance
value of the resistor 202 appears to be less than the
resistance value of this resistor 202 is when the
transistor 236 is off.
The point at which the main output control
transistor 214 turns on is the point at which the
hysteresis transistor 236 turns off. Turning off the
hystercsis transistor 23G causes a slight voltage jump
in the line 190 which further latches the differential
amplifier 186 into the condition where the differential
.
- 110 -
7~24
am~lificr lS6 sustaills the main output transi.stor 21
in thc on statc.
- The state of the PON circuit will remain stable
: in this condition ~ith the main output control translstor
214 on and the output drivers 216 through 228 off until
the plus five volts drops below a lower thresllold point,
as determined by the voltage applied on the line 190.
As the voltage on the line 190 decreases below
the reference voltage on the line 188, (because the five
volts supply is going down in a power failure condition),
then the differential amplifier 186 turns off the main
output control transistor 214. This, in turn, turns on
the output driver stages 216 through 228.
Since the hysteresis transistor 236 was off
as power dropped, the voltage applied to the input of
the PON circuit 182 must drop somewhat farther than the
point at which the PON circuit 182 sensed that power
was within the acceptable limits during the power-up
phase of operation.
This differential or hysteresis is used to
inhibit any noise on the five volt power supply from
causing any oscillation in the circuit that would
erroneously indicate that power is failing.
The PON circuit 182 shown in Figure 25 provides
very accurate sensing of the two voltages used by the
11~782i4
1 PON circuit to determine its state twhether a one or
2 zero output of the PON circuit).
4 In order to sense these two volta~es very
5 accurately the PON circuit must have the capability of
6 compensating for initial tolerances of the different
7 components and also the capability to compensate for
8 changes in temperature durin~ operation. In the PO~
9 circuit 182, the zener diode 192 is the only critical
10 part that must be compensated for because of its initial
11 tolerance, and this compensation is provided by selecting
12 the resistor 198.
13
14 Temperature compensation is achieved because
15 the zener diode 192 is an active zener diode and is
16 not a passive zener diode. Effective temperature
17 compensation is also achieved because the two transistors
18 in the differential amplifier 186 are a matched pair of
19 transistors and the resistors 198, 200 and 202 are metal
20 film resistors.
21
22 Each port 43 includes a number of lines which
23 are indlcated by the general reference numeral 179 in
24 Fig. 20 and Fig. 19. This group of lines 179 includes
25 the individual lines 201 (sixteen (16) of which make up
26 the Input Bus - I Bus), device address lines 203, Output
27 Bus lines 205 (of which there are sixteen), a take owner-
28 ~hip line 207 and general lines 209 which transmit such
29 signals as-parity, the T bus, and other similar lines
30 which are required because of the particular hardware
31 implementation.
112
7824
1 These particular lines 201, 203, 205, 207 and
2 209 correspond to the lines with the same numbers in
3 Fig. 21, which is the block aiagram of the interface
4 common loqic. However, there are two sets of each of
5 these lines in Fig. 21 because the interface common
6 logic 181 is associated with each of the dual ports 43
7 in a device controller 41.
9 With reference to Fig. 21, the interface common
10 logic 181 includes the ownership latch 185 (see also
11 Fig. l9). This ownership latch determines the logical
12 connection between the interface common logic 181 and
13 a port 43 from which TAKE OWNE~SHIP signal has been
14 received over the line 207.
16 As noted above, the TAKE OWNERSHIP signal is
17 derived by the port hardware from a load address and
18 command tLA~) T bus command (see Fig. 18) with a particular
19 operation code in the command field on the D bus. When
20 the port receives the unction LAC on the T bus from the
21 channel, the port logic examines the command field (the
22 top 8ix bits) on the D bus. Then, if the command field
23 contains a code specifying a take ownership command, the
24 port hardware issues a signal to set the ownership
25 latch to connect the port to the interface common logic
26 and thence to the control part of the device controller.
27 If the command field specifies a kill command, the port
28 hardware issues a signal to clear the port's enable latch.
29 This operation happens only if the device address field
30 on the D bus matches the port's device address jumpers,
113
~1478Z4
1 and no parity error is detected during the command.
2 That is, no co~nands (including the take ownership,
3 kill, etc.) are executed if a parity error is detected
4 on the LAC.
.
6 As a consequence, the I/O channel 109
7 issuing the Take Ownership command gains control of the
8 device controller 41, and the other port 43 is logically
9 disconnected. Take Ownership may also cause a hard
10 clear of the controller's internal state.
11 '
12 The state of the ownership latch 185 deter-
13 mines which port may pass information through the multi-
14 plexer 211. Once the ownership latch 185 is set in a
15 given direction, it stays in that state until a Take
16 Ownership command is recei~ed by the other port.
17 Assertion of the I/O reset line ~IORST) will also cause
18 ownerghip to be given to the other port after the internal
19 gtate of the device controller has been clearedO
21
22
23
24
26
27
29
114
~7
- i~47~24
:" .
Control signals are chosen by the state of the ownership register
185 and from the appropriate one of the ports 43 and are transmitted by
the multiplexer 211 to the control part 187 of a device controller on a
~,:
set of control lines 215. Data are selected from an appropriate one of
,~ the ports 43 on lin~s 205 and are loaded into the data register 213 and
presented to the controller on an Output Bus ~0 bus) 217.
Some of the control lines 215 ~the lines 215A) are used to
- control the multiplexer 220 in selecting information from the controller
as transmitted on lines 219, to be returned by the input bus (I bus)
201 to the ports 43 ~Figure 20) and then to the channel 109 of a proces-
sor module 33. A line 221 returns the device address from the appropriate
port 43 to the I bus 201 and thence to the I/0 channel 109.
The data buffer 189 shown in Pigure 19 is illustrated in more
detail in Pigure 22.
In accordance with the present invention many of the device
controllers 41 incorporate a multiword buffer for receiving information
at a relatively slow rate from a peripheral device and then transmitting
that information at or near memory speed to the processor module to
maximize channel bandwidth utilization.
In the buffer design itself it is important that the device
controllers 41 be able to cooperate
-115-
1~47a~4
1 with each other in gaining access to the channel 109
2 to avoid error conditions. In order for the device
3 controllers 41 to cooperate properly, the multiword
4 buffers 189 are constructed to follow certain guidelines.
6 These guidelines include the following:
8 First of all, when a device controller makes
g a reconnect request for the channel 109 it must have
10 enough buffer depth left so that all higher priority
11 device controllers 41 and one lower priority device
12 controller 41 may be serviced and the reconnect latency
13 Of the reconnect request can occur without exhausting
14 the remaining depth of the buffer. This is called Buffer
15 Threshold, abbreviated T in Fig. 23.
16
17 Secondly, after the buffer has been ser~iced,
18 lt must wait long enough to permit all lower priority
19 device controllers 41 to be serviced before making
20 another reconnect request. This is called Holdoff.
21 The buffer depth (D in Fig. 23) is the sum of the holdoff
22 depth plus the threshold depth.
23
24 The holdoff and threshold depths are a
25 function of a number of variables. These include the
26 device rate, the channel rate, the memory speed, the
27 reconnect time, the number of controllers of higher priority
28 on that I/O bus, the number of controllers of lower priority
29 on that I/O bus, and the maximum burst length permissible.
116
~- f
~ ~782~
1 A controller at high priority on an I/O bus has
2 more controllers of lower priority associated with it on the
3 same I/O bus than another controller at lower priority on
4 the same I/O bus, and therefore the higher priority controller
5 requires more holdoff depth than the lower priority controller.
6 Similarly, a controller at low priority on an I/O bus requires
7 more threshold depth than a controller at higher priority.
8 The buffer 189 in a controller is constructed to takq advantage
g of the fact that as holdoff requirement increases the
10 threshold requirement decreases, and as the threshold
11 requirement increases the holdoff requirement decreases. This
12 is accomplished by making the stress at which a reconnect
13 request is made be variable, the actual setting depending
14 on the characteristics of the controllers at higher and
15 lower priorlty in a particular I/O channel configuration.
16 ~he buffer depth is therefore the maximum of the worst case
17 threshold depth or worst-case holdoff depth requirement,
18 rather than the sum of the worst-case threshold depth and
19 worst-case holdoff depth. This allows the buffer depth to
20 be minimized, and shortens the time required to fill or
21 empty the buffer.
22
23
24
26
27
28
29
116a
~47824
1 A number of these parameters are graphically
2 illustrated in Fig. 23. In Fig. 23 time has been plotted
3 on the horizontal axis versus words in the buffer on
4 the ~ertical axis for an output operation.
6 Starting at point D on the upper left hand
7 part of Fig. 23 (and assuming a buffer filled to the full
8 buffer depth), data is transferred to a device at a rate
9 indicated by the line of slope -RD and this data transfer
10 continues without any reconnect signal being generated
11 until the buffer depth decreases to the threshold depth
12 as indicated by the intersection of the line of slope -RD
13 with the threshold depth line ~ at point 223.
14
At this point the reconnect request is made
16 to the channel 109 as indicated by the legend on the
17 horizontal axis in Fig. 23.
18
19 The transfer of data continues from the huffer
20 at the rate indicated by the line of slope -RD and the
21 request is held off by higher priority device controllers
22 41 until point 225 at which point the request is honored
23 by the channel 109, and the I/O channel begins its
24 reconnect sequence for this device controller.
26 At point 22? the first data word has been trans-
27 mitted by the channel 109 to the device controller buffer
28 189, and the channel 109 then transfers data words at
29 a rate indicated by the line of slope RC into the buffer
30 189.
117
~1~78Z4
1 At the same tim~ the device controller 41
2 continues to transrer data words out of the buffer at
3 the rate -RD so that the overall rate of input to the
4 buffer 189 is indicated by the line of slope RC ~ RD
5 until the bu,ffer is again filled at the point 229. At
6 229 the buffer is full, and the d~vice controller dis-
7 connects from the channel 109, and the data transfer
8 continues at the rate indicated by the slope line -Rc.
The notation tr in Fig. 23 indicates the time
11 required for the polling and selection of this device
12 controller and the transfer of the first word. This will
13 be discussed again below in relation to Fig. 16.
14
'The letter B in Fig. 23 indicates the burst
16 time. ~he burst time is a dynamic parameter. The length
17 of any particular burst i~ dependent upon the device
18 transfer rate, the channel transfer rate, the number of
19 devices with trans~ers in progress and the channel
20 reconnect time. The maximum time permitted for a burst
21 is chosen to minimize the amount of buffer depth required
22 while accomodating high device transfer rates and also
23 the number of devices that can transfer concurrently.
24
Fig. 22 is a block diagram of a particular
26 embodiment of a buffer 189 constructed in accordance
27 with the present invention to accomplish the holdoff
28 and threshold requirements illustrated in Fig. 23.
29
118
1~4t7824
1 The buffer 189 shown in Fig. 22 comprises an
2 input buffer 231, a buffer memory 233, an output buffer
3 235, an input pointer 237, an output pointer 239, a
4 multiplexer 241, buffer control logic 243 (described in
5 more detail in Fig. 26), a multiplexer 245 connected to
6 the buffer control logic 243 and a stress counter 247.
8 As also illustrated in Fig. 22, two groups of
g data input lines (lines 217 and 249) are fed into the
o input buffer 231.
11
12 One group of data input lines include sixteen
13 device data input lines 249.
14
The other group of input lines include sixteen
16 Output Bus lines (O bus lines) 217.
17
18 One or the other of these two groups of input
19 signals is then fed from the input buffer 231 to the
20 buffer memory 233 by a group of lines 251. There are
21 sixteen of the lines 251.
22
23 Data is taken from the buffer memory 233 and
24 put into the output buffer 235 by a group of lines 253.
25 There are sixteen of the lines 253.
26
27 The output buffer 335 transmits the data
28 back to the interface common logic 181 (see Fig. 19
29 and Fig. 21) on a group of sixteen lines 219 and to
30 the devices 45, 47 (such as 49, 51, 53 shown in Fig. 1)
119
_._ _ ~ .. . _ _ .. _ . . ,.. _ . . . . . _ . _ .. . . . . . .
J~147824
1 on a group of sixteen lin~s 255 as indicated by the
2 legends in Fig. 22.
4 The input and output pointers 237 and 239
5 fùnction with the multiplexer 241 as follows.
7 When data is being transferred from the
8 input buffer 231 to the buffer memory 233, the input
9 pointer 237 is connected to the buffer memory 233
10 through the multiplexer 241 to determine the location
11 into which the word is written.
12
13 When data is being transferred out of the
14 buffer memory 233 into the output buffer 235, the output
15 pointer 239 is connected to the buffer memory 233 through
16 the multiplexer 241 to determine the location from which
17 the word is taken.
18
19
21
22
23
24
26
27
28
29
120
1~7824
1 The purpose of the buffer control logic 243
2 ~llustrated in Fig. 22 and Fig. 26 is.to keep track of
3 the stress placed on the buffer 189. In this regard,
4 the degree of the full or empty condition of the buffer
5 in combination with the direction of the transfer with
6 respect to the processor module ~whether input or output)
7 determines the degree of stress. Stress increases as the
8 device accesses the buffer and decreases as the channel
9 accesses thé buffer.
11 In the implementation shown in Figs. 22 and 26
12 the stress counter measures increasing stress from 0-15
13 on an input, and decreasing stress from 0-15 on an out-
1~ put. Another implementation (not shown in the drawings)
15 would add the direction of transfer in the buffer control
16 logic such that two new lines would access the pointers
17 237 and 239 and the stress counter would always measure
18 increasing stress.
lg
With continued reference to Fig. 22, a channel
21 request line 215 ~see also Fig. 21) and a device request
22
23
~4
26
27
28
29
121
,,
~147824
1 line 257 (coming from the control part 187 of the
Z device controller) are asserted to indicate access to
3 the buffer 189.
The multiplexer 245 chooses one of these lines
6 as a request to increase the buffer fullness and chooses
7 the other line as a request to decrease the buffer full-
8 ness based on the direction of the transfer (whether
9 input or output) with respect to the processor module.
11 The line chosen to increase buffer fullness
12 i9 also used to load data from the appropriate data
13 lines 249 or 217 (see Fig. 22) into the input buffer
14 231 by means of the line 259.
16 The channel and the device may acGess the
17 buffer 189 at the same time, and the buffer control
18 logic 243 services one request at a time. The buffer
19 control logic 243 chooses one of the lines for service
20 and holds the other line off until the buffer control
21 logic 243 has serviced the first request, then it
22 services the other request.
23
24 ~he servicing of a request by the buffer
25 control logic 243 includes the following.
26
27 First of all, it determines the direction of
28 transfer (into or out of) the buffer memory 233, and it
29 asserts line 261 (connected to the multiplexer 241) as
30 appropriate to select the input pointer 237 or the output
31 pointer 239 through the multiplexer 241.
122
f
~147824
1 Secondly, on an output request, the ~uff~r control
2 logic 243 asserts line 263 which does three thin~s.
3 (A) It writes the word from the input buffer 231
4 into the buffer memory 233 at the location determined by
5 the input pointer 237 and the multiplexer 2~1.
6 (B) It increments the stress counter 247.
7 (C) The buffer control logic 243 increments the
8 input pointer 237.
Thirdly, on an output transfer, the buffer control
11 logic 243 asserts line 265 which accomplishes the following
12 three operations.
13 (A) The buffer control logic 243 writes the word
14 being read from the buffer memory 233 as determined by the output
15 pointer 239 and multiplexer 241 into the output buffer 235.
16 (B) The buffer control logic 243 decrements the
17 ~tresg counter 247.
18 (C) The buffer control logic ~43 increments the
19 output pointer 239.
21 The stress counter 247 determines when the buffer
22 189 is full (D~, or at threshold depth (T) as shown by the
23 output line legends in Fig. 22.
24
The output of the stress counter is decoded, and any
26 one of the decoded values may be used to specify that the buffer
27 is at threshold depth. In the preferred embodiment, wire jumpers
28 are used to select one of sixteen possible stress values, and
29 a reconnect request is made to the channel 109 when the stress
30 on the buffer 189 reaches that value.
123
78Z4
1 The control part 187 of the device controller
2 uses these three signals (which correspond to the legends
3 in Fig. 23) to make reconnect requests and disconnect
4 requests on respective lines 145 (see Fig. 14 and Fig. 12)
5 and 159 (see Fig. 14 and Fig. 12).
7 The STI (stop in) signal transmitted on line 159 shown
8 in Fig. 14 and Fig. 12 is related to the buffer depth (D), the
g full or empty conditions of the buffer and the direction of transfer;
10 and the RCI ~reconnect in) signal on line 145 of Fig. 14 and Fig. 12
11 is related to the threshold depth (T) indication from the stress
12 counter 247 in Fig. 22. Thus, the STI signal is asserted when
13 the buffer 189 reaches a condition of minimum stress (full on output
14 and empty on input). The STI signal signals the channel 109 that
the controller 41 wishes to terminate the burst data transfer.
16 When the buffer passes through its threshold, it asserts the RCI
17 signal on 1ine 145 to indicate to the channel 109 that the buffer
18 wishes ~o transfer a burst of data.
19
Fig. 26 shows details of the multiplexer 245, the
21 buffer control logic 243 and the stress counter 247 of the
22 buffer 189 shown in Fig. 22.
23
24 In Fig. 26 the multiplexer 245 is shown as two sets of
25 gates 245A and 245B, request flip-flops 267A and 267B, a clock
26 flip-flop 269, request synchronization flip-flops 271A and 271B, a
27 priority resolving gate 273 and request execution gates 275A and 275B.
28
2g The stress counter 247 comprises a counter section
30 247A and a decoder section 247B as indicated by the legends
31 in Fig. 26.
124
,
,
11~7824
1 As illustrated in Fig. 26, the two sets of
2 gates 245A and 245B have used the channel request signal
3 tline 215) and the device request signal (line 257)
4 and the read and-write signals to determine which of the
channel or the device is putting data onto the buffer
6 189 and which is taking data out of the buffer 1~9.
8 The request flip-flops 267A and 267B store the
9 requests until the control logic has serviced the request.
11 The clock flip-flop 269 generates a two phase
12 clock used by the request synchronization flip-flops 271A
13 and 271B and the request execution gates 275A and 275B.
14
The request synchronization flip-flops 271A
16 and 271B synchronize the request to the clock generation
17 flip-flop 269 and stabilize the request for execution.
18
19 The priority resolving gate 273 picks one
20 of the requests for execution and causes the other
21 request to be held off.
22
23 The request execution gates 275A and 275B
24 execute the requests in dependence on the synchronized
25 request.
26
27 Each output signal on the lines 263 and 265
28 performs the functions describea above (incrementing
29 and decrementing the stress counter, updating the-
30 buffer memory or output buffer, and updating the input
31 pointer or output pointer).
125
,.. ..... .
1147a24
1 In addition, each signal clears the appropriat~
~ 2 request flip-flop through the lines 277A and 277B
- 3 illustrated ln Fig. 26.
As noted above, Figs. lS, 16 and 17 show
6 the three sequences of operation of the I/O system.
8 In the operation of the I/O system, the normal
9 data transfer between a processor module 33 and a
particular device, such as a disc 45, includes an EIO
11 sequence to initiate the transfer.
12
13 The EI0 instruction selects the particular
14 device controller and device and specifies the operation
to be performed.
16
17 The device controller 41 initiates the I~0
18 between the device controller 41 and the particular
19 device.
21 The device controller 41 periodically
22 reconnects to the channel 109 and transfers data
23 between the device controller 41 and the channel 109.
24 The periodic reconnection may be for the purpose of
either transferring data from the channel to the device
26 or for the purpose of transferring data from the device
27 to the channel.
28
29 When the transfer of data is complete the
device controller 41 interrupts the CPU 105, which
31 responds by issuing an IIO or an HII0 sequence.
126
~478Z4
1 The IIO sequence determines the identity of
2 the intexrupting device and conditions under which the
3 transfer completed.
The HIIO sequence is similar to the IIO
6 sequence ~ut is issued in response to a hi~h priorit~
7 I/O interrupt.
9 The "Execute I/O" CPU instruction (EIO) ls
10 defined by the T bus state changes shown in Fig. 15.
11
12 The first state shown in Fig. 15 (the state
13 farthest to the left) is the no-operation (NOP) or
14 idle state. The other states are the same as those
15 listed in Fig. 18 by the corresponding mnemonics--load
~6 addre~s and command (LAC), load parameter (LPRM), read
17 device statUs (RDST), deselect (DSEL) and abort
18 instruction (ABTI).
~9
As in the state changes shown in Figs. 6, 7
21 and 8, the solid line arrows indicate a state change,
22 and a dashed line arrow indicates a condition which must
23 occur before a state change can occur.
24
The EIO instruction and execution shown in
26 Pig. 15 is directly under control of the microprocessor
:27 113 (see Fig. 12) of the CPU 105.
28
2~ This CPU initiation is shown as transmitted
30 to the state machine in Fig. 15 by the line 117; the
127
11~78Z4
1 lnltiation signal is accepted only when the T bus is
2 in the idle state.
4 Once the CPU initiation signal is applied,
the T bus goes from the NOP (idle) state to the LAC
6 state,
8 In the hAC state or function a word is taken
9 from the top of the register stack 112 in the CPU 105
~see Fig. 12) and is put on the D bus 161 (see Fig. 14).
11
12 As described above, this word is used to
13 select a particular device controller 41 and a particular
14 peripheral device 45, 47, 49, 51 or 53 (see Fig. 1),
and the word is also used to specify the operation to
16 be performed.
17
18 In the next T bus cycle the T bus goes to
19 the LPRM state.
21 In the load parameter state (LPRM) the word
22 just below the top of the register stack in the CPU
23 105 ~see Fig. 12) is put on the T bus 161 (see Fig.
24 14) by the I/O channel 109 and is passed to the device
controller 41 selected during the previous LAC state.
26
27 At the conclusion of the handsha~e cycle,
28 as shown by the dashed line arrow in Fig. 15, the ~
29 bus goes to the RDST state. In this state the device
controller 41 returns the device status (the status of
128
f
"
11478Z4
1 a particular device selected and comprising the set
2 of signals describing the state of that device) from
3 the device controller 41 and places it on the top of
4 the register stack 112 in the CPU 105.
;
6 During the load parameter and read device
7 status state several errors may ha~e occurred. These
8 include parity error, handshake time out, and an error
9 indication in the status word. If an error did occur,
then the T bus machine 143 (Fig. 13) goes from the
11 RDST state to the abort instruction (ABTI) state.
12
13 The ABTI state instructs the device controller
14 41 to ignore the previous LAC and LPRM information
passed to it by the I/O channel 109 and then the T bus
16 (channel) returns to the NOP (idle) state.
17
18 If, after the ~DST state no error was detected,
19 ~as shown by the dashed line arrow 114 in the top branch
20 of Fig. 15), the T bus goes to the deselect state (DSEL).
21
22 With the T bus in the deselect state, the device
23 controller 41 clears its select latch 173 and responds to
24 the instruction issued to it (passed to it during the LAC
state) and the T bus returns to the NOP (idle) state.
26
27 In the operation of the I/O system there are
28 a number of device request signals that can happen
29 asynchronously. For example, a reconnect signal may be
30 generated after an ETO sequence to request that the
.,,
~
~ 129
, .
~,,
f
~147824
1 channel transfer data to the controller. Or the devlce
2 controller 41 may assert an interrupt re~uest line under
3 a number of different conditions, e.g. to sic3nal the
4 completion of an EIO sequence or to report an unusu~l
condition in a peripheral device.
7 The device request lines are common to all
8 device controller ports 43 attached to a particul~ I/O
9 bus 39.
11 The channel 109 responds to reconnect requests
12 made on the line RCI (145 of Fig. 14), and the CPU 105
13 responds to requests made on the LIRQ line 147 (see also
14 Flg. 14) with an IIO sequence, and to a request made on
the HIRQ line 149 with an HIIO sequence.
16
17 The first thing that the channel 109 or CPU
18 105 does in response to a Device Request signal is to
19 tetermine the identity of the highest priority device
controller 41 asserting a request. That is, there may
21 be several device controllers 41 asserting a reque~t
22 to the channel 109 at one time, and the channel will
23 select a particular device controller in accordance
24 with a predetermined priority scheme.
26 In a particular embodiment of the present
27 invention up to thirty-two device controllers 41 can
28 be connected to a single channel 109.
29
130
11478Z4
1 The thirty-two device controllers are
2 connected in a star poll using the sixteen bit data
3 bus 161. One additional line 151 is used to divide
4 the thirty-two device controllers into two groups of
sixteen each. One group of sixteen device controllers
6 is assigned priority over the other group; a~d priority
7 is also assigned among the sixteen within each group.
8 The device responding on bit zero of the D bus during a
9 polling sequence has the highest priority within a
rank, and the one responding on bit 15 has the lowest
11 priority.
12
13 In initial introduction, it may be noted that
14 polling (which will now be described) involves the state
descriptions shown in Fig. 16 and 17 up to and including
16 that handshake which occurs during the select (SEL) state
17 in each figure.
18
19 With continued general reference to Figs. 16
and 17, the channel 109 sets the rank line to zero and
21 then presents the T bus function RPOL (Fig. 16) if the
22 response is to a reconnect request, while the CPU 105
23 presents an LPOL (Fig. 17) T bus function if the CPU is
24 responding with an IIO sequence, or an HPOL T bus function
if the CPU is responding with an HIIO sequence. This
26 is the only major point of difference between the show-
27 ings in Fig. 16 (the channel response) and Fig. 17 (the
28 CPU response) with regard to polling.
29
131
~1~78Z~
1 Referring specifically to Fig, 16 and the
2 response of the channel 109 to assertion of the RCI
3 line 145 (see Fig . 14 ), all devices with a reconnect
4 request pending that would respond on rank zero place
a one bit response on the D bus. That is, all these
6 devices assert a line of the D bus 161 corresponding
7 to their priority within the rank.
9 The channel 109 transfers the D bus response
into the priority resolve register 135 (see Fig. 13).
11 This priority resolve register 135 output determines
12 which device controller has the highest priority (in
13 accordance with the scheme described above~ and asserts
14 the appropriate bit back onto the D bus 161, if there
i~ a bit asserted in rank zero by the attached device
16 controllers.
.,
18 If there are one or more devices asserting a
19 response to the priority resolve register on rank zero,
20 the output of the priority resolve register is presented
21 to all device controllers attached, along with the select
22 function (SEL) on the T bus, and the device controller
23 whose priority on rank zero matches the output of the
24 priority resolve register sets it select bit 173 (see Fig.
19), and then that port will respond to subsequent states
26 in the sequence. This is the mode of operation indicated
27 by the solid line arrow going from the state indicated
28 by RPOL with a rank equals zero to select (SEL).
29
132
f
~147~24
1 If the priority resolving register 135
2 determines that no device responded when the rank line
3 equalled zero, then the channel 109 sets the rank line
4 to one and reissues the RPOL T bus command. Then, if
the priority resolving register determines that a
6 response occurred on rank 1, the channel asserts the T
7 bus select function as be~ore.
g ~owever, if the priority resolving register
135 determines that no response was made on rank 1,
11 the channel returns to the idle state indicated by
12 9tate NOP in Fig. 16.
13
14 This latter event is an example of a failure
which might occur in one port 43 and which would result
16 in the system 31 accessing that particular device
17 controller 41 through the other port 43.
18
19 As noted above, the action of the priority
resolving register 135 in response to an IIO or an HIIO
21 sequence initiated by the CPU 105 is the same as the
22 response of the priority resolving register 135 to a
23 reconnect sequence initiated by the channel in response
24 to a reconnect in on the line 145 from a device
controller 41.
26
27 With continued reference to Fig. 16, the
28 reconnect sequence begins with the poll sequence described
29 above for reconnecting the highest priority device
controller 41 making a request.
133
1147824
1 The next step in the reconnect sequencc is
2 to determine the actual device controller number contained
3 in the device address comparator 193. As noted ~ov~,
4 the device address comparator 193 includes jumpers to
determine a physlcal device controller number. These
6 are the same jumpers that are used on a LAC T bus
7 function during an EIO sequence to determine a particular
8 port. In the reconnect sequence the address determined
9 by these 3umpers is returned to the I/O channel via the
D bus during the T bus RAC state to access a table de-
11 fining the buffer area for this device.
12
13 It is also necessary to determine the direction
14 of the transfer (whether an input or output transfer to
the processor module). To accomplish this determination
16 Of the direction of the requested transfer and the device
17 address, the channel asserts the RAC T bus function and
18 the device controller 41 returns the device controller
19 address and the transfer direction.
21 The channel uses the device address returned
22 by the device controller 41 to access a two word entry
23 (142) in an I/O control table (IOC) 140 (Fig. 12) which
24 aefines a buffer area 138 in the memory 107 for this
particular device controller and device.
26
27 The format of a two word entry 142 is shown
28 enlarged in Fig. 12 to show details of the fields of the
2g two words.
134
~47824
There is a two word entry 142 in the IOC
2 table 140 for each of the ei~ht possibl~ devices of each
3 of the thirty-two possible device controllers 41
4 attached to an I/O bus 39 associated with a particul~r
5 processor module 33, and each processor module 33 has
6 its own IOC table.
8 Each two word entry describes the buffer
9 location in main memory and remaining length to be
10 transferred at any particular time for a particular
11 data transfer to a particular device. Thus, as
12 indicated by the legends in Fig. 12, the upper word
13 specifies the transfer address to or from which the
14 transfer will be made by a burst; and the lower word
15 gpeclfies the byte count specifying the remaining length
16 of the buffer area and the status of the transfer.
17
18 . The fields representing the status of the
19 transfer include a protect bit P and a channel error
20 field CH ERR. The channel error field comprises three
21 bits which can be set to indicate any one of up to
22 seven numbered errors.
23
24 The transfer address and byte count are up- j
25 dated in the IOC table 140 at the conclusion of each
26 reconnect and data transfer sequence (burst). The
27 transfer address is counted up and the byte count is
28 counted down at the conclusion of each burst. The
29 amount reflects the number of bytes transferred during
30 the burst.
135
f
~1~7824
1 The second word also contains (1) a field in
2 which any error encountered during a reconnect and data
3 transfer sequence may be posted for later analysis, ~nd
4 (2) a protect bit to specify that the buf~er area in
5 memory 107 may be read from but not written into.
7 The protect bit serves to protect the processor
8 memory 107 from a failure in the device controller 41.
g That is, when the device controller 41 returned the
10 transfer direction to the channel lO9 during a read
11 address and command (RAC) T bus function, a failure in
12 the device controller 41 could cause the device controller
13 to erroneously specify an input transfer. Then the
14 channel would go to the IN state and transfer data from
15 the device controller into memory, thus causing data in
16 the buffer 138 to be lost. The protect bit allows the
17 pro~rami to specify that the channel may not write into
18 this buf,fer area; that is, the device may only specify
19 an output transfer.
21 The transfer address specifies the logical
22 path 139B (see Fig. 12~.
23
24 The channel places the transfer address in
25 the channel memory address register 12~ (see Fig. 13)
26 and places the byte count in the charactér count register
27 131 (see Fig. 13).
28
29
136
7824
1 Depending upon the direction of the transfer,
2 (which the channel retrieved from the devic~ duriny the
3 ~AC state shown in Fig. 16), the channel puts the T hus
4 in either the IN state or OUT state and transfers data
5 between the device controller 41 and memory 107 using
6 the channel memory address register 129 to specify the
7 logical path 139C (see Fig. 12). The channel memory
8 address register 129 and character count register 131
9 are updated with each word transferred during the burst
10 to reflect the next address in the buffer and the number
11 of characters yet to be transferred. At the conclusion
12 of a burst the contents of the channel memory address
13 register 129 and of the character count register 131 are
14 written into the IOC table 140.
16 In operation, for each word transferred in
17 from the device on an in transfer, the channel 109
18 accept9,the word by the handshake mechanism described
19 above and places the word in the I/O data register 127
20 (see Fig. 13) and then transfers the word to the buffer
21 area in memory defined by the logical path 139C (see
22 Fig. 12).
23
24 On an out transfer the channel 109 takes a 5
25 word from the buffer area over logical path 139C and
26 transfers the word to the channel memory data register
27 125. The channel then transfers the word into the I/O
28data register 127 (Fig. 13) and handshakes with the device
29controller which accepts the word into its interface
30data register 213.
; .
137
~f
11478~
1 The high speed of the I/O channel,is
2 accomplished by pipelining where the word in the I/O
3 data register 127 is hanàshaken to the device while the
4 channel concurrently requests and accepts the next
word in the,transfer from memory 107 and places it in
6 the channel memory data register 125. since it takes
7 just as long to put a word out to the device as it does
8 to accept a word from memory for the device, the two
g operations can be o~erlapped.
11 During the burst, the channel decremented the
12 character count register by two for every word transferred,
13 since there are two by~es in every word.
14
The burst transfer can terminate in two ways.
'16 The burst transfer can terminate normally or
17 the burst transfer can terminate with an error condition.
18
19 In the normal case there are two possibilities.
"
21 In a first condition of operation, the
22 character count register 131 can reach a count of either
23 one or two bytes remaining to be transferred. In this
24 situation the channel puts up EOT (line 165 as shown
in Fig. 14) signifying that the end of transfer has
26 been reached. If the count reaches one, then the channel
27 asserts EOT and PAD OUT ~line 167 of Fig. 14) signifying
28 the end of transfer with an odd byte.
29
,
138
~ ~9;7~24
If tllc ch;lr~cter COUIlt rc.lchcs t~o, the challnel
puts up EOT, but PAD our ~PADO on line 167 of Figure 14)
is not rc~luired because both bytes on the bus are valid.
In cither case, thc devicc controller 41
rcsl~onds by asserting STOP IN (STI) on line 159 (see
Figurc 1~), and the device controller 41 also asserts
P~D I~ (P.~DI) on line 169 (Figure 14) if the channel
asscrted PAD OUT ~PADO).
In this first case of normal termination, the
tr~nsfcr as a whole, not just the burst, is terminated
by the channel 109.
The other normal completion is when the
device controller 41 ends the burst by asserting STOP
IN (STI) in response to the channel SERVICE OUT ~SVO).
This signifies that the buffer 189 (see Figure 19) has
reaclled a condition of minimum stress (as indicated by
point 229 in Figure 23).
The STOP IN (STI) can occur on an output
transfer or on an input transfer.
On an input transfer, if the device controller
41 wishes to terminate the transfer as well as the
burst, the device controller 41 can assert STOP IN
~STI); and, to signify an odd byte on the last word,
the device controller 41 can also assert PAD IN (PADI).
- 139 -
7t3Z4
1 As shown in Fig. 16, when the transfer is
2 terminated by a non-error condition (STI OR EOT) on
3 either an output transfer or an input trans f er (as shown
4 by the balloons OUT ~nd IN in Fig. 16), the channel
109 updates the IOC table entries as noted above, and
6 returns to the idle (NOP) state shown in Fig. 16.
8 As noted above, the transfer can also be
9 terminated by an error condition.
11 During the burst several errors may occur
12 as follows.
13
14 First, the device controller 41 may request
an input transfer into a buffer whose protect bit P is
16 set in the IOC table as mentioned above.
17
18 Second, the device controller 41 may not
19 return a PAD I~ ~PADI ) signal in response to a PAD OUT
(PADO) signal fr~m the channel 109.
21
22 Third, the channel 109 may detect a parity
23 error on the D bus 161.
24
Fourth, the device controller 41 may not
26 respond to a SERVICE OUT (SVO) signal from the channel
27 109 within the allotted time as mentioned above in the
28 discussion on handshakes.
29
140
_,.
~147824
1 Fifth, the buffer area specified by the IOC
2 table entries may cross into a page whos~ ma~ marks it
3 absent (see the discussion of the mapping scheme in
4 the memory system).
6 Sixth, a parity error may be detected in
7 accessing the map while accessing the memory during
8 the reconnect in and data transfer sequence. See the
9 description in the memory system relating to the parity
10 error check.
11
12 Seventh, the memory system may detect an un-
13 correctable parity error when the channel 109 accesses
14 the memory. See the description of the memory system for
15 thi9 parity error check.
16
17 If any of these error conditions occur,
18 the channel 109 goes to the abort data transfer state
19 tABTD) as shown in Fig. 16. This instructs the device
20 controller 41 that an error has occurred and that the
21 data transfer should be aborted. The channel 109 then
22 goes back to the idle state which is ~NOP) as shown in
23 Fig. 16.
24
When an error occurs, the channel 109 updates
26 the IOC table entries and puts an error number indicating
27 one of the seven errors noted above in the error field
28 Of the second word of the IOC table entry as mentioned
29 above.
141
~147824
1 Thus, if a single error occurs, the number of
2 that error is en~exed in the error field of the IOC
3 table entry.
If more than one error occurs, the chann~l
6 109 selects the error from which recovery is least
7 likely to occur and enters only the number of that error
8 in the error field of the IOC table entry.
g
- There is one other type of e~ror that can
occur. The device controller 41 may try to reconnect
2 to the channel when the count word in the IOC table is
13 zero. In this event, the channel will not let the
1~ device controller reconnect and the channel goes
through the sequence as described above with reference
16 to Fig. 16, but when the channel determines that the
17 count word in the IOC table is zero, the channel 109
18 goe~ directly to the abort (ABTD) state. This is an
19 important feature of the present invention because it
20 protects the processor memory from being overwritten by
21 a failing device.
22
23 If the count is zero in the byte count count
24 Of the second word of the IOC table entry 142 for a
25 particular device, and if the device controller 41
26 attempts to reconnect to the channel 109, the channel
27 issues an abort (ABTD) to the device controller 41 as
28 noted above and leaves the channel error field of the
29 two word entry 142 at zero.
142
~14782g
1 In response to an abort data (ABTD) T bus
2 function, the device controller 41 makes an interrupt
: 3 request on the line HIRQ or LIRQ (lines 149 or 147 as
4 shown in Fig. 14) to the channel 109.
s 6 The device controllers 41 may at any time
;:~ 7 request an interrupt on these two lines.
9 An interrupt generally indicates that a
10 data transfer has been completed or terminated by an
11 abort from the channel (an ABTD from the channel) or
12 by an error condition within the device controller 41
13 or attached device, or that a special condition has
14 occurred within the device controller or an attached
15 device. For example, when the power is applied and the
16 PON circuit indicates that power is at an acceptable
17 level, the device controller interrupts the processor
18 module to indicate that its internal state is Reset
19 because power was off or had failed and has been reset
20 by the PON circuit.
21
22 In response to an interrupt, the program
23 running within the processor module 33 issues an interrogate
24 I/O instruction (IIO) or an interrogate high priority
25 I/O instruction (HIIO) over the I/O bus 39.
26
27 The IIO instruction is issued in response to
28 a low priority I/O interrupt, that is, one issued on
29 the low priority interrupt request (LIRQ) line 147 (see
30 Fig. 14).
143
f~ r
1147824
1 The ~IIIO instruction ls issued in response
2 to a high priority I/O interrupt, that is, one requested
3 on a high priority interrupt requ~st (}~I~Q) line 149
4 ~see Fig. 14).
6- The microprocessor 113 (see Fig. 12) executes
7 the EIO, IIO or HIIO instruction by taking control of
8 the channel control logic 141 and data path logic 123.
., ~
The sequence for these instructions is
11 illustrated in Fig. 17; and, as noted above, the sequence
12 starts with a polling sequence.
13
14 The IIO instruction polls in a sequence using
the T bus function low priority interrupt poll (LPOL)
16 whlle the HIIO instruction polls in a sequence using
17 the T bug function high priority interrupt poll (HPOL).
18
19 The polling sequence which is also described
20 above completes by selecting the appropriate device
21 controller 41 by using the T bus function select (SEL)
22 as shown in Fig. 17.
23
24 The appropriate device controller 41 selected
25 is that device controller which has the highest priority
26 and is making an interrupt request.
27
2~ The sequence continues with a read interrupt
29 cause (RIC) T bus function as shown in Fig. 17. The
30 de~ice controller 41 responds by returning device
31 dependent status on the D bus 161 (see Fig. 14).
144
....
~478:~4
: 1 The microprocessor 113 (Fig. 12) reads the
2 status from the D bus 161 and places the status on the
3 top of the register stack 112 (Fig. 12).
The sequence then continues with a read
6 interrupt status (RIST) T bus function as shown in
7 Fig. 17. The device controller 41 responds to this RIST
8 T bus function by returning the device controller number,
9 the unit numbèr and four dedicated status bits on the
D bus.
11
12 Of the four bit status field, two of the bits
13 indicate respectively, abort (ABTD) and parity error
14 (which parity exror may have occurred during a reconnect
and data transfer sequence).
16
. 17 The microprocessor 113 copies the content of
18 the D bu9--the controller number, the device number
19 and the interrupt status--and places that content on the
20 top of the register stack 112.
21
22 If no error occurred during the sequence, then
23 the sequence continues with the deselect (DSE~) state
24 which deselects the device controller 41; and then the
25 sequence goes into the idle (NOP) state as indicated by
26 the line at the top of Fig. 17.
27
28 If an er~or did occur (and the error can be
29 a parity error detected by the channel or a handshake
30 time out), the channel goes from the RIST state to the
145
1~478Z4
abort instruction ~/~BTI) statc as shown in ~igurc 17.
- This desclects the dcvice controllcr 41, and then the
channel 109 goes back into the idle (.~OP~ state as
sho~n by the bottom line in Figure 17.
~ s noted above, an I/0 operation between a
processor module and an I/0 device typically consists
- o~ a ~roup of sequences, e.g. an EIO followed by some
number of reconnect and data transfer sequences, terminat-
ing witll an II0 sequence. Sequences from several different
I/0 operations may be interleaved, resulting in apparent
simultaneous I/0 operation by several devices. Thus, a
large number of devices may be accessed concurrently;
the e~act number depends on the channel bandwidth and
the actual bandwidth used by each device.
The I/0 system and dual port device controller
architecture and operation described above provide a
number of important benefits.
These benefits include ~a) flexibility to
interface a wide variety of devices, (b) a maximum usage
of resources, (c) a fail soft environment in which to
access peripheral devices in a multiprocessor system,
(d) on line maintenance and upgrade of the multiprocessor
system capability, and (e) maximum system through put
(as opposed to emphasizing processor through put or I/0
through put exclusively) in an on line transaction system
in which a large number of concurrent transactions must
be processed by the I/0 system and CPU.
- 146 -
114782~
1 Flexi~ility to interface a wide variety of
-~ 2 devices is achieved because the system of the present
3 invention does not presuppose any inherent characteristics
4 of a device type. Instead, the present lnvention provides
5 a structure and operation which can accom~odate a wide
6 variety of device operations.
8 The present invention provides for a maximum
9 usage of resources, primarily by making a ma~imum usage
10 Of memory bandwidth. Each device uses a minimum of the
11 memory bandwidth. This allows a relatively large number
12 of devices to be associated with the particular I/O bus.
13 ~ecause of the inherent speed of the I~O bus, and the
14 buffering technigue of the present invention, each
15 particular transfer is made at a relatively high speed
16 limited only by memory speed. Because the transfers are
17 ln a burst mode, the overhead associated with each trans-
18 fer i8 minimized. This maximizes the use of the channel
19 bandwidth and also permits the use of high speed devices.
21 The present invention provides for failsoft
22 access to peripheral devices. There are redundant paths
23 to each peripheral device, and containment of failure
24 on any particular path. Failure of a particular module
25 in one path does not affect the operation of a module
26 in another path to that device.
27
28 ~here are comprehensive error checks for
29 checking data integrity over a path, sequence failures
30 and timing failures.
147
~1478Z4
Protcction featurcs l)revcnt a peripheral device
from contaminating its own buffer or the mcmory of the systc~,.
- These l~rotection fcatures include a separate count word in
each IOC ta~le and a protect bit in the IOC table. The IOC
tablc is accessible by the channel, but not by the device.
This is a second level of protection to prevent the device
from acccssing any memory not assigned to that device.
The present invention requires only a small
nulnber of lines in the I/O bus to provide a flexible and
powerful I/O system.
The operation of the device controller is
well defined as power is turned on or off to protect the
I/O bus from erroneous signals during this time and also
to permit on line maintenance and system upgrade.
The present invention uses stress to allow the
buffers to cooperate without communicating with each other.
An on line transaction system is obtained
through overlapped transfers and processing.
Multichannel direct memory access provides
interleaved bursts to give overlapped transfers and
minimum waits for accesses to a device. Each burst
requires a minimum memory overhead and allows the pro-
cessor to make maximum use of the memory. This
combination allows maximum use of the I/O bandwidth
and minimal tie up of the processor.
- 148 -
1~47824
1 POWER DISTRIBUTION SYST~M:
3 The multiprocessor system of the present
4 invention incorporates a power distribution system that
5 over comes a.number of problems associated with prior
6 art systems,
8 In many prior art systems it was necessary to
g stop the processor system in order to perform required
10 maintenance on a comp~nent of the system. Also, in many
11 prior art systems, a failure in the power supply could
12 stop the entire processor system.
13
14 ~he power distribution system of the present
lS lnvention incorporates a plurality of separate and
16 independent power supplies and distributes the power
17 ~rom the power upplies to the processor modules and to
18 the device controllers in a way that permits on-line
19 maintenance and also provides redundancy of power on
20 each device controller,
21
22
23
24
26
27
28
29
149
f' ~
; l In this regard "on-line" is used ih the sense
2 that when a part of the system is on-line, that part of
3 the system is not only powered on, but it is also function-
4 ing with the system to perform useful work.
6 The term "on-lir.e maintenance" therefore means
7 maintaining a part of the system (including periodic
8 preventative maintenance or repair work) while the
9 remainder of the system is on-line as defined above.
11 In the present invention any processor module
12 or device controller can be powered down so that on-line
13 maintenance can be performed in a power off condition
14 on that processor module or a device controller while the
15 rest of the multiprocessor system is on-line and functional.
~6 The on-line maintenance can be performed while fuLly
17 meetlng Underwriters Laboratory safety requirements.
18
19
21
22
23
24
26
27
28
29
150
~1478Z~
1 Also, in the power distribution system of
2 the present invention each device controller is connected
3 for supply of power from two separate power suppIies
4 and by a diode switching arrangement that permits the
5 device controller to be supplied with power from both
6 power supplies when both power supplies are operative and
7 to be supplied with power from either one of the power
8 supplies in the event the other power supply fails; and
9 the changeover in the event of failure of one of the
10 power supplies is accomplished smoothly and without any
11 interruption or pulsation in the power supply so that an
12 interrupt to a device controller is never required in
13 the event of a failure of one of its associated power
14 SUpplies,
16 A power distribution systcm for insuring both
17 a primary supply and an alternate power supply for each
18 ~ndividual dual port device controller 41 is illustrated
19 in Fig. 30. The power distribution system is indicated
20 generally by the reference numeral 301 in Fig. 30.
21
22 The power distribution system 301 insures
23 that each dual port device.controller 41 has both a
24 primary power supply and an alternate power supply.
25 Because each device controller does have two ~eparate
26 and independent sources of power supply, a failure of
27 the primary power supply for a particular device controller
28 does not render that device controller. (and all of the
29 devices a~sociated with that controller) inoperative.
30 Instead, in the present invention, a switching arrange-
151
; ~
~478Z4
ment provides for an automatic switchover to the alternate power
supply so that the device controller can continue in operation. The
power distribution system thus coacts with the dual port system of the
device controller to provide continuous operation and access to the
devices in the event of a failure of either a single port or a single
power supply.
The power distribution system 301 shown in Figure 30 provides
the further advantage that each processor module 33 and associated
CPU 105 and memory 107 has a separate and independent power supply
which is dedicated to that processor module. With this arrangement,
a failure of any one power supply or a manual disconnection of any
one power supply for repair or servicing of the power supply or asso-
ciated processor module is therefore limited in effect to only one
particular processor module and cannot affect the operation of any of
the other processor modules in the multiprocessor system.
The power distribution system 301 shown in Figure 30 thus
works in combination with the individùal processor modules and the
dual port device controllers to insure that a failure or disconnec-
tion of any one power supply does not shut down the overall system or
make any of the devices ineffective.
The power distribution system 301 includes a plurality of
separate and independent power supplies
~ -152_
~1~78Z4
1 303, and each power supply 303 has a line 305 (actually
2 a multiline bus 305 as shown in Fig. 33) which is
3 dedicated to supplying power to the CPu and memory of
4 a particular, related processor module.
6 Each device controller 41 is associated with
7 two of the power supplies 303 through a primary line 307
8 and an alternate line 309 and an automatic switch 311.
A manually operated switch 313 is also
11 associated with each device controller 41 between the
12 device controller and the primary line 307 and the
13 alternate line 309.
14
The switches 311 and 313 are shown in more
16 detail in Fig. 31.
17
18 Fig. 32 shows details of the component
19 construction of a power supply 303.
21 As shown in Fig. 32, each power supply 303
22 has an input connector 315 for taking power from the
23 mains. The input 315 is connected to an AC to DC
24 converter 317, and the output of the AC to DC converter
25 provides, on a line 319, a five volt interruptable
26 power supply (IPS). This five volt interruptable power
27 supply is supplied to the CPU 105, the memory 107 and
2~ the device controller 41. See also Fig. 33.
29
153
.. . . _ .. . ...
~:147t~Z4
The AC to DC converter 317 also provides on a second i-nput
line 321 a sixty volt DC output which is supplied to a DC to DC con-
verter 323. See Figure 32.
The DC to DC converter in turn provides a five volt output on
a line 325 and a twelve volt output on a line 327.
The outputs from the lines 325 and 327 are, in the system of
the present invention, uninterruptable power supply ~UPS) outputs in
that these power supply outputs are connected to the CPU and memory
when semi-conductor memory is used. The power supply to a semi-con-
ductor memory must not be interrupted because a loss of power to asemiconductor memory will cause loss of all data stored in the memory.
The five volt interruptable power supply on line 319 is con-
sidered an interruptable power supply beca.use this power is supplied
to parts of the multiprocessing system in which an interruption of
power can be accepted. Thus, the five volts interruptable power is
supplied to parts of the CPU other than semiconductor memory and to
only those parts of the memory which are core memory (and for which a
loss o power does not cause a loss of memory) and to the device con-
troller which (as will be described in more detail below) is supplied
with an alternate source of power in the event of a failure of the
primary power supply.
~?~ -154-
1~47824
1 Since the power supply on lines 325 and 327
2 must be an uninterruptable power sup~ly, the prcsent
3 invention provides a battery back-up for the input to
4 the DC to DC conYerter 323. This battery back-up
5 includes a battery and charger module 329. The modu~e
6 329 is connected to the DC to DC converter 323 by a
7 line.331 and a diode 333.
g In a particular embodiment of the present
10 invention the battery 323 supplies power at 48 volts
11 to the converter 323, which is within the input range
12 of the converter 323.
13
14 The diode 333 insures that power from the
15 battery is supplied to the converter.323 if the voltage
16 on the line 321 drops below 48 volts. The diode 333.
17 also stops the flow of current from the battery and
18 the line 333 when the output of the AC to DC converter
19 on line 321 exceeds 48 volts.
21 Each power supply 303 also includes a power
22 warning circuitry 335 for detecting a condition in the
23 AC power input on line 315 that would result in in-
24 sufficient power out on the output lines 319, 325 and
25 327- The power warning circuit 335 transmits a power
26 failure warning signal on a line 337 to the related
27 CPU 105.
2g
29 Because of the capacity storage in the power
30 supply 303, there is enough time between thé power warn-
155
1147~Z4
1 ing signal and the loss of the five volts interruptabl~
2 power on line 319 for the CPU to save its state
3 before the power is lost.
However, the uninterruptable power supply on
6 lines 325 and 327 must not be interrupted, even for
7 an instant of time; and the battery back-up provided
8 by the arrangement shown in Fig. 32 insures that there
9 is no interruption in the power supply on lines 325
10 and 327 in the event of a power failure in the input
11 llne 315.
12
13 One particular power supply 303 itself can
14 fail for some reason with the other power supplies 303
15 gtill operating. In that event, the power distribution
16 syStem 301 o~ the present invention limits the effect
~7 of the failure of the power supply 303 to the loss of
18 one particular, associated CPU and memory; and the auto-
19 mati¢ switch 311 provides for an automatic switchover
20 from the ~ailed power supply to the alternate power
21~upply to keep the associated device controller 41 in
- 22operation. ~he device controller 41 which had been
23connected to the failed power supply therefore continues
24in operative association with the othe~ processor modules
25and components of the multiprocessor system, because the
: 26required power is automatically switched in ~rom the
27alternate power supply.
~8
29 As best illustrated in Fig. 31, each automatic
30switch 311 includes two diodes--a diode 341 associated with
,
156
f '
~78Z4
1 the primary power line 307 and a diode 343 associated with
2 the alternate power line 309.
4 The function of the diodes 341 and 343 is to per-
5 mit power to be supplied to a device controller 41 from
6 either the primary power line 307 and a related power supply
7 303 or the alternate power line and its related power supply
8 303 while keeping the supplies isolated. This prevents a
9 failed power supply from causing its associated alternate
10 or primary from failing.
11 .
12 In normal operation each diode permits a certain
13 amount of current to flow through the diode so that the
14 power to each device controller 41 is actually being
15 supplied by both the primary and alternate power supplies
16 for that device controller.
17
18 ' In the event that one of the power supplies
19 fails, the full power is supplied by the other power supply,
20 and this transition occurs without any loss of power at all.
21
22 Since there is a small voltage drop across the
23 diodes 341 and 343, the voltage on the lines 307 and 309
24 must bé enough higher than five volts to accomodate the
25 voltage drop across the diodes 341 and 343 and still
26 supply exactly five volts to the device controller 41.
27 The lines 305 are in parallel with the lines 307 and 309,
28 and the power actually received at the CPU in memory must
29 also be five volts; so balancing diodes 339 are located
30 in the lines 305 to insure that the voltage after the diodes
31 339 as supplied to each CPU is exactly five volts.
157
. ~, , . , , . . , , , .. , . .. . ~ _ ... . ... . ., . ,_ . ... .,,.. ., _. .... _.. ... ..
r
~78Z4
The manual switch 313 permits a dcvice
2 controller 41 to be disconnected from both th~ E~rim~lry
3 and the alternate power sources when the device
4 controller needs to be disconnected for removal ~nd
5 service.
7 Details of the construction of the switch
8 313 are shown in Fig. 31. As showr~ in Fig. 31, the
g switch 313 includes a manual switch 345, a transistor
10 347, a capacitor 348 and a resistor 350 and a resistor
11 352.
12
13 The manual switch 345 is closed to turn on
14 the transistor 347 which then supplies power to the device
15 controller 41.
16
17 It i8 important that both the turn on and
18 the turn off of power to the device controller 41 be
19 accomplished in a smooth way and without fluctuations
20 which could trigger the PON circuit 182 more than once.
21 The feedback capacitor 348 acts in conjunction with the
22 resistor 352 to cause the required smooth ramp build-up
23 Of power when the switch 345 is closed to turn the trans-
24 istor 347 on.
26 When the transistor 347 is turned off by opening
27 the switch 345, the feedback capacitor 348 acts in
28 conjunction with resistor 350 to provide a smooth fall
29 ff of power.
158
i;:l47~4
In a l)rererrcd eInbodiment of the invention
all ot'cliodes a41, 3L~31 and 339 are Schott~y diodes
which have a very low forward volta~e drop, and this
rccluces power dissipation.
As noted above in the description of the
I/O systcm and dual por-t device controller 41, each
devicc controller 41 does have a power on circuit (PON)
lS~ for cletecting when the five volt power is below
specific,ations. The PO~ circuit 182 is shown in more
dctail in Figure 25 and resets the device controller 41
to loc~ everything off of the device controller and
holds the device controller itself in a state that is
~nown when the power is turned off by the switch 313.
The PO.Y circuit 182 also releases the device controller
and returns it to operation after the power is turned on
by switch 313 and five volt power supply at the proper
specification is supplied to the device controller 41.
Further details of the power on circuit 182
shown in Figure 25 are described above in relation to the
I/O and dual port controller system.
With reference to Figure 33, the power from
each power supply 303 is transmitted to a related CPU
by the vertical bus 305, and each vertical bus 305 is
a laminated bus bar which has five layers of electrical
conductors.
- 159 -
~478~4 ;
As indicated by the legends in Figure 33, each vertical bus
305 has two different conductors connected to ground.
One conductor provides the ground for both the five volt
interruptable power supply (PS) and the five volt uninterruptable power
supply ~UPS).
A separate conductor provides a ground for the memory voltage.
This separate ground for the memory voltage insures that the relatively
large fluctuations in current to the memory will not have any effect
on either the five volt IPS or the five volt UPS supplied to the CPU.
The horizontal bus 307, 309 includes the primary and alternate
power supply lines 307 and 309 (as indicated by the reference numerals
in Figure 30~. In a particular embodiment of the present invention the
bus 307, 309 is actually a nine layer laminated bus which has a single
ground and eight voltage layers tVl through V8 as indicated by the
legends and notations in Figure 33).
Each voltage layer is connected to the five volt interruptable
output of a different power supply 303. Thus, the layer Vl is connec-
ted at 351 to the five volt IPS power for the power supply 303 and re-
lated processor module farthest to the left as viewed in ~igure 33,
and the layer V2 is connected at 353 to the five~volt IPS power supply
~ -160_
. 1~7~324
303 for the processor module at the center as viewed in Figure 33,
and so on.
Since there are eight layers ~Vl through V8) and a common
ground available to each device controller in the horizontal bus,
upstanding vertical taps 355 to these eight layers at spaced intervals
along the horizontal bus permit each device controller 41 to be asso-
ciated with any two of the power supplies 303 merely by connecting the
primary line 307 and the laternate line 309 to a particular set of
taps By way of example, the device controller 41 on the lefthand side
of Figure 33 is shown connected to the taps V2 and V3 and the device
controller 41 on the righthand side of Figure 33 is shown connected to
the taps V2 and V3.
Thus, any device controller 41 can be connected to any two of
the power supplies 303 with any one of the power supplies serving as
the primary power supply and any one of the other power supplies serving
as the alternate power supply.
The power distribution system of the present invention thus
provides a number of important benefits.
The power distribution system permits on line maintenance
to be performed because one processor module or device controller can
be powered down while the rest of the multiprocessor system is on line
and functional.
- 161 -
~,, .
.
1 The power distribution system fully meets all
2 Underwriter Laboratory safety requirements for doin~ on
3 line maintenance of a powered down component while the
4 rest of the multiprocessor system is on line and in
operation.
: 6
7 Each device controller is associated with two
8 separate power supplies so that a failure in one of the
g power supplies does not cause the device controller to
10 stop operation. Instead, the electronic switch arrange-
11 ment of the present invention provides such a smooth
12 transition of power from the two power supplies to only
13 one of the power supplies that the device controller is
14 maintained in continuous operation without an interrupt.
16
17
18
19
21
22
23
24
26
27
28
29
162
~ - .
~47824
1 ME:MORY SYSTEM
2 . .
3 Each processor module 33 (see Fig. 1) in thc
4 multiprocessor system 31 contains a memory.
S
6 This memory is indicated by the general
7 reference numeral 107 in Fig. 1 and is shown in greater
8 detail in Fig. 34.
g
The memory 107 of each processor module 33
11 is associated with both the CPU 105 and the I/O channel
12 109 of that module. There is a dual port access to
13 the memory by the CPU and the channel. That is, the
14 CPU 105 (see Fig. 1 and Fig. 34) can access the memory
15 for program or data references, and the I/O channel
16 109 can also access the memory directly (without having
17 to go through the CPU) for data transfers to and from a
18 device controller 41. This dual access to the memory
9 i9 illustrated in Fig. 34 and will be described in
20 greater detail below in the description of the Fig. 34
21 structure and operation.
22
23 One benefit of this dual access to the
24 memory is that CPU and channel accesses to the memory
25 can be interleaved in time. There is no need for
2~ either the CPU or the channel to wait for access to
27 the memory, except in the case where both the CPU
28 and the channel are trying to access the memory at
29 exactly the same time. As a result~ both the CPU and
30 the channel can be performing their separate functions
163
\
!_
~147824
1 simultaneously, subject to an occasional wait by the
2 CPU or channel if one of these units is accessing
3 the memory at the exact time the other unit needs to
4 access the memory.
6 The dual port access also allows bac~ground
7 I~O operation. The CPU 105 needs to be involved with
8 the channel 109 only in the initiation and termination
g of I/O data transfers. The CPU can be performing other
0 functions during the actual I/O data transfer itself.
11
12 The memory 107 shown in Fig. 34 comprises a
13 physical memory which consists of up to 262,144 words
14 of sixteen data bits each.
16 In addition to the sixteen data bits, each
17 word in memory has an additional parity bit if the
18 memory is a core memory or six additional error correction
19 bit5 if the memory is a semiconductor memory.
21 The parity bit permits detection of single
22 bit errors.
23
24 The six error correction bits permit detection
25 and correction of single bit errors and also permit
26 detection of all double bit errors.
27
28 The physical memory is conceptually subdivided
29 into contiguous blocks of 1024 words each (which are
30 called pages). The pages in physical memory are numbered
164 ~
f
qaz4
1 consecutively from pa~e zero, starting at physical
2 location zero. The address range of physical memory
3 in one specific embodiment of the present invention,
4 which address range is zero through 262,143, requires
5 eighteen bits of physical address information.
7 The basic architecture of the present
8 invention is, however, constructed to accommodate and
g utilize twenty bits of physical address information, as
10 will become more apparent from the description to follow.
11
12 In one specific embodiment of the invention
13 the physical memory is physically divided into physical
14 modules of 32,768 words. Thus, eight of these modules
15 provide the 262,143 words noted above.
16
17 All accesses to memory are made to one of
18 four loglcal address areas--user data, system data, user
19 code and system code areas. All CPU instructions deal
20 with these logical ~as distinct from physical) addresses
21 exclusively. Thus, a programmer need not be concerned
22 with an actual physical address but can instead write `
~3 a program based entirely on logical addresses and the
24 logical addresses are translated by the map section of t
25 memory system into physical addresses.
26
27 The range of addressing in any given logical
28 address area is that of a sixteen bit logical address,
29 zero through 65,535. Thus, each logical address area
30 comprises sixty-four logical pages of 1024 words each.
' ' , .
165
~.
~1478Z4
1 In the memory system of the present
2 invention there is no required correspondence between
3 a logical page and a physical page. Instead, the
4 various logical pages comprising an operating system
5 program or a user program need not reside in conti~uous
6 physical pages. In addition, the logical pages need
7 be in physical main memory but may be in secondary
8 memory, such as on a disc.
This allows implementation of a virtual
11 memory scheme.
12
13 Virtual memory has two benefits.
14
First, virtual memQry allows the use of a physical
16 maln memory space which is smaller than the logical address
17 areas would requ~re, because the physical memory can be
~8 supplemented by a secondary physical memory.
19
Secondly, virtual memory permits address spaces
21 Of a plurality of users (multiprogramming) to share the
22 physical memory, and each user does not have to be con- ~
23 cerned with the allocation of physical memory among the
24 operating system, himself, or other users.
26 The memory system of the present invention
27 provides protection between users in the multiprogramming
28 environment by guaranteeing that one user program cannot
2g read from or write into the memory space of another -
30 user program. This is accomplished by the paging and
166
~47~24
1 mapping system. When one user program is running, the
2 map or that user program points only to the memory pages
3 (up to sixty-four pages of code and sixty-four pages of
4 data) for that particular user program. That particular
5 user program cannot address outside its own logical
6 address space and therefore cannot write into or read
7 from the memory space of another user program.
9 The fact that code pages are non-modifiable
0 also prevents a user program from destroying itself.
11 -
12 Thus, there are two levels of protection for
13 user programs operating in a multiprogramming en~ironment--
14 the fact that each user map points only to its own pages
lS in memory and the fact that code pages are non-modifiable.
16 Also, in the present invention, this protection is
17 achieved without protection limit registers or by protection
18 keys as'often used in the prior art.
19
The required translation of a sixteen bit
21 logical addres to an-eighteen bit physical address is
22 accomplished by a mapping scheme. As part of this
23 mapping scheme, a physical page number is obtained by
24 a look-up operation within a map. This physical page
25 number is then combined with the address within a page
26 to form the complete physical memory address.
27
28
29
167
~ ~ ~ . .....
~ 24
1 Only the page number is translated. The
2 offset or address within a pa~e is never chan~ed in
3 the mapping.
In the present invention there are four
6 map sections. Each map section corresponds to one
7 of the four logical addressing areas (user data,
8 system data, user code and system code).
The separation of the logical address into
11 these four separate and distinct areas provides
12 significant benefits.
13
14 The separation provides isolation of programs
from data 90 that programs are never modified. The
16 separation also provides isolation of system programs
17 and data from user programs and data, and this pro-
18 tects the operating system from user errors.
19
The four map sections are designated as
21 fOllows
22
23 Map O--user data map. All addresses to
24 variable user data areas are translated through this
u5er data map.
26
27 Map l--system data map. The system data
28 map is similar to the user data map and in addition,
29 all memory references by either the I/O channel, the
interprocessor bus handling microprogram, or the interrupt
168
- L ~
~4
1 handling microprogram specifies this map. The system
2 data map provides channel access to ail of physical
3 memory via only a sixteen bit address word.
Map 2--user code map. This map defines th~
6 active user program. All user instructions and constant
7 data are obtained via this user code map.
9 Map 3--system code map. This map defines
10 the operating system program. All operating system
11 instructions and constant data are obtained via this
12 system code map.
13
14 Each map section has sixty-four entries
15 corresponding to the sixty-four pages possible in
16 each log~cal address area. Each entry contains the
17 following lnformation.
18
19 (l) The physical page number field (which
20 can have a value of zero through 255).
21
22 (2) An odd parity bit for the map entry.
23 The parity bit is generated by the map logic whenever
24 a map entry is written.
26 (3) A reference history field. The
27 reference history field comprises reference bits,
28 and the high order bit of the reference bits is set
29 to a "one" by any use of the page corresponding to
30 that map entry.
169
f
~78Z4
1 (4) A dirty bit. The dirty bit is set to
2 a "one" when a write access is made to the correspond-
3 ing memory page.
The reference bits and the dirty bit are
~ used by the memory manager function of the operating
7 system to help select a page for overlay. The dirty
8 bit also provides a way to avoid unnecessary swaps of
9 data pages to secondary memory.
.0
11 (5) An absent bit. The absent bit is
12 initially set to a "one'' by the operating system to
13 flag a page as being absent from main memory. An
14 access to a page with this bit set to "one" causes
15 an interrupt to the operating system page fault interrupt
16 handler to activate the operating system virtual memory
17 manager function. The absent bit is also used as a
18 protection mechanism to prevent erroneous access by a
19 program outside its intended logical address area for
20 either code or data.
21
22 Three in~tructions are used by the operating
~3 system in connection with the map. These three
24 instructions are: SMAP, RMAP, AMAP.
26 The SMAP (set map entry) instruction is used
27 by the memory manager function of the operating system
28to insert data into a map entry. This instruction
2grequires two parameters--the map entry address and
30the data to be inserted.
170
_..~_9
.~,,~ 5 f
~4~24
1 The R~P ~read map entry) instructlon is
Z used by the memory manager function of the operating
3 system to read a map entry. This instruction requires
4 one parameter, the map en~ry address, and the result
S returned by the instruction is the map entry content.
7 The AMAP (age map entry) instruction causes
8 the reference history field of a map entry to be shifted
9 one position to thè right. This is used by the memory
10 manager function of the operating system to maintain
11 reference history information as an aid in selecting a
12 page for overlay.
13
14 A page fault interrupt provided by the absent
lS bit occurs when a reference is made to a page that does not
16 currently reside in main memory or which i9 not part of the
17 logical address space of the program or its data. tYhen a
18 page fault is detected, an interrupt through to the operat-
19 ing system page fault interrupt handler occurs.
21 The page fault interrupt sequence includes
22 the following events:
23
24 1. An address reference is made to a page
25 that is absent from physical memory (absent bit = "one").
26
27 2. The page fault interrupt occurs. The
28 interrupt handler microcode places an interrupt para-
29 meter indicating the map number and the logical page
30 number in a memory location known to the operating
171
1 ~ystem. Then the current environment is saved in an
2 interrupt stack marker in memory.
4 3. The page fault interrupt handler executes~
5 If the page fault occurred because of a reference out-
6 side the logical address space of the program, then the
7 program is terminated with an error condition. On the
8 other hand, if a page fault occurred because the logical
9 page was absent from physical main memory (but present in
10 secondary memory), an operating system process executes
11 to read the absent page from the secondary memory (ùsually
12 discj to an available page in primary memory . That
13 physical page information and a zero absent bit are inserted
14 into the map entry. When this memory management function
15 completes, the environment that caused the page fault is
16 re~tored.
17
18 4. The instruction previously causing the page
19 fault is reexecuted. Since the absent bit in the map
20 entry of the logical page has now been set to a "zero",
21 a page fault will not occur, the page address is trans-
22 lated to the physical page just brought in from secondary
23 memory, and the instruction completes.
24
As noted above, the I/O channel has access
26 to the memory through its own port.
27
28 Data transfers to and from memory by the I/O
29 channel are via the system data map. That is, the six-
30 teen bit logical addresses provided by the I/O channel
31 are translated to an eighteen bit physical address by
32 means of the system data map.
172
- r
1 Thus, the mapping scheme allows I/O access
2 to more words of physical memory than its address
3 counter would normally allow.
In one specific embodiment of the present ;
6 invention 262,144 words of physical memory (for an
7 eighteen bit address) can be accessed with only a
8 sixteen bit logical address by going through the map.
g The extra address information ~the physical page
information) is contained in the map and is supplied
Y the operating system before each I/O transfer is initiated.
12
13 As will become more apparent from the
14 detailed description to follow, the present invention
is also readily extendible to a twenty bit physical
16 address.
17
18 Fig. 34 is a bloc~ dizgram showing d~tails
19 Of the memory 107 of a processor module 33 and showing
also connections from the memory 107 to the CPU 105
21 and the I/O channel 109 of that processor module.
22
23 As illustrated in Fig. 34, the memory system
24 107 provides access ports for both the CPU 105 and the
I/O channel 109 to the memory 107, and the I/O channel
26 109 therefore is not required to access the memory
27 through the CPU 105.
,.0
~v
29 ~he memory 107 ~ncludes map memory control
logic 401 which controls initiation and cor.lpletion
31 o~ access to physical memo~y modules 403.
173
~7~
1 The memory 107 also includes a data path
2 section 405 containing registers (as indicat~d by
3 the legends in Fig. 34 and described in detail below)
4 which supply data to be written to memory and which
5 hold data read from memory.
7 ~he memory 107 also includes a map section
8 407. The map section 407 includes logical address
9 registers from both the CPU and the channel and a map
10 storage 409 from whLch physical page numbers are obtained.
11 . . ,
12 The map section 407 thus contains a processor
13 memory address (PMA) register 411 and a channel memory
14 address (CNA) register 129.
16 These two registers are connected to an
17 aadress selector 415.
18
19 The address selector 415 is connected to the
20 map 409 by a logical page address bus 417, and the address
21 selector 415 is also connected directly to the memory
22 modules by a page offset bus 419.
23
24 As indicated by the numerals 8 and 10 adjacent
25 to the buses 417 and 419, the logical page address bus
26 417 transmits the eight high order bits to the map 409
27 for translation to a physical page number, and the page
28 offset bus 419 transmits the ten low order bits (of an
29 eighteen page address from the address selector 415) to
30 the memory modules 403.
174
~47~
1 An output bus 421 supplies the physical page
2 address to the modules 403. This output bus 421 con-
3 tains the translated eight high order bits for the
4 address of the physical page.
6 The data path section 405 contains the follow-
7 ing registers: A processor memory data (PMD) register
8 423; a channel memory data (CMD) register 425; a next
9 instruction (NI) register 431; a memory data (MD)
10 register 433; and a channel data (CD) register 125.
11 ,,, . . '
12 The outputs of the PMD and C~D registers
13 are supplied.to a data selector 427. This data
14 gelector 427 has an output bus 429 which supplies data
15 to be written to memory in the modules 403.
16
17 Data read out from one of the memory modules
18 403 is ~ead ~nto one of the three data registers NI,
19 MD and CD over a bus 437.
21 As illustrated in Fig. 34, the map memory
22 control logic 401 is also connected with each of the
23 memory modules 403 by a bus 439. ~he bus 439 comprises
24 command lines which ~nitiate read or write operations,
25 completion signals from the memory modules, and error
26 indicators or flags.
27
28 With reference now to Fig. 35, the map section
29 407 includes, in addition to the map 409, a map page register
30 441, a map output latch 443, a map memory data (MMD)
175
1 register 445, a map data selector 447, a map parity
2 generator 449, a map parity checker 451, reerence
3 bit logic 453, and dirty bit logic 455.
Thé map memory control logic 401 is shown
6 in Fig. 35 as associated with the map section 407 by
7 control signal lines 457.
9 The map memory control logic 401 controls
10 the loading of registers and selection of registers by
11 the selectors, controls (in conjunction with map absence
12 and parity error outputs) the initiation of memory
13 modules 403 operations, and provides interrupts to the
14 CPU 105 ~as indicated by the page fault and map parity
15 error interrupt signals indicated by the legends in
16 Fig. 35)--all as will be described in more detail below.
17
18 In a particular embodiment of the invention the
19 memory system shown in Figs. 34 and 35 utilizes a physical
20 page address field of eight bits and a page offset of ten
21 bits which combine to give a total eighteen bits. As noted
22 above, the numbers 8, 10, 12, 13, 14 and 18 which are not
23 in parenthesis on certain bus lines in Fig. 34 and Fig. 35
24 relate to this specific eighteen bit implemented embodi-
25 ment of the present invention. However, the memory system
26 is easily expandable to a twenty bit implemented embodi-
27 ment (with a physical page address of ten bits)~and this
28 is indicated by the numbers (10), (12), (14), (15), (16)
29 and (20) which are within parenthesis on the same bus
30 lines of Fig. 35.
176
~478Z4
1 Fig. 36 illustrates the organization of
2 logical memory in four separate and distinct logical
3 address areas 45g, 461, 46~ and 46S. These four
- 4 logical address areas are: user data area 459;
system data area 461; user code area 463; and system
6 code area 465.
8 Fig. 36 also illustrates the four map sections
g corresponding to the logical address areas.
11 Thus, the user data map section 467 corres-
12 ponds to the logical user data address area 459, the
13 system data map section 469 corresponds to the logical
14 system data address area 461, the user code map section
471 corresponds to the logical user code address area
16 463 and the system code map section 473 corresponds to
17 the loglcal system code address area 465.
18
19 As also illustrated in Fig. 36, each map
section has sixty-four logical page entries (page zero
21 through page sixty-three), and each map entry comprises
22 sixteen bits (as illustrated by the enlarged single
23 map entry in Fig. 36).
24
As indicated by the legends associated with
26 the enlarged map entry shown in Fig. 36, each map
27 entry comprises a ten bit physical page number field,
28 a single parity bit P, a reference history field
29 comprising three reference bits R, S and T, a single
dirty bit D,and a single absent bit A.
177
1 ~he physical page number field provided by
2 the ten high order bits provides the physical page
3 number corresponding to the logical page called for
4 by the program.
6 The parity bit P is always generated as odd
7 parity to provide a data integrity check on the map
8 entry contents.
The reference history field bits R, S and T
11 are u8ed by the memory manager function of the operating
12 system to maintain reference history information for
13 ~electing the least recently used page for overlaying.
14
The R bit is set to a one by any read or
16 write operation to that logical page.
17
18 The S and T bits are storage bits which are
19 manipulated by the AMAP (age a map entry) instruction.
Zl The dirty bit D is set to a one by a write access
22 to that logical page. The operating system uses the dirty
23 bit to determine whether a data page has been modified
24 since it was last brought in from secondary memory.
26 The absent bit A is set to a one by the operat-
27 ing system to flag a logical page which is absent from
28 main memory but present in secondary memory or to flag
29 a page which is outside the logical address area of
30 that user.
178
. _ . _, ,
~147824
1 The two high order bits for the map entry
2 shown in Fi~. 36 are not used in the specific embodiment
3 of the invention illustrated in the drawin~s, but these
4 two bits are used when the full twenty bit physical
addressing is used.
- 6
7 As noted above, three instructions are used
8 by the operating system in connection with the map.
9 These three instructions are: SMAP, RMAP and AMAP.
- 10
11 The SMAP instruction is used by the memory
12 manager function of the operating system to insert data
13 into a map entry like that illustrated in Fig. 36.
14
The SMAP instruction is implemented by the
16 microprogram 115 (Fig. 12) in the CPU 105. The micro-
17 program interacts with the map memory control logic 401
18 ~ee Flg. 34), first of all, to select (with the first
19 ln8truction parameter)a location in the map 409 and
then, ~econd, to ~nsert in that location the second
21 instruction parameter--the new map entry data.
22
23 In operation, and referring to Fig. 35, in
24 the first step in the sequence the microprogram-115
loads the new map entry data into the processor memory
26 data (PMD) register 423.
27
28 In the next step in the sequence, the map
29 address, including two high order bits for map selection,
are loaded into the processor memory address (PMA)
31 register 411.
179
,
~1~7~24
1 At this point the two instruction parameters
2 containing the map entry address and the data to b~
3 inserted have been loaded in their respective registers
4 411 and 423.
6 Next, the microprogram 115 in the CPu 105
7 initiates a map write operation sequence of the map
8 memory control logic 401. This map write operation
9 sequence is initiated after any previous memory operations
10 have been completed.
11
12 The steps noted above in the operation
13 sequence have all been performed by the microprogram
14 ~the firmware).
16 The remaining actions of the SMAP instruction
17 are performed under the control of the map memory control
18 logic. Thus, the remaining actions are all performed
19 automatically by hardware.
21 In the map write operation sequence, the map
22 address is transmitted from the PMA register through the
23 address selector 415 over the bus 417 to the map 409.
24 Only the eight high order bits (the map select and map
25 address) are used in this operation.
26
27 The two high order bits specify the map
28 selection--whether user data, system data, user code
29 or system code.
1~0
~_ ~_~, ~_~ ~, _"
1 The ten low order bits of the logical
2 address bus from the address selector ~ASELj 415
3 ~which bits are the offset within a page for a memory
4 read or write access) are not used in this operation.
6 As the map is being addressed as described
7 above, the new map data is transmitted from the PMD
8 register 423 through the map data selector 447 to the
g map parity generator 449 and to the map 409. The map
10 parity generator computes odd parity on the new map
11 data and supplies this parity bit to the map.
12
13 Now, at this point, the map memory control
14 logic 401 generates a map write strobe signal (on one
15 Of the lines indicated by 457 in Fig. 35) to the map 409
6 which causes the new data and parity to be written into
17 the selected map section at the specific map entry
18 selected by the logical page address on the bus 417.
19
This completes the SMAP instruction sequence.
21
22 At the end of this SMAP instruction the proper
23 map section has been selected, the particular logical
24 page entry on that map section has been selected, the
25 data and computed odd parity have been supplied to the
26 map, and the map write strobe has caused that data to be
27 written at the desired map entry.
28
29
.
181
1~478:Z4
1 The SMAP instruction (SMAP) is used by the
2 operating system to initialize each logical page entry
3 in each of the four map sec~ions as required.
One use of the set map instruction is there-
6 for to insert a physical page address for a logical
7 page to provide for translation of logical page numbers
8 to physical page numbers after a page has been swapped
9 in from secondary memory.
... -
11 Another use of the set map instruction is to
12 set on an absent bit for a logical page swapped out to
13 8econdary memory.
14
The read map (RMAP) instruction is used by
16 the memory manager function of the operating system to
17 examlne the content of a map entry.
18
19 In this RMAP instruction the microprogram llS
20 in the CPU 105 interacts with the map memory control
21 logic 401 to select (with the instruction parameter) a
22location in the map 409 and to return to the register
23stack 112 (see Fig. 12) as a result of the content of
24that map entry.
26 In the operation of the read map (RMAP)
27instruction, referring to Fig. 35, the microprogram
28115 loads the map address, including the two high order
29
182
,
~478;~4
1 bits for the map selection, into the PMA register 411.
2 The microprogram 115 then initiates a map read operation
3 sequence of the map memory control logic 401
This sequence is then carried out by the hard-
6 ware, and in this sequence the map address is transmitted
7 from'the PMA register 411 through the address selector
8 415 to the map 409. Again, only the map select and page
g address bits are used in this operation.
11 , The content of the selected map entry is
12 transmitted from the map 409 to the map parity checker
13 451 (see Fig. 35) and to the map output latch 443. The
14 map parity checker 451 compares the parity bit from the
map entry with the odd parity computed on the data.
16
17 If the parity is incorrect, the map address
18 i5 loaded into the map page register 441; and the map
19 parity error, signal sets an error flag which causes a
20 map parity error interrupt to the CPU 105.
21
22 Otherwise, in the case of correct parity,
23 the map entry data is loaded from the map output latch
24 443 into the map memory data register ~MMD) 445.
26 Finally, the RMAP instruction micrGprogram
27 returns the data in the map memory data (MMD) register
28 445 to the register stack 112 (see Fig. 12) as the result
29 Of the instruction.
, 183
_
~147~24
1 At the end of the read map (RM~P) instruction
2 the proper map section has been selected, the particular
3 logical page entry on that map section has been selected,
4 and the content of that map entry has been read out from
the map and returned as an instruction result to the CPU's
6 register stack.
8 The uses of the RMAP instruction include the
9 following.
11 The main function of this read map (RMAP)
12 instruction is to allow the operating system to examine
13 the reerence history field and dirty bit of a map
14 entry ~see the map entry format shown in Fig. 36) to
15 determine a page for overlaying (as will become more
16 apparent from the description of the operation to follow).
17
18 ' The r~ad map (RMAP) instruction is also used
19 in diagnostics to determine whether the map storage is
20 functioning properly.
21
22 The age map (AMAP) instruction is used by the
23 memory manager function of the operating system to maintain
24 useful reference history information in the map. This
25 reference history information is maintained in the map
26 by map entries (the R,Sand T bits of the map entry
27 format shown in Fig. 36) within a map section which are
28 typically "aged" after each page fault interrupt occurrence
29 ~n that map section.
184
f~ r
~1478Z4
1 This AMAP instruction has just a single para-
2 meter which is the map address specifyin~ the map
3 location to be ag~d.
In the operation of the age map ~A~P)
6 instruction, the microprogram 115 in the CPU 105 selects
7 a map location with the instruction map address parameter.
8 The microprogram 115 loads the map address parameter into
9 the PMA register just as in the RMAP instruction.
11 At this point a map read operation sequence
12 Of the map memory control logic 401 is initiated, and
13 this sequence proceeds identically as in the ~ ~P
14 instruction described above.
16 The microprogram 115 ~Fig. 12) reads the
17 content of the map entry from the MMD register 445 (Fig.
18 35) extracts the reference history field (the R, S and T
19 b~ts 10, 11 and 12 shown in Fig. 36), shifts the field
20 right one position, and reinserts the field to form the
21 new map entry data. Thus, a zero has been entered in
22 the R bit, the R bit has been shifted into the 5 bit,
23 the S bit has been shifted into the T bit, and the old
24 T bit is ~ost.
.
26 Now the microprogram 115 takes the modified
27 map entry and loads this new data into the PMD register
28 423 (Fig. 34) and writes the new map entry data back into
29 the selected map entry (similar to the SMAP sequence).
185
~ ~`.
114~824
1 This completes the age map (P*~P)
2 instruction.
: 3
4 As a result of the age map (AMAP) instruction,
a map entry has been read from the map, its reference
6 history field has been shifted, and this modified entry
7 has been reinserted into the selected map location.
g As previously noted, the R bit is set to one
by any memory reference to the corresponding logical page,
11 so that when this bit is a one, it is an indication
12 that this page has been used since the last set map
13 ~SMAP) or age map (AMAP) operation instruction.
14
.Thls setting of the R bit in conjunction with
16 the age map ~AMAP) instruction provides a means for
17 maintaining frequency of use information in the reference
18 hl8tory field of the map.
19
Thé reference history field of all of the map
21 entries in a given map are typically aged after a page
22 fault interrupt. Thus, the value of the three bit
23 reference ield in a map entry is an indication of the
24 frequency of access since the previous three page fault
interrupts.
26
27 For example, a binary value of seven (all .
28 three reference bits set at one), indicates accesses in
29 each of the intervals between the proceeding page fault
30 interrupts.
186
~ ~.
.
~14~
1 A binary value of four in the reference history
2 field ~the R bit set at one and the S and T bits set
3 at zero) indicates an access in the interval since the
4 last page fault interrupt and indicates that there were
5 no accesses in the intervals previous to the most recent
6 page fault interrupt.
8 As a final example,-a binary value of zero for
9 the three bit reference field indicates that that logical
10 page has not been accessed in any of the three intervals
11 since the làst three page fault interrupts.
12
13 Thus, the higher the binary number represented
14 by the three bit reference history field, the higher the
15 freguency of recent accesses to that logical page.
16
17 . Thls reference history information is main-
~8 tained so that when it is necessary to select a page for
19 overlay, a page which has been infrequently used in the
20 recent past can be identified. A page infrequently
21 accessed in the recent past is likely to continue that
22 behavior, and that page will therefore probably not have
23 to be swapped back into memory after being overlayed.
24
This frequency of use history is used by the
26 memory manager function of the operating system to select
27 infrequently used pages for overlay so as to minimize
28 swapping from secondary memory to implement an efficient
29 virtual memory system.
187
11478Z4
1 As noted above, memory may be accessed by
2 the CPU or by the I/O system.
4 The action of the memory system and map during
5 a CPU memory access sequence will now be descrihed.
6 The access sequence is similar for the various CPU
7 memory accesses such as writing data, reading data, or
8 reading instructions from memory.
The CPU memory access sequence is started
11 either by the CPU microprogram 115 or by the CPU
12 instruction-fetch logic. In either euent, the CPU 105
13 loads an eighteen bit logical address into the PMA
14 register 411 and initiates a data read, data write, or
15 lnstruction read operation sequence of the map memory
16 control logic 401.
17
18 The eighteen bit logical address is composed
19 Of two high order logical address space select bits and
20 sixteen low order bits specifying a location within that
21 logical address space. The two select bits may be
22 specified by the CPU microprogram 115 or may be auto-
23 matically generated in the CPU, based on the contents of
24 the instruction (I) and environment ~E) registers.
26 The eighteen bit logical address also includes,
27 in addition to the two high order logical address select
28 bits, six bits which specify the logical page within the
29 selected map and ten low order bits which specify the
30 offset within the page in the selected map.
188
1 In the data read, data write, or instruction
2 read operation sequence of the map memory control logic
3 401, after any previous map or memory operations hav~
; 4 completed, the eighteen bit address in the PMA register
411 (Fig. 35) is transmitted through the address selector
6 415 to the buses 417 and 419 (see Figs. 34 and 35).
8 The bus 419 transmits the page orfset portion
9 of the address. This page offset portion of the address
ls transmitted directly to the physical memory modules
11 403 (F~g. 403) ~y the bus 419.
12
13 The bus 417 transmits the logical page address
14 portion (which must be translated to a physical page
address) to the map 409.
16
17 The map entry selected by the logical page
18 address is read out from the map 409 to the map memory
19 control logic 401 (Fig. 34), the map parity checker 451
(Fig, 35), and the map output latch 443.
21
22 If the absent bit is a one, the logical page
23 address is loaded into the map page register 441, a
24 page fault interrupt signal is transmitted to the CPU
25 105, and the map memory control logic 401 terminates
26 the memory access sequence.
27
28 Similarly, if the parity checker 451 detects
29 lncorrect parity in the map entry, the logicai page
30 address is loaded into the map page register 441, a map
.
189 F
~782~
1 parity error signal is transmitted to the CPU, and the
2 memory access sequence is terminated.
4 Otherwise, if there is no error, the physical
page address is transmitted from the map output latch
6 443 over the bus 421 to the physical memory modules 403;
7 and the map memory control logic 401 issues a command
8 over the bus 439 to cause the selected memory module 403
9 to perform a read or write operation.
-
11 In a CPU write operation the data to be written
12 is transmitted from the PMD register 423 through the
13 data selector 427 to the memory module o~er the bus 429.
14
~5 While the memory module is performing a read
16 or write operation, the map memory control logic 401
17 causes,the map entry data to be modified and rewritten.
18
19 The map entry data, without the parity bit
20 P or the reference bit R, is transmitted from the map
21 output latch 443 to the dirty bit logic 455 (see Fig.
22 35~ and to the map data selector 447.
23
24 In this operation the physical page field
25 of a map entry (shown in enlarged detail in the lower
26 righthand part of Fig. 36) and the S and T bits of the
27 reference field and the absent bit are always rewritten
28 without modification.
29
190
~478Z4
1 If a CPU data write operation is being
2 performed, the dirty bit D supplied to the map data
3 selector is set to a one by the dirty bit logic 455.
4 Otherwise, the dirty bit is not modified.
6 The reference bit R supplied to the map data
; 7 selector by the reference bit logic 453 is set to a one
;~ 8 in either a read or a write operation.
. 9
The physical page field and the S, T and A
11 bits are not modified, as noted above.
12
13 The map data selector 447 supplies this new
14 map data to the parity generator 449 and to the map 409.
16 An odd parity bit P is generated from the
17 new data by the parity generator 449 (see Fig. 35).
18
19 A map write strobe from the map memory control
20 logic 401 then causes the new data and parity to be
21 written into the map entry selected by the logical page
22 address bus 417.
23
24 Thus, the logical page has been translated
25 through the map entry, and the map entry has been
26 rewritten with updated parity, reference,and dirty bits.
27
28 When the physical memory module 403 completes
29 its read or write operation, it sends a completion signal
30 to the map memory control logic 401 over the bus 439
31 (see Fig. 34).
191
... . ..
~478Z4
1 In a read operation the memory module 403
Z gates the memory data to the bus 437 (Fis. 34).
4 In a data read operation sequence the data
is loaded into the MD register 433 (Fig. 34) for use
6 by the CPU 105.
8 In an instruction read operation sequence
g the data is loaded into the NI register 431 (Fig. 34)
for subsequent execution by the CPU 105.
11 .
12 The CPU memory accesses of data read, data
13 wr~te and instruction read are thus completed as
14 described above.
16 An I/O channel access to read or to write
17 data to memory proceeds similar to a CPU memory access
18 as desdribed above except for the following.
19
~he channel memory address tCMA) register
21 129 ~Fig. 34) is used to provide the logical address,
22 and this register always specifies the system data map
23 469 (see Fig. 35~.
24
The channel memory data (CMD) register 425
26 (Fig. 34) is used to supply data to memory in a write
27 operation.
28
29 The channel data (CD) register 125 (Fig. 34)
is used to receive data from memory in a read operation.
.
lg2
~14'783~4
1 In an I/O channel 109 memory access, the
2 access is always a read or write data to memory access,
3 and there is no instruction read access as in the case
4 of a CPU access.
;~ 6 In addition, map parity and absent conditions
7 are transmitted ~o the I/O channel 109 if they occur in
8 an I/O channel access to memory.
g
As noted at several points above, either
11 semlconductor memory core memory is used for the memory
12 modules 403.
13
~4 When the memory is core memory, errors are
15 detected by a parity error detection system. The parity
16 error detection system for core memory modules is
17 effective to detect all single bit errors. Conventional
18 parity error generation and checking techniques are used,
19 and details of the core memory will therefore not be
20 illustrated.
21
22 The probability of failures in semiconductor
23 memory is great enough to justify an error detection
24 and correction system, and the present invention provides
25 a detection and correction system which incorporates a
26 six bit check field for each sixteen bit data word. Figs.
27 37-41 and related Table 1 (set out below) illustrate
28 details of an error detection and correction system used
29 when the memory modules 403 are constructed with semi-
30 conductor memory.
lg3 F
,, . . . ~ . .. ~, ....
~147824
1 The six bit check field error detection and
2 correction system of the present in~ention is, ~s will
3 be described in detail below, capable of detecting and
4 correcting all single bit errors and is also capable
5 of detecting all double bit errors. In addition, most
6 errors of three or more bits are detected.
8 While the error detection and correction
9 system wilI be described with reference to a semi-
10 conductor memory, it should be noted that the system
11 is not limited or restricted to semiconductor memory
12 but is instead useful for any data storage or trans-
13 mission application.
14
An important benefit of the error detection
16 and correction system of the present invention results
17 from the fact that not only are single bit errors
18 corrected but also that any subsequent double bit errors
19 are reliably detected after a single bit has failed.
21 The multiprocessor system incorporating the
22 error detection and correction system of the present
23 invention is therefore tolerant of single failures and
24 can be operated with single bit failures in semiconductor
25 memory until such time as it is convenient to repair
26 the memory.
27
2~ The error detection and correction system
29 utilizes a systematic linear binary code of Hamming dis-
30 tance four. In this code eac~ check bit is a linear
31 combination of eight data bits (as shown in Fig. 38).
194
.
,,
~1478Z~
1 Also, each data bit is a component of exactly three
2 check bits (as also shown in Fig. 38). An advantage of
3 this code is that uniform coverage of the data bits
4 by the check bits is obtained.
6 The error correction and detection system
7 embodies a syndrome decoder which provides the
8 combination of fast logic speed and low parts count.
1~ In initial summary, the error detection and
11 corrèction system of the present invention operates to
12 add 9iX check bits to each data word written into stor-
13 age. When a data word is subsequently read out of memory,
14 the check field portion of the storage word is used to
identify or to detect the loss of information in that
16 word since t~e time it was stored.
17
18 ~ In semiconductor memory there are two possible
19 mechanisms for loss of information (error). One is hard
20 failure of a memory device which makes that device
21 permanently unable to retain information, and the other
22 is soft failure in which electrical noise can cause a
23 transient loss of information.
24
The detection of errors is accomplished by a
26 check bit comparator which produces a six bit syndrome.
27 The syndrome is the difference between the check field
28 obtained from the stored word and the check field which
29 would normally correspond to the data field obtained from
30 the stored word.
195
1~78Z4
1 This syndrome is then analyzed [decoded) to
2 determine whether an error has occurr~d and, if an ~rror
3 has occurred, to determine what type of correction is required.
In the case of single data bit errors, the
6 syndrome decoder output causes a data bit complementer
7 to invert the bit that was in error; and this corrected
8 data is supplied as the output of that memory module.
If the syndrome decoder indicates a multiple
11 error, then the fact of the multiple error is communicated
12 to the map memory control section by means of one of
13 the control and error lines to cause an interrupt to the CPU.
14
With reference now to Fig. 37, the memory
16 module 403 includes a timing and control logic section
17 475 and a semiconductor storage array 477. The storage
18 array 477 provides storage for 32,768 words of twenty-
19 two bits each. Each word has (as illustrated in Fig. 37)
20 a slxteen bit data field and a six bit check field.
21
22 Each semiconductor memory module 403 also
23 has, as illustrated in Fig. 37, an output latch 479,
24 a check bit generator 481, a check bit comparator 483,
25 a syndrome decoder 485 and a data bit complementer 487.
26
27 The memory module 403 interfaces to the rest
28 of the system through the signal and data paths illustrated
29 in Fig. 37. These paths include: 429 tdata to memory bus),
30 439 (control and error lines to the map memory control
196
-
~147824
1 section 401), 41g and 421 (physical address bus), and
2 437 (data from memory bus). These signal and data paths
3 are also shown in Fig. ~4.
With continued reference to Fig. 37, the
6 content of the output latch 479 is transmitted on a
7 bus 489 to both the check bit comparator 483 and the
8 data bit comparator 487.
g
The output of the check bit comparator 483 is
11 transmitted on a syndrome bus 491 to both the syndrome
12 decoder 485 and the timing and control logic section 475.
13
14 The output of the syndrome decoder 485 is trans-
15 mitted on a bus 493 to the data bit complementer 487.
16
17 Other outputs of the syndrome decoder 485
18 ~re transmitted on lines 495 and 497 to the timing and
19 control logic section 475. The line 495 transmits a
20 SINGLE ERROR (correctable error) signal, and the line
21 497 transmits a MULTIPLE ERROR (uncorrectable error) signal.
22
23 The timing and control logic 475 provides
24 control signals on a control bus 499 to the semi-
25 conductor storage array 477 and also to the output latch
26 479.
27
28 - The output of the check bit generator 481 is
29 transmitted to the storage array 477 by a bus 501.
197
~ 4~8Z4
1 With reference to Fig. 38, the check ~it
2 generator 481 includes six separate elght-bit parity
3 trees 503.
As shown in Fig. 39, the check bit comparator
6 483 includes six separ2te nine-bit parity trees 505.
8 As shown in Fig. 40, the syndrome decoder
9 485 includes a aecoder section 507 and a six-bit parity
10 tree 509.
11
12 With continued reference to Fig. 40, the out-
13 puts o~ the decoder section 507 and six-bit parity tree
14 509 are combined in error identification logic indicated
15 generally by the reference numeral 511.
16
17 As illustrated in Fig. 41, the bit complementer
18 437 comprises sixteen exclusive-or gates 513.
19
In operation the sixteen bit data word is
21 gupplied by the bus 429 to the storage array 477 and also
22 to the check bit generator 481 (see Fig~ 37).
23
24 The check bit generator 481, as best
25 illustrated in Fig. 38, generates six check bits C0
26 through C5 by means of the six eight-bit parity trees
27 503.
28
29 As also illustrated in Fig. 38, the eight-
30 bit parity tree 503 farthest to the left generates
198
~1~7824
1 check bit zero (C0~ as specified by the logic equation
2 for C0 as set out at the lower part of ~ig. 38. Check
3 bit zero (C0) is therefore the complement of the modulo-
4 two sum of data bits 8 through lS.
6 By way of further example, the chec~ bit C3
7 is generated by an eight bit parity tree 503 as specified
8 by the logic equation for C3 set out at the lower part
9 of Fig. 38. Check bit three (C3) is the modulo-two
10 sum of data bits 0, 1, 2, 4, 7, 9, 10 and 12 as shown
11 by the logic equation and as also illustrated by the
12 connections between the eight bit parity tree and the
13 corre9ponding data bit lines in the logic diagram in the
14 upper part of Fig. 38.
16 Sim~larly, each of the other check bits is
17 generated by a modulo-two addition of eight data bits
18 as itlustrated in the logic diagram in the top part of
19 Fig. 38.
21 To accomplish a memory write operation, these
22 six check bits, as thus generated by the check bit
23 generator 481, and the sixteen data bits, as transmitted
24 on the data bus 429, are entered in a particular location
25 in the storage array 477. As illustrated in Fig. 37, the
26 six check bits and the sixteen data bits are entered
27 in the storage array 477 under the control of the timing
28 and control logic 475 and the physical address information on
29 the physical address bus 419, 421.
3~
199
.,_ ,. . . . .................. . . . . . . . . . .
.
~47824
1 Every word stored in the storage array 477
2 has a siX bit check field generated for that word in
3 a similar manner. This check field is retained wlth
4 the stored word in the storage array 477 until the time
5 when that location in the storage array is subsequently
6 accessed for a read operation.
8 When a particular word is to be read out of
9 the storage array 477, the timing and control logic 475
10 and the address on the physical address bus 419, 421
11 causes the content of the selected storage location to
12 be loaded into the output latch 479. The output latch
13 is twenty-two bits wide to accommodate the sixteen data
14 bits and the six bit check field.
16 From the output latch 479 the sixteen data
17 bits and the 5iX bit check field are transmitted by a
18 bus 489 to the check b~t comparator 483.
19
As illustrated in Fig. 39, the check bit
21 comparator 483 forms six syndrome bits S0 through S5.
22
23 Each syndrome bit is the output of a nine-bit
24 parity tree 505 whose inputs are eight data bits and one
25 check bit. Each syndrome bit is related to a correspond-
26 ~ngly numbered check bit. Thus, check bit zero is used
27 only for computing syndrome bit zero, check bit one is
28 used only for computing syndrome bit one, and so forth.
29
- 200
~7824
1 As an example, syndrome bit zero (S0) is the
: 2 comple~ent of the modulo-two sum of ch~ck bit zcro and
3 data bits 8 through 15 (as shown in the logic equation
4 at the bottom of Fig. 39).
- 5
6 Similarly, each of syndrome bits S l through
7 S 5 is generated from the modulo two sum of a corresponding
8 check bit and eight of the data bits, as shown by the
g connections to the particular data bit lines for each
syndrome bit in the logic diagram part of Fig. 39.
11
12 The presence or absence of errors and the types
of errors, if any, are identified by interpreting the
14 value of the six syndrome bits on the bus 491.
~ ,15
16 ~able 1 enumerates the sixty-~our possible
17 values of the six bit syndrome code and gives the
18 interpretation for each possible value.
19
21
22
23
24
26
27
28
29
: 30 .
201
78Z~
TABLE 1
SYNDROME CODES
S0 Sl S2 S3 S4 S5 ERROR IN S0 Sl S2 S3 S4 S5ERROR IN
O 0 0 0 0 0(No Error~ 1 O 0 0 0 0 C0
0 0 0 0 0 1 C5 0 0 0 1 ~Double)
0 0 1 0 C4 0 0 1 0 ~Double)
0 ~ 1 1 (Double) 0 0 1 1 D8
0 1 0 0 C3 0 1 0 0 (Double)
0 1 0 1 (Double) 0 1 0 1 D9
0 1 1 0 (Double) 0 1 1 0 D10
0 0 0 1 1 1 D0 0 1 1 1 ~Double)
0 0 1 0 0 0 C2 1 0 1 0 0 0 (Double)
0 0 0 1 (Double) 0 0 0 1 Dll
0 0 1 0 (Double) 0 0 1 0~Multi-All 0's)
0 0 1 1 (Multi) 0 0 1 1 (Double)
0 1 0 0 (Double) 0 1 0 0 D12
0 1 0 1 Dl O 1 0 1 (Double)
0 1 1 0 D2 0 1 1 0 (Double)
0 1 1 1 ~Double) O 1 1 1 ~multi)
0 1 0 0 0 0 Cl 1 1 0 0 0 0 (Double)
0 0 0 1 (Double) 0 0 1 D13
0 0 1 0 (Double~ 0 0 1 0 D14
0 0 1 1 D3 0 0 1 1 (Double)
0 1 0 0 (Double) 0 1 0 0 (Multi)
0 1 0 1(Multi-All l's) O 1 0 1 ~Double)
0 1 1 0 D4 0 1 1 0 ~Double)
0 1 1- 1 (Double) 0 1 1 1 (Multi)
O 1 1 0 0 0 (Double) 1 1 1 0 0 0 D15
0 0 0 1 D5 0 0 0 1 ~Double)
0 0 1 0 D6 0 0 1 0 ~Double)
0 0 1 1 (Double) 0 0 1 1 ~Multi)
O 1 0 0 D7 0 1 O O (Double)
0 1 0 1 ~Double) O 1 0 1 ~Multi)
0 1 1 0 ~Double) 0 1 1 0 ~Multi)
0 1 1 1 (Multi) 1 1 1 (Double)
THUS (NUMBER OF l's IN SYNDROME)
0 BITS - NO ERROR 3 BITS - DATA BIT OR MULTI
1 BIT - CHECK BIT ERROR 4 BITS - DOUBLE
2 BITS - DOUBLE 5 BITS - MULTI
6 BITS - DOUBLE
202
~4~t3Z4
1 For e~ample, if all of the syndrome bits S O
2 through S 5 are zero, there is no error in either the
3 data field or the check field. This is the condition
4 illustrated at the upper left of Table 1.
6 The presence or absence of errors and the
7 type of error is summarized at the bottom of Table 1.
9 In this summarization, when all six syndrome
0 bits are zero, ther,e is no error, as noted above.
11
12
13
14
16
17
18
19
21
22
23
24
26
27
28
29
203
.. _ .
1 If only one of the six syndrome bits is on,
2 this indicates an error in the corre~ponding check bit.
3 It should be noted at this point that check bit errors
4 are single bit errors which do not require correction
5 of the data word.
7 As also illustrated in the summary at the
8 bottom of Table l, when two bits are on there i5 a
9 double bit error; and the two errors could be (a) one
10 error in a data bit and one error in a check bit or
11 (b) two errors in the data bits or (c) two errors in
12 the check bits.
13
14 When three bits are on in the six bit syndrome
15 code, that condition can correspond to either a single
16 data bit error or a multiple error.
17
18 As an example of a single bit error in a data
19 bit, see the syndrome code lll,000 indicating a single
20 bit error in data bit D-15 in the lower right hand part
21 Of Table 1. As will be described in more detail below,
22 the syndrome decoder 485 (Fig. 37 and Fig. 40) will cause
23 the incorrect value of data bit 15 to be inverted
24 ~corrected).
26 ~he syndrome decoder 485 provides two functions.
27 First, the syndrome decoder 485 provides an
28 input to the data bit complementer 487 (see Fig. 37) by
29 way of the bus 493 in the case of single data bit errors,
30 which input causes the erroneous bit to be inverted with-
31 in the data bit complementer 487.
204
_;
..
~47824
1 Secondly, the syndrome decoder 485 provides
2 one of two error sig~als in the event ~f an error.
3 -
4. A single data ox check bit error is transmitted
5 on the SINGLE ERROR line 495 to the timing and control
6 logic 475.
. 7
8 A multiple error ind.ication is transmitted on
9 the MULTIPLE ERROR line 491 to the timing and control
10 logic 475.
11
12 A MULTIPLE ERROR signal is generated in the
~3 case of all double bit errors and most three or more
14 bit errors. This MUL~IPLE ERROR signal, as noted abo~e,
15 causes an interrupt to the CPU 105 (see Fig. 34).
16
17 The construction of the syndrome decoder 485
18 is shown in detail in Fig. 40. The syndrome decoder 485
19 comprises a decoder 507, a six bit parity tree 509 and
20 error identification logic 511.
21
22 The decoder 507 decodes five of the six syndrome
: 23 bits (bits Sl through S5) to provide sufficient information
24 (thirty-two outputs) to generate both the error types
25 (whether single errors or double or multiple errors) and
26 the sixteen output lines required for inversion of data
27 bit errors in the sixteen data bits. These sixteen output
2g lines required for inversion of data bit errors are
29 indicated generally by the bus 493 and are identified
30 individually by T0 through T15 in Fig. 40.
205
.. . .
~47824
1 The decoder 507 outputs which are not connected
: 2 to the OR gate 512 correspond to errors in the six check
bits. Errors in the six check bits do not need to be
4 corrected (since the errors are not data bit errors),
5 and these outputs of the decoder are therefore not used.
7 The remaining outputs (the outputs connected
8 to the OR gate 512) represent double or multiple errors
g and are so indicated by the legends in Fig. 40. All of
10 these cases are collected by the OR gate 512 and are one
11 component of the multiple error signal on the line 497
12 at the output of the error identification logic 511.
13
14 As also illustrated in Fig. 40, the syndrome
15 decoder 45 includes a parity tree 509 which forms the
16 modulo-t~o sum of syndrome bits S0 throush S5.
17
18 The resulting even or odd output of the parity
19 tree 509 corresponds to the error classes shown at the
20 bottom of Table 1.
21
22 Thus, the EVEN output 514 corresponds to syndromes
23 containing no bits on, two bits on, four bits on, or six
24 bits on.
26 The EVEN syndrome corresponding to no bits on
27 (no erxor) is excluded from the MUL~IPLE ERROR output signal
28 497 by an AND gate 515 which excludes the zero syndrome
29 case (the other input from decoder 507 to the gate 515).
30
206
-- 78Z4
1 Syndromes containing two bits on, four bits on
2 or six bits on are thus the only remaining EVEN syndromes
3 which in combination with the MULTIPLE signal constitute
4 multiple errors as transmitted on the output line MULTIPLE
5 E~ROR (497).
7 An output is desired on the SINGLE ERROR
8 indicator line 495 only for single bit errors. Since
9 the odd output on the line 510 of the parity tree 509
10 corresponds to one bit on (check bit error), three bits
11 on (data bit error or multibit errors), or five bits on
12 (multibit errors) in the six-bit syndrome (as indicated in
13 the summary at the bottom of Table 1), the odd output on
14 line 510 must be qualified so that only single bit errors
15 are transmitted through the logic 511 to the line 495.
16 $ho~e three-bit syndrome codes corresponding to multibit
~7 errors and all of the five-bit syndrome codes must there-
18 ~ore be excluded so that only the single bit errors are
19 transm~tted on the line 495. This is accomplished by an
20 inverter 517 and an AND gate 519.
21
22 A SINGLE ERROR output is generated on the line
23 495 for syndrome codes containing a single one bit (check
24 bit errors) and also for those syndrome codes containing
25 three one bits corresponding to data bit errors. As noted
26 above, the odd output of the parity tree 509 indicates
27 syndromes containing one, three or five bits on. The
28 inverter 517 and the AND gate 519 exclude multiple error
29 three bit syndromes and all five bit syndromes. Thus,
30 the SINGLE ERROR output 495 includes only single check
207
~,_~
~478Z4 ~i
1 bit errors and single data bit errors. Single check
2 bit errors do not need to be corrected, and sin~le data
3 bit errors are corrected by the bit complementer 487.
The logic e~juations for MULTIPLE ERROR and
6 for SINGL~ ERROR listed on the bottom of Fig. 40
7 represent the operation described above.
9 There are some errors of three or more bits
10 which are not identified as multiple erroxs and in fact
11 can be incorrectly identified as no errors or as single
12 bit errors (correctable errors). However, the normal
13 pattern of error generation is such that the deterioration
14 Of storage is nor~ally detected before three bit errors
15 occur. For example, the normal pattern of deterioration
16 Of memory storage would first involve a single bit error
17 from noise or component failure, then would later involve
18 a double,bit error from additional failure, etc.; and the
gdouble bit errors would be detected before the three or
20 more bit errors could develop.
21
22 The function of the data bit complementer 487
23 tsee Fig. 37) is to invert data bit errors as detected
24by the syndrome decoder 485.
26 Fig. 41 shows details of the construction of
27the bit complementer 487. As illustrated in Fig. 41,
28the bit complementer 487 is implemented by exclusive-or
2ggates 513. Each of these gates 513 inverts a given data
30bit on a line 489 when a corresponding decoder output
31 on a line 493 is asserted
~08
. . , . ~
-
~78Z4
1 The corrected output is then transmitted on an
2 output line 437 of the bit compl~menter 487 as the out-
3 put of that physical memory module.
This completes the description of the error
6 detection and correction system.
8 The memory system of the present invention
9 provides a number of significant features.
11 First of all, the memory map provides four
12 separate and distinct logical address spaces--system
13 code, system data, user code and user aata--and provides
14 for a translation of logical addresses within these
15 address spaces to physical addresses.
16
17 The division of logical memory into four
18 address spaces isolates the system programs from the
19 actions of the user programs and protects the system
20 programs from any user errors. The division into four
21 logical address areas also provides for a separation of
22 code and data for both user code and data and system
23 code and data. This provides the benefits of non-
24 modifiable programs.
2S
26 There are specific fields within each map
27 entry for this page address translation and for other
28 specific conditions.
29
One field permits translation of logical page
31 addresses to physical page addresses.
209
~478Z4
1 Another field provides an absence indication.
2 This field is an absence bit which allows implementation
3 of a virtual memory scheme where logical pages may reside
4 in a secondary memory.
6 Another field is a reference history field.
7 This reference history field allows frequency of use
8 information to be maintained for use by the memory manager
9 function of the operating system to make the virtual
10 memory scheme an efficient scheme. Frequently accessed
11 pages are retained in primary memory, and infrequently
1~ used pages are selected for necessary overlaying.
13
14 A dirty bit field is maintained in each entry
15 Of the system data map and the user data map so that
16 unmodified data pages can be identified. The unmodified
17data pages 80 identified are not swapped out to secondary
18memory because a valid copy of that data page is already
19Pre~ent in secondary memory.
~1 The memory system includes map memory control
221Ogic which automatically maintains the reference and
23dirty bit information as CPU and I/0 channel accesses
24are made to memory.
26 The memory system of the present invention pro-
27vides for three CPU instructions--S~P, RMAP and AMAP--
2gwhich are used by the operating system's memory manager
2gfunction to maintain and to utilize information in the
3~aP- ,
~ .
210
,
f
1~478Z4
1 The memory system of the present invention
2 includes a dual port access to the memory. The memory
3 can be accessed separately by the CPU and by the I~O
4 channel. Accesses to memory by the I/O channel do
5 not need to involve the CPU, and the CPU can be per-
6 forming other functions during the time that an I/O
7 data transfer is being made into or out of memory.
; 8
9 The operation of the dual port access to
10 the memory also involves arbitration by the map memory
11 control logic in the event that the CPU and the I/O
12 channel attempt a simultaneous access to the memory.
13 In the case of simultaneous access, the I/O channel
14 is given priority and the CPU waits until that particular
15 I/O channel access has completed.
16
17 Physical memory is expandible by the modular
18 addition of physical memory modules.
19
The physical memory modules incorporate, in
21 the case of semiconductor memory, error detection and
22 correction under certain conditions. Single errors
23 are detected and corrected so that operation of the
24 CPU and I/O channel can ~e continued even in the event
25 of a transient or permanent failure within the physical
26 memory module. The error detection and correction
27 system comprises a twenty-two bit word within the storage
28 medium. Sixteen bits represent the data and six bits
29 provide an error detection and correction check field.
30 The six bit check field allows the detection and
211
~ ` ~
~4'782~
. .
1 correction of all single errors and the detection of
2 all double errors.
4 The core memory includes parity for the
5 detection of single errors.
7 In the overall muItiprocessor s~stem of the
8 present invention each processor module incorporates its
g own primary memory system.
11 Since each processor module has its own memory
12 system, problems of shared memory in a multiprocessing
13 9ystem are avoided.
14
The problems of shared memory in a multiprocessing
16 ~ystem include reduced memory bandwidth available to a
17 particular processor because of contention, and this
}8 reduction of available memory bandwidth becomes more
19 severe as additional CPU's are combined with a single
20 ~hared memory.
21
22 The problems of interlocks relating to the
23 communication between CPU's by means of areas within a
24 shared memory are avoided by the present invention which
25 does not include shared memory and which does, instead,
26 provide for communication between processor modules by
27 an interprocessor bus communication system.
28
29 An additional problem of shared memory is
30 that a failure in the shared memory can result in
212
a782~ ~
1 simultaneous failure of some or all of the CPU's in
2 the system. That is, in a shared memory system, a
3 single memory failure can stop all or part of the system;
4 but a memory failure will not stop ~he multiprocessor
5 system of the present invention.
7 The dual port access by the CPU and the I/O
8 channel to the memory utilizes and is permitted because
g of separate address registers and separate data registers
10 to and from memory.
11
12 The CPU has a specific register (the NI register)
13 specifically for receiving instructions fxom memory. This
separate and specific register allows overlapped fetch-
lS ing of the next instruction during execution of the
16 current instruction (which may involve the reading o
17 tata from memory). As a result, at the end of a current
18 instruction, the next instruction can be initiated immedi-
19 ately without waiting for an instruction fetch.
21 The map is constructed to provide significantly
22 faster access than the access to physical main memory.
23 This provides a number of benefits in the translation of
24 addresses through the map.
26 As one result, in the memory system of the
27 present invention, the map can be rewritten in the time
2~ that the physical memory access is being accomplished.
29
.
213
~47~324
Because the rewriting is so fast, the rewriting
of the map does not increase memory cycle time.
Also, the high speed at which the map can be
accessed reduces the overall time including page trans-
lation required for a memory access.
Parity is maintained and checked in the actual
map storage iself. This provides immediate indication
of any failure in the map storage before resulting in-
correct operation in the processor module can occur.
- 21~ -