Patent 1324678 Summary

(12) Patent:	(11) CA 1324678
(21) Application Number:	1324678
(54) English Title:	DIGITAL SIGNAL PROCESSING APPARATUS
(54) French Title:	APPAREIL DE TRAITEMENT DE SIGNAUX NUMERIQUES
Status:	Expired and beyond the Period of Reversal

Bibliographic Data

(51) International Patent Classification (IPC):	G06F 9/46 (2006.01)
(72) Inventors :	MURAKAMI, TOKUMICHI (Japan) KAMIZAWA, KOH (Japan) KINJO, NAOTO (Japan)
(73) Owners :	MITSUBISHI DENKI KABUSHIKI KAISHA
(71) Applicants :
(74) Agent:	KIRBY EADES GALE BAKER
(74) Associate agent:
(45) Issued:	1993-11-23
(22) Filed Date:	1989-02-17
Availability of licence:	N/A
Dedicated to the Public:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	No

(30) Application Priority Data:

Application No.	Country/Territory	Date
298722/88	(Japan)	1988-11-26
298723/88	(Japan)	1988-11-26
37921/88	(Japan)	1988-02-19
63695/88	(Japan)	1988-03-18

Abstracts

English Abstract

ABSTRACT
A digital signal processing apparatus which is used for
the computation of coding image signals or the like. The
apparatus comprises a plurality of signal processing means
arranged in parallel and control means which assigns loads to
the signal processing means so that the signal processing
means have even computation volumes. Alternatively, an
address generator is provided for each of the data sets
entered independently.
- 1 -

Claims

Note: Claims are shown in the official language in which they were submitted.

CLAIMS:
1. A digital signal processing apparatus comprising a
multiprocessor module including a plurality of signal
processors in connection, each of said signal processors
including an instruction memory which stores a sub-program
that describes a functional process of a signal processing
operation that is a combination of the functional processes
for data blocks formed of a plurality of data, said
instruction memory being accessible for writing from outside
and connected through an internal bus with a data memory
which is used for the execution of said signal processing
operation;
a digital signal processor which executes any of said
functional processes in units of data blocks in accordance
with a sub-program stored in said instruction memory;
block forming means which forms a signal processing block
of one unit by appending, for each data block, control
parameters including the type of functional process to be
executed, a block address indicative of the position in time
and spatial domains and the order in time and spatial domains
of said data block and information indicative of post-process
attributes of said data block;
activation control means which analyses said control
parameters to activate each unit of said functional process
indicated by said parameter;
an interrupt controller which controls the timing of
execution of said activation control means in response to an
external interrupt;
54

status indication means which indicates to the outside as
to whether said functional process is in execution;
a data input bus which reads out a unit of signal
processing block from an external data memory by way of said
internal bus for the execution of said functional process;
at least one dual-port memory capable of reading and
writing independently on both ports, with one port being
connected to said internal bus and adapted to write a unit of
signal process block resulting from said functional process,
and with another port being opened to the outside;
and an external bus controller including a bus contention
control means which connects said internal bus to a common bus
consisting of at least one data bus provided externally only
when the common bus is not used by an external device and
implementing data transfer for the unit of signal processing
block or arbitrary quantity of data;
and a transfer control means which performs data transfer
asynchronously in units of signal processing block by linking
adjoining ones of said signal processors in a serial and/or
parallel arrangement by connecting the externally-opened port
of said dual-port memory in one processor to said data input
bus of another processor; an input frame buffer of dual-port
memory which forms a digital signal into blocks and writes the
signal on one port in units of frame or block on a real time
basis and implements data input by connecting another port to
said input bus in the first-stage signal processor in said
multiprocessor module; at least one common memory which is
connected to said common bus and adapted to transact data in

units of signal processing block or arbitrary number of data
with all of said signal processors; a task table which
memorizes said status indication means in said signal
processors; an output controller which reads out the last-
processed signal processing block written in said dual-port
memory in the last-stage signal processor in said
multiprocessor module, rearranges the block in accordance with
said processing parameters so as to be in compliance with the
position in time and spatial domains and the order in time and
spatial domains, stores the rearrangement result temporarily
in a buffer memory and outputs the buffer memory contents at a
constant quantity per unit time; a data flow controller which
scans the contents of said task table at a constant interval,
determines the process assignments of said signal processors
on the basis of feedback information such as the degree of
occupancy of said buffer memory indicated by said output
controller, and activates the interrupt controller of each
said signal processor; and a writing means which newly
generates said processing parameters for each signal
processing block entered newly to said input frame buffer and
writes the parameters in corresponding positions of input
frame buffer.
2. A digital signal processing apparatus according to
claim 1, wherein said data flow controller comprises judgment
means which judges the processing status of each processor by
scanning said task table at a constant interval; and first
control means which determines on the basis of the result
56

provided by said judgment means as to whether each signal
processing module can process a next signal processing block,
and, if possible, issues an interrupt signal to said interrupt
controller to initiate processing or, if impossible, directs a
signal processing module, which can have a process, to
transfer said signal processing block.
3. A digital signal processing apparatus according to
claim 2, wherein said judgment means performs scanning, in
case of parallel processing in a constant period, in a time
length which is the input period of said signal processing
block multiplied by the number of parallel processings, or, in
case of serial processing, in a time length which is the input
period divided by an integer greater than or equal to one, and
implements matching with real time by being in synchronism
with input data frames.
4. A digital signal processing apparatus according to
claim 1, 2 or 3, wherein said data flow controller includes
second control means, in which a piece of image data is
divided into small rectangular blocks to form said data
blocks, the size of data block is made equal to a maximum or
minimum size dealt with by said functional processes and
positions of small blocks in said piece of image data are used
as spatial position information of said process parameters,
and, in case of inter-frame coding for a moving image, a frame
memory for storing a coded previous frame image is said common
memory and a signal processing block processed by a signal
57

processor unit is written in the position of the common memory
by way of said common bus thereby to form feedback data, and a
new image frame is processed by making reference to said
feedback data from another signal processing block by way of
the common bus; and third control means which, if feedback
data of the previous frame has not yet written in the position
in the common memory, dictates an execution wait for the
process.
5. A digital signal processing apparatus according to
claim 4, wherein a plurality of digital signal processors are
connected in parallel through a local common bus, a local
signal processor is formed of a local data flow controller
which performs only process activation control for said
digital signal processors, a local common memory which can be
accessed commonly by said digital signal processors, and a
plurality of digital signal processors, a plurality of local
signal processors being connected to complete said signal
processors in a hierarchical structure.
6. A digital signal processing apparatus according to
claim 5, wherein said multiprocessor module is one in number.
7. A digital signal processing apparatus according to
claim 5, wherein said multiprocessor module is more than one
in number.
58

Description

Note: Descriptions are shown in the official language in which they were submitted.

`` 132~70
,
DIGITAL SIGNAL PROCESSING APPARATUS
This is a division of copending Canadian Patent
Application Serial No. 591,354 which was filed on
~- February 17, 1989.
BACKGROUND OF THE INVENTION
Field of the Invention
The present invention relates to a digital signal
processing apparatus which preforms computational processes
for digital signals.
Prior art digital signal processing systems will be
-~ discussed in conjunction with the drawings.
SUMMARY OF THE INVENTION
An object of the preferred embodiment of the present
~; invention is to provide a digital signal processing apparatus
which uses the multiprocessor parallel configuration to its
maximal processing ability.
~: Another object of the preferred embodiment of the present
r,. invention is to provide a digital signal processing apparatus
which works efficiently with less number of processors and
less capacity of memory, while ensuring the latitude of signal
~; processing algorism.
Still another object of the preferred embodiment of the
present invention is to provide a digital signal processing
;~ apparatus ~hich eliminates the need of address control for
~, 25 storing the intermediate result and transfer to the memory,
x~ thereby executing fast 3-input l-output operation.
A further object of the preferred embodiment of the
present invention is to provide a motion compensative
.
-1-- ~
,; .
,
~, :
.
. .
.~ .
,

132~7~J
~eration method which, in constructing the motion compensator
of ~n image coding system with a digital signal processing
apparatus, requires less number of parallel processors,
thereby enhancing the simplicity and compactness of the
- 5 hardware structure.
In accordance with one aspect of the invention there is
: provided a digital signal processing apparatus comprising a
multiprocessor module including a plurality of signal
processors in connection, each of said signal processors
including an instruction memory which stores a sub-program
that describes a ~unctional process of a signal processing
. operation that is a combination of the functional processes
. for data blocks formed of a plurality of data, said
instruction memory being accessible for writing from outside
~ 15 and connected through an internal bus with a data memory which
;'~ is used for the execution of said signal processing operation;
a digital signal processor which executes any of said
~ functional processes in units of data blocks in accordance
7~ with a sub-program stored in said instruction memory; block
. 20 forming means which forms a signal processing block of one
.~` unit by appending, for each data block, control parameters
.`7'` including the type of functional process to be executed, a
block address indicative of the position in time and spatial
domains and the order in time and spatial domains of said data
.~ 25 block and information indicative of post-process attributes of
'~ said data block; activation control means which analyses said
control parameters to activate each unit of said functional
process indicated by said parameter: an interrupt controller
- 2 -
.
.,.
., .
.. .. .
';
: ' .
... :
.

132~7~j
...lich controls the timing of execution of said activation
control means in response to an external interrupt;
status indication means which indicates to the outside as to
whether said functional process is in execution; a data input
bus which reads out a unit of signal processing block from an
external data memory by way of said internal bus for the
execution of said functional process; at least one dual-port
memory capable of reading and writing independently on both
ports, with one port being connected to said internal bus and
10 adapted to write a unit of signal process block resulting from
said functional process, and with another port being opened to
the outside; and an external bus controller including a bus
contention control means which connects said internal bus to a
common bus consisting of at least one data bus provided
15 externally only when the common bus is not used by an external
.. device and implementing data transfer for the unit of signal
processing block or arbitrary quantity of data; and a transfer
control means which performs data transfer asynchronously in
. units of signal processing block by linking adjoining ones of
said signal processors in a serial and/or parallel arrangement
by connecting the externally-opened port of said dual-port
memory in one processor to said data input bus of another
processor; an input frame buffer of dual-port memory which
; forms a digital signal into blocks and writes the signal on
one port in units of frame or block on a real time basis and
implements data input by connecting another port to said input
bus in the first-stage signal processor in said multiprocessor
module; at least one common memory which is connected to said
- 3 -
,,
.

~ 3 2 ~
~ .nmon bus and adapted to transact data in units of signal
processing block or arbitrary number of data with all of said
; signal processors; a task table which memorizes said status
indication means in said signal processors; an output
controller which reads out the last-processed signal
processing block written in said dual-port memory in the last-
. stage signal processor in said multiprocessor module,
.; rearranges the block in accordance with said processing
.; parameters so as to be in compliance with the position in time
and spatial domains and the order in time and spatial domains,
; stores the rearrangement result temporarily in a buffer memory
.. and outputs the buffer memory contents at a constant quantity
~ per unit time; a data flow controller which scans the contents
;,s; of said task table at a constant interval, determines the
~ 15 process assignments of said signal processors on the basis of
.~ feedback information such as the degree of occupancy of said
buffer memory indicated by said output controller, and
activates the interrupt controller of each said signal
processor; and a writing mea.ns which newly generates said
processing parameters for each signal processing block enterednewly to said input frame buffer and writes the parameters in
corresponding positions of input frame buffer.
~RIEF DESCRIPTION OF THE DRAWINGS
The present invention taken in conjunction with the
invention disclosed in copending Canadian Patent Application
Serial No. 591,354 which was filed on February 17, 1989, will
x~. be described hereinbelow with the aid of the accompanying
-~........ drawings in which:
~ - 3a -
; .
~,.... .
.~
i

1 32~7~
Fig. 1 is block diagram showing the multiprocessor
system of a conventional digital signal processing apparatus;
Fig. 2 is a diagram explaining the assigned areas of the
; processors shown in Fig. 1;
. 5Fig. 3 is a block diagram showing the arrangement of
other conventional digital signal processing apparatus:
Fig. 4 is a block diagram showing in detail the
arrangement of the signal processing module shown in Fig. 3:
~; Fig. 5 is a block diagram showing the algorism of the
10high-efficiency coder for a moving image;
Fig. 6 is a block diagram showing the arrangement of a
. third conventional digital signal processing apparatus;
Fig. 7 is a flowchart showing the process of 3-input
..
. ~;
~.
:,
;'
- ~
0
.
, .
o,
- 3b -
~':
. ~. . . -
., .
,
,. ..

~32~7~
~rithmetic operation using the digital signal processing
apparatus shown in Fig. 6;
Fig. 8 is a block diagram showing in brief the
arrangement of the image coding transmitter which carries out
the conventional motion compensative operation method;
Fig. 9 is a diagram used to explain the conventional
motion compensative operation method;
Fig. 10 is a flowchart showing the operational process
for detecting a motion vector in the conventional motion
compensative operation method;
Fig. 11 is a block diagram showin~ the digital signal
processing apparatus based on the first embodiment of this
invention;
Fig. 12 is a diagram explaining the area assignment for
the processors shown in Fig. 11;
Fig. 13 is a block diagram showing the arrangement of the
digital signal processing apparatus formed by connecting in
cascades a plurality of digital signal processors (DSP blocks)
shown in Fig. 11;
Fig. 14 is a diagram showing the concept of process of
each DSP block shown in Fig. 13;
Fig. 15 is a block diagram showing the digital signal
processing apparatus based on the second embodiment of this
invention;
` 25 Fig. 16 is a block diagram showing the internal
arrangement of the signal processor shown in Fig. 15;
Fig. 17 is a diagram explaining the concept of control
operation of the digital signal processing apparatus shown in

~;g. 15; 1324~ lo
Fig. 18 is a diagram explaining the relation between
parameter data and processing block data in the digital signal
processing apparatus shown in Fig. 15;
Fig. 19 is a diagram showing the correspondence between
i
c data blocks and a frame;
~ Fig. 20 is a block diagram of the arrangement in which a
`~ plurality of digital signal processors are included in the
~ digital signal processing apparatus shown in Fig. 15;
;~ 10 Fig. 21 is a block diagram showing the digital signal
.,.
processing apparatus based on another embodiment of this
~,~ invention;
Fig. 22 is a flowchart showing the operational process of
~; the digital signal processing apparatus shown in Fig. 21;
Fig. 23 is a flowchart showing an embodiment of the
~'~ inventive motion compensative operation method using a digital
j,
signal processing apparatus;
Fig. 24 is a diagram used to explain the method of
intermediate check for the computation of distortion in the
.: ,~,
i 20 inventive motion compensative operation method; and
:` ,
Fig. 25 is a diagram showing the arrangement of pixel
samples at sampling points in a block according to the
intermediate check method for the distortion computation.
Fig. 1 shows the multiprocessor system described in
article entitled "A Real Time Video Signal Processor Suitable
'r
s~ for Motion Picture Coding Applications", IEEE, GLOBCOM `87,
p. 453. In Fig. 1, input data 1 is received by a data
transfer controller 3, and thereafter data 4 are transferred
,,,~
~ - 5 -
'"'tr
r~,~.,
. ,
~'~' ` ` .
, . ~
. .

132~7,i
selectively to digital signal processors 2, i.e. DSP-l through
DSP-N, in block-1. After being processed by the respective
DSPs in block-l, resultant data 5 is transferred to bloc~-2
and processed by respective DSPs for the next processing step.
Fig. 2(a) shows divided memory areas of the DSPs. For
the simplicity of explanation, shown here is an example of
parallel processing using three DSPs 2, to which process areas
A, B and C are assigned evenly.
In the inter-frame image coding system and the like, it
lo is a general convention to employ the conditional pixel
supplementary process in which only portions having at least a
certain difference between the input frame and previous frame
are coded and previous frame data is used for the remaining
portions. Accordingly, the volume of computation needed for
the process differs depending on the valid pixel rate even
though the number of pixels in the process area is constant.
The volume of computation or computation time needed is
proportional to the valid pixel rate.
In the inter-frame image coding system or the like,
assuming that the number of valid pixels is shared by all DSPs
to have a distribution EA, EB and EC as shown in Fig. 2(b),
the computation time needed for one block of parallel DSP
configuration is determined from the process time of the DSP
which works for the area 9 with the largest volume of process
M, and the remaining DSPs which have finished the areas A and
C earlier have idle time.
The conventional digital signal processing apparatus
arranged as described above has its overall process time

1324~7"
'etermined from the longest process time among DSPs when the
density of information, such as the valid pixel rate, within a
`- frame is uneven and the distribution of information varies
with time, resulting in a degraded process efficiency per DSP
. unit.
, Fig. 3 is a diagram showing, as an example, the
;; arrangement of other digital signal processing apparatus
. .
disclosed in an article entitled "Realtime Video Signal
~ Processor Module", in the proceeding of ICASSP `87, pp. 1961 -
.~ 10 1964, April 1987, Dallas, U.S.A. In the figure, indicated by
t~ l is an input terminal, 4 is an input bus for distributinginput data on the input terminal 1, 28a is a feedback bus for
distributing the result of previous process, and 20 are signal
~; processing modules each including an input storage 21, a
processing unit 22, an output storage 23 and a timing control
unit 24. Indicated by 25 are wired-OR circuits through which
feedback data on output ports 30 are placed on the feedback
; bus 28a, 26 are wired-OR circuits through which output data on
output ports 29 are delivered to the output terminal 5 over
the output bus 5a, 27 are input ports for the input data to
the signal processing module 20, and 28 are input ports for
the feedback data to the signal processing module 20.
Fig. 4 is a block diagram showing in more detail one of
c the signal processing module in Fig. 3. In the figure,
indicated by 221 is an address generator tAGU A), 211 is an
~, input dual memory (MEM A) which receives data on the input,~ port 27 over the input bus 4, 212 is an input dual memory
~ (MEM B) which receives data on the feedback bus 28a by way of
. .
;~ - 7 -
.~;,
,i,
~. ~
~ .
. ~ . .

132~7(,
he input port 28, 222 is an address generator (AGU B), 223 is
' an X-bus, 224 is a Y-bus, and 225 is a pipéline arithmetic
-~ unit (PAU) having its input terminal EXl connected to the
' X-bus 223 and another input terminal EX2 connec,ted to the
Y-bus 224. Indicated by 226 is a data memory [MEM P(Q)]
, having its output connected to the X-bus 223, 227 is an'~. address generator tAGU P(Q)] having its output connected to
'.': the Y-bus 224 and data memory 226, 228 is a mode register
.,.......... (MDR) having its output connected to the X-bus 223 and Y-bus
~' 10 224, and 241 is a Z-bus connected to the inputs of the address
generators 221, 222 and 227, pipeline arithmetic unit 225 and
t data memory 226. Indicated by 242 is a seguencer (SE0), 243
is an instruction memory (IRAM) connected to the output of the
,........... sequencer 242, and 245 is a decoder (DEC) connected to the
' 15 output of the instruction memory 243, with the output of~.the
decoder 245 being connected to the z-bus 241 and output bus
231. The output bus 231 is connected to the input of the mode
. register Z28 and the Z-bus 241. Indicated by 232 is an FIF0
memory (MEM C) connected to the output bus 231, 233 is an FIFO
~ 20 (MEM D) connected to the output bus 231, 29 is an output port
;~ of the FIFO memory 232, and 30 is an output port of the FIFO
,~' memory 233.
-~ Fig. 5 is a diagram showing, as an example, the algorism
', . of a typical high-efficiency coder for a moving image. In the
figure, indicated by 250 is an input terminal for the input
,` video signaI, 251 is an input frame buffer having at least a
:~'` 1-frame capacity and having the simultaneous read-write
ability, 252 is an inter-frame subtracter for evaluating the
- 8 -
~.
`''~ '
:.
.

~32~7~
difference, 253 is a block identifier, 254 is a coder, 255 is
a coding parameter produced by the coder 254, 2S5 is a
variable-length coder, 257 is a video multiplexer, 258 is a
transmission buffer memory, and 259 is an output terminal for
the coded data. Connected in cascade between the input
terminal 250 and output terminal 259 are the above-mentioned
functional blocks 251 - 254 and 256 - 258. Further indicated
by 260 is a local decoder which receives the coding parameter
255, 261 is an inter-frame adder, 262 is an in-loop filter,
lo 263 is a coding frame memory, 264 is previous coded frame
data, 265 is a motion compensator, 266 is current frame data
fed from the input frame buffer 251 to the motion compensator
265, 267 is motion vector data, 26~ is compensated previous
frame data fed from the motion compensator 265 to the
inter-frame subtracter 252 and inter-frame adder 261, 269 is a
feedback signal, and 270 is a coding controller which provides
coding control inforrnation for the video multiplexer 257, a
feed-forward signal to the input frame buffer 251, a block
identification control signal 273 to the block identifier 253,
and a coding control si~nal 274 to the variable-length coder
256.
Next, the operation of the conventional digital signal
processing apparatus will be described in connection with
Fig. 3. This apparatus is intended for moving image
processing and is based on the division parallel processing
system in which a frame is divided into small frames and a
signal processing module 20 is assigned to each of the divided
frame areas.

13 2 '~
~ Initially, each signal processing module 20 operates on
the autonomous basis by expending one video frame time to
fetch a divided frame area assigned to it among the input data
transferred frame-wise in raster scanning over the input bus 4
and store the data in the input storage 21. At the same time,
if the process result of the previous frame is needed for the
current process, it operates by expending one video frame time
to fetch data of the assigned area of the frame in the
feedback data from the input port 28 over the feedback bus 28a
` 10 and stores the data in the input storage 21.
~ Upon expiration of one video frame time, the processing
; unit 22 performs the prescribed signal processing for the
input data and feedback data stored in the input storage 21,
and stores the result temporarily in the output storage 23.
The feedback data led out of the output storage 23 through the
output port 30 is timed for synchronization with other signal
processing modules 20 and, by being merged into all feedback
data by the wired-OR circuit 25, placed on the feedback bus
28a. Similarly, the output data led out of the output storage
? 20 ~3 through the output port 29 is timed for synchronization
. with other signal processing modules 20 and, by being merged
, into all output data by the wired-OR circuit 26, delivered to
the output terminal 5 over the output bus Sa.
i Divided frame areas processed individually by the signal
processing modules 20 are combined back to a video frame.
Therefore, parallel processing of areas divided type is
realized. For reason as described above, it is necessary for
all signal processing modules 20 to have their process
~'
-- 1 0 --
.
?
~,
r
r.
s
r~

132~7~j
-
`~mmencement in complete synchronism with one another. On
this account, the timing control unit 24 provides all sections
of system with the timing of data input/output and process
commencement in synchronism with the video framq timing which
is the synchronization reference point.
Next, the operation of one signal processing module 20
will be briefed in connection with Fig. 4. Among a video
frame entered frame-wise through the input port 27 in
synchronism with the video frame sync signal, data of the
assigned area is stored in the input dual memory 211. At the
same time, among the coded previous frame data entered through
the input port 28, the portion of the assigned area and its
' peripheral data are stored in the input dual memory 212.
The input dual memories 211 and 212 is made up of a
two-sided memory device in the same structure on both sides
and it operates such that while one side is written data, the
other side is connected to the X-bus 223 and Y-bus 224 for
reading for the coding process by the pipeline arithmetic unit
225. The read/write sides of the input dual memories 211 and
,.,
212 are switched by the above-mentioned video frame sync
signal so that input data of assigned areas on the input ports
;; 27 and 28 are entered frame-wise uninterruptedly.
The data read out to the X-bus 223 and Y-bus 224 are
those stored at data memory addresses indicated to the input
dual memories 211 and 212 by the address generators 221 and
222 that are controlled by the signals provided by the decoder
245 by decoding a 80-bit length horizon-type micro codes read
out in accordance with the address of the command memory 243
x~
~;

132~70
indicated by the sequencer 242~ The data placed on the X-bus
223 and Y-bus 224 are entered in parallel to the pipeline
arithmetic unit 225, which implements a series of signal
processing including coding and local decoding and outputs the
result to the Z-bus 241. Among the process outputs placed on
the Z-bus, the coded output is stored in the FIF0 memory 232
and the local decoded output is stored in the FIFO memory 233
by way of the output bus 231.
The FIF0 memories 232 and 2~3 are buffer memories of FIFO
configuration. Feedback data consisting of the output data
and local decoded data are read out of the output ports 29 and
30 at the read control timing for the assigned area produced
from the video frame signal, and a piece of video frame local
decoded data and coded output data in compliance with the
scanning order are produced.
The data memory 226 which is controlled by the output of
the address generator 227 is used by a work memory which is
' necessary for the process of the pipeline arithmetic unit 225
and a table which stores constants. The mode register 228
consists of a register file including registers for loading
^ immediate values from the decoder 24S.
~ his digital signal processing apparatus is principally
based on the foregoing area division parallel processing, and
is intended such that each signal processing module 20 deals
with a divided frame area independently on a realtime basis.
When the digital signal processing apparatus is intended for
the achievement of a coder as shown in Fig. 5, only portions
excluding the variable-length coder 256, video multiplexer
- 12 -
.
.

~32~7.,
57, transmission buffer 258 and codlng controller 270 can be
realized. Namely, it is not suitable for a continuous process
in one video frame, and is limited to the inter-frame coding
loop process ranging from the input frame buffer 251 to the
block identifier 253, coder 254, local decoder 260, coding
` frame memory 263, and to the motion compensator 265 useful for
; data completely divisible within a frame.
` Since each signal processing module 20 implements the
same process for each frame, the processing program stored in
the instruction memory 243 can be a single program. When a
frame is divided into M areas (M is an integer greater than or
equal to 1), the number of process cycles Nc per pixel which
can be dealt with on a realtime basis by one signal processing
module 20 is given by the following calculation.
:
~ 15 Nc = Mc Tf/Mp Np tClocks/pixel)
'4'
where Mc is the freguency of machine cycle (Hz), Tf is the
frame period (sec), Mp is the number of horizontal pixels in
,~; the assigned area, and Np is the number of vertical pixels in
the assigned area.
On this account, if a frame is divided into four areas,
for example, each having the assignment of a signal processing
~, module 20, the number of process cycles Nc is increased by
,, .
four fold, and it becomes possible for the video signal
processing, which is required to be very fast, to be dealt
with on a realtime basis by an increased number of relatively
; slow signal processing modules 20
.,
.

" ` ~32~7~
` The conventional digital signal processing apparatus
arranged as described above have the following problems for
processing video signals.
(a) For the achievement of very fast processing, a frame
must be divided into numerous small areas, however, certain
signal process algorism does not allow independent processes
for areas below a certain minimal division size. Therefore,
realtime processing can not be achieved by increasing the
parallelism.
(b) Because of a fixed distribution of load to signal
processing modules, the process time must be set to meet the
longest one when each signal processing module has a different
; process time. Therefore, the system has an unnecessarily
increased parallelism relative to the processing capacity.
(c) Data input and data processing each take one frame
~ time, and data input and output each need a 1-frame buffer
', memory, resulting in a longer time lag and an increased memory
capacity. Therefore, the system involves a significant loop
delay in feed~ack control and the like, and it is difficult to
realize the coding controller 270 in Fig. 5 for example.
(d) Since the system is intended for a complete parallel
processing, it cannot perform such a process as scanning the
entirety of a same frame horizontally.
Fig. 6 is a block diagram of the conventional digital
signal processing system disclosed in the proceeding
(No. S10-1) o~ the 1986 annual convention of the communication
department of The Institute of Electronics and Communication
, Engineers of Japan. In the figure, indicated by 31 is a
i::
I - 14 -
.~ .
,. . .
}, `
. .
'J

~32~
lual-port internal data memory (will be termed 2P-RAM) capable
of reading and writing two sets of data simultaneously, 32 is
an address generator which calculates the address of read data
or write data, 33 is a data bus used for the internal transfer
of data related to computation, 34 and 35 are selectors which
select data in the 2P-RAM 31, 36 is a register which holds
computation data selected by the selector 34, 35 is a register
which holds computation data selected by the selector 35, 38
~- is a multiplier, 39 is a register which holds the output of
~; lo the multiplier 38, 40 is a selector which selects the output
of the register 36 or accumulators (AC~O - ACC3) 44, 41 is a
selector which selects the output of the registers 39 or 37,
42 is an arithmetic/logic unit which performs computations for
the outputs of the selectors 40 and 41, and 43 is a selector
which selects the output of the arithmetic/logic unit 42 or
., .
~`~ data in an external data register 46. The accumulators 44 are
:.
used to hold the output of the arithmetic/logic unit 42 for
cumulative computations. The external data register 46 is to
~ hold data from an external data memory 47. Indicated by 45 is
`~ 20 an external address register which holds address data provided
by the address generator 32 and transfers it to the external
data memory 47.
Next, the operation will be described. This signal
'; processing system based on a digital signal processor performs
;~ 25 command fetching and decoding for the preset microprogram,
- data reading, computation, and computation result writing, in
a parallel pipeline processing mode. The following describes
the operation of 3-input-1-output computation.
:,:
~ - 15 -
~.
.
... .
.
,.,

13 2 '~
~ The arithmetic/logic unit, multiplier, address generator,
data memories and selectors are controlled in the micro
command mode.
Arithmetic operations for two inputs, including addition,
subtraction, maximum evaluation~ minimum evaluation, etc. are
expressed generically by a ~ b, and a multiplicaticn 1
operation for two inputs is expressed generically by axb,
where a and b are independent data.
The arithmetic operations and multiplication are combined
to form 3-input-1-output operations, and they are defined by
the following expressions: -
Zi = (ai ~3 bi) x ci ... (l)
Zi = (ai x bi) ~ ci ... (2)
where i = l to N, and ai, bi and ci are sets of independent
data stored in the 2P-RAM 31.
Fig. 7 shows the sequence of process for implementing the
3-input operation of the form of expression (1) by the digital
signal processing system, for example, shown in Fig. 6.
The data address generator 32 sets up the starting
addresses for two data sets ~ and B, and selects the simple
incremental mode. Then the two data sets ~ and B are loaded
through the selectors 34 and 35 into the registers 36 and 37.
The selectors 40 and 41 select the registers 36 and 37,
respectively, so that the arithmetic/logic unit 42 implements
the arithmetic operation ai e bi. The selector 43 selects the
- 16

1 3 2 ~
--rithmetic/logic unit 42 to hold the operation result
: temporarily in one of accumulators (ACC0 ACC3) 44, and the
resultant data is sent over the data bus 33 and through the
external register 46 and stored in the external memory 47,
; 5 which addressing mode is the simple incremental mode because
of it being linked to one of the addresses for the 2P-RAM 31
in the address generator 32.
In the subsequent step ST3, the data address generator 32
sets up the starting addresses of the data set C and data set
ai o bi, and ci data is read out of the 2P-RAM 31 to the
; register 36. The selector 35 selects the data bus to load the
data of ai ~ bi in the external memory 47 into the register
37. In this case, in order to have a coincident timing of
t, reading for the data set C and data set ai o bi, step ST4
: 15 needs to expend two cycles of useless command reading for the
external memory in advance.
~ Th~ two sets of data are rendered multiplication by the
t multiplier 38 in step ST5, and the result is stored in the
~: register 39. In the next cycle, the resultant data is passed
through the arithmetic/logic unit 42 and, after being held
:- temporarily in one of the accumulators (ACC0 - ACC3) 44,
transferred over the data bus 33 to the 2P-RAM 31.
, These operations are carried out in parallel on the basisof the pipeline process, and the operations from the readinq
~ 25 of 2P-RAM 31 until the storing of the process result in the
r external memory 47 for N pieces of data sets will take N ~ 3
~: machine cycles in the case of an arithmetic operation.
The steps of operations are listed in the following
- 17 -
. .
.,
' ' ' .
S
:,
,:-

. ~L32~P~o
~able 1 and Table 2. Table 1 is for the operation of ai o bi
and the transfer of the result to the extérnal memory 47, and
Table 2 is for the reading the resultant ai ~ bi from the
external memory 47, the operation of (ai ~ bi) ci, and the
transfer of the result to the 2P-R~M. In both tables, symbol
'x" represents an indefinite value. Storing in the external
.. data register 46 completes in machine cycle N + 3 in both
:~. tables, and the external data register 46 is read uselessly in
I machine cycle O (two machine cycles) io Table 2.
.~
,s~
:~ ' . '
:~'
"
, - 18 -
.
:~
; . .~ , .
.. - . .~
~: . - ::
, .

` 132~7~
;~ ~ 3 ~ ~ Z ~ ~ ~ x
~ ~ ~ Z ~
X Q-l Q~ Q~ QZ QZ .
t~ :~ ~3 (~ (~3 ... ~) ~)z X X
~' ~ X X X X ................. X X X X
~'
,~ Lr~'' ''
~ Z ¦ K ~ X ¦ X
?; U r ___ _ ~ ~ ¦ ~
~ ,1 ~ ~ ~r ... Z Z + æ
~ 1 7: 1 I
- 1 9
.~ :
~' - `' ' ' .

~32~o
~ ~ x ~ x ~ x
~ u~ ~ ~ ~ ~
i~
~ ~ V'
~ttt~7 ~z t
X~ X~ ,x yz
. ~ X X ~ ~ ................. ~Z ~Z X X
'~ ~ .~ . ~ ~ ~ ~z
, ~ ~ ~ ~Z ~
~ ~ _ _ _
,,~,
,~ ~ X ~,~ ~,~ U~ ....... ~,z X X X
~.~' G l
~` ~ _ _
~; ~ Q'~ .Q ,1:1 ~æ
~. a) x ~) O O ............... (~) x x x
. ,~ ~I ~`I ~ Z
. ~
~, ._~ C ~1 ~ ~ ... Z Z Z Z
~5 _ .' .
!l
~ ~ - 2 0
.,.
,
. ,
~, . . .
:

~ 3 2 ~ ~3 sJ (,
Next, after two useless reading cycles of the external
memory 47 for timing purposes, multiplication is carried out
for N pieces of data sets and the results are stored in the
2P-RAM 31. These operations take N + 3 machine cycles, which
are added by two command cycles for address initialization,
and a total of 2N + lo cycles are expended. An operation of
expression (2) also takes ZN + 10 cycles. Accordingly, it
will be appreciated that if a 3-input-1-output operation is
conducted for N pieces data sets using a processor with the
lo ability of 2-input operation at most, it will take about 2N
machine cycles (provided that N is sufficiently large).
The following describes the cumulative operation for the
results of the foregoing 3-input-1-output computation.
N
S = ~ (ai ~9 bi) x ci ... (3)
i=l
N
S = ~ (ai ~ bi) ~3 ci ... (4)
i=l
In the case of expression (3), the multiplication result
for ai ~ bi and ci (output of register 39) and the
intermediate cumulative value are entered to the arithmetic/
logic unit 42, and the result of summation is entered back to
the same accumulator 44 through the selector 43. Thereby, the
` process takes 2N + 10 cycles unchanged.
In the case of expression (4), the data sets (ai x bi)cci
which have been stored temporarily in the 2P-RAM 31 are read
out sequentially and summed by the arithmetic/logic unit 42,
- 21 -

132~
and therefore the process needs another N cycles, resulting in
a total of 3N + 10 cycles.
The conventional digital signal processing system is
formed as described above, and therefore for a 3-input-1-
output operation of three independent data sets, it performs
two times of 2-input-1-output operation. In addition, the
process time is further extended for address control, memory
transfer and other processes.
Fig. 8 is a diagram showing in brief the image coding
transmitter which implements the conventional motion
compensatory operation method disclosed in an article entitled
"Dynamic Multistage Vector Quantization for Images" , journal
,;
- of The Institute of Electronics and Communication Engineers of
Japan, Vol. J68-B, No. 1, pp. 68 - 76, Jan. 1985. In the
~`; 15 figure, indicated by 1 is an input signal of image data formed
~s of a plurality of consecutive frames on the time axis, 52 is a
motion compensator which produces a prediction signal on the
basis of the resemblance computation of correlation between
the current frame represented by the input signal 1 and the
previous frame represented by a previous frame signal 53 which
i9 the previous reduced signal 1, 54 is motion vector
information provided by the motion compensator 52 indicative
of the position of a prediction signal block, 55 is a
~` prediction signal produced by the motion compensator 52, 56 i5
" 25 a coder which codes the difference between the input signal 1
.
and prediction signal 56, 57 is a decoder which decodes the
signal coded by the coder 56, and 58 is a frame memory which
stores data reproduced through the summation of the signal
r - 22 -
,A;, , , , ~ ~ , ,
., .
``:

~ 3 2 ~
from the decoder 57 and the signal from the motion compensator
S2.
The performance of the foregoing arrangement will be
described in connection with Fig. 9. The motion compensation
process is to calculate for the input signal 1 the amount of
distortion between a ll-by 12 block located in a specific
position in the current frame shown in Fig. 9(A) and M pieces
~ of blocks in the search range S in the previous frame shown in
;~ Fig. 9(B) to evaluate the position of the block y providing a
`~ 10 minimal distortion relative to the position of the input
' block, i.e., motion vector V, and to recognize the signal of
the minimal distortion block as a prediction signal.
The number of motion vectors V under search within the
search range S in the given frame is assumed to be M (an
integer greater than 1). The amount of distortion of the
position of a specific motion vector V between the previous
~'s frame blocks and the current input block is calculated as a
sum of absolute values of differences as follows:
K
di = ~ ¦yih- xh¦ ... t5)
, h=l
~;, . . .
where input vectors x = ~xl, x2, ..., xk), search object
, 20 blocks yi = {yil, yi2, ... , yik), i = 1, 2, ... , M, and M and
:~.
K are fixed values. The motion vector V is evaluated as
follows.
V = Vi ~min di ¦ i = 1, 2, ...... , M) ... (6)
'.
;~
: - 23 -
.~; .
s
s
.~ - . : .
:. .
'.' ' .

1 32a~ ~rl(~
- Fig. 10 shows the sequence of operations for detecting
the motion vector V. Step ST11 calculates a distortion di at
each of K pieces of sampling points on the basis of expression
(5), and the next step ST12 compares the di with the minimal
distortion D at position I, and, if di < D, the variables are
replaced to be D = di and 1 = i. These operations are
repeated for the number of search vectors, i.e., the
~ operational process of expression (6), to determine the final'`3', minimal distortion D and its position I.
These operations must be completed within the period of
s each frame entered successively, and therefore a high-speed
digital signal processor is required.
As an example, the digital signal processing system shown
in Fig. 6 is used to carry out the motion compensation
~ 15 process. In this case, the multiplication-sum operation taXes,~ place K x M times for each input block, and the number of
machine cycles is the total time expended by M times of
processes including comparison and updating. Generally, the
number of cycles for comparison and updating is small enough
as compared with that of the multiplication-sum operation, and
/i the volume of motion compensation operation for one blocX is
virtually e~ual to K x M machine cycles.
j However, since these operations are determined from the
; time corresponding to the period of frames entered
.;
` 25 successively, parallel processing will be needed for the mass
multiplication-sum operations to be performed in a short time,
depending on the operation process cycle time of a particular
digital signal processor.
- 24 -
b
., .
','

13 2 ~ 6 ~ o
- The conventional motion compensation scheme is
implemented as described above, and in order to ensure the
operation time for an enormous volume of operations when
carried out using a digital signal processor, the processor
needs to have parallel processings, resulting in an increased
complexity and scale of hardware structure.
Specific embodiments of the present invention will now be
. described with reference to the drawings.
.~ Fig. 11 shows, as an embodiment of this invention, an
~- 10 example of the image coder of the digital signal processing
~- apparatus. In the figure, input data 1 is entered to a first
through third input memories 6. A task controller 7 estimates
the number of valid pixels on the basis of the contents of the
' input memory 6, determines the distribution of coding process
. 15 among a first, second and third DSPs 2, and issues control
~ signals as address control signals 8 to the DSPs 2. Upon
; receiving the address control signals 8, the first, second,
~ and third DSPs 2 issue addresses 9 to respective
;
:;
.
:, .
.~
- 25 -
~'
?
.

` i324~7~,
.
first, second and third input memories 6 to fetch data
10 assigned for processing, and implement the coding
processes based on the preset program. Upon completion
of processes, the first, second and third DSPs 2 store
- 5 processed data in an output memory 11, which, after
reading the whole data of the DSP block, sents the
processed data to the next DSP block.
In this case, each DSP 2 is controlled by the task
~ controller 7 so that all DSP 2 have even numbers of valid
-, 10 pixels assigned, and therefore the image coding process
time is controlled so that the difference of process
times among the DSPs 2 is minimal. Namely, in case of
,, coding an image with numbers of valid pixels as shown in
Fig. 12(b), an area A having a relatively small number
of valid pixels is enlarged to A', an area C having a
relatively large number of valid pixels is also enlarged
.; to C', and an area B having a larger number of valid
pixels is reduced to B', as shown in Fig. 12(a), by the
task controller 7. The task controller 7 issues the
address control signals 8 corresponding to the assignment
distribution to the first, second and third DSPs 2.
For example, in response to the issuance of the
address control signal 8 for coding the image data of
area A to the first DSP 2, it produces the address 9 for
~; 25 the area A' in the first input memory 6 to fetch data and
implements the image coding process by following the
prescribed program. Similarly, the second and third
..
- - 26 -
,
:, .
,~ .~. , . '
. .
. . .

1324~7(,
DSPs 2 are directed to carry out the image coding
processes for the areas B' and C', respectively.
Consequently, the first, second and third DSPs 2 have
; their numbers of valid pixels EA', E~' and EC' for
;- S coding virtually made even, i.e., the same quantity of
image data to be processed, as shown in Fig. 12(b)~ As
a result, the maximum volume of process M' dealt with by
` the inventive apparatus becomes sufficiently less than
that M of the conventional apparatus, and the process
time required ~or each DSP block is reduced.
Fig. 13 shows the inter-frame coder constructed by
, a serial connection of DSP blocks in three stages. Each
DSP block performs the process shown in Fig. 14. The
first DSP block 12 enters upon the input data 1 and,
after producing a differential signal, implements the
valid/invalid judgment, evaluates the distribution of the
numbers of valid pixels in the image data, and sends the
information to the task controller 7. Based on the
information, the task controller 7 issues address control
signals 8 for dictating such address adjustment that the
i DSPs in the second DSP block 13 have even assignments of
data. Each DSP in the second DSP block 13 implements the
process by adjusting the read address as described above.
The third DSP block 14 is designed to operate identically.
Although in the foregoing embodiment the DSP process
assignment areas are controlled on the basis of the valid
~! pixel distribution among areas in image data, the present
- 27 -
~:
t
~,
.

~32~
invention is not confined to this scheme, but feedback DSP
assignment control based on the general quantity distribution
of transmitted information is also possible, for example.
A second embodiment of this invention will be explained
with reference to the drawings. Fig. 15 shows an example of
the configuration of a digital signal processing apparatus,
the second embodiment of this invention. In the figure, 301
is a data flow control section (D F C) working as a control
' means; 302 are control parameter data output from the data
. 10 flow control section 301; 303 is a common memory (C M) whichstores feedback data, a large capacity data and table, etc.;
304 is a task table (T B) which stores a processing status of
f each signal processor element (P E) 318; 305 is a common bus;~ (C-BUS) which has the function as a status communicating means
f,~ 15 consisting of at least a bus connected to the common memory
303, the task table 304 and each signal processor element 318;
306 is a video frame synchronizing signal (F p) which
discriminates the starting point of a video frame to be
supplied to the data flow control section 301 in the case of
inputting video signals etc.; 307 are feedback data (F b)
which inform the data flow control section 301 of the
occupying status, data quantity of a sending buffer etc. and
finishing of one frame data processing etc. output from an
output control section 308 described later; 308 is an output
control section (0 C) provided with a buffer memory for
'r outputting data at a certain constant speed in restructuring
processed blocks output from a plurality of signal processor
. elements ~P E) 318 for example in the scanning order in a
~'
- 28 -
i, .
f,,,

1 32~7~
-video frame; 309 is an input terminal of analog signals; 310
is an A/D converter; 311 are digitized input data; 312 is a
parameter memory (P M) consisting of dual port memories; 313
- is an input frame buffer consisting of dual port memories for
functioning as a block formation means by memorizing input
data 311 temporarily; 314 is a bus connecting the parameter
memory 312 to the signal processor elements 318; 315 is a bus
`- connecting the input frame huffer 313 to the signal processor
.~
elements 318 in order to supply data in a block unit; 316 is a
~ 10 common bus input/output port connected to the common bus 305;
`` 317 is an interruption control port for sending/receiving
; timing control signals from the data flow control section 301;
~; 318 are individual signal processor elements (P E) and these
'$:~ signal processor elements are provided with software which
`~s '
,. 15 functions as a starting means, and these signal processor
elements are mutually connected with buses 314 and 315, and
;,` said last stage signal processor element 318 and the output
~: control sectian 308 are also connected with buses 314 and 315;
319 is an output terminal through which data are output at a
20 certain constant speed and timing from the output control
~,.
section 308; 320 is a multiprocessor module comprising the
parameter memory 312, the input frame buffer 313 and a
plurality of signal processor elements 318 connected in series
through the buses 314 and 315, for example.
., .
~: 25 The data flow control section 301 has a judgment means
which scans the task table 304 at a certain constant cycle and
jUdges the processing conditions of individual signal
processor elements 318. The data flow control section 301
- 29 -
'~.
~'s~ .
~ .. ..
. ,
'. ~.
.. .
.

132~ ~(J
; also has a control means which based on the result of the
judgment means it decides i~ each signal processing module can
process the next signal process block and when the processing
~ is found to be possible it makes process start by sending out
.' 5 an interruption signal to the interruption control port 317
; and when the processing is found to be impossible it instructs
~ the transfer of the signal process block to another signal
.
processing module which can process the block. When a
parallel processing of a constant cycle, in which the task
table 304 is scanned, is to be done the scanning period shall
be the number of parallelness times of the input cycle of the
signal process block, and when a series processing is to be
done the scanning period shall be l/n of the input cycle; thus
~ by the synchronization with the input data frame (for example
y 15 a video frame) the matching with the real time can be
s~ maintained.
Fig. 16 shows an example of the internal constitution of
the signal processor elements 318 as shown in Fig. 15. In the
figure, 330 is a terminal to which the common bus input/output
;~ 20 port 316 is to be connected; 331 is a terminal to which the
interruption control port 317 is to be connected; 332 is a
terminal to which the buses 314 and 315 are to be connected:
333 is similarly a terminal which connects the buses 314 and
315 between the adjacent signal processor elements; 334 is an
2S external bus control section ~BUS-CONT) with the function as a
competitive control means to control the make/break of the
~ common bus 305 through the bus 316; 335 is a bus for loading a
;; writable control storage (W C S) 336, which memorizes a signal
~ - 30 -
~ .
.~
'
.
.- . .~.

1 32~6~(,
- processing program, from the external bus control section 334
at an initial time: 337 is a BUSREQ which ~equires the
connection of the common bus 305 to the external bus control
section 334; 338 is a BUSACK which denotes the permission ~or
~: 5 the BUSREQ 337; 339 are command codes which are successively
`~ read out from the writable control storage 336 according to
the signal processing program; 340 is a digital signal
7~ processor (D S P) which execute data processing; 341 is an
.~. INTACK which informs an interruption control section
. .
(INTER-CONT) 345 of the reception of an interruption from the
digital signal processor 340; 342 is, on the contrary to it,
~ an INTREQ which informs the digital signal processor 340 of
;~ the requirement of an interruption; 343 is a bus to connect an
internal bus 344 to the common bus 305 through the external
bus control section 334, and the internal bus 344 is directly
~: connected to the digital signal processor 340; 345 is an
interruption control section (INTR-CONT) which processes an
interruption signal from the data flow control section 301;
346 is a bus which writes the parameter of a processed data
`~` 20 block on a dual port memory 349 through the internal bus 344;
347 is similarly a bus which writes processed block on the
dual port memory 349; 348 is a bus which connects a work
memory in the dual port memory 349 and the internal bus 344;
~.
;~ 349 is a dual port memory provided with a parameter memory,
` 25 data memory and work memory which outputs data to the adjacent
signal processor element 318 through the terminal 333 and
buses 314 and 315.
~x:
c - 31 -
~.,
., .
?`
~ . .
f
.
,'.. '~ .
i.

132~
Fig. 17 explains the internal control operation of the
digital signal processing apparatus shown in Fig. 15, and the
same parts as those shown in Fig. 15 are given the same
. symbols; the explanation of them is therefore omitted.
In the figure, 351 is a block which shows analytical
operation of a parameter inside the signal processor element
318; 352, 353, 354 are blocks which show the operation of
individual signal processing subroutines A, B and C according
to the parameter of each of them; 355
... . .
t~
`:
',~:
-i,:
r
~X
:,
~ .
.~`.
~ - 32 -
.~
.
. .
.~. .
s
.. .
.~

.' 132~7g
is a block which shows the contents of a parameter
memorized in the dual port memory 349; 356 is a block
which shows the contents D of processed block data
memorized in the dual port memory 349.
Fig. 18 explains an example of the relation between
the parameter data and process block data until a data
block is successively given a series of function
, processes and an output result is obtained through
series and parallel processes of block units executed in
~- 10 the digital signal processing apparatus shown in Fig. 15.
In the figure, 360 is a block address (B A D) showing
~, the position of an input block in a frame; 361 is a
; processing number (PN) showing the kind of a process to
be given to said block; 362 is a flas (PFLG) which
~ 15 discriminates the result of the process; 363 is a data
', block in which for example eight subblocks are combined
' to form a block.
Fig, 19 shows an example of correspondence between
the data block 363 shown in Fig. 18 and one video frame
~;` 20 when a picture coding process is performed in this
system. In the figure, 365 is one video frame; 366 is a
data block when a picture is divided into 16 lines x 16
s ~ pixels; 367 is a subblock which is obtained when theblock is further divided into 8 blocks of 4 lines x 4
pixels.
.~. ,
An explanation of the operation based on Fig. 15 is
given in the following. Input data 311 digitized by an
.~
i ^ 33 -
f~
:'f~
!
i -
.~}
f.
.f,

~32~
- ~/D converter 310 are memorized in an input frame buffer 313
being scanned in a raster form in synchronization with a video
frame synchronizing signal 6, for example. Input data 311
memorized in the input frame buffer 313 are added to initial
parameter data 302 by the data flow control section 301 by
, blocks and the parameter data 302 are memorized in the
parameter memory 312. These parameter memory 312 and input
. frame buffer 313 consist of dual port memories and
writing/reading is simultaneously possible between two
~ 10 independent ports.
: Data blocks are read from the input frame buffer 313, and
the parameter is read in a data block unit from the parameter
memory 312. Data blocks and parameter are sent through the
~ buses 314 and 315 to the signal processor element 318 where
,~ 15 they are given the first process of a series of functional
processes in a block unit. Next, the results and the
rewritten parameters are written in the dual port memory 349
in the signal processor element 318. It is the basic function
of a processor module 320 to execute processes successively
between the adjacent signal processor elements 318 and to
execute a pipeline processing for each block unit.
ï When a processing is executed for each block unit, if a
~ feedback data such as coded previous frame data are to be
,~ referred to, feedback data are input to the common memory 303
connected to the common bus 305 and memorized. The process of
a new video frame is performed by such processing that the
, other signal processor 318 than the one which data have
written through common bus 305 refers the common memory 303.
- 34 -
'
. .
.~

-~ ~32~7,
f the writing of the feedback data of the previous frame is
not completed in the proper position in thé common memory 303,
the execution time of the process shall be specified.
When the processing of a unit (block processing) is
finished, each signal processor element 318 memorizes the
` status showing the completion of the present processing in the
task table 304, and wait the next processing. The data flow
control section 301 scans the task table 304 and when the
processing of the former stage signal processor element 318 is
lo completed, it sends out an interruption signal to said signal
processor element 318 and start the nex-t processing. By
i~ repeating the operation, the execution of the operation
control of each signal processor element 318 is performed.
To conduct parallel processing in a block unit for each
processor module 320, the data processing condition in the
input frame buffer 313 of each processor module 320 is
~ detected with the status information of the initial stage
`, signal processor element 318 and individual block data are
distributed by proper load distribution and input to each
multi-processor module 320.
~ These results are shown by the control parameter data of
'` the initial stage and the signal processor element 318
discriminates the processing for the block by deciphering the
above results and executes a proper processing. Among these
., .
s~ 25 processings there are for example functional processors such

as a block identifier 253, a coder 254, a local decoder 260,
;~ an inter-frame subtracter 252, a motion compensator 265, an
inter-frame adder 261, a variable length coder 256, and
~,
- 35 -
~t
'". .
. ~' .
., '
t-
:.~

~ 32~7~;
`besides thém a processing which performs only load
distribution such as a processing of transferring block data
is included.
In the data flow control section 301, it is possible to
make an arbitrary signal processor element 318 undertake an
arbitrary processing by controlling the first stage parameter;
owing to such performance as mentioned above the load can be
so distributed to signal processor elements 318 as to make
them work efficiently as much as possible.
The output control section 308 reconstitutes processed
blocks which are output at random times into for example a
scanning order of an input video frame and produces a
resultant output for an output terminal 319 and also produces
feedback data 307 to inform the data flow control section 301
.,
!.~ 15 of these data.
The output control section 308 takes charge for example
of a video multiplex section 257 and a transmitting buffer 258
shown in Fig. 5, and it outputs a feedback signal 269 from the
,, .
~: transmitting buffer 258 to a coding control section 270 which
takes charge of the data flow control section shown in
Fig. 15. ~he data flow control section 301 takes charge of
. the functions of above-mentioned load distribution and the
coding control section 270 as shown in Fig. 5, and finds the
block identification control signal 273 and coding control
signal 274 and multiplex them in the control parameter data
for the execution of the whole characteristic control. Refer
to Fig. 16; the processing of a single signal processor
element 318 is started by the interruption from the data flow
~,
~ - 36 -
,.,
; .
.

132~7~j
ontrol section 301, and the contents of the parameter memory
312 is input to it through an internal bus 344. On the basis
of the discrimination result of the contents, the processing
of one unit of block data is performed by a digital signal
processor 340.
The result and rewritten parameters are written in a dual
-~ port memory 349, and the status is set in the task table 304
through an external bus control section 334; thus the
~,~ preparation for the next process is ready. An interruption
- 10 control section 345 interfaces the interruption from the data
flow control section 301 with the digital signal processor
340. The parameter and the data written in the dual port
i memory 349 are read by an adjacent signal processor element
' 318 which is connected to a terminal 333, and the next stage
proeess is given.
Fig. 17 shows the flow of these processes perfonned by
the data flow eontrol section 301, and it shows the relation
~- between the control of writing/referring of feedback data to
the common memory 303 and the control of status writing in the
task table 304 by the data flow eontrol section 301 through
the common bus 305, and the start processing control in the
signal processor element 318 by a parameter analyzer 351.
~; Fig. 18 shows the rewriting of the eontents of control
parameter data 302, whieh are added eorresponding to an input
bloek data 363, and the flow of these proeesses. A bloek
~ address whieh shows for example the position in a frame or
;-~ time sequential order of a bloek, and a flag 362 whieh is
referred to on the kind of the next proeess and the contents
- 37 -
,~
j.,~
~ . ~
(

~ 3 2 ~
f the next process are contained in the control parameter
data 302. The block address 360 is used for the
discrimination of a special process in a certain case for
example with an end point in a picture or for the restructure
of data in the output control section 308 when a process is
finished. The flag 362 shows for example the results etc. of
coding control information 271, a block identification control
; signal 273, coding control signal 274, and a block identifier
` 253 as shown in Fig. 5. Input block data 363 are set to have
the minimum size handled in a unit processing. The motion
compensator 265 shown in Fig. 5 has a block of 16 x 16 size
and after the block identifier 253 blocks of 4 x 4 sizes are
` handled. In such a case as mentioned above where a block size
,t~ differs for each unit processing, block sizes are arranged to
have matching between a maximum block size and a subblock size
contained in it. In this case, eight pieces of 4 x 4 blocks
are combined to constitute a 16 x 16 block. When coding of a
picture is performed, this block corresponds to a small
picture element made by dividing an ordinary one frame into
small s~uare picture elements.
Fig. 19 shows an example where one video frame 365 is
divided into a block 366 and subblocks 367.
In the above embodiment, a signal processor element 318,
which has a single digital signal processor 340, is shown but
when a higher speed processing is preferable a hierarchical
structure combined with a plural number of digital signal
processors can be used. The constitution of the signal
~, processor 318 in the case of the hierarchical structure is
~ - 38 -
.`:

i 3 2 ~ J
shown in Fig. 20. In this case, as the load for the data flow
control section 301 increases a local data flow control
section 370, a local common memory 371, and a local task table
372 are provided inside the signal processor element 318 in
order to locally execute the optimum load distribution inside
the signal processor. The data flow of the digital signal
processor 340 which is connected to a local common bus 373 is
the same as that shown in Fig. 15 except that the operation is
~ executed inside the signal processor element 318.
.'` 10In the above embodiment, a series/parallel structure is
~ adopted but in some case a complete
, . .
~'
~.,
~ - 39 -
I
~: '",: "~' ''

-- 132~7c,
. parallel or complete series structure is effective
according to the purpose of a signaL processing and a
real time processing could be possible.
The other embodiment of this invention is explained
with reference to Fig. 21. In ~ig. 21, 420, 421 and 422
are address generators for readout data; 423 is an
- address generator for writing data; 424, 425 and 426 are
.
`~ data memories, and address data generated by the address
generator 423 are input to these memories; 427, a28 and
429 are data buses which transfer readout data from the
,,
~ data memories 424, 425 and 426; 430, 431 and 432 are
.~
.~ registers for holding data transferred from data buses
-~ 427, 428 and 429; 433 is a register to hold the output
~ of the register 432; a34 is a selector to select the
'~:
output of the register 430 or that of the register 433;
435 is a selector to select the output of the register
431 or that of the register 441; the selector 434 and
the selector 435 constitute a first selector group; 436
is a selector to select the output of the register 430
or the output of a register 439; 437 is a selector to
select the output of the register 431 or the output of
the register 433; the selector 436 and the selector 437
constitute a second selector group; 438 is an operator
' which operates by inputting the output of the selectors
434 and 435; 440 is a multiplier which performs
multiplication by inputting the output of selectors 436
and 437; the register 439 is the one to hold tne output
.,
- 40 -
.
,

~` 1 3 2 ~
of the operator 438; a register 441 is the one to hold
the output of the multiplier 440; 442 is a selector
which selects the input from the register 439 or the
input from the register 441 and outputs it; 443 is an
adder which adds the output of the output selector 442
~; and the output of an accumulator 444 and outputs to said
accumulator 444; A45 is a data bus to transfer output
data of the accumulator 444 and the output selector 442;
446 is an interface circuit which performs outputting/
inputting of data to/from externaL circuits; 451 - 453,
461 - 463, 471 - 473 denote signal lines which output
: the output of data memories 424, 425 and 426 to data buses 427, 428 and 429.
~ The following are the explanation of operation.`'~ 15 In Fig. 21, assume that data series with N elements,.~, .
s A = (ai¦i = 1 to N), B = (bi¦i = 1 to N), C = (ci¦i = 1
to N) are previously stored respectively in the data
~- memory 424, data memory 425, and data memory 426.
Under the conditions above, the operation when the
operation of three inputs and one output is performed is
~, shown below. The operation processing flow is shown in
Fig. 22.
To ~egin with, at a step ST31, top addresses of
three series of input data and of an output result
storing memory are initially set by address generators
420, 421 and 422. ~fter that the address generators are
assumed to take simple increment actions.
- 41
"
~,; .
~,:

--- 132~c,
The data memory 424 corresponds to the address
. . ,
i generator 420; the data memory 425 corresponds to the
address generator 421; the data memory 426 corresponds
` to the address generator 422. Individual data memories
; 5 424, 425 and 426 readout data based on the addresses of
address generators 420, 421 and 422,
Data are input to three data buses 427, 428 and 429
~ (X-BUS, Y-BUS, Z-BUS) respectively from data memories
$ 424, 425 and 426, so that for the outputting of each of
these data memories 424, 425 and 426 to a specified data
bus, only one bus out of three is controlled to be
effective, and the other two are controlled to be in the
:.
state of a high impedance. In this case, the output of
data buses is limited to that of the one which is made
to be effective. For example, when A da.a series is to
be input to the register 430, the A series data are
output to the signal line 451, and the signal lines 461
and 471, which output data ~rom other data memories 425
and 426 to the data bus 427, are in the state of a high
impedance. The same thing goes for other data buses.
Each of these data series are set respectively in
the registers 430, 431 and 432. Three data buses 427,
428 and 429 can select data from three data memories
424, 425 and 426, so that 33 kinds of data set
~ .
~', 25 combinations can be supplied to the registers 430, 431
and 432.
:,~
:~ - 42 -
,.,
.~ .
:~.; , .
Y~.,
~ ',- .: -

13 2 4 6 ~ (j
.~wo expressions as shown below are defined in the way of three
. input operation and then the processing method is shown in the
following:
(ai e bi) x ci .,. (7)
(ai x bi) e ci ,....................... (8)
'f,
where (x e y) expresses an arithmetic and logic operation for
finding results or values of addition, subtraction, maximum
.~ values or minimum values for two input data x, y, and (x x y]
expresses multiplication. The explanation of operation
~. 10 processing flow of the expression (7) is given in the Table 3.
'~ The mark of 'x' in the table represents an unknown.
if.'
'~';
.i
~'
~ 3 ~
.~
:~'
:~
~ ~"
. ,jf ' .
.~ .
~ 43 -
:~:
:~
~s,
.f~' ' ' .

132~
~ 1 T ~
i.~ ~u ~ ~ ~ ~ ~
~o~ ~ ~ ~
~,.` U~ X X Q'~ ~ ... ... ... ~Z X
~ ~ . l ~ ~ t ~L
~: Sl Q~ ~`J ~ S~Z ~Z
~; ~1 . X ~d N 0 . . . Z Z X X
'
.-. uz uz x x
~ ~ - - -
1~ I U~ I ~ I J~
~. rr ~ ~-
,, '~c ~ ~ = ...
- 4 4 - ,
~ '
~: .

~32467
At a step ST32 a selector 434 selects the side of a
register 430 and a selector 431 selects the side of a register
43S. By the use of these two selected data (ai and bi) the
operation (ai ~ bi) is performed with an operator 438, and the
result is stored in a register 439. This value is output from
the register 439 in the next step.
The data ci in the register 432 are delayed by the
register 433 by one step. In the next step a selector 436
- selects the side of the register 439 and a selector 437
selects the side of a register 433. By the use of these two
~- data, (ai e bi) is multiplied by ci with the multiplier 440
and the result (ai o bi) x ci is stored in a register 441.
, This value is output from the register 441 in the next step.
By an output selector 442's selecting the register 441, the
, 15 data (ai ~ bi) x ci are sent to one of the data memories 424,
425 and 426 through a data bus 445 based on the address shown
.;.
by the address generator 423.
In this invention, readout of data, execution of
operation and writing of data are continuously executed by a
pipeline processing, so that the control of each section can
be operated in parallel. Therefore if the three input one
output operation is executed for a data series with N
r elements, from the time when the first datum is readout until
the time when the processing result of the last datum is
` 25 written into a memory, the period of (N ~ 3) cycles are
x required.
x
i.
' j
~ - 45 -
s
.
.~ .
. .
~, . .
'~'. .
~,
~'

2~7(,
The explanation of operation procéssing flow of
i~ expression (8) is given in Table 4. The mark "x" in
Table 4 represents an unknown.
`~'
,
'~.,
r
,''
,
.
,~
~.~
..
~'
:g;
'A 46
,
.~' '
.~':''~ ' .
'J ~
.~", , ', ', .
~ ' ' " ~ ' ' ','', :', ,

1324~ ~
~ 1
i L~
, ~ X X X X . . . ............ . X X X ,~ ~
x o u~ o oz z x x
o~ 1 o ~
~ ~ ~ ~ ... ~z x x x
~- _ _ _ _ _ _
1`` . ~ ~ ~ ~ .............. ~z x x x
~1 ~ __-- ' _ Z '
i
~. .

1324~7
The operation in which three input data are readout
, to registers 430, 431 and 432 is the same as that in thecase of expression (7). When the operation of expression
(8) is executed, the selector 436 selects the side of
the register 430 and the selector 437 selects the side
of the register 431, and the operation (ai x bi) is
performed by the multiplier 440 and the result is set in
the register 441.
In the next step, the selector 434 selects the side
of the register 433 and the selector 435 selects the
side of the register 441, and the operation (ai x bi) ~3
ci is executed by the operator 438 and the result is set
in the register 439. In the next step, by the selector
442's selecting the side of the register 439 the
selection result is written into one of the data
memories 424 to 426.
, ~
Thus the case of the operation of expression (8) is
the same as the case of expression (7), thereby the
total processing time requires (N ~ 3) cycles.
In the case of the operation of two input one
output, the value of ~ai ~9 bi) can be obtained through
the procedure as shown in the following: the selector
434 selects the side of the register 430 and the
selector 435 selects the side of the register 431 and
after the operation is executed by the operator 438 the
side of the register 439 is selected by the selector 442
in the next step. The value of (ai x bi) can be obtained
- 48 -
.~,
,.
,
I
.~ .
. '

32~cj
through the procedure as shown in the following: the
selector 436 selects the side of the register 430 and
the selector 437 selects the side of the register 431,
and after the execution of the operation with the
~` ` 5 multiplier 440 the selector 442 selects the side of the
register 441 in the next step.
The processing speed in the case of three input one
output is (2N + 10/N + 7) times of that of prior art,
that is almost half times if N is a large number.
When a cumulative value is to~be found in the three
~; input one output operation, a cumulative value till a
~: point on the way or an initial value is stored in the
accumulator 444 and each one of the successive operation
results is added to the cumulative value in the
accumulator 4a4 with the adder 443 and the added result
is stored in the accumulator 444 again. These processes
are performed repeatedly. Processing cycles therefore
are not increased due to cumulative operation.
Fig. 23 shows a flow chart to realize a method for
motion compensative operation which refers to an
~X embodiment of this invention. Fig. 24 is a drawing for
: the explanation of an intermediate check method in the
` ~ distortion quantity operation in this invention. Fig.
25 is a disposition drawing of a pixel sample at a
sample point in a block in the intermediate check method
for distortion operation in th s invention.
'! '
,s, .
, - 49
.,.
; .
s ~ :
,.~ ,. . .
~ ~ .

132~
Before the operation process, on the first block
among M pieces of candidate blocks for search in the
previous frame data, distortion quàntity of all the
pixels in the block shall be measured; the distortion
quantity in this case is defined to be the minimum
distortion. As for the distortion quantity,
differential absolute value sum is adopted. In the
distortion quantity operation about on and after the
second block the calculation of differential absolute
values of all pixels is not needed, but at an
intermediate check point if the intermediate distortion
quantity exceeds a certain value, it is judged that the
~i ultimate distortion quantity of the block cannot be
smaller than the minimum distortion D and the distortion
quantity operation for the residual part is stopped.
A block which gives the minimum distortion is
detected by the calculation of the degree of
approximation between the patterns by using the
~^ difference and accumulation of pixels in the respective
M blocks which are selected out of the present input
frame and the previous input frame (M is a positive
integer). The number of pixels used for the calculation
'J',~ of the degree of approximation is K at the maximum (K is
an integer greater than or equal to one and smaller than
or equal to the number of a total number of pixels in
one block). During the calculation of the degree of
approximation at the time when the number of pixels in
- 50 -
,
~ .
x-:
~ .

132~7~
,
reference is less than K intermediate checks are
performed four times, and an intermediate check point
shall be provided in each 1/4 sample point. Fig. 25
; shows examples of sample points used for distortion
quantity operation. The mark O expresses a first time
sample point for distortion quantity operation; the mark
. x expresses a second time sample point for distortion
?:
.~ quantity operation; the mark ~ expresses a third time
sample point for distortion quantity operation; the mark
i 10 ~ expresses a fourth time sample point for distortion
.s quantity operation.
....
~; In Fig. 24 when the total number of sample points
.:.
is assumed to be K, express threshold levels at a first,
second and third intermediate check points as dl', d2'
i:
and d3'; then put
dl' = D/4 ~ thl ........ (9-1)
.~; d2' = D/2 + th2 ........ (9-2)
d3' = 3D/4 I th3 ....... (9-3)
~: where thl, th2, th3 can be set independently. Express
the distortion quantity at the first, second and third
intermediate points as dil, di2 and di3.
. .
:~ ~ In this case, dil expresses the value of the first
time distortion quantity in Fig. 25; di2 expresses a
. cumulative ~alue, dil plus the second time distortion
-; 25 quantity; di3 expresses a cumulative value, di2 plus the
s. third time distortion quantity. Therefore, the
s~
.. - Sl -
.,
..~
,~s,
~,:

132~7~
-- cumulative value in which the fourth time distortion
quantity is accumulated becomes the distortion quantity
- in which all sample points are included.
On the basis of a distortion quantity judgment at
an intermediate check point if a block is estimated to
have a large distortion quantity, the checking of the
block is canceled before the bloc~ reaches the last
check point to save useless operation processes. In
other words if a distortion quantity dil ~hich is
obtained by a distortion calculation at 1/4 K sample
point in a step ST41 is found to be dil > dl' by the
judgment in the next step ST42, the block is canceled,
if not, the operation is continued to the next step ST43
and the operation of distortion quantity di2 is
performed with the distortion calculation at 1/2 K
sample point. If it is found that di2 > d2' in the
, judgment in the next step ST44 the block is canceled and
if not, the operation is continued into the step ST45.
The operation of distortion quantity di3 is performed
with the calculation at 3/4 K sample point, and if it is
found that di3 ~ d3' by the judgment in the next step
ST4 6 this block is canceled, if not, the operation is
continued into the step ST47 and the distortion quantity
, di at K sample point is calculated for performing
comparison and renewal.
As shown in the above if the processing is
performed till the last step the same result is obtained
.
r''
r ~ 5 2
. .
:
,, ~

132~37(
as that obtained with the conventional method in which the
whole pixels are used for a distortion operation. If the
distortion quantity di, in this case, is smaller than the
` minimum distortion D, the value of the minimum distortion D is
renewed for di and tha motion vector index is renewed for the
index i. The final minimum value of distortion D and the
vector index I which shows the movement to give D can be
` obtained by repeating such operating processes as mentionedr, above by the number of times corresponding to the number ofsearching vectors till the process proceeds up to the Mth
~ block.
: In the above embodiment, the example where differential
~ absolute value sum is used for distortion quantity operation
-~ is shown, but differential square value sum can also be used.
-, 15 In the above embodiment, explanation is made about the
case of motion compensative operation, but the execution of
inner product vector quantization operation is also possible
~' and the same effect can be obtained. When an operation result
t is compared with a threshold value at an intermediate checkpoint, the relation in magnitude is opposite to what is
mentioned in the above embodiment.
A
'':
., ' .
~:
- 53 -
~ .
:.,..~
... .
: .
:'
'' .,

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee and Payment History should be consulted.

Event History

Description	Date
Inactive: CPC assigned	2003-04-17
Time Limit for Reversal Expired	1997-11-24
Letter Sent	1996-11-25
Grant by Issuance	1993-11-23

Abandonment History

There is no abandonment history.

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
MITSUBISHI DENKI KABUSHIKI KAISHA

Past Owners on Record
KOH KAMIZAWA
NAOTO KINJO
TOKUMICHI MURAKAMI

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Claims	1994-07-16	5	186
Drawings	1994-07-16	21	510
Cover Page	1994-07-16	1	20
Abstract	1994-07-16	1	14
Descriptions	1994-07-16	55	1,848
Representative drawing	2002-05-03	1	8
Fees	1995-10-20	1	70
PCT Correspondence	1993-08-27	1	34

Language selection

Menus

English Abstract

Event History

Abandonment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 1324678 Summary

English Abstract

Event History

Abandonment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.