Language selection

Search

Patent 1270954 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 1270954
(21) Application Number: 535863
(54) English Title: APPARATUS FOR ARITHMETIC PROCESSING
(54) French Title: APPAREIL DE TRAITEMENT ARITHMETIQUE
Status: Deemed expired
Bibliographic Data
(52) Canadian Patent Classification (CPC):
  • 354/151
(51) International Patent Classification (IPC):
  • G06F 7/52 (2006.01)
(72) Inventors :
  • HASEBE, ATSUSHI (Japan)
(73) Owners :
  • HASEBE, ATSUSHI (Not Available)
  • SONY CORPORATION (Japan)
(71) Applicants :
(74) Agent: GOWLING WLG (CANADA) LLP
(74) Associate agent:
(45) Issued: 1990-06-26
(22) Filed Date: 1987-04-29
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data:
Application No. Country/Territory Date
104021/86 Japan 1986-05-07
104020/86 Japan 1986-05-07
103443/86 Japan 1986-05-06
103442/86 Japan 1986-05-06
100036/86 Japan 1986-04-30

Abstracts

English Abstract


PATENT
503267
ABSTRACT OF THE DISCLOSURE

In a high-speed arithmetic processor,
a first number that has an absolute value that can
exceed one is multiplied by a second number that has
an absolute value not exceeding one. If the first
number exceeds one it is divided into an integer and a
part having a value less than one. The second number
is accumulated as an addend a number of times equal to
the integer to produce a sum. The second number and
the part of the first number having a value less than
one are supplied to a multiplier to produce a partial
product. An adder adds the partial product to the
sum, thereby obtaining a final product of the firs.
and second numbers. The multiplication is thereby
performed in a number of steps which is minimized and
never varies, regardless of whether the absolute value
of the first number is, for example, less than one, at
least one but less than two, or at least two but less
than three. This speeds up the arithmetic processing
and simplifies the programming therefor.

- 1 -


Claims

Note: Claims are shown in the official language in which they were submitted.


PATENT
SO3267
WHAT IS CLAIMED IS:
1. An arithmetic processor for multiplying
a first number that has an absolute value that car.
exceed one by a second number that has an absolute
value not exceeding one; said processor comprising:
a multiplier;
control means responsive to an absolute
value of said first number exceeding one for dividing
said first number into an integer and a part having a
value less than one;
accumulating means for accumulating said
second number as an addend a number of times equal to
said integer to produce a sum;
storage means for supplying said second
number and said part to said multiplier to produce a
partial product; and
adder means for adding said partial product
to said sum, thereby obtaining a final product of said
first and second numbers.



2. An arithmetic processor according to
claim 1; wherein said second number represents data
and said first number represents a coefficient of said
second number.



3. An arithmetic processor according to
claim 1; wherein said accumulating means comprises a
selector connected to receive outputs from said

multiplier and said storage means and selectively to
pass said partial product or said second number to
said adder means and a storage register responsive to
the output of said adder means.

- 37 -

PATENT
SO3267
4. An arithmetic processor according to
claim 1; wherein said storage means comprises an input
register and a work memory and further comprising a
second selector connected to receive outputs from said
input register and said adder means and selectively
pass said second number or an output of said adder
means to said work memory.

5. An arithmetic processor according to
claim 4; further comprising a register to receive an
output from said work memory and to supply an output
to said second selector, thereby enabling calculation
by said multiplier and adder means and recirculation
of data through said work memory to proceed
simultaneously.



6. An arithmetic processor according to
claim 1; wherein said storage means comprises a
coefficient memory and a register, said coefficient
memory being connected to supply an output to a first
input of said multiplier and to said register and said
register being connected to supply an output to a
second input of said multiplier, whereby the same
coefficient can be supplied to two inputs of said
multiplier for calculating the square thereof.




7. An arithmetic processor according to
claim 1; wherein said storage means comprises an input
register and said arithmetic processor further
comprises an additional register connected to receive
data from said input register; said storage register
being connected to supply an output to a first input




- 38 -


PATENT
SO3267
of said multiplier and said additional register being
connected to supply an output to a second input of
said multiplier, whereby the same data from said input
register is supplied to two inputs of said multiplier
for calculating the square thereof.

8. An arithmetic processor according to
claim 1; wherein said multiplier, control means,
accumulating means, storage means and adder means form
a first part of said arithmetic processor; further
comprising:
a second multiplier; second control means,
second accumulating means, second storage means and
second adder means respectively corresponding in
structure and function to said multiplier, control
means, accumulating means, storage means and adder
means and forming a second part of said arithmetic
processor connected in parallel with said first part;
said first and second parts respectively
and simultaneously operating on real and imaginary
parts of complex numbers.



9. An arithmetic processor comprising:
a multiplier capable of multiplying two
numeric values each having an absolute value not
exceeding one;
means operative in response to one of
said two numeric values exceeding one for dividing
said one value into an integer and a part having a
value less than one;




- 39 -


PATENT
SO3267
means for supplying the other of said
two numeric values and said part to said multiplier to
form a partial product; and

means for taking said other numeric
value as an addend a number of times equal to said
integer to form a sum and adding said sum to said
partial product, thereby obtaining a final product of
said two numeric values.



10. An arithmetic processor comprising:
an input resister;
an arithmetic section connected to
said input register for performing arithmetic
operations on data supplied to said input register;
a work memory having a write input
and
a selector connected between said
input register and said write input for selectively
supplying data from said input register to said write
input, whereby data from said input register can be
transferred to said work memory, thus reducing the
required capacity of said input register.



11. An arithmetic processor comprising:
an input register;

an arithmetic section connected to
said input register for performing arithmetic
operations on data supplied to said input register;
a work memory having a write input and
an output, said output also being connected to said
arithmetic section; and




- 40 -

PATENT
SO3267
a selector connected between said
input register and said write input and between said
output and said write input;
data from said input register and said
output being selectively supplied to said write input
via said selector;
whereby data from a first address in
said work memory can be supplied to said arithmetic
section and simultaneously recirculated through said
selector for storing in said work memory at a second
address shifted with respect to said first address.

12. An arithmetic processor comprising:
a multiplier having two input
terminals;
a coefficient memory producing an
output; and
means connected to said multiplier and
said coefficient memory for supplying said coefficient
memory output to both of said input terminals for
calculating the square thereof.



13. An arithmetic processor comprising:
a multiplier having two input
terminals;
an arithmetic logic unit producing a
logic output; and
two selectors respectively connected
to said two input terminals and responsive to said
logic output;


- 41 -



whereby said logic output is supplied to
both of said input terminals for calculating the square
thereof.
14. An arithmetic processor comprising:
a multiplier having two input terminals;
a coefficient memory producing an output'
means connected to said multiplier and
said coefficient memory for supplying said coefficient
memory output to both of said input terminals of said
multiplier so that said multiplier calculates the square
thereof; and
an arithmetic logic unit provided with the
square from said multiplier and for producing a logic
output.

15. An arithmetic processor comprising:
a multiplier having two input terminals;
an arithmetic logic unit producing a logic
output; and
two selectors respectively connected to
said two input terminals and respective to said logic
output;
whereby said logic output is supplied to
both of said input terminals of said multiplier and for
calculating the square thereof.




42

Description

Note: Descriptions are shown in the official language in which they were submitted.


O~ L PA rFNT


BACKGR~UND OF THE INVENTION



Field of the Invention
~ his invention relates to arithmetic
processors and, in particular, to a novel and highly
effective arithmetic processor adapted ~or use in
video image processors and in other high~speed data
processorC and able to process data more rapidly than
arithmetic processors hereto~ore conventional in such
apparatus.



Description of the Prior Art
Video image processing apparatus must
proce~s data at high speed. In commercial television,
for example, 25 or 3~ frames (depending on the system)
are displaved per second, each frame including
hundreds of lines and each line including hundreds of
pixels (picture el`ements). In advanced image
processing apparatus, signals produced by a television
camera are typically converted to digital form, stored
in an input image memory, processed ln a position
stationary processor, stored in an output image
memory, converted back to analog form, and then
recorded by a VTR and/or displayed on a television
monitor. Apparatus such as a position variant
processor, a contro] processor and a host computer are

provided for controlling dat~a flows, controlling the
execution and stopping of processes, and contrclling
the entire video image processing apparatus.




- 2 -

~ 709~4 PATE~T
S03~6
In such apparatus, the position stationary
processor includes a number of arithmetic units that
process signals consisting of data signals and a
coefficient. Each data signal is multiplied by its
coef~icient to pr~duce an output. Depending on the
magnitude of the signals, the multiplication requires
in conventional practice a different number of steps,
for example three to five. ~t is difficult to write a
pro~ram that infallibly takes the diflerence in the
number of processins steps (and hence in processing
time) into account. Typicall~, therefore, the progrGm
all~s for the maximum number of steps tha4 may be
required, for example five steps. This means that
time is wasted in any case where only three or four
steps are required for the multiplication. While the
time wasted is short in any given instance, the wasted
time is accumulated over and over and is quite
sigr.ificant in the aggregate.



OBJECTS A~ SUM~RY OF THE INVENTTON
_
An object o the invention is to remed~ the
problems of the prior art outlined above.
Another object of the invention is to
provide a ~igh-speed arithmetic processor that car.
multiply two numbers in the same Iminimum) number ol
steps regardless, within limits, of the magnitu~c Or
the numbers.
More particularly, an object of the
inventior. is to provide an arithmetic processor for

multiplying a first number such as a coefficient that
has an absolute value that can exceed one by a second
number such as a data number that has an absolute

.




- 3 -

~7~
PP.TENT
S03267
value not exceeding one, the multiplication requiring
a number of steps that is minimized and always th~
same, regardless of whether the absolute value of the
first number is, for example, less than one, at least
one but less than two, or at least two but less than
three.
~ he foregoing and other objects are
attained in accordance with a first aspect of the
invention by p~oviding ~n arithmetic processor for
multiplying a first number that has an absolute value
that can exceed one by a second number that has an
absolute value not exceeding one; the processor
comprising: a multiplier; ccntrol means respons ve to
an absolute value o the first number exceeding one~
for dividing the ~irst number into an inteaer and a
part having a value lesc than one; accumulatinq means
for accumulating the second number as an addend a
number of times equal to the integer to produce a sum;
storage means for supplyin~ the second number and the
part to the multiplier to produce a partial product
thereof; and adder means for adding the partial
product to the sum, thereby obtaining a final product
of the first and second numbers.
Tn accordance with a sPcond aspect o' the
invention, an arithmetic processor comprises a
multiplier capable of multiplying two numeric values
each having an absolute value not exceeding one; means
operative in response to one of the two numeric values
exceeding one for dividing the one:value into an
integer and a part having a value less than one; means
for supplying the other of the two numeric values and
the part to the multiplier to form a partial product;


~L~7~.~3~a~
PAT~
S03267
and means for taking the other numeric value ac ar.
addend a number of times equal to the in'eger to f~rm
a sum and adding the sum to the partial product,
there~y obtaining a final product of the t~to numeric
values.
In accordance wilh another aspect of the
invention, an arithmetic processor comprises an input
register; an arithmetic section; a work memory having
a write input; and a selector connected between the
inpu~ register and the write input for selectivel~
suppl~-ing data from the input re~ister to the write
input.
In accordance with another aspec. of the
invention, an ari~hmetic processor is provided
comprising an input register; an arithmetic section; a
work memory ha~Ting a write input and an output; and a
selector connected between the input register and the
write input and between the output and the write
input; data from the input register and the outpu~
being selectively supplied to the write input via the
selectox.
In accordance with another aspect of the
invention, an arithmetic processor is provided
comprising a multiplier having two input terminals; a
coefficient memory producing an output; and means
connected to the multiplier and the coefficient memory
for supplying the coefficient output to both of the
input terminals for calculating the square thereof.
In accordance with another aspect of the
invention, an arithmetic processor is provided
comprising a multiplier having~two input terminals; an
arithmetic logic unit producing a logic output; and


~70~
PATENT
5~3267
two selectors respectively c~nnected to the two input
terminals and responsive to the logic output; whereby
the logic output is supplied t~ both of the input
terminals for calculating the square thereof.

BP~lEF` DESCRIPTIO~; OF THE DRAWINGS
A better understanding o, the objects,
~eatures and ad~a~ages of the invention may be gained
from a consideration of the following detailed
description of the preferred embodiments thereof, in
conjunction with the appended drawings, throughout
which a given reference character alwa~s indicates the
same element or part, and wherein:
Fig. 1 is a conceptual drawing showing the
whole of an image processing apparatus to which thc
apparatus of the present invention is applicable;
Fig. is a block diagram showing an
example of a main portion of the image processing
apparatus of Fig. 1;
Fig. 3 is a block diagram of an earlier but
not publicl~ disclosed arithmetic unit for use in the
apparatus of Fig. 2;
Figs. 4-9 are block diagrams of respective
pre erred embodiments of arithmetic units in
accordance with the invention that can be substituted
for the arithmetic unit of Fig. 3;
Fig. ~0 is a block diagram showing the
incorporation of the structures of Figs. 4-9 to form a
pair of arithmetic unit~ for use in the apparatus of
Fig. 2; and


~7~
PklE~'T
S~3267
Fig. 11 is a flowchart illustrating the
operation of a preferred embodiment of an arithmetic
unit in accor~ance with the invention.



DESCRIPTION OF THE PREFERRED EMBODIMFNTS
Typical Appara$us Em~loying Arithmetic Processor
Figs. 1 a~d 2 show image processing
apparatus of a type disclosed in a copending
applica~ion of Hasebe et al. serial No. 06/932,277,
filed ~ovember 19, 1986, and assigned to the assignee
of the present application. Arithmetic processors
according to the present invention are especially
adapted for use in apparatus as shown in Figs. 1 and
2.
Fig. 1 shows an example of video image
processing apparatu~ for achieving high-spee~ data
processing. The apparatus comprises an input/output
portion 1 (hereinafter called an IOC), a memory
portion 2 (hereinafter called a VIM) consistins of an
input image memory 2A (hereinafter called a VIMTN) and
an output image memory ~B (hereinafter called a
VIMOUT~, a data processing portion 3 consisting of a
position stationary processor 3A (hereinafter called a
PIP) mainly ~or calculating picture element values and
a position variant processor system 3B /hereinafter
called a PVP) for controlling data flows as by
controlling addresses and for adjust1ng processes~to
coincide in timing, and a processor 4 (hereinafter~
called a TC~ as a total controller for controlling

execution and stopping of processes and


3~


PATE~T
SO3267
exchange of programs. The TC 4 is provided with a
host computer 5 (hereinaf~er called an HC) for
controlling the entire video image processing
apparatus.
The IOC 1 makes A/D (analog-to-digital)
conversion o ~ideo signals coming from a video camera
or VT~ 6, for example, to provide digital image data,
writes t~e digital image data in ~he VIMIN 2A, reads
out processed image data from the VIMOUT 2B, and makes
D/A (digital-to-analog~ conversion of the processed
image data to restore analog video signals, so that
they may, for example, be recorded in a VTR 7 or
supplied to a monitor receiver 8 to enable monitoring
of the video image.
Tn the present case, the signals supplied
as input and output are video signals o the NTSC
system or the R-G-B system, and either of thece
systems is specified by the TC 4. A picture elemen~
is provided, for example, by 8-bit data.
The writing and reading of image data in~o
and out of the VIM 2 is performed in large blocks of
image data, for example in blocks of a field or a
~rame. Therefore, each of the VIMIN 2A and the VIMOUT
2B is made up of a plurality sheets of memories, each
having enough capacity for the imaqe data of a field
or a frame. For example, 12 sheets of 768 x 512 bytes
may be employed as frame memories. In the present
example, the use of these 12 sheets of frame memories
i5 not fixed but can be flexibly alIocated to either
the VIMIN 2A or the VIMO~T 2B according to the purpose
of the processing or the picture image as the object

of the processing. Two sheets are used as one set, so


P~ l'El~'~
S03267
that when one sheet is written, th~ other can be re2c,
whereb; processing from outside the VIM 2 b~- .he IOC 1
and processing within the VIM 2 by the PIP 3A an~ the
P~P 3B are per~ormed in parallel.
A c~ntrol mode signal determining whether
the pl~rality ~f shee~s o~ frame memories of VIM 2
should come under the control of the IOC 1 or under
the control of the PVP 3B is issued from the TOC 1 anc
supplied to the ~
The data processing porti~n 3 co~rises G
processor, reads image data stored in the VI~ h
accor~ing to its program, processes the data in
various ~ays, and writes the processe~ data in the
VIMO~T ~B.
~ he data processing portion 3 is made up o
the separated systems pTp 3A and PVP 3B operating n
parallel; by virtue o such separated arrangement, the
procescing time consumed in the data processing
portion is determined onl~ by whiche~-er is lonqer of
the processing times taken by the two systems. In
contrast, in the data processing portions of eariier
image p~ocessing apparatus, the total processing time
was determined ~y the sum of the processing times. ~n
the present example, data processing is performed at
such hish rates that video data can be processed on a
real-time basis.
The processing portion 3 is ~ade u~ of one
sheet or a plurality of sheets of processors, and the
microprograms in their microprogram memories can be
exc~anged when the scope of the processing is
enlarged.


g~
PATE~T
` 5O3267
The pr~gram exchange ls carried out in th-s
wa~: the microprograms are supplied fro~; the HC 5 to
the TC 4 in ad~ance and st~red, for example, in a Rh~J
provided ~heIein. Thereafter, wnen, for examp'e, the
user has made a request for exchanging some programs
(b~ turning a s~itch on), the TC 4 suppiies the
programs to each of the processors.
The PIP 3A and the PVP 3B are basically of
the same architecture. Each comprises an independent
processor ha~ing a control unit, arithmetic uni.,
memor~ unit, and input/output port. Each is arranced
in a multiprocessor structure made up of a plurality
of unit processors and is constructed so that
high-speed processing is achieved chiefly by adoption
of 2 parallel processing technique.
The PIP 3A comprises, for example, 60
sheets of PIP procPssors and several sheets of
subprocessors and processes image data coming from the
VIM ~ or generates image data within the PIP 3A
itself.
The P~'P 3B comprises, for example, 30
sheets of processors and controls flows of image data
inward from the VIM 2 such as allocation of the
picture e'ement data to the PIP 3A.
More particularly, the PVP 3B generates
address data and c~ntrol signals for the VIM 2 and
supplies them to the VIM 2. It also generates
input/outp~t control signals and o~her control ~Lgnals
for the PlP 3A an2 supplies them to the PIP 3A.
The image data processing is not always

conducted in such a manner that the data from a single
sheet of a frame of the VIMIM 2A are processed and the




-- 10 --

)9~
, ~ PA~E~T
SG3267
processed data are written ln the VIMO~T 2B, but
sometimes data coming from a plurality of she~ts of
frame memories and extending over a plurality of
sheets of ~rames are processed together.
The PIP 3A and PVP 3B employ 16-bit
processing as a standard, and a speed s achievable
that will enab~e the ari~hmetic processing of the
image data of one frame within the time period of one
frame, namely tha-t will enable real-time processing.
As a matter of course, there are also some processes
that require longer processing time than ane ~rame.
In the present case, the image data
processing by the PIP 3A and PVP 3B is performe~ in
synchronism with the video frames. Therefore, a
process start timing signal PS in synchronism with
each frame is supplied from IOC 1 to the PVP 3P. The
signal PS is ordinarily at a high level and it is
brought to a low level at the processing start time.
On the other hand, a signal OK indicating that a
process has been finished is supplied from the PVP 3
to the IOC 1. This signal OK is supplied by a
processor at the core of the PVP 3B that pro~ides
timing control. The process start timing signal P~ is
generated in the IOC 1 based on a frame start signal
indicating the first line. of each frame and the
process end signal OK.
When the pro~essing is performed on a real
time basis, since the signal OK is always obtained at
the end of each frame, the signal PS becomes the same
signal as the frame start signal.

On the other hand, when the processins time
is longer than one frame, the signal PS does not


7~ ~3~

ATE~T
S03267
coincide with the frame period but is obtained at the
star, of a frame after a signal 0~ has been supplied
as an output.
When the processor at the core of the P~'P
3B ~etects that the process start timing signal PS
from the IOC 1 h2s been brought to the low level, this
processor starts t~ ruD, and ~utputs, according to its
controlling program, timing signals to other
processors (including the pTp 3A), supplies addresses
to the VIM 2, reads the image data from the ~ 2 and
causes the same to be processed in the PIP 3A. ~her.
the processing has been finished, the same proce~sor
generates the signal OK and stops, waiting for
issuance o~ the next process start timing siar.al PS.
In this case, only the image signal
portion, excluding the synchronizing sisnal and bur~t
signal, is taken as the object of processing, and the
data read out from the VIM ~ does not include the
synchronizing signal and burst signal. Therefore, the
IOC 1 is provided with a ROM generating the
synchroni2ing signal, burst signal, and the vertical
blanking signal, and in ~he case of the NTSC signal,
the data from the VIMOU~ 2B (after being rearranged,
if necessary) are transferred to the D/A converter of
the IOC 1 together with the synchronizing sisnal,
burst signal, and vertical blanking signal.
Also in the case of the three primary color
signals, an outer synchronizing signal becomes
necessary. This signal is generated also in the IOC l
and supplied to the monitor and other apparatus.
In this parallel processing system by the

use of multiprocessorsi the TC 4 effects synthetic




- 12 - ~

~ ~7~)~5~

PAl`ENT
S03267
control according to the three modes men~ioned below.
Execution of processes, stopping, and program transfer
~exchange) are thus carried out consistently. Also,
the transrer and execution are effectively conducted
by using a slow clock and a fast cloc3~ at the times of
the program tr~nsfer and the program execution,
respectively.
Fig. 2 shows a concrete structure of the
PIP 3A. Although the PIP 3A has, in reality, a large
number 160 sets, fOT example) of processors arranse~
in parallel, only two sets o' them are shown in the
drawing. In this drawing, digital data from, the VI~. 2
are supplied to input registers 31-1 to 31-n
(hereinafter called the FRA) provided for each of the
n processors 30-1 to 30-n, and these registers are
controlled by the PVP 3B in accordance with the
address read out of the VIM 2 an~ stored with a
predetermined amount o' data necessar~- for each
processor.
The data written in these re~isters 31-1 to
31-n are supplied to arithmetic units 32-1, 33-1 to
32-n, 33-n, respectively. Each of the arithmetic
units is provided with an adder/subtractor,
multiplier, coefficient memory, data memory, etc., and
makes linear an~ nonlinear data conversion
calculations accordinq t~ a control signal from the
control units 34-1 to 34-n. Results of the
calculations are obtained at the arithmetic units 33-1
to 33-n, an~ the arithmetic units 33-1 to 33-n are
controlled by the PVP 3B according to write addresses
of the VIM 2, whereby the results of the calculations
are written in necessary portions in the VIM 2.




- 13

~7
- EN~
Su3~7
The control signals from the control units
34-l to 34-n are formed according to the microprogram
written in the microprogram memories (~.P~) 35-l to
35-n. The microprogram is written from outside
through program chan~e controls 36-l to 36-n~
If the microprogram is formed by the host
computer lHC) 5 (Fig. 1), etc., the transfer rate from
the HC S to each MP~; 35-l to 35-n is limited by the
capacity of t~e line. Tt is possible to transfer the
program only at the rate, for example, of 500
Kbytes~sec or so, and it takes a considerable amount
of time for the rewriting in all of the ~.P~s 35-l to
35-n. Since procecsing in the PIP 3A, etc., is
impossible during that time, substantial dra~bac~s are
experienced. And, since the transfer cannot be
performed until the processing in the pTp 3A, etc.,
has been finished, the HC has to wait until it is
finished, and the efficiency of usage of the HC is
considerably lowered.



Earlier Arithmetic Processor
..
In the apparatus des~cribed above, each
arithmetic unit 32,33 of each processor sect.on 30
conctituting the PIP 3A is provided with a so-called
multiplier.
FlG. 3 shows a primary portion of an
earlier arithmetic unit known to the inventor but not
publicly disclosed and no~ claimed~herein, in which

data from the FRA 31 and data from a work memory 41 to
be descri~ed later are supplied ~ia a selector 42 to
an inpu~ of a multiplier 43, and data from a
coefficient memory 44 and data from an arithmetic


~2~
P~ .NT
S0~267
logic unit (ALU~ 46 to be descr-bed later are supplied
via a selector 45 to another input of the multiplier
43. Output data from the multiplier 43 is supplied to
an input of the ALU 46, which deliver~ output data
there rom to the work memory 41 and via a register 47
to another input of the ALU 46.
In a case where the work memory 41 is not
provided, the selector 42 is unnecessary; and, in a
case where the output from the ALU 46 is not supplied
to the multiplier 43, the selector 45 is unnecessary.
_ In general, the multipliers used for this
kind of digital operation require that the absolute
values of each of two numbers to be multiplied be less
than l. Of course, data to be supplied to the FkA 31
can be adjusted to have an absolute value less than ',
for example, by setting the dynamic range to less than
1. However, the coefficient to be multiplied is
required to be at leas~ 1 in some cases.
To cope with thi~ situation, in Fig. 3, a
coefficient having a value of at least 1 is subdivided
into a plurality of coefficients each having a value
less than 1. Each coefficient is then multiplied by
the input data, and the results are added to obtain
the product as a total. For example, in FIG. 3, the
data from the FRA 31 and the coefficient less than 1
from the coefficient ~emory 44 are supplied to the
multiplier 43. T~e resultant product is supplied to
an input of the ALU 46, which operates as an adder,
and an output fxom the ALU 46 is delivered ~ia the
register 47 to another input of the ALU 46.
In this circuit, assuming the input data
and the coefficient to be x and a ( ~a~




- 15 -

~7~3~4

P~ ~NT
S03 67
respectively, the arithmetic pr~cessing is executed in
such a way that the input data x and the coef~icient a
are supplied to the multiplier 43 in the first step,
the product ax is obtained and is loaded in the output
register of the multiplier 43 in the second step, and
then the product is extracted via the ALU 46.
Consequently, when the absolute value of the
coefficient is less th~n 1, the product can be
obtained in three steps.
In contrast, when the absolute value of the
coefficient is at least 1 and less than 2, the
arithmetic operation is conducted with the
coefficients (a + b : ¦a3,~ b~ 1). In this case,
four steps are required to obtain the product: the
input data x and the coefficient a are supplied to the
multiplier 43 in the first step; the input data x and
the coefficient b are supplied to the multiplier 43
immediately after the product ax is supplied to the
output register of the multiplier 43 in the second
step the product bx is supplied to the output
register of the multiplier 43 immediately after the
product ax is supplied from the output register via
the AL~ 46 to the register 47 in the third step; and
the A~U 46 adds the bx in the output register of the
multiplier 43 to the ax in the register 47, thereby
obtaining ~a ~ b)x, in the fourth step. Since four
steps are required to obtain the product when the
absolute value of the coefficient is at least 1 and
less t~an 2, the Tequired period of time is greater by
the time of one step than the period of time required
when ~he absolute value of the coefficient is less
than 1.




- 16 -

~7~
PA~E~T
S0~267
It can easily be seen tha~, if the absolut~
value of the coefficient is a~ least 2 and less than
3, five steps are required by the apparatus of Fig. 3
to obtain the product.
In a case where the processing time
re~uired for a given kind of arithmetic operation
varies depending on the numbers to be subjected to the
operation, the processing program is designed to
accommodate the operation~ that require the gr~atest
period of time for their execut-on. This caUsec some
o the time durin~ the execution of other opera~ ons
to be wasted. In addition, it is not easy to desigr.
the processing program to take the variations of the
processing time into account. The.intertal of time
required to perform one step descri~ed above is quite
short; however, such an operation is repeate~ a
tremendous number of times in graphic processing and
the like. In such a case, the short interval of time
is accumulated over anc over, which results in a
substantial delay.
In the technique described above, many
arithmetic processing steps are required to perform a
multiplication with a coefficient of which the
absolute value is at least 1, thereby leading to the
problem that a substantial delay is caused.
In the apparatus described above, the
output result of the arithmetic operation of the AL~
46 is supplied also to the work memory 41, and
thereafter arithmetic processing is executed in some
cases by using the data written in the work memory 41

and the data latched in the register 47. The amount
of data necessary for the processing varies depending


~7(~

P~TENT
~ 267
on the content of the processing, and ~he amount of
data to be written in the FRA 31 is greatly changed
especially when the apparatus is used as a
general-purpose processing system. In ordinary
processing, it is unnecessary to allocate the capacity
of the write data of FRA 31 accor~iny to the maximum
amount of the required data; however, the efficiency
of the read,~write operations may deteriorate in some
ca~es.
In a case where so-called s~.ading
processin~ oi a solid spherical ima~e is performec by
the apparatus describec above, an inner product is
calculated from the uni; vector Or the light source
and the normal ~ector at an~ gi~7en point on the
surface of the ima~e to obtain the brightness at that
point. In order to obtain the normal vector in this
case, it is necessary to perform processing such as
look-up table (LUT) processing and squaring of the
data from the coefficient memor~ 44 and the FRA 31.
When a squaring of the coefficient is performed in the
arithmetic sections 3 -1 to 32-n and 33-1 to 33-n of
each processor section 30-1 to 30 n (Fig. 2)
constituting the PIP 3A (Fig. 1) described above, the
coefficient from the coefficient memory 44 (Fig. 3) is
supplied to the work memory 41 throush the selector
45, the multiplier 43, and the ALU 46, an~ then the
coefficient stored in the work memor~ 41 is supplied
via the selector 4~ to an input of the multiplier 43.
~t the same time, the coefficient from the coefficient

memory 44 is supplied ~ia the selector 45 to another
input of the multiplier 43, and the obtained product




- 18 ~

~7~ 35~

PA~EN~
S0326'
(square of the coefficient) is supplied as an output
b~ the AL~ 46.
When a squaring operation is to be
performed on data, the data from the FRA 31 is
supplied to the work memory 41 and t~e register 47
through the selector 42, the multiplier 43, and the
ALU 46, and then the data stored in the work memory 41
is supplied via the selector 42 to an input of the
multiplier 43. At the same time, the data from the
register 47 is delivered to another input of the
multiplier 43 through the ALU 46 and the selector 45,
and then the resultant product (squared value) ~s
sup~lied as an output by the ALU 46.
In the apparatus of Fig. 3, however, to use
the work memory 41 for intermediate processing in the
arithmetic operation complicates the address
~eneration, and the operating efficiency may
deteriorate when, for example, a coefficient is
squared in LUT processin~.



A `thmetic Processor Accordin to the Invention
r 1 ~
In accordance with the present invention,
the processing time required for the arithmetic
operations described above is s-gniricantly reduced.
FIG. 4 shows one preferred embodiment of
apparatus constructed in accordance with the
invention. In the apparatus of Fig. 4, data x from
the FRA 31 and a coefficient a from the coefficient

memory 44 ~assumed for the moment to have a value less
than l) are supplied to the multiplier 43, and the
resultant product ax is delivered to a first input of
a selector 48, which may be formed of tri-states.




-- lg --

~7~

PATE~T
SO3267
Data ~rom the FRA 31 is sent directly to a second
input ~i the selector 48, and the da~a from t~e FRA 31
is further supplied via a delay register 49 to a t~ir~
input thereof. The data selecte~ by the selector 48
is supplied to an inp~ o~ the adder 46. The output
of the adder 46 is supplied via the register 47 to
another input of the adder 46.
As indicated above, it is assumed that the
input data and the coefficient are x and a,
respectivel~. The arithmetic processing is then
perorme~ as follows: the input data x an~ the
coe~ficient a are supplied to the multiplier 43 in the
~irst step; the (partial) product ax is store~ in the
output register of the multiplier 43 in the second
step; and the ~final) product ax is obtained through
the selector 48 and the adder 4~ in the third step.
Consequentl~, when the absolute value of the
coefficier.t is less than 1, the product is obtained in
three steps, first as in the case of the apparatus of
Fis. 3.
In contrast, when the absolute value of the
coefficient is at least 1 and less than 2, the
apparatus of Fig. 4 requires fewer steps than ~he
apparatus of Fig. 3 to complete the calculation. In
this case, the calculation is performed with the
coefficient ~a + 1 :¦a¦<l). The input data x and the
coefficient a are supplied to the multiplier 43, ahd,
at the same time, the input data x is supplied t~ the
delav register 49 in the first step; the partial
product ax is supplied to the ~utput register of the
multiplier 43, and, at the same time, the data x
supplied to the delay register 49 is delivered to the




- 20 -

70~
PATEl~
S0326,
register 47 via the selector 48 and the ad~r 46 in
the second step; and then the adder 46 adds the
partial product ax received from the output reg~ster
of the mul iplier 43 to x received from the -~g c~er
47, thereby obtaining the final product (1 1 a) x in
the third step. Consequently, when the absolute value
o~ the coefficient is at least 1 and less than 2, the
final product is obtained also in lust three steFs, in
contrast to the four steps required b~- the appara'us
o Fig. 3.
When the absolute value of the coefr~cient
is at least 2 and less than 3, the arlthmetic
operation is effectec with the coefficient (a ~ 2 :
la¦ ~ 1). In this case, the input data x and the
coeflicien~ a are supplied to the multiplier 43, and,
at the same time, the input data x is delivere~ to the
delay register 49 and via the selector 48 and the
adder 46 to the register 47 in the first step; the
partial product ax is supplied to the output regicter
of the multiplier 43, and, at the same time, the sum x
+ x = 2x produced b~- the adder 46 is delivered to the
register 47 in the second step; and the partia
product ax of the output reg ster of the multiplier 43
and the sum 2x stored in the register 47 are added in
the adder 46, thereby obtaining the final pro~uct (2
a)x, in the third ste~. Consequently, when thc
absolute value of the coefficient is at least 2 and
less than 3, the final product is obtained also in
just three steps, in contrast to the five steps
required by the apparatus of Fig. 3.
Thus, unlike the apparatus of Fig. 3, the
apparatus of Fig. 4 described above can obtain the


PAl r~
o3267
inal product in just three steps ~one step for ~he
input and two steps for the processing) whene~er the
absolute value of ~he coefficient is at ~east 1 a~d
less than 3. In many applications of the in~ention,
this encompasses all of the cases of interest. A~ a
consequence, no provision need be made for additional
delay time when the absolute value o~ the coefficient
is within that range, and the arith~etic processins
time can be minimize~ and held constant. Tn a~.ition,
the processing program can also be quite easil~-
created.
If a detection circuit ~not shown) in the
selector 48 is supplied with the output of the
coefficient memory 44, including the integra' portion
thereof, the selector 48 can selec~ the input data
automatically.
Practically, however, the da'a contents of
the coefficient memory 44 (including only the
respective a portions of the coefficients) are stored
simultaneously when the program for the processor is
stored in the microprogram memories 35-1 to 35 n (Fig.
~. Since only the a part of each coefficient is
stored, the selector is controlled by the program.
The same is true of the other selectors described
below.
In the apparatus of Fig. 4 described above,
for arithmetic processing with a coefficient of which
the value is 1, the input data x from the FRA 31 can
be directly obtained through the selector 48 and the
adder 46 in the first step, and the processins time
can be greatly reduced as compared with the
conventional case where the input data x is obtained


~L27~5D~

PATE~T
S03'67
through the multiplier and the arithmetic operation is
executed wi~h the coefficient (0.5 ~ 0.5).
FIG. 11 is a flowchart illustrating the
operation of the embodiment of Fig. 4 where the
coefficient is at least two (if the coefficient is
less than 2, the flowchart can be shortened). The
selector 48 is stepped to its left position and the
coefficient from the memory 44 and data fro~ the FRA
31 are supplied to the multiplier 43. The
multiplication output of the mult~plier 43 is su~p ~ed
to the adder 46, and the output of the adder 46 is
accumulated a first time in the register 47. The
selector 48 is stepped to its center position, ar.d the
adder 46 and register 49 receive data from the F~ 31.
The adde~ 46 receives the output of the register 47
and adds it to the data from FRA 31 to produce a sum
that is stored in the register 47. The selector 48 is
stepped to its right position, and the adder 46
receives the output of the registers 49 and 47 anc.
produces a sum that is accumulated a third time in the
register 47. In the light of Fig. 11, those skilled
in the art will be able to prepare a flowchart for the
other embodiments of arithmetic processors according
to the invention on the basis of the description belo~
of their structure and function.
If the absolute value of the coefficient
can be restricted to have a value less than 2, the
register 49 can be omitted, as in FIG. 5. In this
case, when the program is designed to cause the output
register of the multlplier 43 to be "transparent", the
partial product derived by the multlplier 43 can be

immediately supplied to the adder 46 in the second




- 23 -

' ' PA'rE~
S03267
step, so that the arithmetic operation can bc
performed in just twg steps (one step Lor the input
and one step for the processing).
According to the present in~ention, a
bypass is established around the multiplier and hence
multiplication with a numeric value of which the
absolute value is at least 1 can be quite easily
performed.
FIG. 6 shows another embodiment of
apparatus constructed in accordance with the
invention. In this figure, the data from the FRA 31
and the data from the work memory 41 'o be descr-be~
later undergo a selection in the selector 42 so as to
be supplied to an input of the multiplier ~3. At the
same time, the data from the coefficient memory 44 is
delivered to another input of the amplifier 43.
Output data from the multiplier 43 is supplied to an
input of the arithmetic logic unit (ALU) 46. The
output of the AL~ 46 and the data from the FRA 31 are
supplied to the selector 50, and the selected data i~
delivered to a write input of the work memory 41. The
output of the ALU 46 is delivered via the register 4
to another input of the ALV 46.
The data supplied to the FRA 31 in this
apparatus is supplied via the selector 4~ to the
multiplier 43 and is then multiplied b~ a coefficient
from the coefficient memory 44. The resultant data is
supplied to the ALU 46. The data is further subjected
to an operati~n such as additlon to the data from the
register 47, and the resultant output of the operation
is extracted. At the same time, the output i5

delivered to the work memory 41 (via the selector 50)




- 2~ -

~ 127~954 PATE~l
S03267
and to the register 47, and thereafter the arithmetic
processing is executed by use of the data written in
the work memory 41 and the data latched in the
register 47.
In this apparatus, the data supplied to the
FRA 31 is supplied via the selector 50 to the work
memory 41.
Consequently, in this apparatus, when the
amount of the input data exceeds the capacit~ o~ the
FRA 31, the excess data can be supplied ~ia the
selector 50 to the work memorv 41 so as to be stored
therein. Even when a great amoun~ of data is to be
processed, the FRA 31 need ha~e only a small capacit~,
since the e~cess data can be written in the work
memory 41. A great amount of data can thus be handled
without lowering the efficiency of the FRA 31, which
facilitates efficient processing regardless of the
amount of data.
The read/write opera~ions in the work
memor~ 41 can be effected in concurrence with the
arithmetic operation such as multiplication, and hence
the efficiency of the processing does not deteriorate.
According to the pre~sent invention, when
the input data exceeds the capacity of the input
register, the excess data can be written in the work
memory, and hence the data can be effecti~ely
processed with a small input register regardless of
the amount of data.
In this apparatus, the data supplied to the
~RA 31 is delivered via the selector 50 to the work
memory 41 o as to be written therein. Moreover, the

data read from the work memory 41 and the data from




- 25 -

5~
. PAlENT
S03267
~he FRA 31 are delivered via the selector 42 to the
multiplier 43, which effects a multiplication with the
coefficient from the coefficient memory 44, and the
resultant ~ata is supplied to the A~U 46. The
obtained data and the data from the register 47 are
subjected to an operation such as addition to obtain
the output of the arithmetic operation, and the output
is supplied to the work memory 41 via the selector 50
and to the register 4,. Thereafter, the arithmetic
operation is performed by using the data written n
the work memory 41 and the data latched in the
register 47.
In the apparatus described above, wher. an
operation such as so-called filter processing or
convolution processing is to be executed, a part 0c a
series of data or a partial series of data is writter.
in the work memory 41, and this data and the
coefficient from the coe.ficient memory 44 are
subjected to multiplication and addition by use of the
multiplier 43 and the ALU 46. In this case, howe~er,
a predetermined period of time is necessary to write
in the work memorv 41 the partial series of data
required in the filter processing, and the arithmetic
processing cannot be carried out at the same time. As
a result, the processing efficiency deteriorates.
In so-called ~ilter processing, the partial
series of data to be used in the arithmetic operation
is sequentially processed in the ~verlapped state;
consequently, in many cases, an arhitrary portion of
the series of data is repetltlvely used by shi_ting

the sequence of the series of data when the processing
is next executed.




- 26 -

~L~7~

PATEN~
503~6/
In the embodiment of Fig. 7, the data from
the FRA 31 and the data from the wor}; memor) 41 to be
described later are subject to a selection in the
selector 42 and the selected data is supplie~ to an
input of the multiplier 43. At the same time, the
data from the coe~ficient memor~ 44 is deli~ered to
another input of the multiplier 43. The
multiplication output of the multiplier 43 is
delivered to an input of the AL~ 46, and the output Or
the ALU 46 is supplied via the register 4, to ano'her
input of the AL~ 46. The data rrom the work me~ior~ 4
is supplied to the register 51. The data from the
register 51, the data ~rom the FR~ 31, and the ou'put
o~ the AT~l 46 are supplied to the selector 50. The
selected data ~rom the selector 50 ic delivered to the
write input of the work memory 41.
The data supplied to the FRA 31 in this
apparatus is supplied via the selector 50 to the work
memor~ 41 so as to be wr.tten therein. The data read
from the work memory 41 and the da a from the FRA 31
are delivered via the selector 42 to the multipl~er
43, which effects a multiplication with the
coefficient from the coefficient memory 44. The
resultant data is delivered to the ALU' 46. The output
of the multiplier 43 and the data from the register 47
are subjected to an operation such as addition, and
the obtained output is supplied to the work memory 41
(via the selector ~0~ and to the register 47.
Thereafter, arithmetic processing is effected by using
the data written in the work memory 41 and the data

latched in the register 47.




- 27

~ ~7(~5~

PA'rEl~T
, S03 67
The data read fr~m the work mem~ry 41 is
fed to the register 51, and the data from the res~ster
51 is rewritten in the work memory 41 via the selector
50.
Conseq~entl~, in this apparatus, the data
written in the work memory 41 is read and is subjecte~
to an arithmetic operation. At the same time, the
data can be rewritten in the work memory 4' via the
selector 50. Thus, any data in the partial series of
data to be used also in the next processing is latchec
in a register and the latched data is rewritter. at an
adcress that has undergone a necessary shift; that is,
the amount of data to be written can be reduced anc
hence the time required to write the data is
minimized.
For example, in a case where a one-address
shift for the next processing is to be executed, while
the data is read and is subjected to an arithmetic
processing, the data is la ched in the register 51;
and when the next data is read after the processing,
the data of the register 51 is rewritten at the
address from which ~he readout has been effected. As
a result, the data is shifted and is rewritten. At
the same time, the system constituted by the register
51 and the work memory 41 is separated from the
arithmetic section; consequen~tly, the rewrite
operation can be accomplished in concurrence with the
arithmetic processing, which greatly increases the
processing efficiency.
According to the present invention, since

the data written in the work memory can be rewritten
through the sequential shift operation, the necessary




- 28 -

~763~
PATEN~
S03267
portion of the partial series of data can be rewritten
for storage and thus the amount of data written in the
respective processings is reduced, thereby minimizing
the write time and improviny the processin5
efficienc~
In the embodiment of Fig. 8, the data from
the FRA 31 and the data from the work memory 41 are
supplied to the selector 42, and the selected data is
delivered to an input of the multiplier 43. The data
from the coefficient memory 44 is supplièd to another
input of the multiplier 43. The data ~rom the
coefficient memory 44 is supplled also to a register
The output of the multiplier 43 is fed to an
input of the ALU 46, which deli~ers output data
therefrom to the work memory 41 and ~.~ia the resister
47 to another input of the AL~ 46. The register 52
may alternatively be connected to the other input
terminal of the multiplier 43.
In a case where the square o~ a coefficient
is to be calculated by this apparatus, the coefficient
from the coefficient memory 44 is supplied to the
register 52, and the data from the register 52 is
delivered via the selector 42 to an input of the
multiplier 43. At the same time, the same coefficier,t
from the coefficient memory 44 is fed to another input
of the multiplier, and the obtained product (square of
the coefficient3 is supplied as an output by the ALU
46.
Since the output of the coefficient memory
44 is supplied to both inputs of the multiplier 43,

the square is ~uite simply calculated. With this
provision, operations such as the square of a




- 29 -

~L2~

~A~ENT
S03267
coefficient and the multiplication of two coef~icientc
can be simplv accomplished, for e~ample in LUT
processing, which considerabl~ increases the
efficiency of the arithmetic operation.
Accordins to the present invention, the
provision of a circuit for applying the output of the
coe ficient memory 44 to both inputs of the multiplier
43 facilitates such operations as the squaring of a
coefficient in LUT processing and the like.
In the embodiment of Fig. 9, the data fro~
the FRA 31 and the data from the work memorv 41 ar~
supplied to the selector 4', and the selected data
from the selector 42 lS supplied to an input of the
multiplier 43. The data from the coefficient memorv
44 and the output of the ALU 46 are supplied to the
selector 45, and the selected data therefrom is
supplied to another input of the multiplier 43. The
output of the multiplier 43 is fed to an input Or the
ALU 46, and the output from the AL~' 46 is supplied to
the selector 45, the register 53, and the work memory
41 and is further supplied via the register 47 to
another input of the ALV 46 at the same time. The
register 53 may alternatively be connected to the
other selector 45.
In a case where the square of data from the
FRA 31 is to be calculated, the data from the FRA 31
is fed to the registers 47 and 53 via the selector 42,
the multiplier 43, and the ALU 46. Next, the data
from the register ~3 i5 delivered via the selector 42
to an input of the multiplier 43; at the same time,
the data from the register 47 is fed to another input
of the multiplier 43 via the ALU 46 and the selector




- 30 -

~ ~7~5~

PATE~T
SO3267
45. The obtained product (square of the data) becomes
the output o~ the ALU 46.
Since the outpu' r~m the A'~l 46 can be
supplied to both inputs of the multiplier 43, the
squaring operation can be quite easily performed. In
addition, s~nce the output of the ALU 46 can be
supplied to either input of the multiplier 43, the
output rrom the AL~ 46 can be arbitrarily mu'tiplied
by the coefficient from the memory 44, the output o
the .~U 46, the data from the FRA 31, or the data ~rom
the work memory 41, thereby considera~ly im~ro~ing the
eff~ciency of the arithmetic operation.
Accordina to the present invention, there
are provided respective routes for ~upplyins the
output from the ALU to two inputs of the multiplier,
which greatly facilitates arithmetic operations such
as the squaring of numeric data.
F-g. 10 shows a preferred embodiment of
apparatus according to the invention applied to the
arithmetic sections 3~-1 to 32-n and 33-1 to 33-n
(Fiy. ,) of the PIP 3A (Fig. 1) of the digital signal
processing system.
In Fig. 10, the arithmetic section of the
PIP comprises two systems including parts A (on the
left side of the figure) and B (on the right side of
the figure). Each part comprises a coeficient
memory, a ~ork memory, a multiplier, an ALU and a
register to perform the basic arithmetic operations
necessary to efect the signal and graphics
processing.
Each of the coefficient memories A CM and B

CM includes 1024 x 16 bits, and the memory contents




- 31 -

~:7(~
..E~T
S03267
can be exchanged ~hrough the program change control
36-1 to 36-n (Fig. 2) ~f the PIP. However, the
~ontents cannot be read from apparatus on the PIP.
The coefficient memory is disposed to store data such
as coefficients necessary for the pr~cessing. For
example, the coefficients of a digital filter, sine
and cosine ~alues of FFT (fast Fourier transform~, and
addresses of the A CM and B CM are commonly used.
However, no problem arises, because the content~ o~
the A C~ an~ B CM can be independently supplied b~ the
TC 4. The output rom the A CM is supplied 2S an
input to the Al MUX or Al REG, and the outpu~ ~rom the
B CM is supplied as an input to the Bl MUX or Bl REG.
The contents of the Al REG and B1 REG are delivered to
the respective outputs a~ 'he next clock pulse CLK.
Each of the multipliers A ~PY and B ~;P~ is
a 16 bit x 16 bit parallel multiplier. Input x of the
A MPY is supplied with the output value of A C~.
selected by the Al MUX or the output value of the A
ALU, whereas input y is supplied with one of the
output values of the A1 REG, PL REG, A6 REG, B, REG,
or FRA selected by the A2 M~X. The PL REG is a
register circuit in which the PL value of the
microprogram is stored. tRefer to a manual of
Advanced Micro Device AM2910. The micro instructions
are stored with condition or jump addresses and can
also be the stored data itself.) The A6 REC- and B7
REG are register circ~its to store the outputs from
the work memories A TM and B TM, respectively~ The
FRA 31 comprises a group of shift regis~ers ha~ing a
variable structure and being contr~lled by the
processors (PVP 3B and TC 4) other than the PIP 3A and




- 32 -

~7a~
PATE~T
S03267
is used as an external input port of the PIP 3A. The
structure can be changed according to the ~rocessing
and can be shirted when necessar~ he output from
the multiplier A MPY includes 32 bits. From the
output, the 16-bit MSB and the 16-bit LSB can be
respectively e~tracted in different cycles. The
16-bit LSP may be obtained from the ~ input. The Al
RE~ is disposed to enable a squarins of the contents
of the A CM and a multiplication ~f the difrerent
contents. Part B is nearly the same as part A.
However, the output of ~he PL ~EG cannot be selected
by the B~ M~X, which has only four inputs instead of
the five of the A2 MUX. Since the FRA 31 has two
ports, the same data can be read from parts A and ~ at
the same time.
Each A AL~ and B AL~ is an arithmetic logic
unit in which logical operations such as addition,
subtraction, OR, and AND can be performed. The A AL~
is supplied with the output o~ the A MPY, the
selection output of the A2 MUX, the output of the A2
REG, or the output of the A3 REG. The B ALU is
supplied with the output of the B MPY, the selection
output of the B2 MUX, the output of the B2 REG, or the
output ~f the B3 REG. More strictly, the MUX
selection results in a selected output or no
selection. The A2 REG and the B2 REG are employed
because neither the A MPY nor the B MPY can perform a
multiplication on an input having a value equal ~o or
more than one. For example, in a case where a
coefficient of 1.5 is multiplied by an input from the
FRA 31, the multiplier multiplies the input by 0.5.
At the same time, the data is sent to the A? REG or




- 33 -

395~
PATENT
S03267
the B2 REG, thereby accomplishing a multiplicatio~
with a coefficient equal to or more than one. The A3
REG and the B3 REG link part A to part B. For
example, these registers are used in a case where an
operation to obtain a sum of products in a digital
filter is performed in parts A and B and each output
is used to obtain a final result. The output from the
A AL~ is fed to the A4 MUX, the A1 MUX, and ~he B3
REG, whereas the output fro~. the B AL~I is delivered to
the B4 MUX, the Bl M~lY., and the A3 REG. ~he A4 .~ is
use~ to select one of the outputs from the A AL~', the
IW REG, and the FRA 31.
The IN REG is an external input po-t. The
output selected by the A4 MUX is supplied tG the A4
REG, the O~Tl REG, the OUT2 REG, and the B4 MUX. The
A4 REG is used to store the input to the work memory A
TM. The OUTl REC- and the OUT2 REG are output ports of
the PIP and are controlled so that data can be
independentl~y sent thereto. The B4 M~'X is used to
select one of the outputs of the B AL~, the A4 MUX,
and the C ALV.
The outputs of the A4 REG and the A5 REG
undergo a selection b~J the A5 MUX, and the selected
output is stored in the A TM, the A6 REG, and A7 RFG
and the A5 REG. The data can be naturall,~ stored in
any one there~f. The A TM has a bidirectional
input/output function. When an output is effected by
the A TM, neither of the outputs from the A4 REG and
the A5 ~EG i~ selected by the A5 MUX, and the output
of t'he A TM is stored in the A5 REG, the A6 REG, and
the A7 REG. The A5 REG serves to shift the address of
the A TM. More concretely, the dela~ pr~cessing of



_ 34 ~

~2'7~5~
~,, rE~
S~3267
the digital filter can be effectively performe~. T;~e
A7 REG is a register to send data from part A to part
B. The output of ~he A7 REG is delivered to the B~
MUX. This provision is effective for a shading
operation in which data is squared in part A and the
resultant data is multiplied by a value in part B.
Since this applies also to part B, the description
thereof will be omitted.
The C ALU is located at an intermedia'e
point between the arithmetic section and the con'ro'
section. The data selected by the A3 MUX is supplied
as an input to the C ATU and, after undergoins an
arithmetic operation in the C AL~, is trar.smittec to
the CM REG, the T~, REG, the VECT REG, and the B4 M~X.
The arithmetic function of the C A'~' is the same as
that of the A ALU and the B ALU. The CM REG is a
register circuit to store the addresses of the
coefficient memories A CM and B C~., and the T.~. REG is
a register circuit to store the addresses of the work
memories A T~ and B TM. The VECT REG is a register
circuit to store the iteration count of a program loop
and the jump destination to be used in the program
controller (PRGCNT) of the control section. Throuch
the bus to the B4 MUX, the result of an arith~etic
operation in the C ALU can be returned to the
processing section. This enables use of the C AL~
also as an auxiliary apparatus for the A ALU and the B
~U .
With the provision of the CM REG and TM
REG, the data of the processing section can be used as
addresses of the coef'icient memory and the work
memory, and hence look-up table processing is




35 -

PATE~'T
, S03~6'
facilitated. In a case where FFT (fast Fourier
transform) processing is to be effected, butterfly
operation is achieved bv use of the A MP~, the A AT~,
the B MP~, and the B ALU, and the addresses of the A
TM and the B TM storing data and the addresses ~f the
A CM and the B CM containing coefficients ~sin, cos)
are computed by use of the C ALLl. For butterfly
operatior., the real part and the imaginary part of
each complex number are processed si~ultaneously in
parts A and P, respectively. Since the arithmetic
operations of the real and imaginary par s can be
accomplished at the same time, the load of the
addressing operation for the data and coefficients can
be reduced; consequently, the overall processing
efficiency is improved and the processing speed is
increased. This is an effect obtained by the
provision of two systems including parts A and B. The
TM REG and the C~ REG comprise four registers, and
hence the same address need not be calculated in the C
AL~, which increases the efficiency thereof.
Although, in this example, because of
physical restrictions such as the size of the circuit
board, parts A and B are not symmetrical, the circuits
may be made symmetrical.
Many modifications of the preferred
embodiments of the invention disclosed above will
readily occur to those skilled in the art upon
consideration of this disclosure. All such
modifications are intended to be included within the
invention, and the invention is limited only by the
appended claims.




- 36 -

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 1990-06-26
(22) Filed 1987-04-29
(45) Issued 1990-06-26
Deemed Expired 1994-12-26

Abandonment History

There is no abandonment history.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $0.00 1987-04-29
Registration of a document - section 124 $0.00 1987-10-02
Maintenance Fee - Patent - Old Act 2 1992-06-26 $100.00 1992-06-12
Maintenance Fee - Patent - Old Act 3 1993-06-28 $100.00 1993-06-11
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
HASEBE, ATSUSHI
SONY CORPORATION
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Drawings 1993-09-22 8 212
Claims 1993-09-22 6 184
Abstract 1993-09-22 1 28
Cover Page 1993-09-22 1 33
Representative Drawing 2002-03-05 1 12
Description 1993-09-22 35 1,331
Fees 1993-06-11 1 32
Fees 1992-06-12 1 29