Note: Descriptions are shown in the official language in which they were submitted.
wo 91/10188 1;;~ 53 Pcr/usso/07034
METHOD AND PROCESSOR FOR
HIGH-SPEED CONVERGENCE FACTOR
DETERMINATION
Traditional implementations of floating-point convergent
division algorithms and square root algorithm determinations
are hampered by the lack of efficient computation of
l 0 convergence factors within both algorithm~. Convergence
algorithms utilize repeated multiplication to achieve results of
a predetermined accuracy. A feasible range of a modified
input value, X, has been shown to be O ~ X < 2. Further
limitation of such a rangc would expedite algorithmic
l 5 determinations.
In add tion, determination of a convergence factor for
convergent di ision generally requires a normalization step, a
subtraction operation requiring a carry-propagate step, and
another normalization stcp. Determination of a convergence
0 factor for a square root computation, in addition~ requires a
scaling computation. Evaluation of a convergence factor for
both thc convergent division and the square root determination
thus requires more computation cycles than a f!oating-point
multiplication computation. Because the convergence factors
2 5 occur serially in both computations, the inefffciency of the
computation of the convergence factors has directly reduced
the efficiency of computation within both the convergent
division and square root determination algorithms.
The necd exists for a more efficient method of
30 determining convergencc factors for convergent division ~nd
for square root determination to facilitate these two over-all
processes within a digital platform~
. :
i
WO 91tlO188 . PCr/US9~)/07034
~:~35~7~ ~3
2 ;
Summarv of the In~ention
The present invention enhances the efficiency of
determining convergence factors for floating-point conver~ent `.
division and square root determination in a numeric processor,
such as a digital signal processor, by expediting the
convergence factor determinations via combinational logic and
by assuming a limited range for a modified input value for a
convergence factor determination. This approach avoids a
subtraction and carry-propagation operation, thereby allowing
both convergent algorithm computations to be limited
primarily by a multiplier latency time factor.
~ri~f DescriDtion of the Drawin~s
FIG. I is a block diagram depicting one embodiment of
the invention.
FIG. 2 is a block diagram depicting one embodiment of a
determination of a new normalized significand for enhancing
- convergence factor de~ermination in a convergent division
computation .
FIG. 3 is a block diagram depicting one embodiment of a
determination of a new normalized significand for enhancing
convergencc factor determination in a square root
determination.
FIG. 4 is block diagram of a computer hardware
2 5 implementation of the invention.
~est Mode For CarrvinQ out the ln~en~ion
FIG. 1, gcnerally depicted by the numeral 100, is a flow
chart setting forth one embodiment of the present invention
3 0 that utilizes determination of a ncw normalized significand (f)
via an expedient combinational logic, as opposed to multiple
operations, to determine a new value for X (116), being X', and
-~ obtains a conYergence factor utilized in a mathematical
;~ determination, being convcrgent division or square root
WO 91/10188 PCl/US90/07034
53
determination, that is iterated until a solution with a
predetermined degree of accuracy is obtained ( I 18, 120).
To commence convergent division and square root
S determination utilizing floating-point arithmetic in accordance
with one embodiment of the invention, a numeric processor
(such as a digital signal processor) utilizes a processing
platform to check an input X to determine whether or not X is
equal to + ~, + O, or Not-a-Number (NaN) (102). If X is equal
to +oo, +0, or NaN, an error-check mechanism bypasses
convergence factor determination according to a rationale tha
utilizes a range limitation of X to O.S < X < B (122, B being set
out more specifically below). If X is other than + oo, +0, or
NaN~ X is modified by multiplication by a suitable seed value S,
1 S S - l/X being a workable value, to obtain an X in a range of
0.5 < X < B such that B is assigned 1.5 for convergent division
and 2.0 for square root determination (103, 104)~ Set X to X =
2e ~ f, provided that 1 < f < 2 (lOS). The processing platform
then checks whether or not 0.5 < X < 1.0 (106). If O.S < X <
1.0, the platform sets e = -1 (110). If it is not true that O.S < X
< 1.0, then 1.0 X < B (108), and e is set to O (112). When e
has been set (110, 112), f' is determined (114; additional detail
:: ~ ; regarding the determination of f will be provided below), and
X' is also~determined (116)~ where X' = 2e~ f l`he platform
: 25 utilizcs X' to compute a convergence factor for the
:: ~ mathematical detetmination being invoked, being convergent
: division: or square root determination (118). Then a
convcrgencc algorithm is computed for the mathematical
determination being invoked, convergent division or square
3 0 root determinatic . and is itcrated until a solution with a
predetcrmined degree of accuracy is obtained ( 120).
` :: In one cmbodiment of a convergence factor
~: determination for a convergent division determination, set
forth in FIG. 2 and depicted generally by the numeral '~00,
-
:,
wo gl/10188 ;~ ,3~ Pcr/US90/07034
after e has been set ( 1 10, 1 12), the processing platform biases
the exponent e in accordance with the IEEE 754-1985 floating-
point standard using an odd bias, yielding a biased exponent
with a least significant bit (lsb)(206). If the least signi~lcant bit r
5 of the biased exponent is 1 (214J, a binary f value of l.wxyzO is
selected (212, 213). If the least significant bit of the biased
exponent is 0, a binary f value of l .1 vwxy is selected (212,
215). The selected value of f to the right of the binary decimal
,~ is inverted (216) to yield a binary number of 1.0v'w'x'y' for f
if the related least significant bit is 0 and 1.w'x'y'z'1 for f if the
related least significant bit is 1 (218). In addition, the least
significant bit of the biased exponent e is inverted (208~ to
form a final biased exponent for f (21Q).
In one embodiment of a convergence factor
determination for a square root determination, set forth in FIG.
3 and depicted generally by the numeral 300, after e has been
set (112, 114), the proccssing platforrn biases the exponent e in
accordance with thc IEEE 754-1985 floating-point standard
using an odd bias, yielding a biased exponent with a least
significant bit (306). If the least significant bit of the biased
~- ~ exponent is 1 (3l4), a binary f value of l.vwxyz is selected
`~ (312, 313). If thc least significant bit of the biased exponent is
` ~ 0 (314), a binary f valuc of l.llvwx is selected (312, 3I5).
The selected value of f to thc rigbt of the binary decimal is
2 5 then invertcd (316) to yield a binary number of 1.00v'w'x' if
the related Icast signîficant bit is 0 and l.v'w'x'y'z' if the
relatcd least significant bît is 1 (318). In addîtion, the least
sîgnificant bit of thc bîased exponent e is inverted (308) to
form a final biased cxponent for f' (310?.
3 0 It should be noted that, both in convergent divîsion and
square root determînation, convergence factor determination
wîll incur one lsb of error, The cause of the onc lsb of error is
due to the usc of the one's complement rather than the two's
complement opcratîon. While the two's complement wîll not
~'
WO 91/10188 PCT~/US90/07034
35~
give the one lsb of error, the two's complement requires a
carry propagation which slows down the convergencc factor
determination. The one's complement avoids this carry ~
propagation and is preferred because of its speed. The one lsb
S of error is negligible in convergence algorithms since the
computation is typically performed using extended precision
hardware and is then rounded to a lower precision.
In the case of X = 1, the onc lsb ot error will cause the
result to be slightly less than one, speci~lcally, one lsb less than
10 one, and the exponent lsb changes accordingly.
FIG. 4 depicts a hardware implementation of the present
invention, generally depicted by the numeral 400. A computer
program for implementation of the present invention may be
stored in the program memory (404), other memory (412), or
15 may be embodied in hardware in the arithmetic logic unit
(ALU) (406) by allocation of data storage means and data
manipulation means of a numeric processor. In one
embodiment, the program contro I unit (402) utilizes the bus
(410) to select a computer program to implement the present
0 invention, and the status register determines whether X = + ~.
+ O, or NaN (408). If X - + ~, :i: 0, or NaN, a processing
platform processes an error-check mechanism, and processing
- stops. If X is other than ~ 0, or NaN, the ALU converts X
to a value X such that O.S S X < B, where multiplication by
25 -1/X is a workabl`e conversion and ~ is selected as 1.5 for
convergent division and as 2.0 for square root determination
(406).
Por convergent division a primary selecting means of the
ALU selects a valuc of negative one for e where O.S S X c 1.0,
30 and a secondary selecting means of the ALU selects a value of
zero for e where 1.0 S X ~ 1.5 (406). For square root
determination a tertiary selecting means of the ALU selects a
value of ncgative one for e where 0.5 S X < 1.0, and a
quaternary means of the ALU selec~s a value of zero for e
wo 91/101~8 Pcr/usso/0703
where l.0 S X < 2.0 (406). This exponent e is biascd with an
odd bias value according to the IEEE 754-1985 floating-point
standard. The ALU computes a value of f, where f = X/2
provided that 1 ~i f < 2, and outputs f as a series of binary bits
consisting of a most significant bit, being one, to the left of a
point and remaining binary bits to the nght of the binary point.
generally descnbed here as l.vwxyz (406). For convergent
division and 0.5 S X ~ l.0 such that the lsb of the biased
exponent is equal to zero, the ALU selects an output of a most
signi~lcant binary bit of one to the left of the binary point and
of one followed by the binary bits to the right of the binary
point for the new binary bits to the right of the binary point
(406). For convergent division and l.0 < X < l.5 such that the
lsb of the biased exponent is equal to l, the ALU selects an
l 5 output of a most significant binary bit of one to the left of the
binary point and of the initially second and all following binary
~: bits to the right of the binary decimal point together with a
zero for thc new binary bits to the right of thc binary point
(406).
For squarc root determination and 0.5 c X < l.0 such
that the lsb of biased exponent is equal to zero, the ALU selects
an output of a most significant binary bit of one to thc left of
the binary point and of two ones followed by all bits initially to
the right of thc binary decimal for the new binary bits to the
'7 S right of thc binary point (406). For square root detcrmination
and l.0 S X < 2.0 such that thc lsb of the biased exponent is
equal to one, the ALU sclccts an output of a most significant
binary bit of onc to thc left of the binary point and of all bits
initially to the right of the binary point for thc ncw binary bits
3 0 to the right of thc binary point (406). Thè ALU then inverts all
bits to the right of thc binary decimal point for all
mathcmatical dcterminations described abo-le, the output
being f (406). The ALU also complements the lsb of the biased
exponent to dcterminc the new biased exponent (406).
~0 91/10188 PCI/US90/07034
3~J~3
For all such determinations, the ALU determines X',
where X' = 2e ~ f (406). For convergent division the ALU
determines a convergence factor of 2.0 - X' and for square~ root
determination the ALU deaérmines a convergence factor of 1.5
5 - O.5X' (406~. The ALU utilizes one or more data manipulation
and storage devices for computing a convergent division
algorithm or a square root determination employing a related
convergence factor until a solution with a predetermined
degree of accuracy is obtained (406). :~
What is claimed is: ;