Patent 2082641 Summary

(12) Patent Application:	(11) CA 2082641
(54) English Title:	HIERARCHICAL ENTROPY CODED LATTICE THRESHOLD QUANTIZATION ENCODING METHOD AND APPARATUS FOR IMAGE AND VIDEO COMPRESSION
(54) French Title:	METHODE ET APPAREIL DE CODAGE A QUANTIFICATION DE SEUIL HIERARCHIQUE POUR LA COMPRESSION DE DONNEES VIDEO
Status:	Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication

Bibliographic Data

(51) International Patent Classification (IPC):	H4N 7/12 (2006.01) G6T 9/00 (2006.01)
(72) Inventors :	BAKER, RICHARD L. (United States of America) BERNSTEIN, JEFFREY (United States of America) GIROD, BERND (United States of America) YUAN, XIANCHENG (United States of America) THOMPSON, EDMUND (United States of America)
(73) Owners :	PICTURETEL CORPORATION
(71) Applicants :	PICTURETEL CORPORATION (United States of America)
(74) Agent:	RICHES, MCKENZIE & HERBERT LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	1991-05-10
(87) Open to Public Inspection:	1991-11-12
Availability of licence:	N/A
Dedicated to the Public:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US1991/003286
(87) International Publication Number:	US1991003286
(85) National Entry:	1992-11-10

(30) Application Priority Data:

Application No.	Country/Territory	Date
522,602	(United States of America)	1990-05-11

Abstracts

English Abstract

ABSTRACT OF THE DISCLOSURE
A method and apparatus for encoding interframe
error data in an image transmission system, and in
particular in a motion compensated image transmission system
for transmitting a sequence of image frames from a
transmitter to a receiver, Employ hierarchical entropy coded
lattice threshold quantization to increase the data
compression of the images being transmitted. The method and
apparatus decimate an interframe predicated image data and
an uncoded current image data, and apply hierarchical
entropy coded lattice threshold quantization encoding to the
resulting pyramid data structures. Lossy coding is applied
on a level-by-level basis for generating the encoded data
representation of the image difference between the predicted
image data and the uncoded original image. The method and
apparatus are applicable to systems transmitting a sequence
of image frames(or other pattern data, such as speech) both
with and without motion compensation.

Claims

Note: Claims are shown in the official language in which they were submitted.

WO91/18477 PCT/US91/03286
-48-
What is claimed is:
1. A method for encoding interframe error data,
in an image transmission system, for transmitting a
sequence of image frames, said method comprising the
steps of
decimating an interframe predicted image
data for a current image frame for generating a
prediction pyramid data structure having a plurality
of decimation levels,
decimating an uncoded current image data
representing the uncoded current image frame for
generating a current image pyramid data structure
having said plurality of decimation levels, and
applying hierarchical entropy coded, lattice
threshold quantization encoding to said prediction
and current image pyramid data structures on d
selected level by level basis for generating an
encoded data representation of the difference between
the predicted image data and the uncoded current image
data.
2. A method for encoding interframe error data,
in an image transmission system, for transmitting a
sequence of image frames, said method comprising the
steps of
decimating an interframe predicted image
data for a current image frame for generating a
prediction pyramid data structure having a plurality
of decimation levels,
decimating an uncoded current image data
representing the uncoded current image frame for
generating a current image pyramid data structure
having said plurality of decimation levels, and

WO91/18477 PCT/US91/03286
-49-
applying hierarchical lattice threshold
quantization encoding to said prediction and current
image pyramid data structures on a selected level by
level basis for generating an encoded data
representation of the difference between the predicted
image data and the uncoded current image data,
3. The method of claim 1 wherein said applying
step comprises the steps of
applying said hierarchical encoding to said
data structures on a block-by-block basis, and
blurring blocks of the difference
representation when a predicted image data fails to
adequately represent a block portion of said current
image at a pyramid structure level.
4. The method of claim 1 wherein said applying
step comprises the step of
employing arithmetic coding for generating,
in part, said encoded representation.
5. The method of claim 1 further comprising the
steps of
applying said encoding to said data
structures of a level on a block by-block basic, and
shifting block location boundaries from
frame to frame of said sequence of image frames for
improving the encoding efficiency.
6. The encoding method of claim 1 further
comprising the step of
applying an E-8 lattice for said lattice
threshold quantization.

WO91/18477 PCT/US91/03286
-50-
7. The method of claim 1 wherein said applying
step comprises the steps of
subtracting, at a top level, the predicted
image data structure from the current image data
structure for generating a top level output image,
forming a warped interpolation error data
structure at the lower levels by taking the difference
between the predicted image data at the lower level
and the interpolated predicted image data at the next
higher level, and
forming a prediction image at each lower
level by combining an interpolated output image of the
next higher level with a warped interpolation error
data structure at the lower level.
a. The method of claim 7 wherein said
prediction forming step further comprises the step of
interpolating the output image from the next
higher level for generating said interpolated output
image.
9. The method of claim 7 further comprising the
step of
applying said lattice quantization and
coding only to the prediction image error at a bottom
plurality of levels of the pyramid data structures,
transmitting only blur information at the
next to highest level, and
transmitting only a scalar quantization data
at the top level of processing.

WO91/18477 PCT/US91/03286
-51-
10. An apparatus for encoding interframe error
data in an image transmission system for transmitting
a sequence of image frames, said apparatus comprising
means for decimating an interframe predicted
image data for a current image frame for generating a
prediction pyramid data structure having a plurality
of decimation levels,
means for decimating an uncoded current
image data representing the uncoded current image
frame for generating a current image pyramid data
structure having said plurality of decimation levels,
and
means for applying a hierarchical entropy
coded, lattice threshold quantization encoding to said
prediction and current image pyramid data structures
on a selected level by level basis for generating an
encoded data representation of the difference between
the predicted image data and the uncoded current image
data.
11. The apparatus of claim 10 wherein said
applying means further comprises
means for applying said hierarchical
encoding to said data structures on a block-by block
basis, and
means for blurring blocks of the difference
representation when a predicted image data fails to
adequately represent a block portion of said current
image at a pyramid structure level.
12. The apparatus of claim 10 wherein said
applying means further comprises
arithmetic coding means for generating, in
part, said encoded representation.

WO91/18477 PCT/US91/03286
-52-
13. The apparatus of claim 10 further comprising
means for applying said coding to said data
structures of a level on a block-by-block basis, and
means for shifting block location boundaries
from frame to frame of said sequence of image frames
for improving encoding efficiency.
14. The encoding apparatus of claim 10 wherein
said applying means further comprises
means for using an E-3 lattice for said threshold
quantization encoding.
15. The apparatus of claim 10 wherein said
applying mean comprises
means for subtracting, at a top level, the
predicted image data structure from the current image
data structure for generating a top level output
image,
means for forming a warped interpolation
error data structure at the lower levels by taking the
difference between the predicted image data at the
lower level and the interpolated predicted image data
at the next higher level, and
means for forming a prediction image at each
lower level by combining an interpolated output image
of the next higher level with a warped interpolation
error data structure at the lower level.
16. A method for encoding interframe error data,
in a motion compensation image transmission system,
for transmitting a sequence of image frames, said
method comprising the steps of
decimating an interframe predicted image
data for a current image frame for generating a

WO91/18477 PCT/US91/03286
-53-
prediction pyramid data structure having a plurality
of decimation levels,
decimating an uncoded current image data
representing the uncoded current image frame for
generating a current image pyramid data structure
having said plurality of decimation levels, and
applying hierarchical entropy coded, lattice
threshold quantization encoding to said prediction and
current image pyramid data structures on a selected
level by level basis for generating an encoded data
representation of the difference between the predicted
image data and the uncoded current image data.
17. The method of claim 16 wherein said applying
step comprises the steps of
applying said hierarchical encoding to said
data structures on a block-by-block basis, and
blurring blocks of the difference
representation when a predicted image data fails to
adequately represent a block portion of said current
image at a pyramid structure level.
18. The method of claim 16 wherein said applying
step comprised the step of
employing arithmetic coding for generating,
in part, said encoded representation.
19. The method of claim 16 further comprising
the step of
applying said encoding to said data
structures of a level on a block-by-block basis, and
shifting block location boundaries from
frame to frame of said sequence of image frames for
improving the encoding efficiency.

WO91/18477 PCT/US91/03286
-54-
20. The method of claim 16 wherein said applying
step comprises the steps of
subtracting, at a top level, the predicted
image data structure from the current image data
structure for generating a top level output image,
forming a warped interpolation error data
structure at the lower levels by taking the difference
between the predicted image data at the lower level
and the interpolated predicted image data at the next
higher level, and
forming a prediction image at each lower
level by combining an interpolated output image of the
next higher level with a warped interpolation error
data structure at the lower level.
21. The method of claim 20 wherein said
prediction forming step further comprises the step of
interpolating the output image from the next
higher level for generating said interpolated output
image.
22. The method of claim 20 further comprising
the step of
applying said entropy coded, lattice
threshold quantization and coding only to the
prediction image error at a bottom plurality of levels
of the pyramid data structures,
transmitting only blur information at the
next to highest level, and
transmitting only a scalar quantization data
at the top level of processing.

WO91/18477 PCT/US91/03286
-55-
23. An apparatus for encoding interframe error
data in a motion compensation image transmission
system for transmitting a sequence of image frames,
said apparatus comprising
means for decimating an interframe predicted
image data for a current image frame for generating a
prediction pyramid data structure having a plurality
of decimation levels,
means for decimating an uncoded current
image data representing the uncoded current image
frame for generating a current image pyramid data
structure having said plurality of decimation levels,
and
means for applying a hierarchical entropy
coded, lattice threshold quantization encoding to said
prediction and current image pyramid data structures
on a selected level by level basis for generating an
encoded data representation of the difference between
the predicted image data and the encoded current image
data.
24. The apparatus of claim 23 wherein said
applying means further comprises
means for applying said hierarchical
encoding to said data structures on a block-by-block
basis, and
means for blurring blocks of the difference
representation when a predicted image data fails to
adequately represent a block portion of said current
image at a pyramid structure level.
25. The apparatus of claim 23 wherein said
applying means further comprises
arithmetic coding means for generating, in
part, said encoded representation.

WO91/18477 PCT/US91/03286
-56-
26. The apparatus of claim 23 further comprising
means for applying said coding to said data
structures of a level on a block-by-block basis, and
means for shifting block location boundaries
from frame to frame of said sequence of image frames
for improving encoding efficiency.
27. The apparatus of claim 23 wherein said
applying means comprises
means for subtracting, at a top level, the
predicted image data structure from the current image
data structure for generating a top level output
image,
means for forming a warped interpolation
error data structure at the lower levels by taking the
difference between the predicted image data at the
lower level and the interpolated predicted image data
at the next higher level, and
means for forming a prediction image at each
lower level by combining an interpolated output image
of the next higher level with a warped interpolation
error data structure at the lower level.
28. A method for encoding interframe error data,
in an image transmission system, for transmitting a
sequences of image frames, said method comprising the
steps of
forming a difference image representing, on
a pixel-by-pixel basis, the difference between a
predicted image data for a current image frame and an
uncoded current image data representing the uncoded
current image frame,

WO91/18477 PCT/US91/03286
-57-
decimating said difference image for
generating a difference image pyramid data structure
having a plurality of decimation levels, and
applying hierarchical entropy coded, lattice
threshold quantization encoding to said difference
image pyramid data structure on a selected level by
level basis for generating an encoded data
representation or the difference between the predicted
image data and the uncoded current image data.
29. The method of claim 28 further comprising
the step of
forming said predicted image data using
interframe motion compensation.
30. The method of claim 28 wherein said applying
step comprises the steps of
applying said hierarchical encoding to said
data structures on a block-by-block basis, and
blurring blocks of the predicted image
representation when a predicted image data fails to
adequately represent a block portion of said current
image at a pyramid structure level.
31. The method of claim 28 wherein said applying
step comprises the step of
employing arithmetic coding for generating,
in part, said encoded representation.
32. The method of claim 28 further comprising
the step of
applying said encoding to said data
structures of a level on a block-by-block basis, and

WO91/18477 PCT/US91/03286
-58-
shifting block location boundaries from
frame to frame of said sequence of image frames for
improving the encoding efficiency.
33. The method of claim 28 wherein said applying
step comprises the step of
forming an interpolation error data
structure at the lower levels by taking the difference
between the difference image data at the lower level
and an interpolated reconstructed difference image
data at the next higher level.
34. A method for encoding interframe error data,
in an image transmission system, for transmitting a
sequence of image frames, said method comprising the
steps of
forming a difference image representing, on a
pixel-by-pixel basis, the difference between a
predicted image data for a current image frame and an
uncoded current image data representing the uncoded
current image frame,
decimating said difference image for generating a
difference image pyramid data structure having a
plurality of decimation levels, and
applying hierarchical lattice threshold
quantization uncoding to said difference image pyramid
data structure on a selected level by level basis for
generating an uncoded data representation of the
difference between the predicted image data and the
uncoded current image data.
35. An apparatus for encoding interframe error
data in an image transmission system for transmitting
a sequence of image frames, said apparatus comprising

WO91/18477 PCT/US91/03286
-59-
means for forming a difference image
representing, on a pixel-by-pixel basis, the
difference between predicted image data for a current
image frame and an uncoded current image data
representing an uncoded current image frame,
means for decimating said difference image
for generating a difference image pyramid data
structure having a plurality of decimation levels, and
means for applying a hierarchical entropy
coded, lattice threshold quantization encoding to said
difference image pyramid data structure on a selected
level by level basis for generating an encoded data
representation of the difference between the predicted
image data and the encoded current image data.
36. The apparatus of claim 35 further comprising
means for forming aid predicted image data
using interframe motion compensation.
37. The apparatus of claim 35 wherein said
applying means further comprises
means for applying said hierarchical
encoding to said data structures on a block-by-block
basis, and
means for blurring blocks of the predicted
image representation when a predicted image data fails
to adequately represent a block portion of said
current image at a pyramid structure level.
38. The apparatus of claim 35 wherein said
applying means further comprises
arithmetic coding means for generating, in
part, said encoded representation.

WO91/18477 PCT/US91/03286
-60-
39. The apparatus of claim 35 further comprising
means for applying said coding to said data
structures of a level on a block-by-block basis, and
means for shifting block location boundaries
from frame to frame of said sequence of image frames
for improving encoding efficiency.
40. The apparatus of claim 35 wherein said
applying means comprises
mean for forming an interpolation error
data structure at the lower levels by taking the
difference between the difference image data at the
lower level and an interpolated reconstructed
difference image data at the next higher level.
41. An apparatus for encoding interframe error
data in an image transmission system for transmitting
an sequence of image frames, said apparatus comprising
means for forming a difference image
representing, on a pixel-by-pixel basis, the
difference between predicted image data for a current
image frame and an uncoded current image data
representing an uncoded current image frame,
means for decimating said difference image for
generating a difference image pyramid data structure
having a plurality of decimation levels, and
means for applying a hierarchical lattice
threshold quantization encoding to said difference
image pyramid data structure on a selected level by
level basis for generating an encoded data
representation of the difference between the predicted
image data and the encoded current image data.

WO91/18477 PCT/US91/03286
-61-
42. A method for encoding interframe image data,
in an image transmission system, for transmitting a
sequence of image frames, said method comprising the
steps of
applying entropy coded, lattice threshold
quantization encoding to said image data on a block-
by-block basis, and
shifting block location boundaries from
frame to frame of said sequence of image frames for
improving encoding efficiency.
43. The method of claim 42 wherein said shifting
step comprises the step of
shifting said block boundaries a plurality
of picture elements in each of a plurality of axial
directions defining an image plane.
44. An apparatus for encoding interframe image
data, in an image transmission system, for
transmitting a sequence of image frames, said
apparatus comprising
means for applying entropy coded, lattice
threshold quantization encoding to said image data on
a block-by-block basis, and
means for shifting block location boundaries
from frame to frame of said sequence of image frames
for improving encoding efficiency.
45. The apparatus of claim 44 wherein said
shifting means comprises
means for shifting said block boundaries a
plurality of picture elements in each of a plurality
of axial directions defining an image plane.

WO91/18477 PCT/US91/03286
-62-
46. An apparatus for encoding interframe error
in an image transmission system for transmitting a
sequence of image frames, said apparatus comprising
means for decimating an interframe predicted
image data for a current image frame for generating a
prediction pyramid data structure having a plurality
of decimation levels,
means for decimating an uncoded current image
data representing the uncoded current image frame for
generating a current image pyramid data structure
having said plurality of decimation levels, and
means for applying a hierarchical lattice
threshold quantization encoding to said prediction and
current image pyramid data structures on a selected
level by level basis for generating an encoded data
representation of the difference between the predicted
image data and the uncoded current image data.
47. A method for encoding a sequence of
multidimensional data vectors comprising the steps of
applying lattice threshold quantization encoding
to said sequence of multidimensional data vectors for
generating a sequence of closest lattice points to
said data vectors,
entropy encoding lattice point identifying data
representing said sequence of closest lattice points,
and
delivering said entropy encoded data to a
communication channel.
48. The method of claim 47 wherein said applying
step comprises for each data vector, the steps of
employing an E-8 lattice for said threshold
quantization encoding,

WO91/18477 PCT/US91/03286
-63-
summing, over all said dimensions, the magnitude
of the round-off error for a vector coordinate,
determining the oddness or evenness of the sum of
candidate lattice points for a vector, and
determining the closest lattice point to the data
vector.
49. The method of claim 47 further comprising
the steps of
indexing the identity of each lattice point
within a selected range of shells, and
entropy encoding said indices for transmission
over said communication channel.
50. The method of claim 49 wherein said indexing
step comprises the steps of
pairing lattice points to form 4x4 blocks,
identifying whether both lattice points fall in
Shell 0 of the lattice,
encoding the shell numbers for all points not in
Shell 0, and
identifying the index of each lattice point not
within Shell 0 using symmetries of the lattice index
identifiers.
51. The method of claim 50 wherein said index
identifying step comprises
identifying, for groups of lattice points, a
parent centroid index, and
using spatial symmetries to identify variations
of the parent centroid index.

WO 91/18477 PCT/US91/03286
-64-
52. The method of claim 51 further comprising
the step of
explicitly transmitting lattice point coordinates
for points lying in Shell 3 or a higher level shell.
53. Apparatus for encoding a sequence of
multidimensional data vectors comprising
means for applying lattice threshold quantization
encoding to said sequence of multidimensional data
vectors for generating a sequence of closest lattice
points to said data vectors,
means for entropy encoding lattice point
identifying data representing said sequence of closest
lattice points, and
means for delivering said entropy encoded data to
a communication channel.
54. The apparatus of claim 53 wherein said
applying means comprises, for each data vector,
means for employing an E-8 lattice for said
threshold quantization encoding,
means for summing, over all dimensions, the
magnitude of the round-off error for a vector
coordinate,
means for determining the oddness or evenness of
the sum of candidate lattice points for a vector, and
means for determining the closest lattice point
to the data vector.
55. The apparatus of claim 53 further comprising
means for indexing the identity of each lattice
point within a selected range of shells, and
means for entropy encoding said indices for
transmission over said communication channel.

WO91/18477 PCT/US91/03286
-65-
56. The apparatus of claim 55 wherein said
indexing means comprises
means for pairing lattice points to form 4x4
blocks,
means for identifying whether both lattice points
fall in Shell 0 of the lattice,
means for encoding the shell numbers for all
points not in Shell 0, and
means for identifying the index of each lattice
point not within Shell 0 using symmetries of the
lattice index identifiers,
57. The apparatus of claim 56 wherein said index
identifying means comprises
means for identifying, for groups of lattice
points, a parent centroid index, and
means for using spatial symmetries to identify
variation of the parent centroid index.
58. The apparatus of claim 57 further comprising
means for explicitly transmitting lattice point
coordinates for points lying in Shell 3 or a higher
level shell.

Description

Note: Descriptions are shown in the official language in which they were submitted.

WO Ylil8q77 PCr/1,'$~1/032~6
A ~IERARCHICAL ENTROPY CODED LA~TICE
~RESHOL~ QUANTIZATION ENCODING ME~HOD
AND APPARATUS POR IMAGE AND VIDEO COMPRESSION
9ackqround of the rnvention
The invent~.on relates generally to data
communication and signal processing methods and
apparatus, and in particular to a method and apparat~s
for reliably and efficiently encodin~ and decoding
sequences Oe image data, for example, that transmitted
over a telephone communications channel.
The transmission o~ sequences of images, and i.~ -
particular sequences of naturally occurrin~ images
such as those represented by a television si~nal,
continues to be the subject Oe a significant amount of
investigation. Typically, investigators have relied
upon the highly redundant nature of successive images
in the sequence and have often modeled the image data
as a Markov process with a correlation coefficient
close to unity. The three-dimensional Markov model
provides a motivation for utilizing differential
pulse-code-~odulation tDpcM) and trans~orm coding
techniques to take account of the interframe
~edundancy.
~ y analyzing the nature of typical moving video,
it is easy to become convinced that the principal
change occur~ing between successive ~rames is the
inhomogeneou~ motion o the objects within the fram@.
It has also been rscognized that an accu~ate apparatus
and method of estimating and compensating for this
spatially d~pendent motion enables the construction of
an interfra~e data compression method and apparatus
which can have substantially better performance t~an

W091/~ PC~/~S~/03~&6
-2- ~2~
can be achieved by sending a signal representative
merely of the difference between successive frames.
As a result, various motion compensating coding
methods and apparatus have been developed. These
systems typically are either ~eceiver-based motion
compensation systems or transmitter-based motion
compensation systems. In the receiver-based motion
compensation system, the receiver makes a prediction
as to the motion and compensates the previous rame
~or the expected motion. The transmitter, operating
in the same manner, then sends only an error signal
describing what must be done at the :eceiver in order
to correct the receiver predicted frame. The error
signal is typically coded to reduce its bandwidth.
For a transmitter-based motion compensa~ion
system, the motion estimation process occurs only at
the transmitter. Displacement vectors are generally
determined over various regions of the image and this
data is then transmitted to the receiver along with an
error inÇormation data signal. At the receiver, the
compensation process is performed on the preYiously
coded image first using the motion information
provided by the transmitter. The error signal data
provided by the transmitter i5 then added to the thus
compensated receiver i~age in order to maintain
picture quality.
There is thus typically provided for a
tran3~itt~r-b~sed motion compensation systPm a
plurality of displacement vectors, and in at least one
preferred embodiment, each vector is associated with a
specific region or block o the image. The bloeks are
typically non-overlapping and have, for example, a
size o~ eight picture elements (pixels) by eight

u O 9l/l~77 PCT/~59l/03~K
-3~
picture element3. Yariou method~ have been employed
for encoding the motion compensation data associated
with each cf the blocks.
Many methods have also been employed for
encoding the error information data signal in a
transmitter-based motion compensation system. For
example, in Hinman, U.S. Patent 4,727,422, a lossy
compression method and apparatus are disclosed. While
these method3 are highly advantageous, and provide
excellent results, it is nevertheless desirable to
further improve the compression of the data
information and thereby enable high quality image
reproduction u~ing still less channel bandwidth. '~
is further de~irable to provide better control over
the data transmission by controlling, ~or example, the
bit rate associated with the image.
Often, during a scene change, for example, there
exists substantial information to be transmitted, so
tha~ during a ~ingle frame time, insufficient
bandwidth is ~vailable to transmit all of the
information. Accordingly, varlous method~ have been
implemented to selectively limit the number o~ bits of
infor~ation transmitted over the channel. One of
these methodq, de~cribed in Ericsson, U.S. Patent No.
4,816,9140 filed January 7, 1987, entitled "A Method
and Apparatus fo~ Efficiently Coding and Decoding
Imaqe Sequence~,~ e~ploys ~uad-tree coding in
connection with the tran mission o poreions of a
tranqorm coe~ficient data set. The quad-tree coding
advantageou~ly provides a more graceful degradation of
the image during heavy motion or scene changes.
Other m~thods, such as hierdrchical vecto~
quantization, disclosed in Ericsson, United States
Patent 4,849,810 provide other advantage~ in the
encod~ng proce~.

W091/184~' P~T/US9l/03286
~4~ 2~8~
Other encoding methods have been used for
encoding source data. In particular, lattices have
been studied for some time in the literature ~or
channel and source coding applications. Much of that
work has been summarized in an encyclopedic monograph
on lattice theory and applications. While most of the
current wor~ in data compression uses Generalized
Lloyd Vector Quanti2ers (VQ), several researchers have
recently labored to revive intere~t in lattice
quantization ILQ)-
Both lattice and vector quantization map amultidimen-~ional space into a finite or countable set
o~ points. In vector quantization, as described in
Ericsson, U.S. Patent 4,849,810, a codebook containing
a finite set o~ points is constructed by some training
method or analysis. Each source vector is quantized
by locating the "closest" point (or codevector) in the
codebook, as measured by some distortion criterion.
~or example, a ~quared error criterion (L-2 norm), can
be used, althoug~ other measures are equally valid.
The encoder trans~its the index of this codevector tO
the receiver, which then approximates the source
vector during reconstruction, with the codevector.
A lat~ice i-~ defined as a finite set o~ points
and all pos~ible linear translate~ o~ it, yielding a
(typically) countable but in~inite set of points. A
source v~ctor can be quantized in accordance with
lattice.quantization by locating the closest point to
i~, that i~ contained in the lattice, as measured by
some distortion criterion (for example a squared error
criterion3. Assuminq the lattice can b~ indexed
~counted3 by some method, the lattice point's index is
transmitted and the receiver approximates the cource
vecto~ with that lattice point.

W~91tl~ P~ S~1/03~
- 5 - 2 ~ L
It is therefore an object of the pre~ent
invention to transmit sequences o~ images over a
communications channel using lattice quanti2ation
encoding, achieving relatively low bandwidth, and
providing high reliability and fidelity. Another
ob~ect of the invention is to control the number of
bit~ employed to transmit each image of a sequence o~
images while providing ~or grace~ul degradation o~ the
image during a scene change or during periods of heavy
motion. Other objects Oe the invention are a motion
compensation encoding and decoding method and
apparatus which reliably transmit and receive an
accurate estimate of the displacement error in a
scanned image in a sequence, and an improved motion
estimation error encoding and decoding method and
apparatus which enable real-~ime, accurate
determination o~ regional displacement in an image
transmis~ion device.
Summary of _he Invention
The invention relates to a method and apparatus
for encoding interframe error data in an image
transmi~sion system, and in paEticular, in a motion
compensation i~age transmission system, for
transmitting a sequence of image frames from a
transmitter station to a receiver s~ation. ~he method
feature~ the step~ of decimating an interfram~
predicted image data representing a predie~ion o~ the
cur~ent image frame for generating a prediction
pyramid dat~ tructure representing the curren~ image
predic~ion and having a plurality of decimation
levels; decimating an uncoded current image data
representing the current uncoded image ~rame for
generating ~ current image pyra~id data structure
represen~ing the current imaqe and having ~he

~.
' 'WO 91/1~17 PCT/~'S~1/03~86
-6~ 2~
plurality of decimation levels; and applying an
entropy coded, lattice threshold quantization encoding
method to the difference of the prediction and current
image pyramid data st~uctures, on a level by level
basis, for generating an encoded data representation
of the dif~erence between the predicted image data and
the uncoded current image data.
In other aspects, the method features the steps
o~ applying the lattice encodinq method to the data
str~cture~ o~ a level on a block-by-block basis using
an E-8 lat~ice structure.
In another aspect of the invention, the method
features the steps of orming a difference image
representing, on a pixel-by-pixel basis, the
difference between a predicted image data for a
current image frame and an uncoded current image data
represen~ing the uncoded current image frame. The
method further ~eatures desimating the difference
image for generating a difference image pyramid data
structure having a plurality of decimation levels and
selectively applying the lattice quantization encodins
to the difference image pyramid data structure on a
level-by-level basis for generating an encoded data
representation of the difference between the predicted
image da~a and the uncoded currene image data. In a
p~rticular a~pect, the method ~eatures forming the
predicted imag~ data usinq interframe motion
compensation.
The apparatus of the invention featur~s
circuitry for decimatin~ the interframe predicted
image d~ta for a current image frame for generating
the prediction pyramid data structure having a
plurality of decimation levels, circui~ry for
decimdting the uncoted curren~ image data representlng
the current uncoded image frame for generating a

w09l/l~7, PCT/US~1/03286
7~ ~32~1
current image pyramid data structure having the
plurality of decimation levels, and circuitry for
applyinq entropy coded lattice threshold quantization
encoding to the prediction and current image pyramid
data structures on a level-by-level basis for
generating an encoded data representation of the
di~ference between the predicted image data and the
encoded current image data. The apparatus further
features circuitry for applying the lattice encoding
method to the data structures of a level on a
block-by-block basi~ using an E-8 lattice.
In yet another aspect of the invention, the
apparatus ~or encoding the interframe error data in an
image transmission system for transmiteing a sequence
of imaqe frames ~eatures circuitry for forming a
difference image representing, on a pixel-by-pixel
basi~, the difference between predicted image data for
a current i~age frame and an ~ncoded current image
data representing an uncoded image frame. Decimation
circuitry i3 provided for decimating the difference
image; for generating a difference image pyramid data
~tructure having a plurali~y of decimation levels; and
for applying entropy coded, lattice th~eshold
quantization encoding to the difference imaqe pyramid
data ~tructure on a level-by-level basis for
generating an encoded data repre~entation of the
di~fer~nce between the predicted image data and the
encoded current i~age da~a.
Brie~ Description of the Drawinqs
Other objects, feature~, and advantage of the
invention will appear from ~he ollowing de~cription
of particular preferred embodiments taken together
with the drawing~ in which:

W091/1~77 PCT/U5~1/032g6
2 ~
Figure 1 is an electrical block diagram o~ a
typical image communications system in accordance With
the claimed invention;
~ igure 2 is an electrical block diagram of the
transmitter Oe a motion-compensated imaqe encoding
apparatus employing the inventions
Figure 3 is an electrical block diagram of the
receiver of a motion-compensated image encoding system
for receiving the channel signals ~rom the transmitter
o~ Figure 2:
Figure 3A is a block diagram of the lossy
compressor 28 according to the invention:
Figure 4 is a block diagram of the lossy
compre~.~or 46 according to the invention;
Figure S is a diagrammatic rep~esentation o~ a
one dimen~ional decimation procesq;
Figure 6 iq a detailed electrical block diagram
of lo~sy compre~-qor 46 according to the invention;
Figur~ 7 is a general block diagram of an
alternate embodiment of lossy compressor according to
one aspect o~ the invention:
Figure 8 is a diagrammatic representation of the
relative locations of adjacent blocks used in
predicting a next block value;
~ igure 8~ i3 a diagram~atic representation o~
the relative location~ of adjacent motion vectors used
in the linear predictor;
Pigure 9 is a probability density function
divided into stepped re~ions and showing centroid
locations;
Figure 10 is an electrical block diagram of a
coding apparatus according to the invention; and
~ igur~ 11 is an electrical block diagram of an
entropy coded~ lattice threshold quantization encoding
system according to the invention.

UO91/~7~ PC~/~S91/032&6
_9~ 2 ~ ~ ~
DescriDtion of Particular Preeerred Embodiments
.
Referring to Figure 1, a communications system 6
has a transmitter 8 which, in accordance with a
preferred embodiment of the invention, has a camera !0
for providing a video signal to an analog-to-digital
converter and frame buffer 12. The frame buffer
portion of the analog-to-digital converter and frame
bu~fer 12 is capable of storing a full frame of the
video, sampled to, for example, eight bits across a
256x240 pixel raster.
The entire coding and motion compensation
process takes place in the digital domain. The
transmitter has an error signal circuitry 14 and a
motion e-~timation and coding circuitry 16. A channel
encoder 18 channel encodes the outputs of the error
circuitry 14 and motion estimation and coding
circuitry 16 and passes the thus encoded data onto a
channel 20 Çor transmi~sion to a receiver 21.
The illustrated motion estimation and coding
circuitry 16 of Figure 1, in accordance with a
preferred embodiment of the invention, and ~e~er~lng
to Figure 2, compares a present original input frame
image, available over lines 22, with the previous
original input image, available in thi~ illustra~ed
e~bodiment from a fram~ buff~r 24. A ~otisn estimator
circuitry 26 generate~ a measure of the motion
di~placement between the frames input ther2to, and can
be any of a number of motion estimation devioes as ase
well known in the art. In a preferred embodiment to
be described her@inafter, the motion estimation device
uses an adaptive steepest descent @rror minimization
method to generate the motio~ displacement measures ac
de~cribed in ~inman, U.S. Patent No. 4,661,B49~

WO 91/1~477 P~T/I;'S91/03286
-10~
The output of the motion es~imator 26 is a field
o~ motion vectors which, as noted above, provide a
measure o the motion displacement beeween input
frames. This vectoc field provides a description o~
how to map a previous input ~rame or input image erom
bu~er 24 into the best approximation of the present
input frame or image over lines 22. (Preferably
buffer 24 contains only 32x30 and 64x60 decimated
copies of the previous image, while a buffer 44 is
used for deriving a 128xlZ0 copy of the previously
~estimated) image.) By "best" is meant an error
meeric such as, or example, a mean-squared-error
error measure. Typically, and in the illustrated
embodimen~, the motion estimator uses a region
matching technique between non-overlapping blocks of
the previous and present input imaqes. Should motion
occur for a region in the present i~age, the estima~or
will deeermine which bloc~ in the previous image is
the be~t match for the block in the present image, and
the value of the displacement is the dieference
between a new coordinate pair for the block in the
present image and the original coordinate pair for the
block in the earlier i~age. That determination de~ines
the ~otion vector to be associated with th~ block in
the presen~ i~age.
Since scenes are ~enerally composed of several
large object~ mo~ing uniformly over tim~, there is a
high deqree of correlation in the motion vector ield.
~o avoid tran3mitting redundant information, and to
reduce th~ da~a bit requirementq, th~ preferred
embodiment o~ the invention modifie~ the motion vector
field, thereby losing some information, ~o faeilitate
the oompression of the motion representins data. In
the illustra~ed embodiment, thi~ operation is
represented by a "lossy compressor" 28 which reduces

' WO9l/l~77 PCT/~591/0328b
2~2~
the amount of data, and hence the bandwidth, requi~ed
to repre~ent the motion vector field. Noting the
simila~ity between motion vecto~ field and natural
images, predictive, transform, or interpolative coding
of the two independent components of the vector eleld
can be employed by the lossy compressor 28.
Thus the lossy compressor circuitry 28 is
employed ~or coding the motion vector field available
over lines 32, and provide~, over lines 30, a coded
motion signal representative of the motion vectors.
This output of the lossy compressor, as noted above,
will not, upon decoding, exactly reproduce the signals
over lines 32 ~which provide the measure o~ motion
displacement) and, therefore, have some error signal
associated with them. Nevertheless, the reduction in
the data requirements of a lossy compressor, when
compared to, for example, a PCM exact coding method,
are so sub~tantial, that the use of a loscy
compres~or, is a significant advance in the art. One
preferred lossy compression cir~uitry employs adaptive
predictive pulse coded modulation (~DPCM).
Referring to ~igure 3A, the lossy compressor
circui~ry 28, in the illustrated embodiment of the
invention, provideq for a linear predicto~ 28a of the
motion vector ield and includes additional circuitry
or reducing the number of bits (the bandwidth)
required to de~cribe the predictor and prediction
error of the motion vector field. Referring to the
raster scanning pattern as shown in ~ig. 8A, and to
Figure 3A, the linear p~edictor 2Ba predicts the
cur~en~ motion vector ~marked X) using the ~our
previou~ly coded neighbo~ Imarked O) (previously
available sve~ line~ 30). Then additional circuitry
lembodied in switch circui~ry 28b) makes a decision
regarding thr~e possible choice~:

WO 91/18417 PC f/1,'591/03286
--12--
~2~
a~ Re~et the estimated motion vector to zero
~the signals over line L88) and send it to the
receiver;
b) Reset the estimated motion vec~or tover
line 28c) to the predicted value and send it to the
receiver: or
c) Quantize the prediction error (the
di~ference between the original estimated motion
vector and the predicted motion vector) in quantizer
28d and send the quantization index to the receiver.
Associated with each choice, there is an
incurred cost lthe bits N used to transmit that
decision) and an error E (the mean squared error o~
the difference image block generated by using chosen
motion vector). The additional circuitry 28b makes
this decision using an entropy constrained error
measure, that is, a weighted combination of the used
bits and error (for example, a measure equal to (aN
bE) where "a" and "b" are constant~ experimentally
determined for the apparatu-~. The result that
produce~ the smallest entropy constrained error is
chosen and transmitted. The encoding process starts
by sending one bit of information about whether or noe
to reset the estim~ted motion ve~tor to zero. If it
i5 r~set to 2~ro~ nothing more is sent for this motion
vector. Otherwi3e, a second bit of information is
sent to tell the receiver whether to use only the
predicted motion vec~or or whether additional
correctional information (the quantized error) will be
sent. Finally, i correctioAal infoc~ation is needed,
the quantization index of the predlction error will b@
sent. In order to further reduce the number o~ bits
for encoding, the lossy compressor employ~ arithmetic
coding for the firs~ two step~ and ~uffman codinq for
the la5t step.

W091/1~77 PCT/~$91/032~6
-13- 2~
The output of the 109sy compressor circuit~y
over lines 30, a~ noted above, is passed to the
encoder la. In addition, those signals are employed
by the erro~ circuitry l~ for determining what the
receiver would have seen, absent any e~rors in the
channel, and thereby providing the mechanism for
determining the prediction error signal, that is, the
signal representing the difference between what the
receiver would have predicted based upon the coded
motion signal representation over lines 30, and the
true image input.
The output of the lossy compressor over lines 30
is used by a reconstructo~ circuitry 34 for producing~
at its output, a signal repre~entative of the measure
o~ motion displacement, the motion vectors, on lines
32. The difference between the signals over lines 36,
the output o~ the reconstruction circui~ry, and the
signals over lines 32, representR ~he coding error
introduced by the lossy compression apparatuc 28. ~he
output of the reconstruction apparatus 3~, over lines
36, is directed to a motion field interpolation
circuitry 38 which operates in the spatial domain to
associate with each picture element a motion
displacement vector. Thus, while the input signals
over line~ 36 represent motion displacements for
groups or r~gions of elements, or example, the
piCturQ elements o a 4x~ block, the motion field
int~rpolator, as described in more detail below,
resolves that data so that there i~ associated with
each picture element, a motion displacemen~ vector.
~he resulting output of the motion field int~rpolator,
over lines 40, is designated the motion reconstrue~ion
signal.
The motion reconstruc~ion signal is applied to a
motion compensation appa~atus 42 which forms part of

WO~1/1~77 PC~/~591/~32~6
-14~
an error reconstruction loop 43. The error
recon~truction loop include~ a fram~ buffer 44, a
105sy compression circuitry 46, and a reconstructio~
circuitry q8. The input~ to the lossy compression
circuitry 46, over lines 22 and 51 respectively, are
the original input image for the current frame and the
estimated receiver signal, that is, the signal which,
absent any further data, the receiver w~ll reconstruc~
and display. The lossy compre~qor 46 provides the
receiver with further encoded data, the error
reconstruction signal, ~or reducing, and in principle
eliminating, the difference between the original ir.put
imaqe and the estimated receiver signal. That
difference ie coded to ~educe its bandwidth and the
resulting signal, the error reconstruction signal cver
line 52, is delivered to the channel encoder 18. ~he
lossy compressor 46 in the above referenced Ericsson
patent is a two-dimen3ional block encoder which
employs a gain/shape vector quanti2ation: and the
output of the block transform can be advanta~eously
further reduced in bandwidth and encoded according ~o
the processes described above in connection with the
lossy compressor 28. ~owever, in the preferred and
illustrated embodiment of the invention a hierarchlcal
entropy encoded, lattice thresholt quantization
encoding method and apparatu~ are advantageously
employed in implementing the lossy co~pressor 46.
~ he error recon~tructio~ signal is also sent to
the r~construction apparatus 48 which provides an
operation which is the inverse to tha~ imposed by ~he
lo~sy compressor 46. There ~e~ults, ~herefore, at the
output of the reconstruction apparatu~ 48, an error
reconstruction image over lines 54. The error
reconst~uction imag~ i~ added to the expected output
of the motio~ compensator, (which i~ the estimated

~ W091/l~77 P~r/~S4l/03286
-15~ 2 ~ ~ ~
receiver image over lines 51) and the resulting
signal, an estimated previous receiver image (the
predicted receiver image for the previous ~rame)~ is
stored in the frame buffer 44.
As noted above, the input to the ~rame buffe~ 44
is the estim~ted previous ceceiver image. This
receiver image, which takes into account all data
received by the receiver, corre~ponds to the
reconstructed receiver image for a frame. The image
outpu~ from the frame buffer over lines 64 is the
image which the motion compensation circuitry 42
modifies in accordance with the output of the motion
field interpolator 38 over line~ 40. The output o~
motion compensator 42 thus represents the predicted
receiver image as a result of reconstructing the
output data from lossy compressor 28.
At the receiver 21, referring to ~igure 3, the
data from the channel i~ decoded by a channel decoder
circuitry 70 and the re~ulting receiver error
reconstruction signal over lines 72 and receiver coded
motion ~ignal repre~entation over lineY 74 are
delivered to reconstruction circuitry 76, motion
compensator 99, and reconstruction circuitry 78
respectively. The reconstruction circuitries 76 and
78 each provide for decoding the codes employed by ehe
tran~l~itter ~o effect the operations performed by
reconstruction circuitrie~ 48 and 34, respectively, of
the trans~itter, a~ described in more detail
hereinafter. The outpu~ o th~ error reconstruction
circuitry 76 is delivered to a recove~y loop 80 in
which motion compensa~ing ~ignals over line 82 are
added to ~he error image ~epresentation over lines 8q
to produc~ a recons~ructed receiver signal over lines
86. That siqnal i deliver~d to a digital-to-analog

U~09~ t~ PCr/~'S~I/03~K
-16~
circuitry 90 and from there to a monitor 92 for
viewing.
Motion reconstruction ~ignals are generated by a
motion field interpolator 96 corresponding to the
motion field interpolator 38 of the ~igure 2. The
motion field interpolator, as noted above, provides a
motion vecto~ for each pictur~ element of the image
and hence allows the frame interpolator to accurately
predict what the image would have been at any selected
time between received ~rames. The reconstructed
receiver images over line~ 86 are successively stored
in a frame buffer 98 and are delivered to a motion
compensator 99 which also receives signals from the
motion field interpolator 96 The output of the
motion compensator, representing the expected receiYer
imaqe in the absence of an error correction,
corresponds to the signal over line 51 in the
transmitter, and i9 delivered to the adder 100 for
combination with the output of the error
reconstruction circuitry over lines 84.
The transmitter and receiver circuitries of
~igures 2 and 3 can be modified in a number of ways as
described, ~or example, in United States Patent~
Nos. 4,727,422 and 4,816,914 referred to above. While
thes~ alte~nate embodi~ents of transmitter and
receiver structure are applicable in diferent
communications configura~ions, th~ invention described
and clalmed herein relating to the entropy coded,
lattice threshold quantization encoding system i not
dependent upon which of tho~ p~rticular transmitter
configurations is employed and will therefor~ be
described solely in connection with the ~ypical
transmitter and receiver configuration set forth
hereina~ov~.

W091/1~77 P~T/U59~/032&6
-17- 2 ~ ~ 2
The motion field interpolator ~38, 96) of the
tranQmitter and receiver circuitries and the lo~sy
compres~or 28 o~ the transmitter circuitry, are
described in detail in Ericsson, U.S.
Patent 4,a~9,810, issued July 18, 1989. The lossy
comp~essor 46, however, which is also described in the
Ericsson patent, i9 modiÇied by it5 use of the entropy
encoded, l~ttice threshold quantization encoding
method. Accordingly, the 10~5y compressor is
described in detail herein. Further, the motion
estimator and an adaptive filter which can be
advantageously u~ed in accordance with the present
invention, are also described in detail in the above
identified Ericsson, U.S. Patent 4,849,310.
The ~ossy Compressor (46)
As noted above, the lossy compressor 46 receives
as inputs the original uncoded siqnal over line 22 and
signals representing the estimated receiver image over
lines 51. The lo sy compressor 46 uses those signals
~or encodinq the difference between them, and outputs
the encoded error reconstruction signal over lines 52.
This signal correc~s ~or most errors not properly
compensated for by the motion compensation system.
Referring now to ~igure~ ~ and 5, the estimated
receiver im2ge over line 51 (often referred to as the
"warped" i~ge) and the original uncoded image over
line~ 22 are decimated (that i~, filtered and
subsampled as described below) by decimation
circuitrieq 502, 50~, respectively, four times. At
e~ch decimation stage, the image is subsampled by a
factor o~ two both hori20ntally and vertically. Thus,
five level~ of images for the luminance i~age are
available at resolutions of, in the illustrated
embodiment, 256x240, 128x120, 64x60, 32x30~ and 16x15
picture elemen~ for the luminance. The set of

wosl~l~77 PCT/~S91/~3286
-18~
image~, at the di~ferent image resolutions, i5
commonly referred to a~ a "resolution pyramid." The
base of the pyramid i~ the ~ull resolution image while
the top of the pyramid is, in the illustrated
embodiment, the 16x15 pixel image.
Similar resolution pyramids are formed ~or the
"I" and "Q" chrominance components of a color image.
However, for the discussion below, only the luminance
component of the image shall be discussed. The same
apparatus and proce~sing steps are equally applicable
to the chrominance components o~ the image.
In accordance with the lattice threshold
quantization sy~tem, encoding of the image difference
between the warped image and the original uncoded
image is performed by an encoding circuitry 506 on a
level by level basis, from the top level to the bottom
level of the resolution pyramids. ~he process
terminate~ at that re~olution when no additional bits
are available for video transmission. Thus, during a
moderate motion, the system will typically reach the
bottom or ba3e level o~ 256x240 pixels while during a
heavy motion the encoding may stop at the 12~x120
level. Typically, during a scene change, the
apparatus will run out of transmission bit~ earlier in
the pyramid or divide the available lists among the
several pyramid level~. Thus, in general, large
changes of i~age or scenes are typically described
first at th~ hiqh~r levels with the details being
filled in latez frames.
More pa~icula~ly, in accordance with a
pre~erred hierarchical coding system using entropy
code~, lattice threshold quan~ization (EC-LTQ~,
encodinq begin~ a~ the top level, that is, the 16x15
image. The 16xlS version of the warped image or a
background i~age is used as the prediction. Recall

WO 91/1847~ PCI/1~'~91/03286
2'~
that this cor~esponds to the image ldecimated) that is
created at the receiver absent any additional
information. Referring to Figure 6, this top level
prediction i5 subtracted from the 16xlS decimated top
level image oÇ the oriqinal image. The difference
image, representing the error at that top level~ is
quantized and the quantized information is directed to
the encoder 18 for transmission to the receiver.
Thereafter, the quantized difference image is added to
the prediction image, at the 16x15 level, to form a
16xlS reconstructed image which the receiver will also
create.
At the lower levels, the prediction version o~
the image is formed in a different fashion. In
accordance with the invention, the prediction is
derived from the higher level reconstructed image and
~rom the current leveI warped image as follows.
First, an interpolation error image is derived
by interpolating the higher level warped image and
subtracting it from the curren~ level warped image.
The resulting warped interpolation error image thus
essentially extracts the spatially higher frequencies
of the warped image, thae i~, information not present
in the higher level image. The higher level
reconstructed image i~ then interpolated to form an
int~rpolated, reconstruction image at the current
level. Finally, the warped interpola~ion e~ro~ image
o~ the background imaqe is selectively added to the
interpolated reconstsuction image to generate the
prediction imag~. As described in detail in the
copending application 07/521,976, the warped
interpolation error image o~ the background image is
used where i~ improves the predic~ion but not
otherwise. Thi~ i~ decided on a block-by-block ba~is,

w09~/1~77 P~T/~9~/03286
-20~
and the decisions are transmitted to the receiver as
"side" infor~ation.
Thereafter, the steps ~or generatinq the
difference signal at this lower level are the same a~
those at the top level, that is, the current level
prediction image is subtracted from the current level
original image and that difference is quantized and
transmitted to the receiver. Thereafter the quantized
difference is added to the prediction image at that
level to form a new reconstruction image. This
procedure is repeated through the resolu~ion pyramid
until the bottom level is reached. The reconstructed
image at the bottom level is the output image a~ the
level, and it i that image that is displayed by the
decoder. That image is also used as de cribed above
to form a warped image ~or the next frame. The warped
image reconstruction at the transmitter is, as noted
above, performed by the reconstruction circuity 48.
If all of the available bits have been u~et
before the bottom level is reached, the predictions at
the lower levels are still generated in the same
manner; how~ver, no coding, that is, no quantized
differenc~ information is sen~ to the receiver.
Instead, the prediction at the lowest level3 will be
used directly ~ the output or ~econstruc~ion imag0 at
tbat level and a the error reconstruction image over
lin~s 5~ froDo reconstructor cirFuitry 48.
Th r es ~
~ eferri~g to ~igu~e 5, the resolution pyramid is
formed, as noted above, by decimating four times, in
this illustrated embodiment, the highest resolution
level o the imag~. In the one dimensional
, ~ ,

W091/1~77 PCT/~S91/~32~6
-21~
relationship illustrated in Figure 5, each pair Oe
pixels at a lower level are averaged to form a single
pixel at an upper level. ~he situation is the same
- both horizontally and vertically so that each higher
level picture element is located at the center of a
2x2 pixel group of the lower level. The codinq method
also provides for generating, using an interpolation
procedure, the pixels at a lower level from a higher
leveL. The interpolation process is applied, for
example, to the warped and reconstructed imaqes to
obtain images for processing at the next lower level
and is effected by a bilinear interpolation. The
interpolation factors are 0.75 and 0.25.
In the illustrated embodiment of the invention,
arithmetic coding ie employed for both coding of
information for transmission from the lossy compressor
28 as well as, and dS will be discussed in more detail
below, the coding of scalar data from lossy compressor
46. Arithmetic coding is well known to those skilled
in the art. In particula~, it can be applied
advantageously to describing the locations of non-2ero
transform or other a~ray variables. The symbol
probabilities are changed depending upon previously~
transmitted values and ~he sequence position o the
coefficient. Presto~ed pro~abilities are employed
~ince on-line adaptation does not, in the experience
of the inventor, provide significant improvement in
thi~ appllcation.
Con~idering the encoding of the resolution
pyramid~ in more detail, and ~eferring to ~igure 6,
the original and warped images have, at the top le~el,
a resolution of lSx16 pixels for the luminance and 8x8
pixQls for the chrominance, respectively. Figur~ 6
describ~s the processing o~ the luminance component;
and the proceRsing of the chrominance component (not

WO91/l~77 PCT/~S91/032
22 ~ ~ g
shown) can be ~imila~ly illustrated. ~he prediction
image consi~ts of the top level warped image that was
o~tained originally by four decimations of the warped
luminance and chrominance images, respectively. The
prediction error is generated by subtracting the
prediction image 510 from the original, uncoded, top
level decimated image 512. The image dif~erences ovec
line 51q are quantized by a scalar ~uantizer 516
having a ~ixed step size. The quantized information
over line 518 is encoded separately for each
component, the Y, the I, and the Q components, using
the same arithmetic encoder 520 which is also employed
for the motion vector transform coefficients. Encoder
520 uses a Markov Model ~or encoding the non-zero data
locations. Th~ encoder has sixteen stateq dependin~
upon whether the al~eady encoded four nearest
neighbors corresponding to the four nearest neighbors
illustrated in Figure 8 are zero or non-zero. The
non-zero values are encoded by a memory-less coder
that encodes the eight bit quaneization indices into
the bit stream. The quantized difference image is
added to the prediction a~ noted above, and the result
is th~ output or reconstruction i~age tover lines 522)
at the top level.
The scalar quanti~er 516 used in connection with
the top level prediction error is a uniform quanti2er
ha~in~ a dead-zone around zero. It codes the sign
with a 1 bit and the magnitude with 7 bit~ in the
illu~trated embodiment. Th~ thresholds (T(i)) for the
magnitude ar~ located at:
T(il - i~T
i=1,2,...,N (Equation 1)
:

WV~1/1~77 P~T/~S91/~86
-23- 2~
The reconstruction levels (R~i)) are defined by:
~(0) = O
Rti) ~ ~i+Delta~R) i=1,2,...,N (Equation 2)
There~o~e, a value of X, where X is greater than
T(k) but less than T(k~l) is assigned a quantizer
index value Oe k and is reconstructed at the receiver
as having a value R~k). The quantizer is also
symmetric around zero and sets all values with a
magnitude less than ~(l) equal to zero.
~ or the lower levels of the resolution pyramid,
the prediction image is generated by combining the
output image ~rom the next higher level with the
warped image of the same level. Then, the prediction
error i9 formed by taking the difference of the
original image at the current level. The difference
image is coded using a lattice threshold quantizer and
the quantized difference is added to the prediction to
obtain a new output image at the current level. The
Y, I, and Q components are treated as three separate
images.
Considering the lower level~ in more detail, the
prediction image is generated by combining the warped
image at the current level with the output and warped
images f~om th~ next higher level. Specifi~ally, the
interpolation error oE the warped image i9 generated
using the warped image 524 at the ~urrent level and an
inte~polated version o~ the warped image from th~ nexe
higher level (interpolated by circuitry 526j. That
interpolation error i5 thu the dif~erence between the
current level ~arped image and the sa~e image that has
been decimated and interpol~ted. A~ noted above, it
contains the details o the warped i~age that were
lost in the d~ ation to form the next higher level
image. The output image ~ro~ the next higher level i~

u 09~ P~r/~'S91/032g~
-24 2~
then interpolated at interpolation circuitry 527 to
obtain an imaqe at the current level. Thereafter, the
warped interpolation error over line 528 or the
background image is conditionally added by adder 530
to the interpolated ou~put imdge to form the
prediction. That is, for each block of 8x8 pixels,
the ~quared error i9 determined between the original
image stored at 53~ and three possible predictions,
that is, between the interpolated output image from
the next higher level with and without the inclusion
o~ the warped interpolation error, and also that with
the inclusion Oe the background image.
~ he elimination of the warped interpolation
error is e~uivalent to low pass filtering the warped
image for the prediction. This effec~ive ~iltering
process is performed in all blocks where it provides a
significant decrease in the prediction error, that is,
in those blocks wherein motion compensation was not
succe~s~ul. The result oP the filtering process,
termed "blurring," is effected if the "blurred error,"
multiplied by a weighting factor, such as 1.5 in the
illustrated embodiment, i5 less than the error using
the warped interpolation error. The use of the
background image i~ e~uivalent to a long term memory
such that a ~epresentative of the non-moving object
can be retained by the decoder even when the~e objects
are briefly occluded. This background ~ra~e ~hould
only be used when a 3ignificant gain i~ achieved ove~
other choices. Therefore, its weighting factor is 20%
- 25~ greater than the "blurring" weight.
The blur and background information generate a
one or two bit (3 states) word for each 8x8 block.
This i5 similar to the method used in ~otion vector
field coding, and these two bit~ "answer" the
following two questions:

W091/1~~ P~ 'S91/~3286
~25- 2 ~
a) Doe3 the system blur the current (8x8)
block?
b) IF blurring is not preformed, should warped
prediction or the background image be used?
For example, a one indicates blurring and a zero
indicates no blurring. The informaeion is encoded
using an arith~etic coder 534 such as that noted
earlier, and since each word contains only one bit,
there is no need to encode the non-zero values once
the "blur location map" has been encoded.
The particular arithmetic encoder 534 for the
blur information uses six binary variables to ~elect
one of thirty-two states with corresponding
probabilities. The binary variables are the four
previously encoded blur words for neighboring blocks
at the same leYel and the blur word for the higher
level neighbor, that is, the block at the next higher
level that corresponds ~o the current block. Thus,
the encoder does not make explicit use o~ the fact
that blurring at one level propagates to lower leve's
and instead this relationship i5 reflected in the
probabilitie~ for the variou~ states having a non-zero
higher level neighbo~.
~ he prediction errors themselves are coded by
the lattice quantizer 536. ~hu~, at each level, the
Y, I, and 0 components are treated a3 thr~e separate
image3. ~ach different image, generated for ea~h
levsl, is thu~ divided into blocks o~ 4x2 pixels.
Each block then becomes ~he "vector" and i coded by
the lattice quantization by firs~ deter~ininq a
closest point and thereafter indexing the point as
described in greater detail below.
The remaining levels can be encoded using the
procedure~ applied at the (30x32) level, and using
equivalents o~ elemen~s 52~, 526, 527, an adder 538,

WO91tl~77 p~r/l-s9l/o3286
-26~
dnd elements 530, 534, and 536, but with the exception
that the i~age data will be encoded, in the preferred
embodiment, as described below.
The entropy coded, lattice threshold
quantization described herein replaces the vector
quantization described in ~ricsson, ~.S. Patent
4,849,810. In comparing vector quantization (VQ)
encoding with entropy coded, lattice thr~shold
quantization encoding, we know, ~rom Shannon ~heory,
;hat as the size and dimen~ion of an optimally
designed VQ codebook increases, its performance
approaches the operational distortion-rate bsund,
without the need for entropy coding. Complexity in
~he VQ process, however, also grows without bound, and
clever method must be used to obtain ~ood performance
with only moderate complexity. Lattice quantize~s
that are optimum in the Shannon sense were, prior to
the invention herein, known for only a handful o
source and dimensions. For example, the hexagonal
lattice is optimum for an independent, identically
distributed uniform source in two dimensions. It has
also been shown tha~ a rectangular lattice works well
~or independent, identically distributed Laplacian
source ~ rn gene~al, one can show th~t lattice~
having high sphere packing densities work well for
independent, identically distributed ~ource~, provided
the coding rat~ i~ high and an efficient entropy code
is u~ed ~or ~r~nsmitting the indicies. Achievinq good
performance at low rate and for sources with memory,
such as that described in this application, has been
an elusive goal.

WO gl/18477 P~ 1/03286
-2 7-
Entropy Coded LattiCe Thre~hold Ouantizat~on
In accordance with the invention, an optimum
per~ormance over a wide range of data rate~ ~or a
variety of data sources can be obtained for the fi~st
time using multiple dimension lattices. The preferred
encoding structure is termed the Entropy Coded Lattice
Threshold Quantizer (EC-LT~) illustrated in Figure 10.
The preferred embodiment of the invention uses an
eight-dimensional lattice having the highest known
sphere packing density, the E-8 lattice.
The vector quantizer referred to in Ericsson,
U.S. Patent 4,849,810 i5 designed for a given (though
Dossibly unmeasured) source probability density
~unction (pdE) by successively optimizing during a
training period, its encoder for its decoder, and its
decoder for its encoder, etc., resulting in a locally
(though not globally) optimal design that typically
does not require further coding, such as entropy
coding. On the other hand, most lattice quantizers
work best only for memoryless probability density
functions. ~owever, it has been shown that the
simplest one-dimen~ional lattice (the Z-l, or uniform
quantizer) can n~t only approach, but can even attain
the scalar ~u~ntizer's Gish Pierce performance bound.
The excellen~ per~ormance i~ obtained by u~in~ the
centrold of ea~h Voronoi reqion for the quantizer's
reconstruction value, (Pigure 9) and by coupling the
uni~orm thre~hold quantizer with an ef~icient entropy
code (Figure 10). ~he particular distortion-rate
operating poin~ is fixed by the quantizer's step ~ize.
The Entropy Coded ~attice Threshold Quantizer of
the pre ent invention generalizes the unifor~
thresbold quantizer to the vector case. According to
th~ invention, as not~d abov~, the E-8 lattice is used
to encode the sourc~, and a separate, optimal decoder

WO 91/18477 PC~l,'$91/032~6
-28-
i~ de~igned ~or it. The encoder output i9 coded using
a near-optimum entropy code. The decoder consists o~
a countable set of centroids, one centroid for each
lattice point'q Voronoi region, given the ~ource
probability density ~unction and distortion measure.
The lattice threshold quantizer's performance
lies between the scalar quantizer's Cish-Pierce bound
and Shannon's Opera~ional Distortion-Rate Bound.
While an entropy coded, lattice threshold quantizer
can be built for any lattice, entropy coded lattice
threshold quantization performance is relatively
independent Oe the particular lattice employed, so
even the simplest n-dimensional lattic~, the n-
dimensional cubic (or Z-n) lattice, can work about as
well as the more complex, but denser, E~8 or 24-
dimensional Leech lattices.
There are ~everal reason~ for preferring entropy
coded lattice thre~hold quantization over vector
quantization, even i~ entropy coded lattice threshold
quantization could not attain the v~ctor
quantization's distortion-rate performance. First,
since a conventional vector quantization codsbook is
finite, there exists a hypersphere which contains all
codevector~ in it. Source vectors lying out~ide the
hypersphere will typically have low probabili~y and
thu~ add littl~ to the vector quantization's average
di~tortion, but they neverthele~ incur substantial
instantaneous distQrtion, appearinq a~ th~ decod~r's
output a~ infrequent, but easily perceived and
annoying artifacts. ~or example, in image coding,
vectsr quantization overload can re~ult in an ea~ily
perc~ived localized blur o~ gra~ularity: and in speech
coding, it can cause a momentary bu~ audible
distortion. Lattice quanti2ers have no ove~load
region, since even a ~ource vector having arbitrarily
,

W091/1~7 P~T/~'S91/03~K
-29- 2~$2~
large energy will still be located near a point in the
lattice. A lattice quantiz~r can thus improve
perceived per~ormance of source coding systems,
especially those needing occasional instantaneous high
rates.
Second, fast encoding methods for lattices exist
and others are continually being sought, e~pecially
for ldttices having high sphere packing den~ities
(e.g., the E-8 and the 24-dimensional Leech lattice).
The~e methods perform optimum encoding by locating the
unique point in the lat~ice that i9 closest to a given
source vector. In contrast, most ~ast encoding
~ethods for vector quantization either perform a sub-
optimum search of the vector quantization's finite
codebook or, to attain optimum encoding, typically
require substantially more memory than is needed to
store the codebook itself.
Third, a~suming one has a qood predictor that
excises much of the memory ~rom the source,
~ran3forming the quantizer's source model in~o a
peaked (though not necessarily i.i.d.~ probability
density function, excellent distortion-rate
peror~ance can be obtained ~rom the entropy coded
lattice thre~hold quantizer by expending memory and
complexi~y only to opti~ally encode the mos~ probable
source vector~, leaving simpler sub-optimal encoding
techniques Eor the less frequent outliers. This is
important for practical implementations of entropy
coded lattice ~hreshold quanti2ation, such the real
time video compre ~ion described here. The preferred
embodiment of the invention shows one novel way to
accomplish thiC for th~ ~-8 lattice.
While in the preerred embodiment o~ the
i~v~ntion, entropy coded lattice threshold
quantization ls applied to a closed-loop pyramid image

wo91/1~7 PCT/~S91/03286
_30_ 2S~ 4
compression method, the entropy coded lattice
thre~hold quantization technique5 herein described are
equally applicable to open-loop pyramid image
compression, conventional subband and transform imaqe
and speech coding, and virtually any other compression
technique using vector quanti~ation.
Entropy Coded Lattice Threshold Quantization Por The
E-8 Lattice
The preeerred embodlment of entropy coded
lattice threshold quantization ~or hierarchical
compression in accordance with the invention, uses the
E-8 lattice and the apparatus and method of the
invention provide methods to efficiently implement
entropy coded lattice threshold quantization for this
lattice.
The E-8 lattice is composed of the union of one
D-8 lattice with a ~econd D-8 lattice translated by
the vector 0.5 ~ ~1 1 1 1 1 1 1 1). The D-~ lattice
is composed o~ all 8-dimensional vectors having
integer components which sum to an even number~ For
example, (1 -3 0 0 0 0 0 0) i~ a point in D-~ lattice,
while (1 1 1 0 0 0 0 o) i3 not. Since the transla~inq
vector alqo ~ums to an even numb~r, th~ componenes o~
all point-~ in E-8 lattice sum to an even numb~r. No~e
also that th~ E-8 latti~e include~ the all 2ero
vector.
All point~ in the E-8 lattice can be grouped by
~he value of their L-2 ~orms (~um of ~uares) into
shell~, wher~ the shell number increases wi~h the L-2
norm. The all zero vector is the only lattice point
having zero nor~ and it comprises Shell 0. There are
240 points having an L-2 norm o~ 2.0, and these points
comprise Shell 1. There are 2160 points in Shell 2,
each having an L-2 norm of 4.0, and ~o on. For

WO9~ 7~ PCr/US9l/032B6
-31-
typical pyramid video coding application~, such as
that described herein, the probability density
function is very "peaky" so that over 90~ of the
source vectors fall onto Shell 0. Of the remaining
10~, roughly 80~ to 90~ or more fall into Shells 1 and
2, making it worthwhile to code Shells 1 and 2
efficiently, but permitting simpler methods to be used
for Shells 3 and up.
The entropy coded lattice threshold quantization
process i3 performed in five steps: quantizing the
source vector to the nearest lattice poin~, encoding
the lattice point to a unique lattice index, entropy
coding the index, decoding the index, and
reconstructing the vector using a scaled version o~
the centroid associated with the lattice po~nt. In
accordance with the preferred embodiment, the method
implementation per~orms each step for the E-8 lattice.
Ou~ntizing ~o the Nearest E-8 Lattice Point
~ he quantizer can identify the closest E-8
lattice point using one of several algorithms. In the
preferred embodiment, the quantizer firs~ rounds the
source vector ~o its nearest integer coordinates and
then sums the coordinates to check whe~her the point
is on the D-8 lattice. I ~he su~ is even, this point
is on the inteqer ~no~ tran~lated) D-8 lat~ice. If
th~ 9U~ ig odd, then the close~t lat~ice point ~in the
~-2 sen~e) i~ ~ound by rounding the co~ponent having
th~ largest roundo~ error ~o ~he next nearese
integer. Nex~, the quantizer find~ a closest lattic~
point on the translated D-8 lat~ice. Each component
of the source vector is truncated and increased by 0.5
and ~he re~ul~ing vector's components ar~ summed~ If
~he 3um i~ even~ thi poin~ is on the E-8's o~fset or
tran~lat~d D-8 lattice. ~ the Cum i5 odd, then the

,~ A..19~
W0~ 71 P~T/~S~1/03286
-32-
closest lattice point (in the L-2 sense) is found by
rounding the component having the largest error to the
next nearest integer-plus-one-half.
Both of these D-8 lattice points will lie on the
E-~ lattice. To identify the one closest to the
source vector, we can compute the ~-2 distance from
each. ~owever, a numerically simpler method i5 to
note the following. For the present, assume that
rounding and truncating both result in points on the
lattice and let Rn and Tn be the errors introduced by
roundinq (untranslated D-8 lattice) or by truncatinq
and addinq one-half (translated D-8 lattice) to the
nth coordinate, n = l, 2, ..., 8. Then the difference
between the squares of the distances tO each candidate
lattice point is
sum over n ~Rn ~ Rn] - sum over n [Tn i Tn]
= sum over n [Rn ~ Rn - Tn ~ Tn]
= sum over n [(Rn-Tn)(Rn+Tn)~
N~te that when R~ > 0, then (Rn-Tn) = 0.5 and Tn ~
0.5-Rn. Also, when Rn ~ 0, then (Rn-Tn) = -0.5 and ~n
= 0.5 ~ Rn. Substituting, we find this reduces to
= sum over n [¦Rn¦ - 0.253
= ~um over n ¦Rn¦ - 2
I~ an implementation, the subtraction can also be
obviated by comoaring the sum to two.
~ n accordance with the invention, a imple
technique enable~ quick determination of th2 oddne~s
or evenne~ of the sum of the coordinates. Since
there are an even number of coordinates, subtracting
l/2 from each does not affect the oddness or evenness
of thelr sum. So for the truncate and add one-halF
ca~e, adding the "half" can be omitted, Also, if ~he
rounding erro~ Rn is non-negative (~= 0), the same
value is obSained whe~her a coordinate is rounded or
truncated. If Rn is negative, then the truncated

~091/l~77 PCT/~S91/~3~
-33- ~ ~ ~2~ ~ ~
value is one le3q th~n the rounded ~alue. Combining
these facts, the oddness or evenness of the sum of
rounded and truncated coordinate-Q is the same if and
only if Rl R2 ^ ... R8 >= O, where t^) designates
the exclusive-or operation. If the exclusive-or
operation returns a negative value, the evenness/
oddness o the rounded coordinates will be the
opposite of that of the truncated coordinates. The
oddness o~ evenne~s o~ both round and truncate plus
one-half cases can thus be checked by computing only
the sum Oe the truncated values and Rl ^ R2 ^ ...
R8.
~ inally, the largest magnitude error obtained by
trunc~ting and adding one-half occurs for the
coordinate having the smallest magnitude error
obtained by rounding. The sign o~ the error indicates
wheeher "on~" ~ust be added or subtracted to move tO
the coordinate of the closest lattice point. If a
coordinate i-Q changed, its corresponding error term
(Rn or Tn) must also be updated as follows: If a
rounded off coordinate changed,
Rn(new) = 2 * tRn(old~] - l,
while if a ~runcate-and-add-one-half coordina~e
changed, its corresponding influence on Rn i5
Rn(new) = 2 * lRntold)].
Encod~ eq and Minimizinq Centroid ~
~ avinq fsund the E-8 lattice point closest to
the sou~ce vector, the point must be unambiguously
identified to ~he receive~. We do so in pair~ of 8-
dimensional blocks, where each pair forms a 4x4 block
of pixels~ ~e irst gen~ra~e a sequence of bits, one
bit fo~ each 4x4 block, where the bit is zero if both
lattice psints corresponding to the block belong to
Shell 0, and the bit is one if one or both la~tice
~ , .

WO91/1B477 PCT/~'~91/032~b
-34~
point~ belong to a shell other than Shell 0. Isolated
oneq at higheqt resolution level, that i3, the 256x240
level, in this sequence are set to zero to limi~
"popping" effects. Arithmetic coding is used for
coding this information.
~ or every remaining "one" in the sequence, ~e
jointly Huffman code the shell numbers for both
lattice points in the corresponding block. Since most
source vectors code to shells l or 2, we consider only
~our cases: Shell 0, Shell l, Shell 2, and Shell 3 or
higher. Since we already know that at least one
lattice point is outside Shell 0, there are 15
possibilities, so we use a table of l; Huffman codes.
We ne~t describe how the receiver finds the
particular point on the indicated shell. Since there
is only one poin~ in Shell 0, no lattice index need be
sent. If the point is on Shell 1 o~ Shell 2, the
procedurP described below is used to identify the
particular point to the receiver, exploiting
symmetries to reduc~ the sto~age needed ~or the
associated centroid and probability distribution.
Only 920 centroids are needed for the 2400 points in
Shells l and 2. The remainde~ are permutations.
Also, lattice points having pe~mutations of the same
centroid conveniently have the same source
probability~ ~o only 920 Huffman code3 (the preferred
entropy encoding method~ are stored, one centroid and
one ~uffman code pe~ "pa~ent" centroid. Since the
receiver will know each lattic0 point's shell numb~r,
the preferret implementation actually u3es two sets of
Huffman code tables, one table containing 120 codes
for Shell l, the other 800 codes for Shell 2. 10ne
table from each set i5 selected by computing the ratio
of the source vecto~'s energy to ~he step si2e beinq
ussd. This enables U5 to adaptively code the lattice

WO91/1~7 P~T/~9i/~3286
-35 ~ s~
point~.) Finally, i the point is on Shell 3 or
higher, we explicitly transmit the point's
coordinates, usually by jointly Huffman encoding pairs
Oe data.
In accordance with the illustrated embodiment,
the lattice point's ene~gy can be computed and its
shell identified. Each eight dimensional lattice
point corresponds to a two column by four row block o~
pixels. For points on Shell 3 or higher, there is
sufficient correlation between the columns to justify
jointly coding coordinate pairs. Denoting the spatial
relationships o~ the lattice points as:
xO x4
xl xS
x2 x6
x3 x7
point pairs (xO, x4), (xl, xS), (x2, x6) and (x3, x7)
are jointly coded. (If the pairs exceed some pre-
determined dynamic range, they are individually coded.
Also, it is necessary to hard-limit the coordinate's
range to limit the size of the required Huffman
tables.) At the receiver, the lattice point is
reconstructed by taking the product o~ the lattice
coordinate, step size and an attenuation factor,
discus~ed later.
Shell~ l and 2 are coded using ~uffman codes
ba~ed upon co~puted probabilities. A straightforward
implementation would store 2400 8-di~ensional
centroids and 2400 Ruffman codes for the 2400 poin~s
in the two shells. ~owever, considerable stora~e can
~e saved by notinq symmetries between the points. In
particular, we note that half the points are the
negatives of the other half.
We denote quch pair a having opposite polarity (that
i , they exhibit polarity symmetry)O Of ~hose that

W091/1~7 PCT/~S~/03~8b
-36 2~c~
remain, many are permutations of one another, ~ormed
either by witching the columns, row~ or both columns
and rows as Eollows:
xO x4 x4 xOx3 x? x7 x3
xl xS or x5 xlor x2 x6 or x6 x2
x2 x6 x6 x2xl x5 x5 x!
x3 x7 x7 x3xO x4 x4 xO
(original) (column flip) ~row flip) (both fl~p)
This is the same symmetry used to reduce codebook size
in the Mean-RefleCted Residual Vector Quantizer
proposed by Baker and Budge.
We identify these symmetries by ~irst computing
the lattice point's 8-bit group number (GN), a number
found by multiplying the lattice point by E-8's
inverse generator matrix and taking modulo 2 of each
componen~. The matrix is:
0.5 -0.5 -0.5 -0.5 -0.5 -0.5 -0.5 2.5
O 1 0 0 0 0 0 -1.0
0 1 0 0 0 0 -1.0
O O 0 1 0 0 0 -1.0
O O O 0 1 0 0 -1.0
o o O O 0 1 0 -1.0
Q ~ 0 0 0 0 1 -1.0
Q o 0 0 0 ~ 0 2.0
Of the 256 possible group numbers (GN's), one qroup
number (havinq value zero) corresponds to the all zero
Shell 0 lattice point, 129 each correspond to two of
Shell 1'~ 240 lattice points, and the remaining 13~
correspond to Shell 2's 2160 points. It is clear that
a lat~ice ~oint and its nega~iYe yield ~he same group
number. Thus each of the 120 g~oup numbers for Shell
1 corresponds to one lattice point and its negativ~.
Closer inspection reveals that each of the 135 group
numbers for Shell 2 corresponds to exactly 16 lattlce
points, eight of which are negatives of th~ othe~

~O~1/1~77 P~ S~/032~6
-37~
eight. Each group of eight are also mutually
orthogonal, and therefore, the following method can be
used for indexing them. Form an 8x8 ~atrix usin~ each
point a(i3T, i = 0, 1, ..., 7, a~ a row in the matrix,
where T denotes transpose and the vector a(i) th~ ith
eight dimensional lattice point. Since the points are
orthogonal and have the same norm, multiplying any one
of the~ by the matrix yields a vector having only one
non-zero component with value equal to the squared
norm. The ~ollowing quadratic form yields the index
between 0 and 7:
¦ a~O)T ¦
¦ a(l)T ¦
¦ a(2)T ¦
¦ a(3)T ¦
c * t 1 2 3 4 5 6 7 8 ] ¦ a~4)T ¦ a(i)
¦ a(5)T ¦
I a(6)~ 1
¦ a~71T ¦
where c is the reciprocal of the squared norm for any
point a~i). This computation can be ~implified by
pre-multiplying c, the 1 1 2 3 4 5 6 7 8 ] row vector
and the 8x8 rnatrix, yielding a single 8-dimensional
row vector. The result of the inner product i5 a
unique index between -8 a~d 8 excluding 0. Thu~ only
135 row vectors need be stored for the lattice points
of Sholl 2.
Of the 135 GNs in Shell 2, lattic~ points in 6i
of the~ share only polarity symmetry, while in those
in the other 70 GNs are actually composed of only four
points which share exactly one type of flip symmetry
in addition to polarity symmetry with the othe~ four
(eith~r row o~ colu~n or row/colu~n ~lips). gnowin~
the qroup numbe~, th~ rec~iver will know which (i

WO91/1~77 PCT/~'S91/03286
-38~ 2 ~ ~ ~
any) o~ these symmetries exist. Finally, since
lattice pairs having opposite polarity and pairs
related by one type o~ symmetry have the same
probability, we need only store one centroid and one
probability for each polarity and symmetry. '~e need
120 for Shell l and (65~8 1 70~4) - 800 ~or Shell 2,
for a total of 920 centroids and probabilities.
Entropv Codlnq and Decodlnq the Index
W~ transmit each lattice point in Shells l and 2
by entropy coding (using Huf~man coding as is well
known in the art) its group number and sending one bit
(having an entropy of one) for the point's polarity.
If the point belongs to Shell 2 (known immediately tO
the receiver from the group number), three more bits
identify which o~ the eight points ~of that polarity)
it is. These three bits can be computed at the
transmitter from the inner product operation described
above. If the point ha~ flip symmetry, one of these
three bits will have full entropy. ~he decoder knows
from the GN how many additional bits are needed, so
the code is comma free, that is, the decoder can parse
the bit stream without further information, and the
code is efficient.
Reconst ~ the Vector
At the receiver, the vector is reconstructed by
taking the product o~ the lattice coordina~e or
asso~iated centroid, the step size and an attenuation
factoE .
Attenuation ~actor
The entropy coded lattic~ thseshold quantization
is required to operate over a variety of sou~ce
dist~ibutions and bit rates. In this application,

W09~/1~7~ PC~US91/03~K
2~26~
ent~opy coded lattice threshold quantization is used
for hierarchical pyramid video coding and the
centroids in Shells l or 2 are used only at the three
highest resolution levelq o~ the pyramid. Yet the
statistics of the source at each level differ. Also,
at any one level, source probability density functions
tend to change in variance as the step size changes.
For example, if the input energy is large relative to
the step size, the probability distribution over the
lattice points is much broader than when the ratio is
small. To combat these variation~, several classes
are defined and are determined by thr2sholding the
ratio o~ the level's input energy to the square of ehe
level's step size. In the preerred implementation,
the apparatus stores a set of entropy code tables for
each class. Also, the attenuation applied to the
reconstruction value of a vector varies with the
class, the level of the pyramid at which encoding is
taking place, and the quantizer' 9 shell. The
reconstruction is thus given by:
(reconstruction) = (centroidl * ( tep size)
* a~tenuation factorl ~tep cize )
* attenuation table( shell, level, class ) ,
where the attenuation fac~or varies with step size, as
shown in Fig. ll. An additional attenua~ion, deno~ed
by attenuation table, takes a value according to the
partlcula~ shell the lattice point belonq~ to, the
lev~l in the pyramid beinq coded, and th~ class
currently .in force. For lattice point5 in Shell 3 and
up, the centroid used in this formula is the lattice
point i~self. These attenuation values are adjusted
to limit "popping" artifacts in the recon~tructed
signal. ~n general, the cumulative attenuation is not
grea~ enough to move the reconsg~uc~ion value outside

`~
WO91/1~7, PCr/~'59~/032~6
-90~
the lattice point's Voronoi region. The particular
functional values established by the relationships
de~ined here must be determined by teaming regression
analyses with some heuristic constraints and
perceptual experiment~. ~hey are implemented here as
tables and piecewise linear functions, but cou~d be
generalized to continuous functional relationships.
WHITENING RU~E
Since only one set of centroids is used with
entropy coded lattice threshold quantization, the
source data must ~e normalized by the step size to
collapse its probability density function to a morP
stationary set of values. In accordance with the
illustrated embodiment, the method employed to
estimate proper step size, the "Whitening Rule," is a
bit allocation method. This method can also be used
for the cloqed loop 5-level ~ierarchical Pyramid
vector quantization described in earlier patent
Ericsson, U.S. 4,849,810. The method deter~ines the
number o~ bits to expend at each of the three higher
resolution levels o~ the pyramid by specifying the
step size used for that level. One level (3) is not
coded and the last level (0) needs its own estimate,
as de~cribed i~ the next sub ection.
Since the motion compensated image being coded
generally has an ener~y spectrum which decreases with
increa~lng spatial frequency, and since the levels are
coded in the order o~ increasing spatial frequen~y,
step sizes are selected such that the current level is
coded with enough "bi~ " to match the prediction error
energy o~ the ne~t level. The hierarchical coding
structure is closed loop, so that all error energy
remaining at the curxent level carries forward to the

WO ~ 7 P{~tl,'S~1/0328~;
-41~
next level, The next level then attempts to code what
is left to the level o~ prediction error in the
succeeding level, etc.
In more detail, the encoder builds a resolution
pyramid of the current ~rams and creates predictions
within the pyramid using motion compensation and
background references. ln the preferred
implementation, there are the five levels previously
noted:
Level Resolution
===== ==========
0 256 x 240
1 128 x 120
2 64 x 60
3 32 x 30
4 1~ x 15
Level 3 is never coded, since we can save some bits
without any perceived 103s in quality. We scalar code
level 4, then code levels 2, 1 and 0. During the
process of building the resol~tion pyramid, the
average error enerqy E(n) at level n = ~0, 1, 3} is
estimated assuminq level n+l were already coded. Then
the step size S(n~ o be u~ed at level n~l is
S(n+l) ~ min{ a~n) * sqrt[ E(n) + b~n) ~ ;
c(n) ) -
The Whitenlng Rule uses three table , one each fora(n), b(n), and c(n), where n - {0, 1, 3~, to compute
the step sizes ~or level~ and 4. Note that level
3 is not coded and that the step size at level 0 ~ust
be selected Ruch ~hat ~he entropy coded lat~ice
threshold quanti2er accurately expends whatever bies
re~ain ~or the Prame.

. W091/1~7, P~r/~S~1/032g6
-42- 2~2~
A method for computing the step size for level 0
can be estimated, in accordance with the invention, at
level 0 in two steps. First an open loop estimate is
made, followed by a closed loop estimate.
1. Open loop estimate: The estimated level zero
step size SO(0) is computed from the energy in each
level and the number o bits available:
SO~0) = gqrt(energy) ~
expt a * bits~*3 + b ~ bits**2 ~ c *
bits*~3 + d }
2. Closed loop estimate: ~he open loop estimate
SO(0) is then modified by a multiplicative parameter
c determined by a recursive formula:
Sc(0)(n3 = c (n~ ~ S0(0)(n)
c (n) = c (n~ exp{ h * diff bits(n-l) },
where n is the frame index, dlff bits(n-l) i5 the
number of bits available less the number of bits used
in frame (n-l), and h i5~ a constant. In the preferred
implementation, the diff bits process is low pass
~iltered.
~ ive con~tant~ must be determined by
experi~entation, a, b, c, d, and h. Two of the
equations can be combined into one to reduce th~
number of exponentiation~:
S~(o)(n) = c (n~ sqr~(energy(n)j ~ exp~ a
bits(n)**3
b * bits(n)**2 + c ~ bits(n) l d
h * diff ~its~n-l) }
In summary, in accordance with the preferred
embodiment o~ the inventionJ a~ the top level, the
system uses scalar quantization. At the next level,

. "i ,,
WO91/1~7~ PCT~'591/~3286
-43~
the 32x30 level for the Y component and the 15xlS
level or the chrominance components, the warped
interpolation error is selectively added to the
interpolated output images from the top level and
therefore, at this level, blur information is
transmitted. ~owever, in the preferred embodiment of
the invention, no encoding of the image information,
that is, of the output image error, is performed in
the illustrated embodiment. The prediction images
themselves are the outputs from this level. In the
three bottom levels, all of which are treated
identically, adaptive blurring is used when generating
the prediction images, and the p~ediction error is
encsded using lattice threshold quantization. This is
indicated diagrammatically in Figure 6.
In the illustrated embodiment, there can occur
some circumstances, for example a scene chanqe, when
it may not be possible to encode even the next to
lowest level. In that instance, the coding will stop
at the next level up (60x64 for luminance and 30x32
for chrominance). If the coding at this level
generates more bits than the desired number of bits
per frame, the coding i~ still performed and the frame
rate will te~porarily slow due to the large number of
bits being generated. For the two botto~ levels, the
prediction is generated aq u3ual and the blur
in~ormation ls transmitted: however, lattice threshold
qu~nt~zation is not performed.
Reerring to ~igure 3, at the receiver, the
tran~mitted and coded data i~ decoded and the new
frame generated~ In particular, the data representin~
the resolution pyramids is decoded by reconstruction
circultry 7S l~vel by level ~rom the top of.the
pyramad down to the bottom. At th~ ~op level, the
quan~ized di~erenc~ ge ic decoded and added to the

, I..i,~
P~T/~S91/03286
- 4 4 ~
warped i~age at that level (in selective adder 100,
which includes decimation circuitry). Thereby, the
output image at the top level i~ reconstructed. The
lower levels are then reconstructed (by adder 100) by
first ~orming the prediction using the transmitted
blur information available over lines 84 from
reconstruction circuitry ~6 and then decoding the
dieeerence image and selectively adding it to the
prediction image to form a new reconstructed image for
that level. The process continues until the bottom
level is reached and the bottom level image is
transferred to the display frame buffer 98.
The arithmetic decoder which is used in a number
of processe~, that is, decoding the motion transform
coefficients, the blur information, and ~he image
in~or~ation ~rom lossy compressor 46, operate~ as is
well known in the art. Since the non-zero locations
were encoded using different probabilities depending
upon which state the coder was in, the arithmetic
decoder regenerates the state for each position in the
location map as the decoding proceeds. The state, in
combination with the encoded data, then determines
whether or not a zero is used for each ~ap location.
Once the map indicating the loca~ion of non-zero
values has been decoded, the a bit values are decoded
and incremented by one and placed in the appropriate
positions in the map.
Looking in more detail at the generation of the
resolution pyramids and decoding thereof, the process
follows the inverse of the method used in lossy
compressor 46. Thus, since the receiver decoding
proce~ follows the encoding process of the
transmi~er, the prediction at the ~op level i5
generated ~ro~ th~ warp~d top level image. The
quantizer indice~ are decoded using the arithmetic
.'' ' '
. , .
'
. .

. WO91/1~77 PC~/~S~1/03~
-45- 2~
decoder and the quantized dLference image is
reconstructed from the quantizer indices and is then
added to the prediction image to give the top level
output image (corresponding tO adder 522a and the
output over lines 522 of the ~ransmitter), At the
lower levels, the prediction is formed by selectively
adding the warped interpolation ercor to the
interpolated output image from the next higher level
~corresponding to adder 530 in the tran~mitter). That
output image, and the warped image at the next output
lev~l, are interpolated to give images at the current
level. The blur information is decoded using the
arithmetic decoder and then, in each 8x8 block of the
interpolated higher level output for which the blur
code is zero, the difference between the current level
warped image and the interpolated higher level warped
image is added (corresponding to adder 538 of the
transmitter).
At the lower level~, according to the preferred
embodi~ent of the invention, the lattice thre~hold
quantized information is decoded. Each 4x4 block
having a non-zero map bit is decoded by fir t
determining the shell number~ of the two 4x2 latt~ce
coded vector~ inside. ~he centroid is then found for
each lattice not lying in Shell 0 and ~caled by the
attenuation value. The output image is thereafter
or~ed by addlng this reconstructed difference to the
prediction. ~This corresponds ~o the operation o the
adder 540 o~ the transmitter.) The output imag~ from
the bottom le~el in the final recon~eructed image is
then transferred to the display frame buffer as the
final output image.
~ he illu~trated lossy co~pressor q6 can be
implemented in hardware, i~ so~tware, or in a
combinatlon th~reof.

W091/1~7 PC~/US91/~32~b
-46- 2~
The hierarchical entropy coded lattice th~eshold
quantization encoding method and apparatus described
hereinabove can also be advantageously employed in
connection with the transmission o~ a sequence o~
images in which motion compensation is not provided.
In this instance, the estimated receiver image over,
for example, lines 51 in Figure 2 will be the
reconstructed receiver image designated as the frame
output of the frame buf~er 44 over lines 64. Further,
the input i~age over lines 22 and the estimated
receiver image over lines 51 need not be individually
decimated by lossy compressor 46 but can be, referring
to ~igure 1, input to a difference circuitry 720, the
outpu~ of which, representing the errsr signal at the
receiver, can then be input to a lossy compressor for
decimation and hierarchical lattice threshold
quantization encoding in accordance with the invention
described hereinabove. At the receiver, the motion
compensator 99 and its related circuitry would
similarly be eliminated if motion compensation were
not employet. Similarly, the reconstruction circuitry
7~ would be modified in accordance with the
transmitter operation to reconstruct the error image
rep~esentation over lines 84 when the circuitry of
Fiqure 7 is employed. These change~ would be apparent
to those practiced in the art.
The entropy coded, lattice thre hold quantization
can al~o be employed in coding other se~uences of
multidimensional data vectors having a unifor~ vector
quantization. In particular, the simplifications
detailed above can be employed with such data, and
would be modified to take accoun~ o~ ~he probability
den~ity functions for the vector data.

. W091/1~7 P~T~91/~
-47- ~8~
Additions, subtractions, deletions, and other
modifications of the preferred particular embodiments
o~ the invention will be apparent to those skilled i^.
the art and are within the scope of the following
claims.

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee and Payment History should be consulted.

Event History

Description	Date
Inactive: IPC expired	2014-01-01
Inactive: IPC expired	2014-01-01
Inactive: IPC expired	2014-01-01
Inactive: IPC from MCD	2006-03-11
Inactive: IPC from MCD	2006-03-11
Inactive: IPC from MCD	2006-03-11
Inactive: IPC from MCD	2006-03-11
Application Not Reinstated by Deadline	1999-05-10
Time Limit for Reversal Expired	1999-05-10
Inactive: Abandon-RFE+Late fee unpaid-Correspondence sent	1998-05-11
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice	1998-05-11
Application Published (Open to Public Inspection)	1991-11-12

Abandonment History

Abandonment Date	Reason	Reinstatement Date
1998-05-11

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
PICTURETEL CORPORATION

Past Owners on Record
BERND GIROD
EDMUND THOMPSON
JEFFREY BERNSTEIN
RICHARD L. BAKER
XIANCHENG YUAN

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Drawings	1991-11-11	7	163
Claims	1991-11-11	18	524
Cover Page	1991-11-11	1	19
Abstract	1991-11-11	1	26
Descriptions	1991-11-11	47	1,670
Representative drawing	1999-08-17	1	11
Reminder - Request for Examination	1998-01-20	1	118
Courtesy - Abandonment Letter (Maintenance Fee)	1998-06-07	1	186
Courtesy - Abandonment Letter (Request for Examination)	1998-06-21	1	171
Fees	1992-11-09	1	49
Fees	1995-04-12	1	47
Fees	1997-04-21	1	44
Fees	1996-04-17	1	37
Fees	1994-04-20	1	48
International preliminary examination report	1992-11-09	83	1,860
Courtesy - Office Letter	1993-05-20	1	34

Language selection

Menus

English Abstract

Event History

Abandonment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2082641 Summary

English Abstract

Event History

Abandonment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.