Language selection

Search

Patent 2703048 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 2703048
(54) English Title: SYSTEM AND METHOD FOR QUALITY-AWARE SELECTION OF PARAMETERS IN TRANSCODING OF DIGITAL IMAGES
(54) French Title: SYSTEME ET PROCEDE DE SELECTION COMPATIBLE AVEC LA QUALITE DE PARAMETRES UTILISES DANS LE TRANSCODAGE D'IMAGES NUMERIQUES
Status: Deemed expired
Bibliographic Data
(51) International Patent Classification (IPC):
  • G06T 9/00 (2006.01)
  • H04N 19/40 (2014.01)
  • H04N 19/61 (2014.01)
  • G06T 3/40 (2006.01)
(72) Inventors :
  • COULOMBE, STEPHANE (Canada)
  • FRANCHE, JEAN-FRANCOIS (Canada)
  • PIGEON, STEVEN (Canada)
(73) Owners :
  • ECOLE DE TECHNOLOGIE SUPERIEURE (Canada)
(71) Applicants :
  • ECOLE DE TECHNOLOGIE SUPERIEURE (Canada)
(74) Agent: DONNELLY, VICTORIA
(74) Associate agent:
(45) Issued: 2015-06-30
(86) PCT Filing Date: 2008-07-16
(87) Open to Public Inspection: 2009-05-07
Examination requested: 2013-03-04
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/CA2008/001305
(87) International Publication Number: WO2009/055899
(85) National Entry: 2010-04-19

(30) Application Priority Data:
Application No. Country/Territory Date
PCT/CA2007/001974 Canada 2007-11-02
60/991,956 United States of America 2007-12-03
12/164,873 United States of America 2008-06-30
12/164,836 United States of America 2008-06-30

Abstracts

English Abstract




Several quality-aware transcoding systems and methods are described, in which
the impact of both quality factor
(QF) and scaling parameter choices on the quality of transcoded images are
considered in combination. A basic transcoding system
is enhanced by the addition of a quality prediction look-up table, and a
method of generating such a table is also shown.




French Abstract

L'invention concerne des systèmes et des procédés de transcodage compatibles avec la qualité, selon lesquels l'impact des choix du facteur de qualité (QF) et des paramètres de mise à l'échelle sur la qualité des images transcodées sont associés. Selon l'invention, le système de transcodage de base est amélioré grâce à l'ajout d'une table de recherche de prédiction de la qualité, et un procédé permettant la production de ladite table est également mis en oeuvre.

Claims

Note: Claims are shown in the official language in which they were submitted.


WHAT IS CLAIMED IS:
1. A method for selecting transcoding parameters and transcoding an input
image into an output image
for a terminal having terminal constraints including a maximum device file
size, the method
comprising:
(a) determining a file size, and an encoding quality factor QF(I) of the input
image;
(b) providing an array of relative file size prediction values and predicted
quality metric values
characterizing a measure of distortion of an image introduced by transcoding,
the array being indexed
by encoding quality factor QF(I), output encoding quality factor QFT and a
resolution scaling factor
zT;
(c) computing a maximum relative file size as the maximum device file size
divided by the file
size of the input image;
(d) selecting, from the array, a set of feasible transcoding parameters,
yielding those predicted
relative file size values not exceeding the maximum relative file size, which
correspond to the file size
and the encoding quality factor QF(1) of the input image;
(e) obtaining, from the array, predicted quality metric values corresponding
to the set of feasible
transcoding parameters; and
(f) selecting those transcoding parameters from the set of feasible
transcoding parameters,
which yield the highest predicted quality metric value.
2. The method of claim 1, further comprising transcoding the input image into
the output image using
the selected transcoding parameters corresponding to the highest predicted
quality metric.
3. The method of claim 1, wherein the array has been obtained from transcoding
of a set of training
images.
4. The method of claim 1, wherein the set of feasible transcoding parameters
comprises the resolution
scaling factor zT not exceeding a maximum scaling factor z_max determined
using predetermined
maximum image dimensions of the output image and image dimensions of the input
image.
44

5. The method of claim 1, wherein the step (d) comprises interpolating between
entries of the data
structure for at least one of the indices.
6. The method of claim 1, the step (d) further comprising limiting the set of
feasible transcoding
parameters to those combinations of the output encoding quality factor QFT and
the resolution scaling
factor zT yielding predicted relative file size values not exceeding the
maximum relative file size.
7. The method of claim 1, wherein the array comprises a multi-dimensional
relative file size prediction
table indexed by the QF(I), QFT and zT.
8. The method of claim 1, wherein the array comprises a multi-dimensional
quality prediction table
indexed by the QF(I), QFT and zT.
9. The method of claim 1, wherein the input image and the output image are
JPEG images.
10. The method of claim 1, wherein the quality metric values are based on the
Maximum Difference
(MD) measure of respective transcoded training images compared with
corresponding training images
in the set of training images.
11. The method of claim 1, the step (e) further comprising:
sorting the set of feasible transcoding parameters according to the
corresponding predicted
quality metric values; and
truncating the set of feasible transcoding parameters by reducing it to a
promising subset
corresponding to high predicted quality metric values.
12. The method of claim 3, wherein the set of training images comprises a
first set of training images
and a second set of training images, the method further comprising:
obtaining the predicted relative file size values from transcoding of the
first set of training

images; and
obtaining the predicted quality metric values from transcoding of the second
set of training
images.
13. The method of claim 12, wherein the quality metric values are determined
based on the Peak Signal
to Noise Ratio (PSNR) measure of respective transcoded training images
compared with corresponding
training images in the set of training images.
14. The method of claim 6, the step (d) further comprising limiting the set of
feasible transcoding
parameters to those combinations of zT and QFT wherein the resolution scaling
factor zT does not
exceed a maximum scaling factor z_max determined from predetermined maximum
image dimensions
of the output image and image dimensions of the input image.
15. The method of claim 14, wherein the maximum scaling factor z_max is
further defined according to
a viewing scaling factor zV chosen based on anticipated viewing conditions of
the output image.
16. A system for selecting transcoding parameters and transcoding an input
image into an output image
for a terminal having terminal constraints including a maximum device file
size, the system
comprising:
a processor; and
a computer readable storage medium having computer readable instructions
stored thereon for
execution for the processor, forming:
an image feature extraction module, determining a file size, and an encoding
quality factor
QF(I) of the input image;
an array of relative file size prediction values and predicted quality metric
values characterizing
a measure of distortion of an image introduced by transcoding, the array being
indexed by encoding
quality factor QF(I), output encoding quality factor QFT and a resolution
scaling factor zT;
a quality and file size prediction module being configured to:
compute a maximum relative file size as the maximum device file size divided
by
46

the file size of the input image;
select from the array, a set of feasible transcoding parameters, yielding
those predicted
relative file size values not exceeding the maximum relative file size, which
correspond to the file size
and the encoding quality factor QF(I) of the input image; and
obtain, from the array, predicted quality metric values corresponding to the
set of
feasible transcoding parameters;
and
a quality-aware parameter selection module, selecting those transcoding
parameters from
the set of feasible transcoding parameters, which yield the highest predicted
quality metric
value.
17. The system of claim 16, further comprising a transcoding module for
transcoding the input image
into the output image using the selected transcoding parameters corresponding
to the highest predicted
quality metric.
18. The system of claim 16, wherein the array has been obtained from
transcoding of a set of training
images.
19. The system of claim 16, wherein the set of feasible transcoding parameters
comprises the
resolution scaling factor zT not exceeding a maximum scaling factor z_max
determined from
predetermined maximum image dimensions of the output image and image
dimensions of the input
image.
20. The system of claim 16, wherein the quality and file size prediction
module is further configured to
interpolate between entries of the array for at least one of the indices.
21. The system of claim 16, wherein the quality and file size prediction
module is further configured to
limit the set of feasible transcoding parameters to those combinations of the
output encoding quality
factor QFT and the resolution scaling factor zT yielding predicted relative
file size values not
47

exceeding the maximum relative file size.
22. The system of claim 16, wherein the array comprises a multi-dimensional
relative file size
prediction table indexed by the QF(I), QFT and zT.
23. The system of claim 16, wherein the array comprises a multi-dimensional
quality prediction table
indexed by the QF(I), QFT and zT.
24. The system of claim 16, wherein the input image and the output image are
JPEG images.
25. The system of claim 18, wherein the quality metric values are determined
based on the Peak Signal
to Noise Ratio (PSNR) measure of respective transcoded training images
compared with corresponding
training images in the set of training images.
26. The system of claim 18, wherein:
the set of training images comprises a first set of training images and a
second set of training
images;
the predicted relative file size values are obtained from transcoding of the
first set of training
images; and
the predicted quality metric values are obtained from transcoding of the
second set of training
images.
27. The system of claim 18, wherein the quality metric values are based on the
Maximum Difference
(MD) measure of respective transcoded training images compared with
corresponding training images
in the set of training images.
28. The system of claim 16, wherein the quality-aware parameter selection
module is further
configured to sort the set of feasible transcoding parameters according to the
corresponding predicted
quality metric values, and truncate the set of feasible transcoding parameters
by reducing the set to a
48

promising subset corresponding to high predicted quality metric values.
29. The system of claim 21, the quality and file size prediction module is
further configured to limit the
set of feasible transcoding parameters to those combinations of zT and QFT
wherein the resolution
scaling factor zT does not exceed a maximum scaling factor z_max determined
from predetermined
maximum image dimensions of the output image and image dimensions of the input
image.
30. The system of claim 29, wherein the maximum scaling factor z_max is
further defined according to
a viewing scaling factor zV chosen based on anticipated viewing conditions of
the output image.
31. A method for quality-aware transcoding of an input image into an output
image for display on a
terminal having device file size and image size constraints, the method
comprising:
(a) extracting features of the input image including dimensions and a file
size of the input
image;
(b) predicting, from file size prediction information obtained by transcoding
a set of training
images, a file size of the output image taking into account the constraints of
the terminal and the
extracted features, comprising selecting a set of feasible transcoding
parameters so that a corresponding
predicted file size of the output image meets the device file size constraint
of the terminal;
(c) determining, from quality metric prediction information obtained by
transcoding the set of
training images, predicted quality metric (QM) values of the output image, the
predicted QM values
characterizing a predicted measure of distortion of the input image introduced
by the transcoding,
corresponding to the feasible transcoding parameters in the set of feasible
transcoding parameters;
the quality metric prediction information being determined explicitly by a
comparison between
input images from the set of training images and corresponding output images
resulting from the
transcoding of the set of training images; and
(d) selecting those transcoding parameters from the set of feasible
transcoding parameters,
which yield the highest predicted QM value, corresponding to the highest
predicted
visual quality for the output image for the set of feasible transcoding
parameters.
49

32. The method of claim 31, further comprising indexing the file size
prediction information by
transcoding parameters including a transcoder scaling factor zT and an output
encoding quality factor
QFT of the output image.
33. The method of claim 32, further comprising indexing the quality metric
prediction information by
the transcoder scaling factor zT and the output encoding quality factor QFT.
34. The method of claim 32, wherein the step (b) further comprises obtaining
the file size prediction
information from an array of predicted relative output file sizes for image
transcoding, obtained from
the transcoding of the set of training images.
35. The method of claim 33, further comprising indexing the file size
prediction information and the
quality metric prediction information by an encoding quality factor QF(I) of
the input image.
36. The method of claim 33, wherein the step (c) further comprises obtaining
the quality metric
prediction information from an array of predicted quality metric values for
image transcoding, obtained
from the transcoding of the set of training images.
37. The method of claim 31, wherein the step (c) comprises determining the
quality metric values
based on a Peak Signal to Noise Ratio (PSNR) measure or a Maximum Difference
(MD) measure.
38. The method of claim 33, further comprising indexing the quality metric
prediction information by a
viewing scaling factor, characterizing a viewing condition of the output
image.
39. A system for quality-aware transcoding of an input image into an output
image for display on a
terminal having device file size and image size constraints, the system
comprising:
a processor;
a non-transitory computer readable storage medium having computer executable
instructions
stored thereon for execution by the processor, causing the processor to:

(a) extract features of the input image including dimensions and a file size
of the input image;
(b) predict, from file size prediction information obtained by transcoding a
set of training
images, a file size of the output image taking into account the constraints of
the terminal and the
extracted features, comprising selecting a set of feasible transcoding
parameters so that a corresponding
predicted file size of the output image meets the device file size constraint
of the terminal;
(c) determine, from quality metric prediction information obtained by
transcoding the set of
training images, predicted quality metric (QM) values of the output image,
characterizing a predicted
measure of distortion of the input image introduced by the transcoding,
corresponding to the feasible
transcoding parameters in the set of feasible transcoding parameters;
the quality metric prediction information being determined explicitly by a
comparison between
input images from the set of training images and corresponding output images
resulting from the
transcoding of the set of training images; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which
yield the highest predicted QM value, corresponding to the highest predicted
visual quality for the output image for the set of feasible transcoding
parameters.
40. The system of claim 39, wherein the file size prediction information is
indexed by transcoding
parameters including a transcoder scaling factor zT and an output encoding
quality factor QFT of the
output image.
41. The system of claim 40, wherein the quality metric prediction information
is indexed by the
transcoder scaling factor zT and the output encoding quality factor QFT.
42. The system of claim 41, wherein the file size prediction information
comprises an array of
predicted relative output file sizes, obtained from the transcoding of the set
of training images.
43. The system of claim 41, wherein the file size prediction information and
the quality metric
prediction information are further indexed by an encoding quality factor QF(I)
of the input image.
51

44. The system of claim 41, wherein the quality metric prediction information
comprises an array of
predicted quality metric values for image transcoding, obtained from the
transcoding of the set of
training images.
45. The system of claim 39, wherein the computer readable instructions further
cause the processor to
determine the quality metric values based on a Peak Signal to Noise Ratio
(PSNR) measure or a
Maximum Difference (MD) measure.
46. The system of claim 41, wherein the quality metric prediction information
is further indexed by a
viewing scaling factor, characterizing a viewing condition of the output
image.
47. A system for quality-aware transcoding of an input image into an output
image for display on a
terminal having device file size and image size constraints, the system
comprising:
a processor; and
a non-transitory computer readable storage medium having computer executable
instructions
stored thereon for execution by the processor, forming:
(a) a means for extracting features of the input image including dimensions
and a file size of the
input image;
(b) a means for predicting, from file size prediction information obtained by
transcoding of a set
of training images, a file size of the output image taking into account the
constraints of the terminal and
the extracted features, comprising selecting a set of feasible transcoding
parameters so that a
corresponding predicted file size of the output image meets the device file
size constraint of the
terminal;
(c) a means for determining, from quality metric prediction information
obtained by transcoding
of the set of training images, predicted quality metric (QM) values of the
output image, the predicted
QM values characterizing a predicted measure of distortion of the input image
introduced by the
transcoding, corresponding to the feasible transcoding parameters in the set
of feasible transcoding
parameters;
52

the quality metric prediction information being determined explicitly by a
comparison between
input images from the set of training images and corresponding output images
resulting from the
transcoding of the set of training images; and
(d) a means for selecting those transcoding parameters from the set of
feasible transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest predicted visual
quality for the output image for the set of feasible transcoding parameters.
48. The system of claim 47, wherein the means (c) are configured to determine
the quality metric
values based on a Peak Signal to Noise Ratio (PSNR) measure or a Maximum
Difference (MD)
measure.
49. The system of claim 47, wherein the file size prediction information is
indexed by transcoding
parameters including a transcoder scaling factor zT and an output encoding
quality factor QFT of the
output image.
50. The system of claim 49, wherein the quality metric prediction information
is indexed by the
transcoder scaling factor zT and the output encoding quality factor QFT.
51. The system of claim 50, wherein the file size prediction information and
the quality metric
prediction information respectively comprise an array of predicted relative
output file sizes and an
array of predicted quality metric values, obtained from the transcoding of the
set of training images.
52. The system of claim 50, wherein the file size prediction information and
the quality metric
prediction information are further indexed by one of the following:
an encoding quality factor QF(I) of the input image; or
a viewing scaling factor, characterizing a viewing condition of the output
image.
53. A method for transcoding an input image into an output image for display
on a terminal having
device file size and image size constraints, the method comprising:
53

(a) extracting features of the input image including dimensions and a file
size of the input
image;
(b) predicting, from transcoding a set of training images, a file size of the
output image taking
into account the constraints of the terminal and the extracted features,
comprising selecting a set of
feasible transcoding parameters so that a corresponding predicted file size of
the output image meets
the device file size constraint of the terminal;
(c) determining predicted quality metric (QM) values of the output image, the
predicted QM
values characterizing a predicted measure of distortion of the input image
introduced by transcodings,
corresponding to various feasible transcoding parameters in the set of
feasible transcoding parameters;
the predicted QM values being determined by a comparison between the input
image and
corresponding output images resulting from the transcodings;
the predicted QM values being further determined based on viewing conditions,
comprising
respective resolutions at which the input image and the corresponding output
images have been scaled
for determining the predicted QM values; and
(d) selecting those transcoding parameters from the set of feasible
transcoding parameters,
which yield the highest predicted QM value, corresponding to the highest
predicted
visual quality for the output image for the set of feasible transcoding
parameters.
54. The method of claim 53, wherein the step (b) comprises selecting the set
of feasible transcoding
parameters based on a transcoder scaling factor zT and an output encoding
quality factor QFT of the
output image.
55. The method of claim 54, wherein the step (b) further comprises selecting
the set of feasible
transcoding parameters based on an encoding quality factor QF(1) of the input
image.
56. The method of claim 53, wherein the step (c) comprises determining the
predicted QM values of
the output image based on transcoding parameters including a transcoder
scaling factor zT and an
output encoding quality factor QFT of the output image.
54

57. The method of claim 54, wherein the step (c) further comprises determining
the predicted QM
values based on an encoding quality factor QF(1) of the input image.
58. The method of claim 53, wherein the step (c) further comprises determining
the predicted QM
values based on an encoding quality factor QF(1) of the input image.
59. The method of claim 53, wherein the step (b) comprises predicting a
relative file size of the output
image.
60. The method of claim 53, wherein the step (c) comprises iteratively
transcoding the input image
with different feasible transcoding parameters from the set of feasible
transcoding parameters to
determine the predicted QM values.
61. The method of claim 53, wherein the step (c) comprises determining the QM
values based on Peak
Signal to Noise Ratio (PSNR) measure.
62. The method of claim 53, wherein the step (c) comprises determining the QM
values based on
Maximum Difference (MD) measure.
63. A system for transcoding an input image into an output image for display
on a terminal having
device file size and image size constraints, the system comprising:
a processor, and a memory device, having computer readable instructions stored
thereon for execution
by a processor, the processor being configured to:
(a) extract features of the input image including dimensions and a file size
of the input image;
(b) predict, from transcoding a set of training images, a file size of the
output image taking into
account the constraints of the terminal and the extracted features, comprising
selecting a set of feasible
transcoding parameters so that a corresponding predicted file size of the
output image meets the device
file size constraint of the terminal;

(c) determine predicted quality metric (QM) values of the output image, the
predicted QM
values characterizing a predicted measure of distortion of the input image
introduced by transcodings,
corresponding to various feasible transcoding parameters in the set of
feasible transcoding parameters;
the predicted QM values being determined by a comparison between the input
image and
corresponding output images resulting from the transcodings;
the predicted QM values being further determined based on viewing conditions,
comprising
respective resolutions at which the input image and the corresponding output
images have been scaled
for determining the predicted QM values; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which
yield the highest predicted QM value, corresponding to the highest predicted
visual quality for the output image for the set of feasible transcoding
parameters.
64. The system of claim 63, wherein the processor is further configured to
select the set of feasible
transcoding parameters based on a transcoder scaling factor zT and an output
encoding quality factor
QFT of the output image.
65. The system of claim 64, wherein the processor is further configured to
select the set of feasible
transcoding parameters based on an encoding quality factor QF(I) of the input
image.
66. The system of claim 63, wherein the processor is further configured to
determine the predicted QM
values of the output image based on transcoding parameters including a
transcoder scaling factor zT
and an output encoding quality factor QFT of the output image.
67. The system of claim 64, wherein the processor is further configured to
determine the predicted QM
values based on an encoding quality factor QF(I) of the input image.
68. The system of claim 63, wherein the processor is further configured to
determine the predicted QM
values based on an encoding quality factor QF(I) of the input image.
56

69. The system of claim 63, wherein the processor is further configured to
predict a relative file size of
the output image.
70. The system of claim 63, wherein the processor is further configured to
iteratively transcode the
input image with different feasible transcoding parameters from the set of
feasible transcoding
parameters to determine the predicted QM values.
71. The system of claim 63, wherein the processor is further configured to
determine the QM values
based on Peak Signal to Noise Ratio (PSNR) measure.
72. The system of claim 63, wherein the processor is further configured to
determine the QM values
based on Maximum Difference (MD) measure.
73. A method for transcoding of an input image into an output image for
display on a terminal having
device file size and image size constraints, the method comprising:
(a) extracting features of the input image including dimensions and a file
size of the input
image;
(b) predicting a file size of the output image taking into account the
constraints of the terminal
and the extracted features, comprising selecting a set of feasible transcoding
parameters so that a
corresponding predicted file size of the output image meets the device file
size constraint of the
terminal;
(c) determining, from transcoding a set of training images, predicted quality
metric (QM) values
of the output image, the predicted QM values characterizing a predicted
measure of distortion of the
input image introduced by transcodings, corresponding to various feasible
transcoding parameters in
the set of feasible transcoding parameters;
the predicted QM values being determined by comparison between input images
from the set of
training images and corresponding output images resulting from the
transcodings of the set of training
images; and
57

(d) selecting those transcoding parameters from the set of feasible
transcoding parameters,
which yield the highest predicted QM value, corresponding to the highest
predicted
visual quality for the output image for the set of feasible transcoding
parameters.
74. The method of claim 73, wherein the step (b) comprises selecting the set
of feasible transcoding
parameters based on a transcoder scaling factor zT and an output encoding
quality factor QFT of the
output image.
75. The method of claim 74, wherein the step (b) further comprises selecting
the set of feasible
transcoding parameters based on an encoding quality factor QF(I) of the input
image.
76. The method of claim 73, wherein the step (c) comprises determining the
predicted QM values of
the output image based on transcoding parameters including a transcoder
scaling factor zT and an
output encoding quality factor QFT of the output image.
77. The method of claim 74, wherein the step (c) further comprises determining
the predicted QM
values based on an encoding quality factor QF(I) of the input image.
78. The method of claim 73, wherein the step (c) further comprises determining
the predicted QM
values based on an encoding quality factor QF(I) of the input image.
79. The method of claim 73, wherein the step (b) comprises predicting a
relative file size of the output
image.
80. The method of claim 73, wherein the step (b) comprises iteratively
transcoding the input image
with different transcoding parameters to select the set of feasible
transcoding parameters.
81. The method of claim 73, wherein the step (c) comprises determining the QM
values based on Peak
Signal to Noise Ratio (PSNR) measure or Maximum Difference (MD) measure.
58

82. A system for transcoding an input image into an output image for display
on a terminal having
device file size and image size constraints, the system comprising:
a processor, and a memory device, having computer readable instructions stored
thereon for
execution by a processor, the processor being configured to:
(a) extract features of the input image including dimensions and a file size
of the input image;
(b) predict a file size of the output image taking into account the
constraints of the terminal and
the extracted features, comprising selecting a set of feasible transcoding
parameters so that a
corresponding predicted file size of the output image meets the device file
size constraint of the
terminal;
(c) determine, from transcoding a set of training images, predicted quality
metric (QM) values
of the output image, the predicted QM values characterizing a predicted
measure of distortion of the
input image introduced by transcodings, corresponding to various feasible
transcoding parameters in
the set of feasible transcoding parameters;
the predicted QM values being determined by comparison between input images
from the set of
training images and corresponding output images resulting from the
transcodings of the set of training
images; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which
yield the highest predicted QM value, corresponding to the highest predicted
visual quality for the output image for the set of feasible transcoding
parameters.
83. The system of claim 82, wherein the processor is further configured to
select the set of feasible
transcoding parameters based on a transcoder scaling factor zT and an output
encoding quality factor
QFT of the output image.
84. The system of claim 83, wherein the processor is further configured to
select the set of feasible
transcoding parameters based on an encoding quality factor QF(I) of the input
image.
59

85. The system of claim 82, wherein the processor is further configured to
determine the predicted QM
values of the output image based on transcoding parameters including a
transcoder scaling factor zT
and an output encoding quality factor QFT of the output image.
86. The system of claim 83, wherein the processor is further configured to
determine the predicted QM
values based on an encoding quality factor QF(I) of the input image.
87. The system of claim 82, wherein the processor is further configured to
determine the predicted QM
values based on an encoding quality factor QF(I) of the input image.
88. The system of claim 82, wherein the processor is further configured to
predict a relative file size of
the output image.
89. The system of claim 82, wherein the processor is further configured to
iteratively transcode the
input image with different feasible transcoding parameters from the set of
feasible transcoding
parameters to determine the predicted QM values.
90. The system of claim 82, wherein the processor is further configured to
determine the QM values
based on Peak Signal to Noise Ratio (PSNR) measure or Maximum Difference (MD)
measure.

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
SYSTEM AND METHOD FOR QUALITY-AWARE SELECTION OF
PARAMETERS IN TRANSCODING OF DIGITAL IMAGES
RELATED APPLICATIONS
The present application claims benefit from the US provisional application to
Stephane
Coulombe et al. serial number 60/991,956 filed on December 03, 2007 entitled
"Quality-
Aware Selection of Quality Factor and Scaling Parameters in JPEG Image
Transcoding", US
patent application Stephane Coulombe et al. serial number 12/164,836 filed on
June 30,
2008 entitled "Method and System for Quality-Aware Selection of Parameters in
Transcoding of Digital Images", and PCT patent application to Steven Pigeon
entitled
"System and Method for Predicting the File Size of Images Subject to
Transformation by
Scaling and Change of Quality-Controlling Parameters" serial number
PCT/CA2007/001974 filed Nov 02, 2007.
FIELD OF THE INVENTION
The present invention relates generally to image transcoding and more
specifically to
the transcoding of images contained in a multimedia messaging service (MMS)
message.
BACKGROUND OF THE INVENTION
The multimedia messaging service (MMS) as described, e.g., in the OMA
Multimedia
Messaging Service specification, Approved Version 1.2 May 2005, Open Mobile
Alliance,
OMA-ERP-MMS-V1 2-200504295-A.zip, which is available at the following URL
http://www.openmobilealliance.org/Technical/release_program/mms_v1_2.aspx,
provides
methods for the peer-to-peer and server-to-client transmission of various
types of data
including text, audio, still images, and moving images, primarily over
wireless networks.
While the MMS provides standard methods for encapsulating such data, the type
of data
may be coded in any of a large number of standard formats such as plain text,
3GP video
and audio/speech, SP-MIDI for synthetic audio, JPEG still images (details on
any one of
those refer to Multimedia Messaging Service, Media formats and codecs, 3GPP TS
26.140,
V7.1.0 (2007-06), available at the following URL
http://www.3gpp.org/ftp/Specs/html-
1

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
info/26140.htm). Still images are frequently coded in the JPEG format for
which a software
library has been written by "The independent jpeg group" and published at
ftp.uu.net/graphics/jpeg/jpegsrc.v6b.tar.gz.
Figure 1 illustrates one example of a MMS system architecture 100, including
an
Originating Node 102, a Service Delivery Platform 104, a Destination Node 106,
and an
Adaptation Engine 108. The Originating Node 102 is able to communicate with
the Service
Delivery Platform 104 over a Network "A" 110. Similarly the Destination Node
106 is able
to communicate with the Service Delivery Platform 104 over a Network "B" 112.
The
Networks "A" and "B" are merely examples, shown to illustrate a possible set
of
connectivities, and many other configurations are also possible. For example,
the
Originating and Destination Nodes (102 and 106) may be able to communicate
with the
Service Delivery Platform 104 over a single network; the Originating Node 102
may be
directly connected to the Service Delivery Platform 104 without an intervening
network, etc.
The Adaptation Engine 108 may be directly connected with the Service Delivery
Platform
104 over a link 114 as shown in Fig. 1, or alternatively may be connected to
it through a
network, or may be embedded in the Service Delivery Platform 104.
In a trivial case, the Originating Node 102 may send a (multimedia) message
that is destined
for the Destination Node 106. The message is forwarded through the Network "A"
110 to
the Service Delivery Platform 104 from which the message is sent to the
Destination Node
106 via the Network "B" 112. The Originating and Destination Nodes (102 and
106) may
for instance be wireless devices, the Networks "A" and "B" (110 and 112) may
in this case
be wireless networks, and the Service Delivery Platform 104 may provide the
multimedia
message forwarding service.
In another instance, the Originating Node 102 may be a server of a content
provider,
connected to the Service Delivery Platform 104 through a data network, i.e.
the Network
"A" 110 may be the internet, while the Network "B" 112 may be a wireless
network serving
the Destination Node 106 which may be a wireless device.
An overview of server-side adaptation for the Multimedia Messaging Service
(MMS) is
2

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
given in a paper "Multimedia Adaptation for the Multimedia Messaging Service"
by
Stephane Coulombe and Guido Grassel, IEEE Communications Magazine, vol. 42,
no.7,
pp. 120-126, July 2004.
In the case of images in particular, the message sent by the Originating Node
102 may
include an image, specifically a JPEG encoded image. The capabilities of the
Destination
Node 106 may not include the ability to display the image in its original
form, for example
because the height or width of the image in terms of the number of pixels,
that is the
resolution of the image, exceeds the size or resolution of the display device
in the
Destination Node 106. In order for the Destination Node 106 to receive and
display it, the
image may be modified in an Image Transcoder 116 in the Adaptation Engine 108
before
being delivered to the Destination Node 106. The modification of the image by
the Image
Transcoder 116 typically may include scaling, i.e. change the image
resolution, and
compression.
Image compression is commonly done to reduce the file size of the image for
reasons of
storage or transmission economy, or to meet file size limits or bit rate
limits imposed by
network requirements. The receiving device in MMS also has a memory limitation
leading
to a file size limit. The JPEG standard provides a commonly used method for
image
compression. As is well known, JPEG compression is "lossy", that is a
compressed image
may not contain 100% of the digital information contained in the original
image. The loss
of information can be controlled by setting a "Quality Factor" QF during the
compression.
A lower QF is equivalent to higher compression and generally leads to a
smaller file size.
Conversely, a higher QF leads to a larger file size, and generally higher
perceived "quality"
of the image.
Changing an image's resolution, or scaling, to meet a terminal's capabilities
is a problem
with well-known solutions. However, optimizing image quality against file size
constraints
remains a challenge, as there are no well-established relationships between
the quality factor
QF, perceived quality, and the compressed file size. Using scaling as an
additional means of
achieving file size reduction, rather than merely resolution adaptation, makes
the problem
all the more challenging.
3

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
The problem of file size reduction for visual content has been studied
extensively. In
"Accurate bit allocation and rate control for DCT domain video transcoding" by
Zhijun Lei
and N.D. Georganas, in IEEE CCECE 2002. Canadian Conference on Electrical and
Computer Engineering, 2002, vol. 2, pp. 968-973, it is shown that bit rate
reduction can be
achieved through adaptation of quantization parameters, rather than through
scaling. This
makes sense in the context of low bit rate video, where resolution is often
limited to a
number of predefined formats. In "Efficient transform-domain size and
resolution reduction
of images" by Justin Ridge, in Signal Processing: Image Communication, vol.
18, no. 8, pp.
621-639, Sept. 2003, a technique is described for scaling and then reducing
the file size of
JPEG images. But this technique does not consider estimating scaling and
quality reduction
in combination. A method of reducing the size of an existing JPEG file is
described in the
US patent 6,233,359 entitled "File size bounded JPEG transcoder" May 2001, by
Viresh
Ratnakar and Victor Ivashin. However, while reducing the quality and bit rate
of an image,
this method does not include scaling of the image.
Methods to estimate the compressed file size of a JPEG image that is subject
to
simultaneous changes in scaling and in QF have been reported in a brief note
by Steven
Pigeon and Stephane Coulombe, entitled "Very Low Cost Algorithms for
Predicting the File
Size of JPEG Images Subject to Changes of Quality Factor and Scaling", Data
Compression
Conference (DCC 2008), p. 538, 2008, and fully described in "Computationally
efficient
algorithms for predicting the file size of JPEG images subject to changes of
quality factor
and scaling" in Proceedings of the 24th Queen's Biennial Symposium on
Communications,
Queen's University, Kingston, Canada, 2008 (the "Kingston" paper), and in the
PCT patent
application to Steven Pigeon entitled "System and Method for Predicting the
File Size of
Images Subject to Transformation by Scaling and Change of Quality-Controlling
Parameters" serial number PCT/CA2007/001974 filed Nov 02, 2007.
In spite of recent advancement in the area of image transcoding, there remains
a requirement
for developing an improved transcoding method that takes scaling, compressed
file size
limitations, as well as image quality into account.
4

CA 02703048 2014-04-22
SUMMARY OF THE INVENTION
It is therefore an object of the invention to provide a method and system for
scaling an
image, which would avoid or mitigate the shortcomings of the prior art.
According to one aspect of the invention, there is provided a method for
selecting
transcoding parameters and transcoding an input image into an output image for
a
terminal having terminal constraints including a maximum device file size, the

method comprising:
(a) determining a file size, and an encoding quality factor QF(I) of the input
image;
(b) providing an array of relative file size prediction values and predicted
quality
metric values characterizing a measure of distortion of an image introduced by
transcoding,
the array being indexed by encoding quality factor QF(I), output encoding
quality factor
QFT and a resolution scaling factor zT;
(c) computing a maximum relative file size as the maximum device file size
divided
by the file size of the input image;
(d) selecting, from the array, a set of feasible transcoding parameters,
yielding those
predicted relative file size values not exceeding the maximum relative file
size, which
correspond to the file size and the encoding quality factor QF(I) of the input
image;
(e) obtaining, from the array, predicted quality metric values corresponding
to the
set of feasible transcoding parameters; and
(f) selecting those transcoding parameters from the set of feasible
transcoding parameters, which yield the highest predicted quality metric
value.
Additionally, the method further comprises transcoding the input image into
the output
image using the selected transcoding parameters corresponding to the highest
predicted
quality metric.
In the method described above, the array has been obtained from transcoding of
a set of
training images.
In the method described above, the set of feasible transcoding parameters
comprises
the resolution scaling factor zT not exceeding a maximum scaling factor z_max
determined using predetermined maximum image dimensions of the output image
and image dimensions of the input image.

CA 02703048 2014-04-22
The step (d) comprises interpolating between entries of the data structure for
at least one of
the indices.
The step (d) further comprises limiting the set of feasible transcoding
parameters to
those combinations of the output encoding quality factor QFT and the
resolution
scaling factor zT yielding predicted relative file size values not exceeding
the
maximum relative file size.
Also, the array comprises a multi-dimensional relative file size prediction
table indexed by
the QF(1), QFT and zT.
Additionally, the array comprises a multi-dimensional quality prediction table
indexed by
the QF(I), QFT and zT.
Beneficially, the input image and the output image are JPEG images.
Conveniently, the quality metric values are based on the Maximum Difference
(MD)
measure of respective transcoded training images compared with corresponding
training
images in the set of training images.
In the method described above, the step (e) includes:
sorting the set of feasible transcoding parameters according to the
corresponding
predicted quality metric values; and
truncating the set of feasible transcoding parameters by reducing it to a
promising
subset corresponding to high predicted quality metric values.
Additionally, the set of training images comprises a first set of training
images and a second
set of training images, the method further comprises:
obtaining the predicted relative file size values from transcoding of the
first set of
training images; and
obtaining the predicted quality metric values from transcoding of the second
set of
training images.
The quality metric values are determined based on the Peak Signal to Noise
Ratio (PSNR)
measure of respective transcoded training images compared with corresponding
training
6

CA 02703048 2014-04-22
images in the set of training images.
The step (d) further comprises limiting the set of feasible transcoding
parameters to those
combinations of zT and QFT wherein the resolution scaling factor zT does not
exceed a
maximum scaling factor z_max determined from predetermined maximum image
dimensions of the output image and image dimensions of the input image.
Additionally, the maximum scaling factor z_max is further defined according to
a viewing
scaling factor zV chosen based on anticipated viewing conditions of the output
image.
According to another aspect of the invention, there is provided a system for
selecting
transcoding parameters and transcoding an input image into an output image for
a terminal
having terminal constraints including a maximum device file size, the system
comprising:
a processor; and
a computer readable storage medium having computer readable instructions
stored thereon
for execution for the processor, forming:
an image feature extraction module, determining a file size, and an encoding
quality
factor QF(I) of the input image;
an array of relative file size prediction values and predicted quality metric
values
characterizing a measure of distortion of an image introduced by transcoding,
the array
being indexed by encoding quality factor QF(I), output encoding quality factor
QFT and a
resolution scaling factor zT;
a quality and file size prediction module being configured to:
compute a maximum relative file size as the maximum device file size
divided by the file size of the input image;
select from the array, a set of feasible transcoding parameters, yielding
those
predicted relative file size values not exceeding the maximum relative file
size, which
correspond to the file size and the encoding quality factor QF(I) of the input
image; and
obtain, from the array, predicted quality metric values corresponding to the
set of feasible transcoding parameters;
and
a quality-aware parameter selection module, selecting those transcoding
parameters
from the set of feasible transcoding parameters, which yield the highest
predicted quality
metric value.
7

CA 02703048 2014-04-22
Beneficially, the system further comprises a transcoding module for
transcoding the input
image into the output image using the selected transcoding parameters
corresponding to the
highest predicted quality metric.
Also, the array has been obtained from transcoding of a set of training
images.
Additionally, the set of feasible transcoding parameters comprises the
resolution scaling
factor zT not exceeding a maximum scaling factor z_max determined from
predetermined
maximum image dimensions of the output image and image dimensions of the input
image.
The quality and file size prediction module is further configured to
interpolate between
entries of the array for at least one of the indices.
Benefically, the quality and file size prediction module is further configured
to limit the set
of feasible transcoding parameters to those combinations of the output
encoding quality
factor QFT and the resolution scaling factor zT yielding predicted relative
file size values
not exceeding the maximum relative file size.
The array comprises a multi-dimensional relative file size prediction table
indexed by the
QF(I), QFT and zT.
The array further comprises a multi-dimensional quality prediction table
indexed by the
QF(I), QFT and zT.
Benefically, the input image and the output image are JPEG images.
The quality metric values are determined based on the Peak Signal to Noise
Ratio (PSNR)
measure of respective transcoded training images compared with corresponding
training
images in the set of training images.
Additionally, in the system described above:
the set of training images comprises a first set of training images and a
second set of
training images;
the predicted relative file size values are obtained from transcoding of the
first set of
training images; and
8

CA 02703048 2014-04-22
the predicted quality metric values are obtained from transcoding of the
second set
of training images.
Also, the quality metric values are based on the Maximum Difference (MD)
measure of
respective transcoded training images compared with corresponding training
images in the
set of training images.
Further, the quality-aware parameter selection module is further configured to
sort the set
of feasible transcoding parameters according to the corresponding predicted
quality metric
values, and truncate the set of feasible transcoding parameters by reducing
the set to a
promising subset corresponding to high predicted quality metric values.
The quality and file size prediction module is further configured to limit the
set of feasible
transcoding parameters to those combinations of zT and QFT wherein the
resolution scaling
factor zT does not exceed a maximum scaling factor z_max determined from
predetermined
maximum image dimensions of the output image and image dimensions of the input
image.
The maximum scaling factor z_max is further defined according to a viewing
scaling factor
zV chosen based on anticipated viewing conditions of the output image.
According to yet another aspect of the invention, there is provided a method
for quality-
aware transcoding of an input image into an output image for display on a
terminal having
device file size and image size constraints, the method comprising:
(a) extracting features of the input image including dimensions and a file
size of the
input image;
(b) predicting, from file size prediction information obtained by transcoding
a set of
training images, a file size of the output image taking into account the
constraints of the
terminal and the extracted features, comprising selecting a set of feasible
transcoding
parameters so that a corresponding predicted file size of the output image
meets the device
file size constraint of the terminal;
(c) determining, from quality metric prediction information obtained by
transcoding
the set of training images, predicted quality metric (QM) values of the output
image, the
predicted QM values characterizing a predicted measure of distortion of the
input image
introduced by the transcoding, corresponding to the feasible transcoding
parameters in the
set of feasible transcoding parameters;
9

CA 02703048 2014-04-22
the quality metric prediction information being determined explicitly by a
comparison between input images from the set of training images and
corresponding output
images resulting from the transcoding of the set of training images; and
(d) selecting those transcoding parameters from the set of feasible
transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
Additionally, the method further comprises indexing the file size prediction
information by
transcoding parameters including a transcoder scaling factor zT and an output
encoding
quality factor QFT of the output image.
Also, the method further comprises indexing the quality metric prediction
information by
the transcoder scaling factor zT and the output encoding quality factor QFT.
The step (b) further comprises obtaining the file size prediction information
from an array
of predicted relative output file sizes for image transcoding, obtained from
the transcoding
of the set of training images.
Additionally, the method further comprises indexing the file size prediction
information and
the quality metric prediction information by an encoding quality factor QF(1)
of the input
image.
The step (c) further comprises obtaining the quality metric prediction
information from an
array of predicted quality metric values for image transcoding, obtained from
the
transcoding of the set of training images.
Also, the step (c) comprises determining the quality metric values based on a
Peak Signal
to Noise Ratio (PSNR) measure or a Maximum Difference (MD) measure.
The method further comprises indexing the quality metric prediction
information by a
viewing scaling factor, characterizing a viewing condition of the output
image.
In yet another aspect of the invention there is provided a system for quality-
aware
transcoding of an input image into an output image for display on a terminal
having device
file size and image size constraints, the system comprising:

CA 02703048 2014-04-22
a processor;
a non-transitory computer readable storage medium having computer executable
instructions stored thereon for execution by the processor, causing the
processor to:
(a) extract features of the input image including dimensions and a file size
of the
input image;
(b) predict, from file size prediction information obtained by transcoding a
set of
training images, a file size of the output image taking into account the
constraints of the
terminal and the extracted features, comprising selecting a set of feasible
transcoding
parameters so that a corresponding predicted file size of the output image
meets the device
file size constraint of the terminal;
(c) determine, from quality metric prediction information obtained by
transcoding
the set of training images, predicted quality metric (QM) values of the output
image,
characterizing a predicted measure of distortion of the input image introduced
by the
transcoding, corresponding to the feasible transcoding parameters in the set
of feasible
transcoding parameters;
the quality metric prediction information being determined explicitly by a
comparison between input images from the set of training images and
corresponding output
images resulting from the transcoding of the set of training images; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
Additionally, the file size prediction information is indexed by transcoding
parameters
including a transcoder scaling factor zT and an output encoding quality factor
QFT of the
output image.
Beneficially, the quality metric prediction information is indexed by the
transcoder scaling
factor zT and the output encoding quality factor QFT.
Also, the file size prediction information comprises an array of predicted
relative output file
sizes, obtained from the transcoding of the set of training images.
The file size prediction information and the quality metric prediction
information are also
further indexed by an encoding quality factor QF(I) of the input image.
11

CA 02703048 2014-04-22
Also, the quality metric prediction information comprises an array of
predicted quality
metric values for image transcoding, obtained from the transcoding of the set
of training
images.
Additionally, the computer readable instructions further cause the processor
to determine
the quality metric values based on a Peak Signal to Noise Ratio (PSNR) measure
or a
Maximum Difference (MD) measure.
Also, the quality metric prediction information is further indexed by a
viewing scaling
factor, characterizing a viewing condition of the output image.
According to another aspect of the invention there is provided a system for
quality-aware
transcoding of an input image into an output image for display on a terminal
having device
file size and image size constraints, the system comprising:
a processor; and
a non-transitory computer readable storage medium having computer executable
instructions stored thereon for execution by the processor, forming:
(a) a means for extracting features of the input image including dimensions
and a
file size of the input image;
(b) a means for predicting, from file size prediction information obtained by
transcoding of a set of training images, a file size of the output image
taking into account
the constraints of the terminal and the extracted features, comprising
selecting a set of
feasible transcoding parameters so that a corresponding predicted file size of
the output
image meets the device file size constraint of the terminal;
(c) a means for determining, from quality metric prediction information
obtained by
transcoding of the set of training images, predicted quality metric (QM)
values of the output
image, the predicted QM values characterizing a predicted measure of
distortion of the
input image introduced by the transcoding, corresponding to the feasible
transcoding
parameters in the set of feasible transcoding parameters;
the quality metric prediction information being determined explicitly by a
comparison between input images from the set of training images and
corresponding output
images resulting from the transcoding of the set of training images; and
(d) a means for selecting those transcoding parameters from the set of
feasible
transcoding parameters, which yield the highest predicted QM value,
corresponding to the
ha

CA 02703048 2014-04-22
highest predicted visual quality for the output image for the set of feasible
transcoding
parameters.
Preferably, the means (c) are configured to determine the quality metric
values based on a
Peak Signal to Noise Ratio (PSNR) measure or a Maximum Difference (MD)
measure.
Beneficially, the file size prediction information is indexed by transcoding
parameters
including a transcoder scaling factor zT and an output encoding quality factor
QFT of the
output image.
Also, the quality metric prediction information is indexed by the transcoder
scaling factor
zT and the output encoding quality factor QFT.
The file size prediction information and the quality metric prediction
information
respectively comprise an array of predicted relative output file sizes and an
array of
predicted quality metric values, obtained from the transcoding of the set of
training images.
The file size prediction information and the quality metric prediction
information are
further indexed by one of the following:
an encoding quality factor QF(I) of the input image; or
a viewing scaling factor, characterizing a viewing condition of the output
image.
A further aspect of the invention provides a method for transcoding an input
image into an
output image for display on a terminal having device file size and image size
constraints,
the method comprising:
(a) extracting features of the input image including dimensions and a file
size of the
input image;
(b) predicting, from transcoding a set of training images, a file size of the
output
image taking into account the constraints of the terminal and the extracted
features,
comprising selecting a set of feasible transcoding parameters so that a
corresponding
predicted file size of the output image meets the device file size constraint
of the terminal;
(c) determining predicted quality metric (QM) values of the output image, the
predicted QM values characterizing a predicted measure of distortion of the
input image
introduced by transcodings, corresponding to various feasible transcoding
parameters in the
set of feasible transcoding parameters;
lib

CA 02703048 2014-04-22
the predicted QM values being determined by a comparison between the input
image and corresponding output images resulting from the transcodings;
the predicted QM values being further determined based on viewing conditions,
comprising respective resolutions at which the input image and the
corresponding output
images have been scaled for determining the predicted QM values; and
(d) selecting those transcoding parameters from the set of feasible
transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
The step (b) comprises selecting the set of feasible transcoding parameters
based on a
transcoder scaling factor zT and an output encoding quality factor QFT of the
output image.
The step (b) further comprises selecting the set of feasible transcoding
parameters based on
an encoding quality factor QF(I) of the input image.
Also, the step (c) comprises determining the predicted QM values of the output
image
based on transcoding parameters including a transcoder scaling factor zT and
an output
encoding quality factor QFT of the output image.
The step (c) further comprises determining the predicted QM values based on an
encoding
quality factor QF(I) of the input image.
The step (c) further comprises determining the predicted QM values based on an
encoding
quality factor QF(I) of the input image.
Additionally, the step (b) comprises predicting a relative file size of the
output image.
The step (c) comprises iteratively transcoding the input image with different
feasible
transcoding parameters from the set of feasible transcoding parameters to
determine the
predicted QM values.
The step (c) comprises determining the QM values based on Peak Signal to Noise
Ratio
(PSNR) measure.
1 I c

CA 02703048 2014-04-22
Further, the step (c) comprises determining the QM values based on Maximum
Difference
(MD) measure.
In yet another aspect of the invention, there is provided a system for
transcoding an input
image into an output image for display on a terminal having device file size
and image size
constraints, the system comprising:
a processor, and a memory device, having computer readable instructions stored
thereon for
execution by a processor, the processor being configured to:
(a) extract features of the input image including dimensions and a file size
of the
input image;
(b) predict, from transcoding a set of training images, a file size of the
output image
taking into account the constraints of the terminal and the extracted
features, comprising
selecting a set of feasible transcoding parameters so that a corresponding
predicted file size
of the output image meets the device file size constraint of the terminal;
(c) determine predicted quality metric (QM) values of the output image, the
predicted QM values characterizing a predicted measure of distortion of the
input image
introduced by transcodings, corresponding to various feasible transcoding
parameters in the
set of feasible transcoding parameters;
the predicted QM values being determined by a comparison between the input
image and corresponding output images resulting from the transcodings;
the predicted QM values being further determined based on viewing conditions,
comprising respective resolutions at which the input image and the
corresponding output
images have been scaled for determining the predicted QM values; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
Benefically, the processor is further configured to select the set of feasible
transcoding
parameters based on a transcoder scaling factor zT and an output encoding
quality factor
QFT of the output image.
The processor is further configured to select the set of feasible transcoding
parameters
based on an encoding quality factor QF(I) of the input image.
lid

CA 02703048 2014-04-22
Also, the processor is further configured to determine the predicted QM values
of the
output image based on transcoding parameters including a transcoder scaling
factor zT and
an output encoding quality factor QFT of the output image.
The processor is further configured to determine the predicted QM values based
on an
encoding quality factor QF(I) of the input image.
The processor is further configured to determine the predicted QM values based
on an
encoding quality factor QF(I) of the input image.
The processor is further configured to predict a relative file size of the
output image.
The processor is further configured to iteratively transcode the input image
with different
feasible transcoding parameters from the set of feasible transcoding
parameters to
determine the predicted QM values.
The processor is further configured to determine the QM values based on Peak
Signal to
Noise Ratio (PSNR) measure.
The processor is further configured to determine the QM values based on
Maximum
Difference (MD) measure.
According to yet another aspect of the invention, there is provided a method
for transcoding
of an input image into an output image for display on a terminal having device
file size and
image size constraints, the method comprising:
(a) extracting features of the input image including dimensions and a file
size of the
input image;
(b) predicting a file size of the output image taking into account the
constraints of
the terminal and the extracted features, comprising selecting a set of
feasible transcoding
parameters so that a corresponding predicted file size of the output image
meets the device
file size constraint of the terminal;
(c) determining, from transcoding a set of training images, predicted quality
metric
(QM) values of the output image, the predicted QM values characterizing a
predicted
measure of distortion of the input image introduced by transcodings,
corresponding to
various feasible transcoding parameters in the set of feasible transcoding
parameters;
lie

CA 02703048 2014-04-22
the predicted QM values being determined by comparison between input images
from the set of training images and corresponding output images resulting from
the
transcodings of the set of training images; and
(d) selecting those transcoding parameters from the set of feasible
transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
Beneficially, the step (b) comprises selecting the set of feasible transcoding
parameters
based on a transcoder scaling factor zT and an output encoding quality factor
QFT of the
output image.
Also, the step (b) further comprises selecting the set of feasible transcoding
parameters
based on an encoding quality factor QF(I) of the input image.
The step (c) comprises determining the predicted QM values of the output image
based on
transcoding parameters including a transcoder scaling factor zT and an output
encoding
quality factor QFT of the output image.
The step (c) further comprises determining the predicted QM values based on an
encoding
quality factor QF(I) of the input image.
The step (c) further comprises determining the predicted QM values based on an
encoding
quality factor QF(I) of the input image.
The step (b) comprises predicting a relative file size of the output image.
The step (b) comprises iteratively transcoding the input image with different
transcoding
parameters to select the set of feasible transcoding parameters.
The step (c) comprises determining the QM values based on Peak Signal to Noise
Ratio
(PSNR) measure or Maximum Difference (MD) measure.
A further aspect of the invention provides a system for transcoding an input
image into an
output image for display on a terminal having device file size and image size
constraints,
the system comprising:
1 I f

CA 02703048 2014-04-22
a processor, and a memory device, having computer readable instructions stored

thereon for execution by a processor, the processor being configured to:
(a) extract features of the input image including dimensions and a file size
of the
input image;
(b) predict a file size of the output image taking into account the
constraints of the
terminal and the extracted features, comprising selecting a set of feasible
transcoding
parameters so that a corresponding predicted file size of the output image
meets the device
file size constraint of the terminal;
(c) determine, from transcoding a set of training images, predicted quality
metric
(QM) values of the output image, the predicted QM values characterizing a
predicted
measure of distortion of the input image introduced by transcodings,
corresponding to
various feasible transcoding parameters in the set of feasible transcoding
parameters;
the predicted QM values being determined by comparison between input images
from the set of training images and corresponding output images resulting from
the
transcodings of the set of training images; and
(d) select those transcoding parameters from the set of feasible transcoding
parameters, which yield the highest predicted QM value, corresponding to the
highest
predicted visual quality for the output image for the set of feasible
transcoding parameters.
Beneficially, the processor is further configured to select the set of
feasible transcoding
parameters based on a transcoder scaling factor zT and an output encoding
quality factor
QFT of the output image.
Also, the processor is further configured to select the set of feasible
transcoding parameters
based on an encoding quality factor QF(1) of the input image.
The processor is further configured to determine the predicted QM values of
the output
image based on transcoding parameters including a transcoder scaling factor zT
and an
output encoding quality factor QFT of the output image.
The processor is further configured to determine the predicted QM values based
on an
encoding quality factor QF(I) of the input image.
The processor is further configured to determine the predicted QM values based
on an
encoding quality factor QF(I) of the input image.
1 1 g

CA 02703048 2014-04-22
The processor is further configured to predict a relative file size of the
output image.
The processor is further configured to iteratively transcode the input image
with different
feasible transcoding parameters from the set of feasible transcoding
parameters to
determine the predicted QM values.
The processor is further configured to determine the QM values based on Peak
Signal to
Noise Ratio (PSNR) measure or Maximum Difference (MD) measure.
Thus, an improved system and method for transcoding a digital image have been
provided.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention will now be described, by way of example, with
reference to
the accompanying drawings, in which:
Figure 1 illustrates an example of an MMS system architecture 100 of the prior
art;
Figure 2 illustrates a basic quality-aware image transcoding system 200 (Basic
System);
Figure 3 shows details of the Quality Assessment module 210 of the Basic
System 200;
Figure 4 is a flow chart of a basic qualityaware parameter selection method
(Basic
Method) 400 for the selection of parameters in JPEG image transcoding,
corresponding to
the Basic System 200;
Figure 5 is a flow chart showing an expansion of the step 412 "Run Quality-
aware
Parameter Selection and Transcoding Loop" of the Basic Method 400;
Figure 6 shows a quality prediction table generation system 500;
Figure 7 shows a simple quality-aware image transcoding system (Simple System)
600;
Figure 8 is a flow chart of a predictive method 700 for quality=aware
selection of
parameters in JPEG image transcoding which is applicable to the Simple System
600;
11h

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
Figure 9 is a flow chart showing an expansion of the step 702 "Run Predictive
Quality-
aware Parameter Selection Loop" of the Predictive Method 700;
Figure 10 shows a block diagram of an improved quality-aware transcoding
system
(Improved System) 800;
Figure 11 is a flow chart of an improved method 900 for quality-aware
selection of
parameters in JPEG'image transcoding which is applicable to the Improved
System 800;
Figure 12 is a flow chart showing an expansion of the step 902 "Create Set "F"
of the
improved method 900;
Figure 13 is a flow chart showing an expansion of the step 904 "Run Improved Q-
aware
Parameter Selection and Transcoding" of the improved method 900;
Figures 14A and 14B show an example of sorted PSNR values for zV=0.7 and s_max
=
1.0; and an example of sorted PSNR values with zV=0.7 with smax = 0.7
respectively; and
Figure 15 is a flow chart of a quality prediction table generation method
1000, illustrating
the functionality of the quality prediction table generation system 500 of
Fig. 6.
DETAILED DESCRIPTION OF THE EMBODIMENTS OF THE INVENTION
It is an object of the embodiments of the invention to provide a method and a
quality-aware
image transcoder for scaling an image to meet the constraints of a display
device in terms of
resolution or image size, and file size while at the same time maximizing the
user
experience, or objective quality of the transcoded image.
In a first embodiment, a transcoder system is described which makes use of a
predictive
table (Table 1 below) that is based on results of transcoding a large number
of images.
Further details of the predictive table, and methods by which such a table may
generated
can be found in the above mentioned paper by Steven Pigeon and Stephane
Coulombe,
entitled "Computationally efficient algorithms for predicting the file size of
JPEG images
subject to changes of quality factor and scaling".
12

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
The predictive table may serve as a three-dimensional look-up table for
estimating with a
certain amount of statistical confidence, the file size of a transcoded image
as a function of
three quantized variables: the input Quality Factor of the image before
transcoding (QF_in);
the scaling factor ("z"); and the output Quality Factor to be used in
compressing the scaled
image (QP_out).
For convenience of the reader, an example of a two-dimensional slice of the
predictive table
is reproduced here from the above mentioned paper.
scaling
QF_out 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0.03 0.04 0.05 0.07 0.08 0.10 0.12 0.15 0.17 0.20
0.03 0.05 0.07 0.09 0.12 0.15 0.19 0.22 0.26 0.32
0.04 0.05 0.08 0.11 0.15 0.19 0.24 0.29 0.34 0.41
0.04 0.06 0.09 0.13 0.17 0.22 0.28 0.34 0.40 0.50
0.04 0.06 0.10 0.14 0.19 0.25 0.32 0.39 0.46 0.54
0.04 0.07 0.11 0.16 0.22 0.28 0.36 0.44 0.53 0.71
0.04 0.08 0.13 0.18 0.25 0.33 0.42 0.52 0.63 0.85
0.05 0.09 0.15 0.22 0.31 0.41 0.52 0.65 0.78 0.95
0.06 0.12 0.21 0.31 0.44 0.59 0.75 0.93 1.12 1.12
100 0.10 0.24 0.47 0.75 1.05 1.46 1.89 2.34 2.86 2.22
Table 1: Relative File Size Prediction
Table 1 shows a two-dimensional slice of relative file size predictions for
transcoding
images of an input Quality Factor QF jn = 80%, as a function of the scaling
factor "z", and
of the output Quality Factor QF_out. The table shows relative file size
predictions,
quantized into a matrix of 10 by 10 relative size factors. Each entry in the
matrix is an
example of an average relative file size prediction of a scaled JPEG image, as
a function of a
selected output Quality Factor QF_out and a quantized scaling factor "z". The
output
Quality Factor is quantized into ten values ranging from 10 to 100 indexing
the rows of the
matrix. The quantized scaling factor "z", ranging from 10% to 100% indexes the
columns
13

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
of the sub-array. Each entry in the table represents a relative size factor,
that is the factor by
which transcoding of an image (de-compressing, scaling, and re-compressing)
with the
selected parameters would be expected to change the file size of the image.
As an example, an input image of a file size of 100 KB, transcoded with a
scaling factor of
70% and an output Quality Factor QF_out of 90, would be expected to yield an
output
image of a file size of 100 KB * 0.75 = 75 KB. It should be noted that this
result is a
prediction based on the average from a large set of pre-computed transcodings,
of a large
number of different images - transcoding a particular image may result in a
different file
size.
As described in detail in the above mentioned paper, the table may be
generated and
optimized from a Training Set comprised of a large number of images.
The input Quality Factor QF_in of 80% was selected as representative of the
majority of
images found on the world-wide web. The predictive table may contain
additional two-
dimensional slices, representing file size predictions for transcoding images
of a different
input Quality Factor. Furthermore, the Table 1 was chosen as a matrix of
dimension 10 x
10, for illustrative purposes. A matrix of a different dimension could also be
used. In
addition, although in the following description the parameters such as QF_in
and z are
quantized, it is also possible to alternatively interpolate values from the
table. For instance,
in Table 1, if the relative file size prediction is desired for a scaling
factor of 65% and an
output Quality Factor QF_out of 75, linear interpolation could be used to
obtain a relative
file size of (0.33+0.42+0.41+0.52) / 4 = 0.42.
For the remainder of the description of the embodiments of the invention, an
input Quality
Factor QF_in of 80% is assumed, and the 10 x 10 size Table I will be used.
It is evident by inspection of the Table 1 that several combinations of QF_out
and scaling
factor "z" may lead to the same approximate predicted file size, which raises
the question of
which combination would maximize subjective user experience, or objective
quality.
Objective quality may be calculated in a number of different ways. In the
first embodiment
14

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
of the invention, a quality metric is proposed in which the input (before
transcoding) and
output (after transcoding) images are compared. The so-called peak signal-to-
noise ratio
(PSNR) is commonly used as a measure of quality of reconstruction in image
compression.
Other metrics, such as "maximum difference" (MD) could also be used without
loss of
generality.
Figure 2 illustrates a basic quality-aware image transcoding system 200 (Basic
System),
including a computer, having a processor and computer readable storage medium
having
computer executable instructions stored thereon, which when executed by the
processor,
provide the following modules: an Image Feature Extraction module 202; a
Quality and File
Size Prediction module 204; a Quality-aware Parameter Selection module 206; a
Transcoding module 208; and a Basic Quality Determination Block 209 which
includes a
Quality Assessment module 210. The Transcoding module 208 includes modules for

Decompression 212; Scaling 214; and Compression 216. The Basic System 200
further
includes means (e.g. data storage) for storing: an input image (Input Image
"I") 218; an
output image (Output Image "J") 220; a predictive Table "M" 222; and a set of
terminal
constraints (Constraints) 224. The set of terminal constraints 224 includes a
maximum
device file size S(D), and maximum permissible image dimensions of the device,
that is a
maximum permissible image width W(D), and maximum permissible image height 1-
I(D).
The table "M" 222 may be obtained as shown in the "Kingston" paper referenced
above, and
from which Table 1 has been reproduced as an example of a sub-array of the
Table "M" 222.
The input image "I" 218 is coupled to an image input 226 of the Transcoding
module 208, to
be transformed and output at an image output 228 of the Transcoding module
208, and
coupled into the output image "J" 220.
The input image "I" 218 is further coupled to an input of the Image Feature
Extraction
module 202, and to a first image input 230 of the Quality Assessment module
210.
The image output 228 of the Transcoding module 208 that outputs the output
image "J" 220
is further coupled to a second image input 232 of the Quality Assessment
module 210. The
Quality Assessment module 210 outputs a Quality Metric "QM" which is sent to a
QM-

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
input 234 of the Quality-aware Parameter Selection module 206.
The output of the Image Feature Extraction module 202 is a set of input image
parameters
"IIP" that is coupled to an IIP-input 236 of the Quality and File Size
Prediction module 204
as well as to an image parameter input 238 of the Quality-aware Parameter
Selection
module 206. The set of input image parameters "IIP" includes the file size
S(I), the
encoding quality factor QF(I), and the width and height dimensions W(I) and
H(I) of the
Input Image "I" 218.
The output of the Quality and File Size Prediction module 204 is a sub-array
M(I) of the
Table "M" 222, i.e. the slice of the Table "M" 222 indexed by QF_in = QF(I)
that
corresponds to the quantized encoding quality factor of the Input Image "I"
218. The sub-
array M(I) is input to a file size prediction input 240 of the Quality-aware
Parameter
Selection module 206.
The output of the Quality-aware Parameter Selection module 206 is a set of
transcoding
parameters including a transcoder scaling factor "zT" and an transcoder
Quality Factor
"QFT", to be also referred to as an output encoding quality factor QFT. These
transcoding
parameters are coupled to a transcoding parameter input 242 of the Transcoding
Module
208.
In the preferred embodiment, the Basic System 200 may be conveniently
implemented in a
software program, in which the modules 202 to 216 may be software modules a
subroutine
functions, and the inputs and outputs of the modules are function calling
parameters and
function return values respectively. Data such as the Input Image I 218, the
Output Image I
220, and the Table "M" 222, may be stored as global data, accessible by all
functions. The
set of terminal constraints 224 may be obtained from a data base of device
characteristics.
Transcoding of the input image "I" 218 is accomplished in the Transcoding
Module 208 by
decompressing it in the Decompression module 212, scaling it in the Scaling
module 214
with the transcoder scaling factor "zT", and compressing the scaled image in
the
Compression module 216 with the transcoder Quality Factor "QFT".
16

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
The transcoding parameters zT and QFT thus control the transcoding operation,
where the
values of these transcoding parameters are determined by the Quality-aware
Parameter
Selection module 206. The purpose of the Quality Assessment module 210 is to
compare
the Input Image "I" 218 with the Output Image "J" 220 and compute the Quality
Metric
"QM", which should be a measure of the distortion introduced by the
transcoding process.
In the preferred embodiment of the invention, the Quality Metric "QM" is
computed
explicitly as the PSNR of the image pair (Images "J" and "I"), and measured in
dB, a high
dB value indicating less distortion, i.e. higher quality.
The Quality and File Size Prediction module 204 uses the encoding quality
factor QF(I) of
the set of input image parameters "IIP", to select the sub-array M(I) of the
Table "M" 222,
the sub-array M(I) representing the predicted relative output file size for
transcoding any
image that was originally encoded with the quality factor QF(I), e.g. the
Input Image "I" 218.
The quality factor QF(I) is the quantized nearest equivalent of the actual
input Quality
Factor QF_in.
The Quality-aware Parameter Selection module 206 includes computational means
for
selecting feasible values pairs (zTõQFT) of the transcoding parameters zT and
QFT, where
feasible is defined as follows:
from the full range of transcoding parameters, a distinct value pair (zT,QFT)
is
selected from the index ranges ("z", and QF_out) of the Table "M" 222;
the value pair (zT,QFT) is accepted if the transcoder scaling factor zT does
not
exceed a maximum scaling factor "z_max", where the maximum scaling factor
"z_max" is
determined from the set of terminal constraints 224 such that neither the
maximum
permissible image width W(D) nor height H(D) is exceeded, otherwise another
distinct
value pair (zT,QFT) is selected;
the value pair (zT,QFT) is then used to index the sub-array M(I) to determine
a
corresponding predicted relative output file size sT; and
the value pair (zT,QFT) is deemed feasible if the predicted relative output
file size
sT does not exceed a maximum relative file size s_max, where s_max is the
lesser of unity
(1) or the ratio calculated by dividing the maximum device file size S(D) from
the
Constraints 224 by the actual file size S(1) of the input image "I" 218,
otherwise another
distinct value pair (zT,QFT) is selected.
17

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
Computational means for iteratively seeking a distinct value pair (zT,QFT)
until the Quality
Metric QM is optimal include a loop for each feasible combination of zT and
QFT:
a transcoding operation (Input Image "I" 218 to Output Image "J" 220) is
performed
by the Transcoding module 208;
the resulting Output Image "J" 220 has an actual file size S(J), and the
transcoding
may still be rejected if a resulting relative file size, obtained by dividing
the actual file size
S(J) of the Output Image "J" 220 by the actual file size S(I) of the input
image "I" 218,
exceeds the maximum relative file size s_max.
the quality of the transcoding is assessed in the Quality Assessment module
210 (see
below for more details) by generating the Quality Metric QM for the specific
transcoding;
and
the Output Image "J" 220 with the highest associated Quality Metric QM is
retained
as a best image.
Comparison of the Input Image "I" 218 with the Output Image "J" 220 in the
Quality
Assessment module 210 is complicated by the fact that at least one additional
scaling
operation is required in order that two images with equal image resolution can
be compared.
Figure 3 shows details of the Quality Assessment module 210 of the Basic
System 200.
The Quality Assessment module 210 comprises a Decompression(R) module 302; a
Scaling(zR) module 304; a Decompression(V) module 306; a Scaling(zV) module
308; and
a Quality Computation module 310. The input image "I" coupled to the first
image input
230 of the Quality Assessment module 210 is decompressed with the
Decompression(V)
module 306, scaled with the Scaling(zV) module 308, and coupled to a first
input of the
Quality Computation module 310. Similarly, the output image "J" coupled to the
second
image input 232 is decompressed with the Decompression(R) module 302, scaled
with the
Scaling(zR) module 304, and coupled to a second input of the Quality
Computation module
310. The Quality Computation module 310 generates the Quality Metric QM.
Two re-scaling parameters are defined, a re-scaling factor zR used in the
Scaling(zR)
module 304, and a viewing scaling factor zV used in the Scaling(zV) module
308.
18

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
For the image resolutions to be equal, we must have zV = zT * zR where zT is
the
transcoder scaling factor zT described above. The viewing scaling factor zV
must be less or
equal 1, since we never want to increase the original image's resolution when
comparing
quality. The transcoder scaling factor zT is always less or equal to one, and
chosen to
satisfy the device constraints.
The viewing scaling factor zV is dependent on the viewing conditions for which
the output
image "J" is scaled, and should be chosen to maximize (optimize) the viewer
experience, i.e.
the anticipated subjective image quality.
Three cases are of interest:
Viewing case 1: zV = 1. The images are compared at the resolution of the input
image "I".
This corresponds to zR = 1 / zT, that is the output image "J" needs to be
scaled up.
Viewing case 2: zV = zT. The images are compared at the resolution of the
output image "J"
therefore zR = 1.
Viewing case 3: zT < zV < 1. The images are compared at a resolution between
the original
("I") and the transcoded ("J") image resolutions, thus zR = zV / zT. This will
result in zR >
1, that is the output image "J" may need to be scaled up.
The expected viewing conditions, corresponding to the choice of the viewing
scaling factor
zV, play a major role in the user's appreciation of the transcoded results. If
the output
image "J" will only be viewed on the terminal, the viewing case 2 could be a
good choice.
However, if the output image "J" might be transferred to another, more capable
device later
(e.g. a personal computer) where it may be scaled up again, the resolution of
the original
image (the input image "I") must be considered, leading to the viewing case 1.
The viewing case 3 could be used when the output image "J" is viewed at a
resolution
between the transcoded resolution and the resolution of the original image
(the input image
"I"), for example at the maximum resolution supported by the device where the
user can pan
and zoom on the device, limited only by its resolution.
The viewing case 3 is the most general case in which both the input and the
output images
19

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
are scaled by the scaling factors zV and zR respectively. In the special cases
(viewing case
1 and viewing case 2) some processing efficiencies may be obtained in the
Quality
Computation module 310, as may be readily understood.
For example, in the viewing case 1 (zV = 1), no actual re-scaling of the input
image "I" is
required for the comparison. Consequently, the already decompressed input
image "I" is
already available at the output of the Decompression module 212 of the
Transcoding
module 208, and may be used directly in the Quality Computation module 310.
Similarly in the viewing case 2, no actual re-scaling of the output image "J"
is required for
the comparison. Consequently, the output image "J" needs to be only
decompressed in the
Decompression(R) module 302, and the re-scaling operation in the Scaling(zR)
module 304
may be skipped.
Due to the quantization inherent in scaling and compression operations in
general, there will
be distortion in the transcoded image (the output image "J"), compared to the
original image
(the input image "I"). Similarly, the re-scaling of one or both of these
images in the Quality
Assessment module 210 introduces additional distortions. As a consequence, the
viewing
conditions corresponding to the three cases described above may result in
different results in
the quality computation, and the best quality image may be obtained with
different
parameter settings of the transcoding parameters in the value pair (zT,QFT),
depending on
the choice of the viewing scaling factor zV and the resultant re-scaling
factor zR. The
viewing scaling factor zV (and implicitly zR) may be chosen and set in the
Quality and File
Size Prediction module 204 according to the intended application of the Basic
System 200.
In the simplest case, the viewing scaling factor zV is set equal to the
transcoder scaling
factor zT (the viewing case 2). If the image is to be optimized for viewing on
the terminal
only, it is proposed that the viewing conditions be set to correspond to the
maximum
resolution supported by the device.
Figure 4 is a flow chart of a basic quality-aware parameter selection method
(Basic method)
400 for the selection of parameters in JPEG image transcoding, corresponding
to the Basic
System 200. The Basic method 400 includes the following sequential steps:
step 402 "Get Device Constraints";

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
step 404 "Get Input Image I";
step 406 "Extract Image Features";
step 408 "Predict Quality and File Size";
step 410 "Initialize Parameters";
step 412 "Run Quality-aware Parameter Selection and Transcoding Loop";
step 414 "Validate Result"; and
step 416 "Return Image J".
In the step 402 "Get Device Constraints" the set of terminal constraints (cf.
Constraints 224,
Fig. 2) including the maximum device file size S(D), the maximum permissible
image width
W(D), and the maximum permissible image height H(D) of the display device (cf.

Destination Node 106, Fig. 1) are obtained, either from a database or directly
from the
display device through a network.
In the step 404 "Get Input Image I" the image to be transcoded (the input
Image "I") is
received from an originating terminal or server (cf. Originating Node 102,
Fig. 1).
In the step 406 "Extract Image Features" (cf. Image Feature Extraction module
202, Fig. 2) a
set of input image parameters including the file size S(I), the image width
W(I), the image
height H(I) and the encoding quality factor QF(I) are obtained from the input
image "I". In
JPEG encoded images, the file size S(I), the image width W(I), and the image
height H(I)
are readily available from the image file. The quality factor QF(I) used in
the encoding of
the image may not be explicitly encoded in the image file, but may be
estimated fairly
reliably following a method described in "JPEG compression metric as a quality
aware
transcoding" by Surendar Chandra and Carla Schlatter Ellis, Unix Symposium on
Internet
Technologies and Systems, 1999. Alternatively, the quality factor QF(I) of the
input image
"I" may simply be assumed to be a typical quality factor of the application,
e.g. 80%.
In the step 408 "Predict Quality and File Size" (cf. Quality and File Size
Prediction module
204, Fig. 2) the viewing conditions are established, i.e. a suitable value for
the viewing
scaling factor zV is chosen:
zV = mm ( W(D)/W(I), H(D)/H(I), 1),
that is zV is the smallest of the ratio of the maximum permissible image width
W(D) to the
21

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
input image width W(I), the ratio of the a maximum permissible image height
H(D) to the
input image height H(I), and one (1). It is assumed that the aspect ratio of
the image is
normally to be preserved in the transcoding. The upper limit of one (1) is to
ensure that zV
does not exceed 1 even if the display device is capable of displaying a larger
image than the
original input image "I". In a modification it is possible to apply different
scaling factors in
the transcoding horizontally and vertically where this is deemed desirable.
Quantizing the encoding quality factor QF(I) to the index QF_in, the sub-array
M(I) of the
Table "M" 222 is retrieved, either from a local file or a database. The sub-
array M(I)
includes relative file size predictions as a function of the scaling factor
"z" and the output
Quality Factor QF_out that will be used in compressing the scaled image
(QF_out ). The
sub-array M(I) may also include columns indexed by scaling factors ("z") that
exceed zV,
and relative file size predictions that exceed the maximum relative file size
s_max of the
display device; the remaining entries in the sub-array M(I) are indexed by a
set of feasible
index value pairs ("z",QF_out).
In the step 410 "Initialize Parameters" a number of variables are initialized
to prepare for the
steps to follow. These variables are:
a best transcoder Quality Factor = 0;
a best transcoder scaling factor = 0;
a best Quality Metric QM = 0; and
a best image = NIL.
Also initialized are two limits, a maximum relative file size s_max and a
maximum scaling
factor z_max. The maximum relative file s_max is calculated by dividing the
maximum
device file size S(D) by the actual file size S(I) of the input image "I" 218,
limited to unity
(1). The maximum scaling factor z_max is given by the viewing scaling factor
zV that was
already calculated in the previous step, that is z_max = zV.
The step 412 "Run Quality-aware Parameter Selection and Transcoding Loop" is a
loop
which: takes distinct valid value pairs ("z",QF_out) from the sub-array M(I);
assigns zT and
QFT to these values; causes the input Image "I" to be transcoded into the
output Image "J"
with zT and QFT; calculates the resulting Quality Metric QM; and runs the loop
until the
22

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
best image is found, that is "best" in the sense of attaining the highest
Quality Metric QM.
At the same time, the loop may also track the transcoder Quality Factor QFT
and the
transcoder scaling factor zT that was used in the transcoding step that
yielded the best
output image (not shown in Fig. 5), but this is not strictly necessary since
ultimately only the
best image is of interest.
Figure 5 is a flow chart showing an expansion of the step 412 "Run Quality-
aware
Parameter Selection and Transcoding Loop" of the Basic Method 400, with the
following
sub steps:
step 452 "Get Next Value Pair";
step 454 "Is Value Pair Available?";
step 456 "Is Value Pair feasible?"
step 458 "Transcode Ito J";
step 460 "Is Actual Size OK?"
step 462 "Decompress J and scale with zR to X";
step 464 "Decompress I and scale with zV to Y";
step 466 "Compute Metric QM = PSNR(X,Y)";
step 468 "Is QM > Best Q?";
step 470 "Set Best Q := QM, Best Image := J"; and
step 472 "Set J := Best Image".
The steps 462 to 466 together are "Quality Assessment Step" 474 comprising the

functionality of the Quality Assessment (cf. Quality Assessment module 210,
Fig. 2).
In the step 452 "Get Next Value Pair" the next value pair ("z",QF_out)
indexing the sub-
array M(I) is taken, as long as a distinct value pair is available.
In the step 454 "Is Value Pair Available?" a test is made if a distinct value
pair is available.
If it is available (YES from the step 454) execution continues with the step
456 "Is value
pair feasible?", otherwise (NO from the step 454) the loop exits to the step
472 "Set J :=
Best Image" because all distinct value pairs have been exhausted.
In the step 456 "Is Value Pair feasible?" two tests are made. First the
scaling factor "z" from
the value pair ("z",QF_out) is compared with the maximum scaling factor z_max.
The
23

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
value pair ("z",QF_out) is not valid, hence not feasible, if the scaling
factor "z" exceeds the
maximum scaling factor z_max. If the value pair ("z",QF_out) is not valid, the
step 456 "Is
value pair feasible?" exits immediately with ("NO") and execution jumps back
to the
beginning of the loop.
Then a predicted relative file size s, is read from the sub-array M(I) indexed
by the distinct
value pair ("z",QF_out), and compared with the maximum relative file size
s_max. If the
predicted relative file size s is acceptable, i.e. does not exceed the maximum
relative file
size s_max, the step 456 "Is Value Pair feasible?" exits with "YES" and
execution continues
with the step 458 "Transcode Ito J", otherwise (NO from the step 456)
execution jumps
back to the beginning of the loop, that is to the step 452 "Get Next Value
Pair".
In the step 458 "Transcode Ito J" the input Image "I" is decompressed; scaled
with a
transcoder scaling factor zT = "z"; and the scaled image is compressed with a
transcoder
Quality Factor QFT = QF_out, resulting in the output image "J".
In the step 460 "Is Actual Size OK?" an actual relative size s_out is computed
by dividing
the file size of the output image "J" by the file size of the input image "I".
If the actual
relative size s_out does not exceed the maximum relative file size s_max (YES
from the
step 460), execution continues with the step "Quality Assessment Step" 474
otherwise (NO
from the step 460) execution jumps back to the beginning of the loop, that is
to the step 452
"Get Next Value Pair". Note that the actual relative size s_out may in fact be
larger than
the predicted relative file size "s".
In the step 462 "Decompress J and scale with zR to X" of the "Quality
Assessment Step"
474, the output image "J" is decompressed and scaled with the re-scaling
factor zR
calculated as zR = zV / zT, resulting in a first intermediate image which is a
re-scaled output
image "X". Similarly, in the step 464 "Decompress I and scale with zV to Y"
the input
image "I" is decompressed and scaled with the viewing scaling factor zV,
resulting in a
second intermediate image which is a re-scaled input image "Y". As described
above, the
viewing scaling factor zV was earlier selected to maximize the user
experience. Three
viewing cases 1 to 3 may be considered.
24

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
In the step 466 "Compute Metric QM = PSNR(X,Y)" the value of the quality
metric QM is
computed as the peak signal-to-noise ratio (PSNR) of the resealed output and
input images
"J" and "I". Alternatively a different metric, for example based on "maximum
difference"
(MD) could also be used without loss of generality.
In the step 468 "Is QM > Best Q?" the computed quality metric QM is compared
with the
best quality metric found in the loop so far. Note that "best Q" was
initialized to zero before
the start of the step 412 "Run Quality-aware Parameter Selection and
Transcoding Loop"
and is the best quality metric found so far. If the computed quality metric QM
is larger than
best Quality Metric ("best Q", YES from the step 468), execution continues
with the step
470 "Set Best Q:=Q, Best Image := J" otherwise (NO from the step 468)
execution jumps
back to the beginning of the loop, that is to the step 452 "Get Next Value
Pair".
In the step 470 "Set Best Q:=QM, Best Image := J" the best results so far are
saved, that is
the highest Quality Metric "Best Q is set equal to the computed quality metric
Q; the best
image is set equal to the output image "J"; and the transcoding parameters
QF_out and zT
may be saved as best transcoder Quality Factor and best transcoder scaling
factor (not
shown in Fig. 5) respectively. After the step 470, the execution jumps back to
the beginning
of the loop, that is to the step 452 "Get Next Value Pair", to possibly find a
better
transcoding of the input image "I", until all feasible parameter pairs are
exhausted. When
the loop finally exits (NO from the step 454 "Is Value Pair Valid?"),
execution continues to
the step 472 "Set J := Best Image" the output image "J" in which the output
image "J" is set
to equal the Best Image found in the execution of the loop.
This completes the description of the expanded step 412 "Run Quality-aware
Parameter
Selection and Transcoding Loop" after which execution continues with the step
414
"Validate Result" (Fig. 4).
In the step 414 "Validate Result" a simple check confirms that a valid Best
Image was
actually found and assigned to the output Image "J" (i.e that "J" is not NIL).
It is possible
that during the execution of the step 412 "Run Quality-aware Parameter
Selection and
Transcoding Loop" no feasible transcoding parameters were found, and Best
Image remains
NIL and thus the output image "J" is set to NIL. This would be an abnormal or
fault

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
condition, and the process would return an exception error to the adaptation
engine 108.
With the final step 416 "Return Image J", the basic method 400 for quality-
aware selection
of parameters in JPEG image transcoding ends by returning the transcoded
output image "J"
to the system.
The Basic System 200 with the basic method 400 for quality-aware selection of
parameters
could thus be employed to provide a quality-aware transcoder, albeit at a high
processing
cost because many transcoding and scaling operations may need to be performed
to find the
best Output Image "J" for a given input image "I" and a set of terminal
constraints.
More efficient systems may be constructed by augmenting or replacing the
Quality-aware
Parameter Selection and Transcoding Loop with a look up table that contains
predicted
quality metric information, the table index being derived from the input image
constraints,
device constraints, and viewing conditions. The input image constraints
include the height,
width, and original quality factor of the input image; the device constraints
include the
dimensions and the maximum file size of the output image; and the viewing
conditions are
represented by the desired scaling factor for which the quality is intended to
be optimal.
Such a look up table may be generated off-line with a prediction table
generation system
such as is described in the following (Fig. 6) and a corresponding quality
prediction table
generation method (Fig. 14).
Figure 6 shows a quality prediction table generation system 500, comprising a
computer,
having a processor and computer readable storage medium having computer
executable
instructions stored thereon, which when executed by the processor, provide the
following
modules: a database containing a Training Set of Input Images 502; a
Computation of
Quality Prediction Table module 504; storage for a quality prediction Table
"N" 506, and a
Table Update module 508. The quality prediction table generation system 500
further
includes the following modules that are the same as the modules numbered with
the same
reference numerals in the Basic System 200: the Image Feature Extraction
module 202; the
Transcoding module 208; and the Quality Assessment module 210.
The Training Set of Input Images 502 contains a large number of JPEG images,
for example
26

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
the image Training Set of 70,300 files described in the "Kingston" paper by
Steven Pigeon
et al, mentioned above. Its output is a sequence of input Images "I" which are
individually
input to the Image Feature Extraction module 202, the Transcoding module 208,
and the
Quality Assessment module 210, as in the Basic System 200.
The purpose of the quality prediction table generation system 500 is to
generate the quality
prediction Table "N" 506 by transcoding each of the images contained in the
Training Set of
Input Images 502 for a range of the transcoder scaling factor zT
representative of viewing
conditions (viewing scaling factor zV), and a range of the input Quality
Factor QF_out.
The quality prediction Table "N" 506 is a multi-dimensional table, e.g., a
four-dimensional
table, which contains a Quality Metric Q indexed by four index variables: an
encoding
quality factor QF_in of an input image from the Training Set of Input Images
502, a viewing
scaling factor zV, an encoding quality factor QF_out to be used in compressing
the output
image in the transcoder, and a transcoder scaling factor zT. These index
variables are
generated in the following manner.
The encoding quality factor QF_in of the input image is inherent in the input
image from the
Training Set of Input Images 502, and may be extracted from each image as
QF(I) in the
Image Extraction Module 504 and quantized, as described above. It may also be
more
convenient to partition the image training set into groups of images clustered
around a given
quantized encoding quality factor QF_in, for example 80%.
The viewing conditions include at least three distinct viewing cases, defined
by different
values of the viewing scaling factor zV as described above. In generating the
Table "N"
506, it is convenient to generate a range of values for zV, for example in
quantized steps of
10%.
The quality prediction table generation system 500 is thus similar to the
Basic System 200
but generates the transcoder Quality Factor QF_out and the transcoder scaling
factor zT
directly instead of calculating them to meet device constraints as in the
Basic System 200.
The Training Set of Input Images 502 sends each of its images as input image
"I" to: the
27

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
Image Feature Extraction module 202; the Transcoding module 208; and the
Quality
Assessment module 210. The Image Feature Extraction module 202 sends the set
of input
image parameters "IIP" to the Computation of Quality Prediction Table module
504; the
Quality Assessment module 210 sends its computed quality measure QM to the
Computation of Quality Prediction Table module 504; and the Computation of
Quality
Prediction Table module 504 controls the Transcoding module 208 with the
transcoding
parameter pair (zT,QFT). The Transcoding module 208 generates the output image
"J" and
sends it to the Quality Assessment module 210.
The Table "N" 506 is initially empty. For each of the input images of the
Training Set of
Input Images 502, and for each of a range of viewing conditions (represented
by the viewing
scaling factor zV) and each of a range of transcoder scaling factors zT, and
for each of a
range of encoding quality factor QF_out, the quality prediction table
generation system 500
generates a best transcoded image (the output Image "J") with the best quality
metric Q.
Each computed best quality metric Q ("Best Q"), along with the four index
values (QF_in,
zV, QF_out, and zT) of each computation are sent to update the Table "N" 506
via the Table
Update module 508.
Because many images will generate a value of the best quality metric Q for the
same index
but slightly different actual value, the raw data generated by the quality
prediction table
generation system 500 may advantageously be collected and processed in the
Table Update
module 508 in a manner similar to that described in the "Kingston" paper by
Steven Pigeon
et al, mentioned above. In this way, by grouping and quantizing the data,
optimal LMS
(least mean squares) estimators of the quality metrics for combinations of the
four index
values, may be computed and stored in the quality prediction Table "N" 506.
Tables 2, 3, and 4 below show two-dimensional sub-tables of an instance of the
quality
prediction Table "N" 506, as examples that have been computed with the quality
prediction
table generation system 500 according to the embodiment of the invention.
28

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
scaling
QF_out 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
17.3 19.3 20.6 21.6 22.1 23.0 23.5 24.0 24.4 26.2
17.8 20.1 21.6 22.7 23.2 24.4 25.1 25.8 26.4 28.7
18.0 20.4 22.0 23.2 23.7 25.1 25.9 26.7 27.4 30.2
18.1 20.6 22.2 23.5 23.9 25.5 26.3 27.3 28.1 31.9
18.2 20.7 22.4 23.7 24.1 25.7 26.7 27.7 28.6 32.5
18.4 20.8 22.6 23.9 24.2 26.0 27.0 28.1 29.1 33.0
18.4 21.0 22.7 24.1 24.4 26.3 27.3 28.6 29.7 37.3
18.4 21.1 22.9 24.4 24.6 26.6 27.8 29.3 30.6 54.9
18.6 21.3 23.2 24.7 24.9 27.1 28.3 30.1 31.6 48.0
100 18.7 21.5 23.4 25.0 25.1 27.5 28.8 30.7 32.2 51.4
Table 2
scaling
QF_out 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
10 22.5 23.7 24.4 24.9 25.3 25.7 26.0 26.3 26.6 26.2
20 24.5 25.8 26.6 27.1 27.6 28.0 28.5 28.8 29.2 28.7
30 25.6 27.0 27.8 28.4 28.9 29.4 29.9 30.3 30.7 30.2
40 26.4 27.8 28.6 29.3 29.8 30.4 30.9 31.4 31.7 31.9
50 27.1 28.5 29.3 30.0 30.6 31.1 31.7 32.2 32.6 32.5
60 27.8 29.2 30.1 30.7 31.3 31.9 32.5 33.0 33.4 33.0
70 28.8 30.1 31.0 31.8 32.4 33.0 33.6 34.1 34.6 37.3
80 30.2 31.6 32.5 33.3 33.9 34.6 35.2 35.8 36.4 54.9
90 32.9 34.2 35.2 36.1 36.8 37.6 38.2 39.0 39.5 48.0
100 39.4 41.0 42.5 44.0 45.5 46.3 47.2 48.0 48.6 51.4
Table 3
29

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
scaling
QF_out 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
18.2 20.3 21.7 22.8 23.8 24.6 25.3 25.8 26.6 27.6
18.9 21.3 22.9 24.2 25.3 26.3 27.1 27.7 29.2 30.5
19.2 21.7 23.4 24.8 26.1 27.1 28.1 28.7 30.7 32.1
19.5 22.0 23.8 25.2 26.5 27.6 28.6 29.3 31.7 33.8
19.6 22.2 24.0 25.5 26.9 28.0 29.0 29.7 32.6 34.5
19.8 22.4 24.2 25.8 27.2 28.4 29.4 30.1 33.4 35.0
19.9 22.6 24.5 26.1 27.6 28.8 29.9 30.5 34.6 39.1
20.1 22.9 24.8 26.5 28.0 29.3 30.4 31.1 36.4 55.9
20.4 23.2 25.2 27.0 28.6 29.9 31.1 31.7 39.5 49.0
100 20.5 23.5 25.6 27.4 29.2 30.6 31.8 32.4 48.6 52.3
Table 4
The Tables 2 and 3 show the distribution of the average PSNR values for QF_in
= 80,
computed for the viewing cases 1 and 2 respectively over the large Training
Set of input
images 503 mentioned before. The Table 4 shows the average PSNR values for the
viewing
case 3, where the viewing conditions correspond to a maximum zoom of 90% of
the size of
the original picture.
The Tables 2, 3, and 4 can be used as the quality estimator in the improved
transcoding
systems described in the following.
In the viewing case 1 (Table 2), the scaled-up transcoded output image is
compared to the
original input image. Both the transcoder scaling factor zT and encoding
quality factor
QF_out affect the measured quality. However, differences between the original
and the
transcoded image due to blocking artifacts from a low encoding quality factor
would be
considered equivalent to the effects of scaling, if the PSNRs were equal. This
seems
paradoxical, since blocking artifacts are visually more annoying than the
smoother low-
resolution images. Therefore, the measure favors high-resolution, low-QF
images over low-
resolution high-QF images. The fact that the comparison does not account for
the loss of

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
perceived quality introduced by presenting a lower resolution image to the
user somewhat
compensates for this bias.
In the viewing case 2 (Table 3), the images are compared at the transcoded
image resolution.
The quality estimator is less affected by scaling than by the encoding quality
factor, because
both images are scaled down to the same resolution before the comparison, and
scaling
smooths defects. Moreover, because file size varies more with scaling than
with changes in
the encoding quality factor QF_out, smaller images with higher QF_out are
favored over
larger images with lower QF_out. This is reasonable if the transcoded image is
to be viewed
only at low resolution, otherwise the loss for the viewer is too great. The
viewing case 3
(Table 4) is tailored to the user's viewing conditions, and thus would
constitute a more
accurate estimation of quality.
The quality prediction Table "N" 506, may be used advantageously in a simpler
quality-
aware transcoding system, that is simpler and more efficient than the Basic
System 200.
Figure 7 shows a simple quality-aware image transcoding system (Simple System)
600
comprises a computer, having a processor and a computer readable storage
medium having
computer executable instructions stored thereon, which when executed by the
processor,
provide the modules, which are similar to the Basic System 200, but in which
the
computationally expensive iterations to calculate the quality factor are
replaced with a
simple table look-up in the quality prediction Table "N" 506, which is stored
in the
computer readable storage medium.
The Simple System 600 comprises all the same modules of the Basic System 200
except the
Basic Quality Determination Block 209 which includes the Quality Assessment
module 210.
These modules (202 to 208) remain unchanged bearing the same reference
numerals as in
Fig. 2, and having the same functions. In addition, the Simple System 600
comprises a
Simple Quality Determination Block 602 which includes the Table N 506 from
Fig. 6.
The computed quality measure QM is not generated by a Quality Assessment
module in the
Simple System 600 but is obtained directly from the quality prediction Table
"N" 506. The
quality prediction Table "N" 506 is the same table whose construction and
generation was
31

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
described in Fig.6, and of which partial examples were described above in the
Tables 2, 3,
and 4. The quality prediction Table "N" 506 is addressed by four parameters:
the input
Quality Factor QF_in is obtained from the Image Feature Extraction module 202;
the
viewing scaling factor zV which may be set to 1 (viewing case 1) or another
value as
appropriate for the viewing condition; the transcoder quality factor QFT; and
the transcoder
scaling factor zT. QFT and zT are chosen by the Quality-aware Parameter
Selection module
206 in a loop that seeks to maximize QM. This is described in more detail in
the method
description next.
Figure 8 is a flow chart of a predictive method 700 for quality-aware
selection of
parameters in JPEG image transcoding which is applicable to the Simple System
600. The
predictive method 700 includes many of the same sequential steps of the Basic
method 400
of Fig. 4 bearing the same reference numerals:
step 402 "Get Device Constraints";
step 404 "Get Input Image I";
step 406 "Extract Image Features";
step 408 "Predict Quality and File Size";
step 410 "Initialize Parameters";
step 414 "Validate Result"; and
step 416 "Return Image J".
In place of the step 412 "Run Quality-aware Parameter Selection and
Transcoding Loop" of
Fig. 4, the predictive method 700 includes a new step (inserted between the
after the step
410 "Initialize Parameters and before the step 414 "Validate Result"), step
702 "Run
Predictive Q-aware Parameter Selection Loop".
Figure 9 is a flow chart showing an expansion of the step 702 "Run Predictive
Quality-
aware Parameter Selection Loop" of the Predictive Method 700, including some
of the same
steps of the expanded step 412 "Run Quality-aware Parameter Selection and
Transcoding
Loop" of Fig. 5 bearing the same reference numerals and having the same
functionality:
step 452 "Get Next Value Pair";
step 454* "Is Value Pair available?";
step 456* "Is Value Pair feasible?"; and
32

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
step 458 "Transcode Ito J".
In addition, the expansion of the step 702 "Run Predictive Quality-aware
Parameter
Selection Loop" includes three new steps:
step 706 "Get predicted Quality Metric QM from Table N";
step 708 "Is QM > best Q?" and
step 710 "Set: Best Q := QM, zT := z, QFT := QF_out".
* Note, the step sequence is modified from Fig. 5 to Fig. 9: The exit "NO" of
the Step 454
goes to the step 458 (which is followed by the function return in which the
transcoded
output image "J" is returned) respectively. The exit "YES" of the Step 456
goes to the step
706.
In the step 706 "Get predicted Quality Metric QM from Table N" a precomputed
quality
metric value QM is retrieved from the Table "N" by indexing into the Table "N"
with four
parameters: the input Quality Factor QF(I) that was obtained in the step 406
"Extract Image
Features" (Fig. 8); the viewing scaling factor zV that was chosen in the step
408 "Predict
Quality and File Size"; the encoding quality factor QF_out; and the transcoder
scaling factor
z.
The step 706 "Get predicted Quality Metric QM from Table N" is followed by the
step 708
"Is QM > best Q?".
In the step 708 "Is QM > best Q?" the quality metric QM obtained in the
previous step is
compared to the highest quality metric "Best Q" found so far. "Best Q" was
initialized to
zero in the prior step 410 "Initialize Parameters" (Fig. 8), and is updated
each time a higher
value is found, as indicated by the result of the comparison. If the result of
the comparison
is true (YES), execution continues with the next step 710 "Set: Best Q := QM,
zT := z, QFT
QF_out", otherwise execution loops back to the step 452 "Get Next Value Pair".
In the step 710 "Set: Best Q := QM, zT := z, QFT := QF_out", the highest
quality metric
"Best Q" is updated to the value of QM that was found in the step 706 "Get
predicted
Quality Metric QM from the Table N". Further, the value pair ("z", QF_out) is
recorded as
a best transcoding parameter pair (zT, QFT) for the present image.
33

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
This completes the description of the expanded step 702 "Run Predictive
Quality-aware
Parameter Selection Loop" after which execution continues with the step 414
"Validate
Result" (Fig. 8).
With the final step 416 "Return Image J" (Fig. 8), the basic method 400 for
quality-aware
selection of parameters in JPEG image transcoding ends by returning the
transcoded output
image "J" to the system, e.g. for storage as the output Image "J" 220.
The Simple System 600 with the predictive method 700 for quality-aware
selection of
parameters may thus be employed to provide a quality-aware transcoder, at a
much lower
processing cost than the Basic System 200 but without assurance that the
actual best
transcoding parameters have been found because of the imperfect nature of the
predicted
quality metric.
An improved quality-aware transcoding system may be constructed on the basis
of the Basic
System 200, enhanced with the Table "N". In this system, the search for the
optimal quality
may be considerably shortened with the use of the Table "N": instead of
running the full
loop contained in the step 412 "Run Quality-aware Parameter Selection and
Transcoding
Loop" (Fig. 4 and 5) for all possible valid combinations of zT and QFT, one
may avoid
expensive processing steps in many iterations of the loop, by first consulting
the Table "N".
In a simple variant of the step 412 "Run Quality-aware Parameter Selection and
Transcoding
Loop", one may skip transcoding step 458 "Transcode Ito J", the step 460 "Is
Actual Size
OK?", and the "Quality Assessment Step" 474 (Fig. 5) if the predicted quality
metric from
the Table "N" would indicate that a higher quality than already found, is not
likely obtained
by the full analysis implied in these steps.
Figure 10 shows a block diagram of an improved quality-aware transcoding
system
(Improved System) 800, comprising a computer, having a processor and a
computer
readable storage medium having computer executable instructions stored
thereon, which
when executed by the processor, provide respective modules of the Improved
System 800.
The Improved System 800 is derived from the Basic System 200, by the addition
of the
Table "N" 506 stored in the computer readable medium, and the replacement of
the Quality-
34

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
aware Parameter Selection module 206 with an Improved Quality-aware Parameter
Selection module 802. The means for storing the Table "N" 506 and the Quality
Assessment module 210 together form an improved Quality Determination Block
804.
The output of the Table "N" 506 provides a predicted Quality Metric Qx to the
Improved
Quality-aware Parameter Selection module 802. The quality prediction Table "N"
506 is
addressed by the same four index parameters as in the Simple System 600: the
input Quality
Factor QF_in; the viewing scaling factor zV; the transcoder quality factor
QFT; and the
transcoder scaling factor zT. QFT and zT are chosen in the Improved Quality-
aware
Parameter Selection module 802 as shown in the method description in Fig. 11
which
follows.
Briefly summarized, the functionality of the Improved Quality-aware Parameter
Selection
module 802 includes collecting a feasible set "F" 806 of value pairs of
(zT,QFT) which are
feasible, i.e. satisfy the input Image "I" and the device constraints. The set
of value pairs
may then be sorted according to the predicted Quality Metric Qx from the
quality prediction
Table "N" 506 indexed by the value pair. The actual Quality Metric QM is then
computed
with the help of the Quality Assessment Module 210 (as in the Basic System 200
of Fig. 2),
but only for a promising subset of a limited number of value pairs of (zT,QFT)
from the
feasible set "F" 806 that predict a high predicted Quality Metric Qx.
Figure 11 is a flow chart of an improved method 900 for quality-aware
selection of
parameters in JPEG image transcoding which is applicable to the Improved
System 800.
The improved method 900 includes many of the same sequential steps of the
Basic method
400 of Fig. 4 bearing the same reference numerals:
step 402 "Get Device Constraints";
step 404 "Get Input Image I";
step 406 "Extract Image Features";
step 408 "Predict Quality and File Size";
step 410 "Initialize Parameters";
step 414 "Validate Result"; and
step 416 "Return Image J".

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
In place of the step 412 "Run Quality-aware Parameter Selection and
Transcoding Loop" of
Fig. 4, the improved method 900 includes two new steps (inserted between the
after the step
410 "Initialize Parameters and before the step 414 "Validate Result"):
step 902 "Create Set "F";
step 904 "Run Improved Q-aware Parameter Selection and Transcoding".
Figure 12 is a flow chart showing an expansion of the step 902 "Create Set "F"
of the
improved method 900, including three of the same steps of the expanded step
412 "Run
Quality-aware Parameter Selection and Transcoding Loop" of Fig. 5 bearing the
same
reference numerals and having the same functionality:
step 452 "Get Next Value Pair";
step 454* "Is Value Pair available?"; and
step 456* "Is Value Pair feasible?".
The expanded step 902 "Create Set "F" further includes new steps:
step 906 "Create Empty Feasible Set F";
step 908 "Add value pair to Feasible Set F";
step 910 "Sort F"; and
step 912 "Truncate F".
* Note, the step sequence is modified from Fig. 5 to Fig. 12: The exit "NO" of
the Step 454
goes to the function return (in which the Feasible Set "F" is returned), and
the exit "YES" of
the Step 456 goes to the step 908.
The steps 452, 454, 456, and 908 form a loop, preceded by the initializing
step 906.
In the 906 "Create Empty Feasible Set F" the Feasible Set "F" 806 is created
empty. The
following steps (452 to 456, 908) form a loop in which a number of distinct
value pairs are
generated (step 452), checked for availability (step 454) and feasibility
(step 456), and
added into the Feasible Set "F" 806 (step 908). If a generated value pair is
not feasible (exit
"NO" from the step 456), the loop is re-entered from the top. If no more
distinct value pairs
are available (exit "NO" from the step 454), the loop is exited and the
Feasible Set "F" 806
is sorted in the step 910 "Sort F" according to the predicted Quality Metric
Qx from the
36

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
quality prediction Table "N" 506 indexed by the distinct value pair. The
Feasible Set "F"
806 now contains all feasible value pairs in descending order according to the
predicted
quality.
In the next step, the step 912 "Truncate F", the Feasible Set "F" 806 is
truncated at the
bottom by removing value pairs which are associated with lower predicted
quality, until
only a definable number C_max of value pairs is left to remain in the Feasible
Set "F" 806.
Figure 13 is a flow chart showing an expansion of the step 904 "Run Improved Q-
aware
Parameter Selection and Transcoding" of the improved method 900, including
some of the
same steps of the expanded step 412 "Run Quality-aware Parameter Selection and
Transcoding Loop" of Fig. 5 bearing the same reference numerals and having the
same
functionality:
step 458 "Transcode Ito J";
step 460 "Is Actual Size OK?"
step 462 "Decompress J and scale with zR to X";
step 464 "Decompress I and scale with zV to Y";
step 466 "Compute Metric QM = PSNR(X,Y)";
step 468 "Is QM > Best Q?";
step 470 "Set Best Q := QM, Best Image := J"; and
step 472 "Set J := Best Image".
The expanded step 904 "Run Improved Q-aware Parameter Selection and
Transcoding"
further includes new steps:
step 914 "Is F Empty?";
step 916 "Get top value pair from F"; and
step 918 "Remove top value pair from F".
The expanded step 904 "Run Improved Q-aware Parameter Selection and
Transcoding"
forms a loop analogous to the loop of the Basic System 200, for finding the
best Image, that
is the image with the best quality assessed through the Quality Assessment
step 474 (the
sequence of the steps 462 to 466). Instead of running the loop for all
feasible value pairs (as
in the Basic Method 400), the loop of the Improved Method 900 is confined to
the value
37

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
pairs in the Feasible Set "F" 806. It will be appreciated that the steps 910
"Sort F" and 912
"Truncate F" provide the mechanism by which the number of value pairs to be
transcoded
and quality assessed can be limited to those pairs which have a predicted
quality measure
that is high.
The loop is entered at the step 914 "Is F Empty?".
In the step 914 "Is F Empty?" the Feasible Set "F" 806 is inspected. If it is
empty (exit
"YES" from the step 914) the loop is exited, execution jumps to the step 472
"Set J := Best
Image", and the expanded step 904 "Run Improved Q-aware Parameter Selection
and
Transcoding" is exited (return "J").
In the step 916 "Get top value pair from F" the value pair corresponding to
the highest
predicted quality metric (the "top value pair") is copied from the Feasible
Set "F" 806 to the
transcoder value pair (zT,QFT).
In the step 918 "Remove top value pair from F", the "top value pair" is
removed from the
Feasible Set "F" 806, and execution goes to the next step 458 "Transcode Ito
J".
Similar to the Basic Method 400, the subsequent steps assess the quality
metric, save the
best quality metric and the best image, and jump back to the start of the loop
(at the step
914).
The effect of sorting and truncating the Feasible Set "F" 806 in Fig. 12 can
be seen as
follows. If the Feasible Set "F" 806 is not truncated, only sorted, all value
pairs will be
evaluated (transcoded and the quality assessed), merely in the order of
predicted quality.
This would result in the same best image to be found as with the Basic Method
200, without
gain in processing cost.
Truncating the Feasible Set "F" 806 leaves the number C_max value pairs in the
set.
Because the set is sorted first, these C_max value pairs will be the value
pairs that are
predicted to yield the most promising quality metrics. Thus, compared with the
basic
method, fewer value pairs will be fully evaluated, saving the processing that
would have
38

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
been (in the Basic System 200) expended to evaluate value pairs that yield a
lower quality.
If Cinax is set to one (1), only one value pair will be fully evaluated, but
regardless of
actual quality assessed, the resulting best image would be the same as that
found with the
Predictive Method 700 of the Simple System 600.
Thus, C_max should be set to a value higher than one, because the highest
predicted quality
is not necessarily the actual highest quality. Setting C_max to a value of
five (5) has been
found to give good results, and is very likely to include the actual best
value pair.
Alternatively, we could set a quality threshold. When the predicted quality
metric is smaller
by a given margin (e.g. 3dB) than the best predicted quality metric obtained
so far then we
may stop. In a further modification, sorting of the set "F" may be done as
follows:
1) For each feasible scaling value "z", find the value pair in the feasible
set "F" with
the best predicted quality value. Let's suppose there are P such value pairs.
(i.e. we
find the best value pair for z=10%, then for 20%, etc.);
2) Sort the P value pairs obtained in step 1 from the highest to the lowest
predicted
quality value. These will be the inserted at the beginning of the feasible set
"F";
3) Then sort the remaining value pairs obtained from highest to lowest
predicted quality
value. These will be the inserted in the feasible set "F" after the previous P
value
pairs.
Proceed as before with a C_max >= P.
The charts shown in Figures 14A and 14B show graphical representations of
quality metric
values (PSNR) recorded in the feasible set "F", after sorting. Figure 14A
shows an example
of sorted PSNR values for zV=0.7, while Figure 14B shows an example of sorted
PSNR
values with zV=0.7 with s_max = 0.7, with the same image as Figure 14A.
Figure 15 is a flow chart of a quality prediction table generation method
1000, illustrating
the functionality of the quality prediction table generation system 500 (Fig.
6). The quality
prediction table generation method 1000 includes some of the same steps of the
Basic
Method 400 of Figs. 4 and 5 bearing the same reference numerals and having the
same
functionality, namely the steps 406, 458, and 474. The quality prediction
table generation
39

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
method 1000 includes the following steps:
step 1002 "Initialize N(QF_in,zV)";
step 1004 "Are more images with QF(I)=QF_in available?";
step 1006 "Get Next Image "I";
step 406 "Extract Image Features";
step 1008 "Set up parameters for loop over value pairs (z,QF_out)";
step 1010 "Get first value pair (z,QF_out)";
step 458 "Transcode Ito J";
step 474 "Quality Assessment Step";
step 1012 "Update N(QF_in,zV)";
step 1014 "Are more value pairs (z,QF_out) available?"; and
step 1016 "Get next value pair (z,QF_out)".
As described earlier, the quality prediction Table "N" 506 (Fig. 6) is a four-
dimensional
table and contains a Quality Metric Q indexed by four index variables: the
encoding quality
factor QF_in of an input image from the Training Set of Input Images 502, the
viewing
scaling factor zV, the encoding quality factor QF_out to be used in
compressing the output
image in the transcoder, and the scaling factor "z" to be used in compressing
the output
image in the transcoder. Shown in Fig. 14 is the quality prediction table
generation method
1000, limited to generating a sub-table of the quality prediction table "N",
namely
N(QF_in,zV), that is the sub_table for one value of the input encoding quality
factor QF_in
and one value of the viewing scaling factor zV. The entire quality prediction
table "N" for
additional values of QF_in and zV, may be generated by repeating the steps of
the quality
prediction table generation method 1000 for these additional values of QF_in
and zV.
In the step 1002 "Initialize N(QF_in,zV)", the sub_table N(QF_in,zV) is
cleared to zero.
In the step 1004 "Are more images with QF(I)=QF_in available?" it is
determined if any
more images having an input encoding quality factor QF(I) = QF_in are
available in the
Image Training Set 502 (Fig. 6). If no more such images are available (i.e.
all such images
have already been processed), the result of the determination is "NO", and the
quality
prediction table generation method 1000 exits with the populated sub-table
N(QF_in,zV),
otherwise execution continues with the step 1006 "Get Next Image "I".

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
In the step 1006 "Get Next Image "I", the next image is obtained from the
Image Training
Set 502, to become the input Image "I".
In the step 406 "Extract Image Features" features of the input Image "I" such
as width and
height are determined, as described earlier (Fig. 4).
In the step 1008 "Set up parameters for loop over value pairs (z,QF_out)" a
per-image loop
1018 over value pairs (z,QF_out) is prepared, that is the per-image loop 1018
comprising
the steps 1010, 458, 474, 1012, 1014, and 1016. The per-image loop 1018 is run
for each
combination of the scaling factor "z" from the set {lc 2*K, 3*K, , 100%} and
the output
Quality Factor QF_out from the set IL, 2*L, 3*L, , 1001, where the increments
"K" and
"L" may be selected as K=10% and L=10, for example. The Tables 2 to 4 above
were
calculated with these values. The combination of "z" and QF_out is referred to
as a value
pair (z,QF_out).
In the step 1010 "Get first value pair (z,QF_out)", the first value pair
(z,QF_out) is
determined, for example (z = 10%, QF_out = 10).
In the step 458 "Transcode Ito J", the input Image "I" is transcoded into the
output Image
"J" with the transcoding parameters zT = "z", and QFT = QF_out, as described
earlier (Fig.
5).
In the step 474 "Quality Assessment Step" the quality Metric QM of the
transcoding is
determined as described earlier (Fig. 5).
In the step 1012 "Update N(QF_in,zV)" the sub-table N(QF_in,zV) is updated
with the
quality metric, at the table location indexed by the value pair (z,QF_out),
more precisely the
predicted quality metric at that table location is updated with the simple
average of the
quality metric values from all images at the same table location.
In the step 1014 "Are more value pairs (z,QF_out) available?" it is determined
if any more
combinations of the scaling factor "z" and the output Quality Factor QF_out
are available.
41

CA 02703048 2010-04-19
WO 2009/055899
PCT/CA2008/001305
If no more distinct value pairs (z,QF_out) are available (i.e. all
combinations have already
been processed), the result of the determination is "NO", and the per-image
loop 1018 exits
to the step 1004 "Are more images with QF(I)=QF_in available?" to find and
start
processing the next image from the Image Training Set 502, otherwise ("YES")
execution of
the per-image loop 1018 continues with the step 1016 "Get next value pair
(z,QF_out)".
In the step 1016 "Get next value pair (z,QF_out)" the next value pair
(z,QF_out) is
determined.
As indicated earlier, the Image Training Set 502 may include many images that
may
generate slightly different actual values of the best quality metric for the
same value pair
index. In the quality prediction table generation method 1000 described here,
the computed
quality metrics are used to update the quality prediction table "N" 506
directly in a manner
not further specified. Preferably, the raw data generated by the quality
prediction table
generation method 1000 are collected and processed in a manner similar to that
described in
the "Kingston" paper by Steven Pigeon et al, mentioned above. In this way, by
grouping
and quantizing the data, and further statistical processing, optimal LMS
(least mean squares)
estimators of the quality metrics may be computed and stored in the quality
prediction Table
"N" 506.
The systems and methods of the embodiments of the present invention provide
for
improvements in transcoding in away that takes scaling, compressed file size
limitations, as
well as image quality into account. It is understood that while the
embodiments of the
invention are described with reference to JPEG encoded images, its principles
are also
applicable to the transcoding of digital images encoded with other formats,
for example GIF
(Graphics Interchange Format) and PNG (Portable Network Graphics) when they
are used in
a lossy compression mode. The systems of the embodiments of the invention can
include a
general purpose or specialized computer having a CPU and a computer readable
medium,
e.g., memory, or alternatively, the systems can be implemented in firmware, or
combination
of firmware and a specialized computer. In the embodiments of the invention,
the quality
prediction table is a four-dimensional table which is indexed by 4 parameters.
It is
understood that the quality prediction table can N generally a multi-
dimensional table,
which is indexed by any required number of parameters, whose number is higher
or lower
42

CA 02703048 2010-04-19
WO 2009/055899 PCT/CA2008/001305
than four.
A computer readable medium, such as DVD, CD-ROM, floppy, or memory, for
example,
non-volatile memory, storing computer readable instructions thereon, when
executed by a
processor, to perform the steps of the methods of the embodiments of the
invention, is also
provided.
Although the embodiments of the invention has been described in detail, it
will be apparent to one
skilled in the art that variations and modifications to the embodiment may be
made within the scope
of the following claims.
43

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 2015-06-30
(86) PCT Filing Date 2008-07-16
(87) PCT Publication Date 2009-05-07
(85) National Entry 2010-04-19
Examination Requested 2013-03-04
(45) Issued 2015-06-30
Deemed Expired 2020-08-31

Abandonment History

There is no abandonment history.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Registration of a document - section 124 $100.00 2010-04-19
Application Fee $400.00 2010-04-19
Maintenance Fee - Application - New Act 2 2010-07-16 $100.00 2010-04-19
Maintenance Fee - Application - New Act 3 2011-07-18 $100.00 2011-02-21
Maintenance Fee - Application - New Act 4 2012-07-16 $100.00 2012-01-13
Maintenance Fee - Application - New Act 5 2013-07-16 $200.00 2013-03-02
Request for Examination $200.00 2013-03-04
Maintenance Fee - Application - New Act 6 2014-07-16 $200.00 2014-02-03
Final Fee $300.00 2015-04-21
Maintenance Fee - Application - New Act 7 2015-07-16 $200.00 2015-06-11
Maintenance Fee - Patent - New Act 8 2016-07-18 $200.00 2016-02-24
Maintenance Fee - Patent - New Act 9 2017-07-17 $200.00 2017-01-09
Maintenance Fee - Patent - New Act 10 2018-07-16 $250.00 2018-06-06
Maintenance Fee - Patent - New Act 11 2019-07-16 $250.00 2019-05-02
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ECOLE DE TECHNOLOGIE SUPERIEURE
Past Owners on Record
COULOMBE, STEPHANE
FRANCHE, JEAN-FRANCOIS
PIGEON, STEVEN
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Abstract 2010-04-19 2 67
Claims 2010-04-19 5 228
Drawings 2010-04-19 15 651
Description 2010-04-19 43 2,093
Representative Drawing 2010-06-14 1 9
Cover Page 2010-06-14 1 42
Claims 2011-03-04 5 229
Description 2011-11-23 43 2,088
Description 2014-04-22 51 2,418
Claims 2014-04-22 17 659
Cover Page 2015-06-11 1 42
PCT 2010-04-19 3 99
Assignment 2010-04-19 6 225
Correspondence 2010-06-11 1 16
Fees 2011-02-21 1 202
Prosecution-Amendment 2011-03-04 6 261
Prosecution-Amendment 2011-11-23 6 297
Fees 2013-03-02 1 163
Prosecution-Amendment 2013-03-04 1 43
Prosecution-Amendment 2014-04-22 38 1,498
Fees 2014-02-03 1 33
Prosecution-Amendment 2014-06-16 3 122
Prosecution-Amendment 2014-12-16 11 349
Correspondence 2015-04-21 1 30