Note: Descriptions are shown in the official language in which they were submitted.
CA 02290445 1999-11-15
WO 98/52119 PCT/US97/09256
METHOD AND SYSTEM FOR IMAGE RETRIEVAL
Technical Field
This invention relates to computerized image
retrieval and, more specifically, to retrieval based on
image database querying.
Backaround of the Invention
with advances in computer hardware technology,
it has become possible to store, manipulate and transmit
large numbers of images. When represented in computer-
tractable form, the images can be included in an image
database.
Systems have been developed, typically in the
form of computer software, for image database management
and image retrieval. For example, as disclosed in
U.S. Patent 5,493,677, issued February 20, 1996 to Balogh
et al., images can be retrieved by searching text
associated with the images for a match with a query.
Systems have also been developed which use
image content descriptors, for querying by image content.
Such a system is disclosed in U.S. Patent 5,579,471,
issued November 26, 1996 to Barber et al. and in the
paper by W. Niblack et al., "The QBIC Project: Querying
Images by Content Using Color, Texture, and Shape", in
Storage and Retrieval for Image and Video Databases,
Wayne Niblack, Editor, Proc. SPIE 1908, pp. 173-187
( 1993 ) .
Summary of the Invention
We have recognized that, for greater accuracy
in retrieving images from an image database, querying
based on image content can be combined with querying
based on spatial location. Thus, in a system in which
each image is represented by a plurality of regions
CA 02290445 1999-11-15
WO 98/52119 2 PCTNS97/09256
having feature attributes and spatial location
attributes, queries can be directed to region feature
similarity and region spatial location similarity in
combination. If desired, the relative spatial
arrangement of regions can also be taken into account.
Brief Description of the Drawincc
Fig. 1 is an example of an image with regions
for inclusion in a database.
Fig. 2 is a tabular display of a representation
of the regions of Fig. 1.
Fig. 3 is an example of an image with regions
for database querying.
Fig. 4 is a tabular display of a representation
of the regions of Fig . 3 .
Fig. 5 is a flow diagram of database query
processing for discriminating based on region feature and
region absolute spatial location.
Fig. 6 is a flow diagram of database query
processing for discriminating based on the relative
location of regions.
Detailed Description of Preferred Embodiments
The following description is primarily in terms
of method steps for execution by a suitable processor
under program control. The program may originate as
software, or, for greater efficiency, it may be embodied
at least in part in dedicated firmware or hardware.
A prototype system embodying features as
described has been formulated in the JAVA language. The
system can operate on suitable hardware such as a SUN
Workstation, a Silicon Graphics Workstation, or a PC with
a Pentium processor, for example.
Conveniently, an image database to be queried
has tabular form, with each record or table entry
CA 02290445 1999-11-15
WO 98/52119 3 PCT/US97/09256
representing a region of an image. A record includes an
image identifier, a region identifier, a region attribute
and, for geometric characterization, the x- and y-
coordinates of the centroid of the region, the width and
height of the region, and the area of the region. The
table may be generated by manual keyboard entry based on
visual inspection of images. Alternatively, if a
suitable pattern recognition system is available, table
generation may be automated.
To illustrate database entries, the image 10
shown in Fig. 1 and having been given the identifier "T"
can be represented by the table entries shown in Fig. 2.
Included are:
region 100 (to, stretching across the bottom of
the image, below a broken line);
region 101 (tl, bounded by a rectangle drawn
with broken lines);
region 102 (t2, bounded by a rectangle drawn
with chain-dotted lines),
region 103 (t" bounded by a rectangle drawn
with broken lines); and
region 104 (t4, stretching across the top of
the image, above a broken line).
The x,y-coordinates, the width w, and the
height h of each region are given in percent of the
respective maximal values. The values x, y, w and h
define a "bounding rectangle" for each region, so that
the area of a region is less than or equal to w times h.
As illustrated, regions may overlap, and their union need
not cover the image.
The attribute f may simply represent color, for
example, with color being represented by known means,
e.g., by a color histogram or by color sets. Other
simple attributes which may be used include texture and
shape, and such attributes may be combined into more
CA 02290445 1999-11-15
WO 98/52119 4 PCT/US97/09256
complex attributes.
A search query is expressed correspondingly.
For example, for the search pattern shown in Fig. 3,
a query region table may be formed as shown in Fig. 4.
For a database and a query, e.g. with entries
as illustrated by Figs. 2 and 4, respectively, Fig. 5
illustrates query processing for finding database entries
based on the query. The general aim is to find images
that contain arrangements of regions similar to those in
the query.
According to Fig. 5, starting at "a", for each
region in the query, the database regions are searched
for a feature match (step 51) and a spatial match
(step 52). For spatial matching, this involves using a
suitable metric for comparing the spatial information
such as x, y, h, w and area of the query region with the
corresponding information for the database regions.
Suitable metrics include Euclidean distance and other
Minkowski distances, and quadratic metrics whose
definition involves a square matrix which expresses the
relative similarity between the components of a vector.
A metric can also include weights which may be different
for each of the geometric parameters.
Similar metrics can be used for feature
matching (f;). For example, if color histogram
information is included in terms of components "red",
"green" and "blue", a 3-component Euclidean metric can be
used. Analogously, this applies when such information is
included in terms of components "hue", "saturation" and
"intensity".
For efficiency, as shown in Fig. 5, thresholds
are applied to the computed feature and spatial
distances. Thus, if a distance exceeds the threshold,
the database region is not included for further
consideration. Instead of, or in addition to using
CA 02290445 1999-11-15
WO 98/52119 5 PCT/US97/09256
separate thresholds for spatial and feature similarity as
shown in Fig. 5, thresholding can be applied also to the
combined region distance or score, i.e. before saving a
region match in step 53. Distances may be combined by
simple addition, or by suitable weighting followed by
addition, for example.
If multiple processors are available, "k-loop"
feature similarity processing analogous to step 51 and
spatial similarity processing analogous to step 52 may be
carried out in parallel instead of pair-wise sequentially
as illustrated in Fig. 5. Parallel processing then
yields two sets of regions, namely (i) those which meet
feature similarity regardless of spatial similarity, and
(ii) those which meet spatial similarity regardless of
feature similarity. Thus, to obtain the desired set of
regions which meet both types of features, a "jain"
operation will be required. After joining, a final
thresholding operation can be performed. Advantageously,
multiple processors may also be used for parallel
processing within steps 52 and 53.
Image matches are obtained as a result of the
"join" operation in step 54, producing all those database
images which meet each one of the region requirements of
the query. A query may result in an image being saved in
step 54 more than once, namely for different combinations
of its regions which satisfy the query. Such
multiplicity may be helpful to a user of the system;
otherwise, duplicates can be deleted by a simple one-pass
search of the saved images.
If the relative spatial location or arrangement
of regions is not important to a user, the computation
may terminate at this point ((3), though preferably after
the saved images are sorted by score.
For discriminating further based on relative
spatial location of regions, a process can be used as
CA 02290445 1999-11-15
WO 98/52119 6 PCT/US97/09256
illustrated by Fig. 6, using so-called 2-D strings.
Generation of 2-D strings at this point, i.e. after
similarity processing, may be termed "query-time 2-D
string generation".
For a query image, a 2-D string includes the x-
coordinates of the centroids of the regions, arranged as
an increasing sequence, followed by the y-coordinates of
the centroids, also arranged as an increasing sequence.
For a database image, correspondingly, the coordinates of
those regions are used which were matched against the
query image regions.
The 2-D string of the query image is formed in
step 61, and, in step 62, this string is matched against
the 2-D strings from each of the saved images. In
step 63, only in case of a match, the database image is
saved, so that only those images are ultimately sorted
and produced in step 64 which have a 2-D string which
matches the 2-D string of the query image.
Instead of or in addition to 2-D strings
including x- and y-coordinates of centroids as described,
2-D strings can be produced after rotation of the
coordinate system, e.g. by 45°. Such 2-D strings are
defined analogously, using coordinates x' and y' of the
centroids in the rotated coordinate system.