Language selection

Search

Patent 2157250 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2157250
(54) English Title: METHOD AND APPARATUS FOR PRODUCING A HYBRID DATA STRUCTURE FOR DISPLAYING A RASTER IMAGE
(54) French Title: METHODE ET DISPOSITIF DE PRODUCTION DE STRUCTURES DE DONNEES HYBRIDES POUR L'AFFICHAGE DE TRAMES
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
(51) International Patent Classification (IPC):
  • G9G 5/00 (2006.01)
  • G6F 3/153 (2006.01)
(72) Inventors :
  • NICHOLSON, DENNIS G. (Country Unknown)
  • KING, JAMES C. (Country Unknown)
  • EMMETT, DAVID M. (Country Unknown)
(73) Owners :
  • ADOBE SYSTEMS, INC.
(71) Applicants :
  • ADOBE SYSTEMS, INC. (United States of America)
(74) Agent: SMART & BIGGAR LP
(74) Associate agent:
(45) Issued:
(22) Filed Date: 1995-08-30
(41) Open to Public Inspection: 1996-03-01
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data:
Application No. Country/Territory Date
08/298,695 (United States of America) 1994-08-31
08/420,827 (United States of America) 1995-04-10

Abstracts

English Abstract


A system for producing a raster image derived from a data structure
including a data processing apparatus, a recognizer which performs recognition on
an input bitmap to the data processing apparatus to detect identifiable objects within
the input bitmap, a mechanism for producing a hybrid data structure including
coded data corresponding to the identifiable objects and to non-identifiable objects
and the input bitmap, and an output device capable of developing a visually
perceptible raster image derived from the input bitmap in the hybrid data structure.
The raster image is derived from the input bitmap and thus includes no
misrecognition errors. The invention includes a method for producing a hybrid
data structure for a bitmap of an image having the steps of inputting a bitmap into a
digital processing apparatus, partitioning the bitmap into a hierarchy of lexical units,
assigning labels to a label list for each lexical unit of a predetermined hierarchical
level, where labels in the label list have an associated confidence level, and storing
each lexical unit in a hybrid data structure as either an identifiable object or a non-
identifiable object. The entire input bitmap or portions thereof are also stored in the
hybrid data structure to be displayed.


Claims

Note: Claims are shown in the official language in which they were submitted.


- 42 -
CLAIMS
1. A system for producing an image comprising:
a data processing apparatus;
means for performing recognition on an input bitmap that is stored within
said data processing apparatus to detect objects within said input bitmap and tocreate coded portions therefrom for identifiable and non-identifiable objects;
means for creating a data structure including coded portions corresponding
to identifiable objects and links to portions of said input bitmap that correspond to
said identifiable objects; and
an output device for developing a visually perceptible image from said input
bitmap.
2. A system as recited in claim 1 wherein said objects that said means
for performing recognition can detect are lexical units including characters andwords comprising said characters.
3. A system as recited in claim 2 wherein said coded portions
corresponding to non-identifiable objects are unrecognized words which fall below
a threshold confidence level.
4. A system as recited in claim 1 further comprising means for
identifying geometrical distortion in said input bitmap before said means for
performing recognition detects objects within said input bitmap.
5. A system as recited in claim 2 wherein said data structure includes
coded portions corresponding to said non-identifiable objects and links to portions
of said input bitmap that correspond to said non-identifiable objects.

- 43 -
6. A system as recited in claim 5 further comprising means for
searching said data structure for a coded portion that at least approximately matches
an inputted lexical unit.
7. A system as recited in claim 5 further comprising means for
displaying an editing window and said coded portions of said data structure,
wherein said data structure can be changed when displayed in said editing window.
8. A system as recited in claim 7 wherein said means for creating a
data structure includes means for determining font attributes for said coded portions
and assigning a typeface to each of said coded portions.
9. A method for producing a data structure from an input bitmap of an
image comprising the steps of:
inputting a signal comprising an input bitmap of an image into a digital
processing apparatus;
partitioning on said digital processing apparatus said input bitmap into
lexical units;
assigning on said digital processing apparatus at least one coded object to
each lexical unit; and
storing on said digital processing apparatus in a data structure a coded object
and link data which links said coded object to its corresponding lexical unit.
10. A method for producing a data structure as recited in claim 9
wherein said bitmap is partitioned into a hierarchy of lexical units, and wherein
coded objects are assigned to lexical units having a predetermined hierarchical level.

- 44 -
11. A method for producing a data structure as recited in claim 10
wherein said predetermined hierarchical level includes a word hierarchical level and
a character hierarchical level.
12. A method for producing a data structure as recited in claim 9
wherein one coded object is stored in said data structure for each partitioned lexical
unit.
13. A method for producing a data structure as recited in claim 12
wherein said step of assigning at least one coded object to each lexical unit includes
performing recognition on said lexical unit and producing a list of coded objects,
each having an associated confidence level.
14. A method for producing a data structure as recited in claim 12
wherein said, link data included coordinates within said bitmap of said lexical unit
that corresponds to said coded object stored in said data structure.
15. A method for producing a data structure as recited in claim 9 further
comprising the step of determining on said digital processing apparatus a geometric
correction of said input bitmap.
16. A method for producing a data structure as recited in claim 15
wherein said determining said geometric correction includes creating a distortion
map of said input bitmap and creating a layout correction transform from said
distortion map and said input bitmap.
17. A system for producing and manipulating a data structure including
coded and non-coded portions, the system comprising:
a data processing apparatus including a digital memory;

- 45 -
a recognizer implemented on said data processing apparatus which operates
on an input bitmap to detect lexical units within said input bitmap and to create
coded objects corresponding to said lexical units;
an analyzer implemented on said data processing apparatus to create and
store in said digital memory a data structure including coded identifiable objects and
coded non-identifiable objects corresponding to lexical units detected within said
input bitmap;
a display device for developing a visually perceptible image on a screen of
at least a portion of said data structure by displaying said input bitmap; and
a display manager implemented on said data processing apparatus to
manipulate the hybrid data structure displayed on said screen.
18. A system for producing and manipulating a data structure as recited
in claim 17 wherein said display manager includes an editor implemented on said
digital processor which implements editing of said data structure.
19. A system for producing and manipulating a data structure as recited
in claim 18 wherein said editor displays coded images derived from said coded
identifiable and non-identifiable objects, wherein said editor permits said non-identifiable objects to be changed into identifiable objects.
20. A system for producing and manipulating a data structure as recited
in claim 17 wherein said display manager includes a finder implemented on said
digital processor which implements the searching of said data structure for a
desired search object.
21. A method for producing an image on a data processing apparatus
comprising the steps of:
performing recognition on an input bitmap that has been entered into said
data processing apparatus to detect objects within said input bitmap;

- 46 -
creating a hybrid data structure including a coded portion corresponding to
each of said objects and a non-coded portion corresponding to each of said objects;
and
developing a visually perceptible image from said non-coded portions of
said data structure.
22. A method as recited in claim 21 wherein said objects that said
means for performing recognition can detect are characters and words comprising
said characters.
23. A method as recited in claim 22 wherein said non-coded portions
are individual word bitmaps.
24. A method as recited in claim 23 wherein said objects include
identifiable and non-identifiable objects, and wherein each of said identifiable and
non-identifiable objects includes an associated confidence level.
25. A method as recited in claim 22 further comprising means for
searching said coded portions of said data structure for an inputted word or group
of words.
26. A method as recited in claim 21 further comprising means for
displaying an editing window in which said coded portions of said data structurecan be changed.
27. A system for producing a raster image derived from a hybrid data
structure including coded and non-coded portions from an input bitmap, the system
comprising:
a data processing apparatus;

- 47 -
means for performing recognition on an input bitmap that has been entered
into said data processing apparatus to detect identifiable objects within said input
bitmap;
means for creating a hybrid data structure including coded portions
corresponding to said identifiable objects and non-coded portions derived from
portions of said input bitmap which do not correspond to said identifiable objects;
and
an output device for developing a visually perceptible raster image from
said hybrid data structure that includes coded images of said identifiable objects and
non-coded images of said portions of said input bitmap that do not correspond tosaid identifiable objects.
28. A system as recited in claim 27 wherein said identifiable objects that
said means for performing recognition can detect are characters and words
comprising said characters.
29. A system as recited in claim 27 wherein said hybrid data structure
includes recognized lexical units.
30. A system as recited in claim 27 wherein said hybrid data structure
includes coded portions for said identifiable objects, coded portions for said portions
of said input bitmap which do not correspond to said identifiable objects, and non-
coded bitmaps for said portions of said input bitmap which do not correspond to
said identifiable objects.
31. A system as recited in claim 30 wherein said portions of said input
bitmap which do not correspond to said identifiable objects correspond to
unrecognized words which fall below a threshold confidence level.

- 48 -
32. A system as recited in claim 27 further comprising means for
identifying geometrical distortion in said input bitmap before said means for
performing recognition detects identifiable objects within said bitmap.
33. A system as recited in claim 27 wherein said means for performing
recognition includes means for comparing each of said identifiable objects with a
portion of said input bitmap corresponding to said identifiable object, and means
for adjusting the size of said identifiable object if said identifiable object is within a
threshold size of said corresponding input bitmap portion.
34. A system as recited in claim 30 wherein said means for creating a
hybrid data structure stores coordinates of said unrecognized words within said
hybrid data structure such that said output device receives said coordinates to
develop said non-coded images.
35. A system as recited in claim 29 wherein said means for creating a
hybrid data structure includes means for measuring font attributes of said lexical
units and assigning typeface to each of said lexical units.
36. A system as recited in claim 29 further comprising means for
searching said hybrid data structure for an inputted lexical unit.
37. A system as recited in claim 27 further comprising means for
displaying an editing window for a rendered portion of said hybrid data structure in
which said hybrid data structure can be changed.
38. A method for producing a hybrid data structure from a bitmap of an
image including identifiable objects and non-identifiable objects comprising thesteps of:

- 49 -
inputting a signal comprising a bitmap of an image into a digital processing
apparatus;
partitioning on said digital processing apparatus said bitmap into lexical
units;
assigning on said digital processing apparatus at least one label and an
associated confidence level to each lexical unit; and
storing on said digital processing apparatus each lexical unit in a hybrid data
structure as an identifiable object if a label for said lexical unit has a confidence level
greater than a threshold confidence level, and as a non-identifiable object if no label
for said lexical unit has a confidence level greater than said threshold confidence
level.
39. A method for producing a hybrid data structure as recited in claim
38 further comprising the step of determining on said digital processing apparatus a
geometric correction of said bitmap.
40. A method for producing a hybrid data structure as recited in claim
39 wherein said determining said geometric correction includes creating a distortion
map of said bitmap and creating a layout correction transform from said distortion
map and said bitmap.
41. A method for producing a hybrid data structure as recited in claim
38 wherein said step of assigning at least one label includes assigning at least one
label for each lexical unit of a first predetermined hierarchical level, and assigning at
least one label for each lexical unit of a second predetermined hierarchical level
based upon said label for each lexical unit of said first hierarchical level.
42. A method for producing a hybrid data structure as recited in claim
38 wherein said bitmap is partitioned into a hierarchy of lexical units, and wherein
labels are assigned to lexical units having a predetermined hierarchical level.

- 50 -
43. A method for producing a hybrid data structure as recited in claim
42 wherein said predetermined hierarchical level includes a character hierarchical
level and a word hierarchical level.
44. A method for producing a hybrid image comprising:
inputting a bitmap representing an image into a digital processing apparatus;
recognizing on said digital processing apparatus identifiable objects in said
bitmap; and
creating with an output device coupled to said digital processing apparatus a
visually perceptible image comprising rendered images of said identifiable objects
and bitmap images of objects that were not recognized.
45. A method as recited in claim 44 further comprising a step of
segmenting said bitmap into lexical units.
46. A method as recited in claim 45 wherein said bitmap is segmented
into a hierarchy of lexical units.
47. A method as recited in claim 46 wherein said step of recognizing
identifiable objects in said bitmap includes producing at least one label in a label list
for each lexical unit and producing a confidence level for each of said labels.
48. A method as recited in claim 45 further comprising a step of
assigning a typeface to each of said lexical units of said bitmap.

- 51 -
49. A method as recited in claim 48 further comprising a step of
performing a size adjustment to said identifiable objects before said visually
perceptible image is created.
50. A method as recited in claim 44 further comprising a step of
performing a geometric correction to said bitmap before said step of recognizing.

Description

Note: Descriptions are shown in the official language in which they were submitted.


. ~ 215~250
METHOD AND APPARATUS FQR t.
PRODUCING A HYBRID DATA STRU~TURE j ~ .
FOR DTspLAyING A RA~TER IMAGE
DFS CRIPTION
10 Trrhnir,.l Fi.ol~
The present invention relates generally to the display of digitally stored
and/or processed images, and more particularly to a method and apparatus for
displaying images on raster display devices such as laser printers and computer
monitors.
Ro-^kFr~llnfi Art
Digital images can be efficiently stored, edited, printed, reproduced, and
otherwise ~ , ' 1. It is therefore often desirable to convert an image, such as
on a piece of pape}, into a digital ~ llLdLiui~ of the image by a process known as
20 11iO'iti7~1tinn Digital representations of an image can be primitive and non-coded
(e.g., an array of picture elements or "pixels") or may contain higher level
descriptive coded ' (e.g., ASCII character codes) from which a primitive
S~ may be generated. Generally, high level coded digital 1~,
are more compact than primitive non-coded ones.
Optical character recognition (OCR) rl l~ digitization and a method
for transforming text in bitmap Ic~rcs~"..dLiull to a high level coded l~ IIL~Li~
such as ASCII character codes. In OCR ~iit7j~i7:1ti~n, te~t characters on a printed
surface such as a sheet of paper are typicaUy scanned by an optical scanner, which
creates a bitmap of the pixels of the image. A pi~el is a ' 'a ' picture

215725~
-- 2 --
element of an image, and a bitmap is a data structure including il,r
concerning each pixel of the image. Bitmaps, if they contain more than on/off
' ~lrm ~ n, are often referred to as "pixel maps."
Other types of processes can also digitize real-world images. Devices such
S as digital cameras can be used to direcdy create bitmaps . w~ o.l,li.l5 to a
captured image. A computer system can recreate the image from the bitmap and
display it on a computer display or send the bitmap to a printer to be printed.
Bitmap generators can be used to convert other types of image-elated inputs intobi~maps which can be , ' ' and displayed. Incoming facsimile (fax) data
10 includes low-resolution bitmaps that can be , ' 1, ~, 1, printed, etc.
Once a bitmap is input to a computer, the computer can perform ~
on the bitmap so that each portion or object of the input bitmap, such as a charæter
or other lexical unit of text, is recogniæd and converted into a code in a desired
folmat. The recogniæd characters or other objects can then be displayed, edited, or
15 otherwise , ' ' from the coded data using an application software program
rulming on the computer.
There are seYeral ways to display a recognized, coded object. A raster
output device, such as a laser printer or computer monitor, typically requires abitmap of the coded object which can be inserted into a pixel map for display on a
2 0 printer or display screen. A raster output device creates an image by displaying an
array of pixels arranged in rows and columns from the pixel map. A bitmap of a
coded object can be provided by retrieving an output bitmap stored in memory forthe code, where each possible code has an associated stored bitmap. For example,for codes that represent characters in fonts, a bitmap can be associated with each
25 charæter in the font and for each size of the font that might be needed. The
character codes and font size are used to access the bitmaps. Another, more
efficient, method is to use a "charæter outline" associated with each charac~er code
. _ . . ,, ... . ,, . . .,,,, . .. , ,,, , . , . ,, , .. , .. ,, ., . , ... _ ... _ , ,

3 21~i72~)~
and to render a bitmap of a character from the character outline and other chsacter
information, such as size. A commonly-used language to render bitmaps from
character outlines is the PostScript~ language by Adobe Systems, Inc. of Mountain
View, Califomia. Character outlines can be described in standard formats, such as
5 the Type 1~ format by Adobe Systems, Inc.
OCR processes are limited by, among other things, the accuracy of the
digitized image proYided to the computer system. The digitizing device (such as a
scanner) may distort or add noise to the bitmap that it creates. In addition, OCR
processes do not perfectly recognize bitmap images, particularly if they are of low
10 resolution or are otherwise of low quality. For example, a recognizer might
misread ambiguous characters, characters that are spaced too closely together, or
characters of a font for which it had no infr~ inn
Imperfect recognition can present problems both at the time of editing a
recognized image and when printing or displaying the image ~ uE,~ ~ J
15 images may be displayed incorrectly, and images that are not recogr~ized at all may
not be displayed at all, or may be displayed as some arbitrary error image. Tbisreduces the value of the OCR process, since the recognized document may require
substantial editing.
2 0 Discl~ re of ~hP Jnyenrinn
The present invention provides a method and apparatus for creating a data
structure describing coded objects and non-coded objects. The invention is
applicable to, r u~ text or other objects from a Wtmap provided by an optical
scanner or other bitmap generator. Objects that are recognized and not recogluzed
25 by the recognizer are stored in the data str~cture. An apparently perfectly
recognized document is provided by displaying the original bitmap, which is
associated with the coded objects in the data structure.
,, _ .. . .. .. .... . . . ... .. ... . . ..

4 2~72S~
The apparatus of the present invention includes a system for producing an
image which includes a data processing apparatus and a recognizer for ~ rO~ 6
recognition on an input bitmap to detect objects within tbe bitmap. The recognizer
creates coded portions from the objects for identifiable and non-identifiable obiects.
5 Tlle system creates a data structure including coded portions ~,oll~ -r '' g to the
identifiable objects and links to portions of the input bitmap that correspond to the
identifiable objects. Coded portions of non-identifiable objects, and links to
,VII~JO~ld;llg bitmap portions, are also preferably included in the data structure.
Arl output device, such as a printer, a plotter, or a computer display, develops a
10 visually perceptible image derived from the input bitmap. The image portrays the
identifiable objects and the non-identifiable objects in their original bitmap form, so
that no inaccurate images caused by llfi~ o6l~;liul. are displayed. An input device,
such as an optical scarmer, a digital camera, and a bitmap generator, can be included
to provide the input bitmap to the data processing apparatus.
Tbe objects of the bitmap that the recognizer can detect preferably include
lexical units such as characters and words. The non-identifiable objects preferably
correspond to Ulll~ words which faU below a recognition threshold
confidence level. The system preferably performs geometric correction to the mput
bitmap, which includes creating a distortion map of the bitmap and creating a layout
20 correction transform from the distortion map and the bitmap.
The present invention further includes a method for producing a data
structure from a bitmap of an image. The method, ;~ t ~1 on a digital
processor, inputs a signal including a bitmap of an image and partitions the bitmap
into a hierarchical structure of lexical units. ~t least one coded object is assigned to
25 each lexical unit of a ~ t. . " .~ l hierarchical level, where each coded object has
an associated confidence level. Finally, a coded object is stored in the data strrlcture
and link data that links the coded object to its ,UII~ '' ,, lexical unit. If a coded
object has a confidence level greater than a threshold confidence level, then that

- ~ 5 ~1~72~ 0
coded object is considered j~lf.n~jfiot~l~. If no coded object for a lexical unit has a
confidence level greater than the threshold confidence level, then the lexical unit is
corlsidered non-identifiable and is stored as the coded object for that lexical unit
having the highest confidence level. The p~ .1 h;^rArchil-~l levels
S pl eferably include a character hiPrArChil'~l level and a word hierarchical level.
In yet another aspect of the present invention, a system for producing and
" ~ "l :",~ a data structure includes a recognizer operating in a data processing
apparatus that detects lexical units within the input bitmap. An analyier creates and
stores a data structure in memory of the data processing apparatus. The data
lO st~ucture includes coded identifiable objects and coded non-identifiable objects
l,O~ to lexical units within the input bitmap. A display device develops
ar,d displays an image of at least a portion of the data structure on a display device,
such as a screen, by displaying the input bitmap. A display manager , '
orl the data processing apparatus . ' the image on the screen. The display
l 5 manager includes an editor which perrnits the data structure and, thus, the image to
be edited. The editor displays the coded data as rendered images and can be used to
change a non-identifiable object into an identifiable object. The display manager
also preferably includes a finder which searches the coded objects of the data
structure to find an exact or ~1~, match to a search word or phrase. Lexical
20 units which correspond to matched coded objects are preferably highlighted if they
are currently being displayed.
In still another aspect of the present invention, a method for producing an
image on a data apparatus includes performing recognition on an input bitmap to
detect objects within the bitmap. A data structure is created to include coded
25 portions uoll~ uul~dil~g to each of the objects and a non-coded portion, such as a
word bitmap, ~ull~ ul~ding to each of the coded portions. A visually perceptibleimage is then developed from the data structure. The image is derived from the
non-coded pûrtiûns of the data structure. Each ûf the ûbjects preferab~y includes an

- ~ 21~2~0
- 6 -
associated cor fidence level, and Qon-identifiable objects correspond to
U~ .Uo.. ~,d words which have a confidence level below a threshold confidence
level. Objects having a confidence level below the threshold confidence level are
displayed as said non-coded portions. Preferably, during said image developing
5 step, the threshold confidence level is raised such that the confidence levels of all of
tlle objects fall below the threshold confidence level, resulting in only the non-cûded
portions for all objects to be displayed. Steps for searching the data structure for an
inputted word or phrase and editing the coded portions of the data structure are also
preferably included.
An advantage of the present invention is that objects within a digitized
image are displayed in their original bitmap form instead of as recognized images.
There are thus no possible displayed errors from ~ images, A user
displays an image which is identical to the source image.
Another advantage of this invention is that the data structure includes coded
15 data that can be searched, edited, and otherwise , ~ by a user.
These and other advantages of the present invention will become apparent to
those skilled in the art upon a reading of the following ~ of the invention
and a study of the several figures of the drawing.
20 ~rirf l~ crru~ n of ~h~: Drawir~.c
Figure I is a block diagram of a computer system for creating a hybrid data
structure and displaying an image in accordance with the present invention;
Figure 2 is a block diagram of the digital computer of Figure l;
Figure 3 is an example of a displayed image produced by the preserlt
25 invention;

7 2i~72~
Figure 4 is a flow diagram illustrating the method of providing a hybrid
data structure and raster image of the present invention;
Figure S is a flow diagram illustrating the step of converting a bitmap to a
hybrid data structure of Figure 4;
Figure Sa is a table showing the hierarchy of lexical ~nits used in the present
ir vention;
Figure Sb is an illustration showing the lexical units of the hierarchy of
Figure Sa;
Figure Sc is a .1;.~,...,...,.~1i, illustration showing lower levels of the
10 hierarchy of Figure Sa;
Figure 6 is a flow diagram illustrating the geometric correction step of
Figure S;
Figure 7 is a flOw diagram illustrating the step of Figure 5 for assigning
label lists and confidence levels to lexical units;
Figure 7a is a table illustrating the character labels and confidence ievels
assigned to characters;
Figure 7b illustrates a word label list;
Figure 8 is a flow diagram illustrating the font attribute recognition step of
Figure 7;
Figure 8a illustrates the font feature .""-`"-.. ~ -'f` taken on an identified
character;
Figure 8b illustrates the partitioning of different types of characters
according to measured font attributes;

- ~ 21572~
- 8 -
Figure 9 is a flow diagram illustrating the step of Figure 7 for compu~ing a
size adjustment of identified words;
Figure 9a illustrates the dimensions of a bounding box for a coded word
and a bitmap bounding box derived from non-coded bitmap data;
Figure 9b illustrates the thresholds used in ~' g if a identified word
sbould be adjusted;
Figure 10 is a flow diagram illustrating the display hybrid data structure
slep of Figure 4;
Figure 11 is a flow diagram illustrating the search hybrid data structure step
10 of Figure 4;
Figure 12 is a flow diagram illustrating the edit hybrid data structure step of
Figure 4; and
Figure 12a is a ~ illustration of a portion of a display screen and
editor showing the editing of a hybrid data structure,
Best l\~nrl-~c fnr ~ rryir~ ml~ thr Tnv-~ntinn
The present invention is well suited for displaying pages of scanned text that
ir~clude several different types of fonts, letter sizes, formatting variations, and hard-
to-recognize characters. However, the present invention is also suited to other types
2 0 of image display, such as graphical all ' ' diagrams, maps, technical
., etc.
A number of terms are used herein to describe images and related
structures. "Pixel" refers to a single picture element of an image. Taken
collectively, the pixels form tbe image. "Bitmap" refers to bits stored in digit~l
25 rrLemory in a data structure that represents the pixels. As used hereirl, "bitmap" can

- `
2l~2~a
- 9 -
-
refer to both a data st~ucture fo} outputting black and white pixels, where each pixel
either is on or off, as well as a "pixel map" having more inforrn~ n for each pixel,
such as for color or gray scale pixels. "Resolution" refers to the size, shape, and
separation of pixels of a displayed or printed image. For example, a displayed
5 bitmap of very small pixels, closely spaced, has a greater resolution, i.e. greater
detail, than a displayed bitmap having large pixels widely spaced. "Render" refers
to the creation of a bitmap from an image rlrcrrjrtj~n such as a character outline.
"Raster" refers to the ....,...,,..~ of pixels on an output device that creates an
image by displaying an array of pixels arranged in rows and columns. Raster
output devices include laser printers, computer displays, video displays, LCD
displays, etc. "Coded" data or portions are ~ ' by a "code" that is designed
to be more concise and to be more readily , ' ' in a computing device than
raw data, in, for example, bitrnap form. "Non-coded" data or portions are data that
is not repre, sented by a code, such as data of a bitmap. For example, the lowercase
letter "a" can be represented as coded data, e.g., the number 97 in ASCII encoding,
or as non-coded graphical or image data that could be used to create the appearance
of "a" on an output device such as a display screen or printer. Fonts usually have
one or more associated "encodings" that associates coded data with non-coded data.
In Figure 1, a computer system 10 for producing a raster image irlcludes
irlput devices 12, a digital computer 14, a display screen 16, a printer 18, a
keyboard 20, a floppy disk drive 22 and a hard disk drive 24. Input devices 12 are
used for inputting a bitmapped image to digital computer 14. In a described
budil~ , input devices 12 include an optical scanner 26, a digital camera 28,
and/or a bitmap generator 30. Optical scanner 26 is a device which scans an image
and generates a bitmap from the scanned image. Such scanners are typically used
to digitize images formed on sheets of paper, such as sheet 32, to a bitmap formthat can be input into digital computer 14. The gerlerated bitmap typically includes
textual objects such as characters and words from the scanned sheet of paper. An

- ~ 21~72~0
- 10-
optical scanner suitable for use with the present invention is the ScanJet llcx
IllalllJrd~ by Hewlett-Packard Co. of Palo Alto, California. Digital camera 28
creates a bitmap of an imâge captured by the camera. For example, if a user takes a
"snapshot" of a scene with camera 28, the camera digitizes the scene and outputs5 the digital data as a bitmap to digital computer 14. Digitizing cameras are well-
known to those skilled in the art. Bitmap generator 30 can be any device which
generates a bitmap and outputs that bitmap to digital computer 14. For example, a
different computer system can provide a bitmap to digital computer 14 over
network data lines or telephone lines using a modem (not shown), or a bitmap can10 be received by a facsimile (fax) card of the digital computer. r~ l10l(;, a user
c~m generate a bitmap on a computer and can transport the bitmap by floppy disk
22 or other storage medium to the system 10. The bitmaps generated by digital
camera 28 and bitmap generator 30 can include text objects or other objects, similar
to the bitm~ps generated by optical scanner 26. These objects may be identifiable
15 or not identifiable by recognizers used in the present invention (described below).
Digital computer 14 receives an input bitmap from one or more input
devices 12 and can display, transform, and/or manipulate the input bitmap. In the
described. .,,l,o.l.~,,...,l computer 14 can also implement a recognizer to recogrlize
text characters or other types of objects within the input bitmap. Once recognized,
20 the characters or other identifiable objects can be stored as codes (coded data) in a
standard format such as ASCII. The coded objects can then be displayed and
,.,,. l~,l. l ~ by application programs which accept the format of the codes. A user
can then view the formatted objects on display screen 16 and edit them, if desired.
Digital computer 10 can be a personal computer (such as an IBM-PC AT-
25 compatible personal computer), a WUlh~ iUII (such as a SUN or Hewlett-Packard . Jll~l~liu~), etc.
To display images on an output device, the computer can implement one or
more types ûf prûcedures. For example, computer 14 can transfer input bitmap
_ _ _,, _ . . . .. . . .

,
data directly to display screen 16 or printer 18 (or prsvide the bitmap data in a
memory cache) to display an image of the bitmap data. The computer can also
transform a coded object into an image ~IPs~rirti~n For example, the code for a
recognized text character can be associated with an image description which takes
5 up less memory space than several copies of the bitmap of the recognized character.
A well known image description language is the PostScript~ language by Adobe
Systems, Inc. of Mountain View, California. For example, the image descripvion
can reference stored character outlines which describe the shape of the character and
includes other rendering il~rv~ livll~ A well-known character ouvine format is the
10 Type 1~ format, by Adobe Systems, Inc. Using character outlines, computer 14
can render a bitmap for each character and send the bitmap to a storage area that is
accessible to an output device for display. In other .1,9.1;....,~, output devices
such as printers can include Illi~,lU~/lUl~ Vl~ or similar controllers which canrender a bitmap from character outlines.
Digital computer 14 can also be used to modify an input bitmap or an
image description of an input bitmap. If a user wishes to change certain portions of
the bitmap, digital computer 14 performs those changes and provides the changed
ir~age to one of the output devices
Display screen 16 displays an image of the input bitmap and/or the images
20 derived from the input bitmap (i.e. rendered images). In the described
~,lllbo~l t, display screen 16 is a raster device which displays images on a screen
~,OII~ V~ to bits of a bitmap in rows and columns of pixels. Tnat is, a bitmap
can be input to the display screen 16 and the bits of the bitmap can be displayed as
pixels. The input bitmap is direcvy displayed on the display screen in a first
25 c,l-l,v,li In alternate C~ )vl~ ', or when editing coded data, computer 14
can first render image .~ i....c into bitmaps and send those bitmaps to be
displayed on display screen 16. Raster display screens such as CRT's, LCD
displays, etc. are suitable fûr the present invention.
... ..... _ . .... _ _ _ _ _ _ _ . .. . . . .

~ 2~ ~2 j~
- 12-
Printer device 18 provides an image of the input bitmap and/or the images
derived from the input bitmap on a sheet of paper or a similar surface. Printer 18
can be a laser printer, which, like display screen 16, is a raster device that displays
pixels derived from bitmaps. Printer device 18 can print images derived from
S coded and non-coded data. Other devices can be used as printer device 18, such as
a plotter, typesetter, etc.
Keyboard 20 is used by a user to input commands and other U.,LiUllD to
digital computer 14. Images displayed on display screen 16 or accessible to digital
computer 14 can be edited, searched, or otherwise , ' ' by the user by
illputting ill ,LI u~Livils on keyboard 20.
Floppy disk drive 22 and hard disk drive 24 can be used to store input
bitmaps, image .~ , character outlines, and rendered bitmaps. Floppy disk
drive 22 f~cilitates Ll~ul~ulLillg such data to other computer systems 10, and hard
disk drive 24 permits fast access to large amounts of stored data such as bitmaps,
which tend to require large amounts of storage space. Other types of storage
devices can also be used.
Figure 2 is a block diagram of digital computer 14 and associated input and
output devices as shown irl Figure 1. Digital computer 14 preferably includes a
IlliUlU~UlU~CsDul 36, a memory bus 38, random access memory (RAM) 40, read
2 0 only memory (ROM) ~2, a peripheral bus 44, a keyboard controller 46.
u~ sul 36 is a general purpose digital processor which controls the
operation of digital computer 14. Using ill~LIu~Liull, retrieved from memory,
~ ,lU~lU~ sUl 36 controls the reception of the input bitmap data from input
devices 12, the recognition and conversion of any input bitmaps to image
2 5 .l. ., ;~ the rendering of any character outlines to output bitmaps for display,
the transfer of output bitmaps and/or image .~ to output devices such as
display screen 16 and prirlte} 18, atld the cûrltrûl ûf thûse ûutput devices. Fûr

~ - 13 - ~ 7 ~ J ~3
example, llliClU~UlU~,C~svl 36 can receive input bitmaps frûm an input device 12.
These input bitmaps can, for example, represent charæters on a sheet of paper 32.
~he input bitmaps can be divided into portions and recognized as characters by arecognizer, at which point they can be stored as chatæter codes or ûther lexicalunits, in formats such as ASCII or PûstScript. Objects of the input bitmap whichcannot be recog~uzed ("non-identifiable objects") are also stored as (, rJ u6, . . .
coded data. This is explained in greater detail below. In an alternative ~
non-identif~able objects can be stored as coded data and an associated non-codedbitmap in the same data structure in which codes for identifiable objects are stored.
This process is described in greater detail with reference to Figure 4.
Memory bus 38 is used by Illil,lU,UlUl,l~ UI 36 to access RAM 40 and
ROM 42. RAM 40 is used by llliClU~iOc~,~v, 36 as a general storage area and as
scratch-pad memory, and can also be used to store input bitmaps and rerldered
bitmaps. ROM 42 can be used to store U.,Liùl.s followed by llliClUIJlU~svl 36
as well as image ~ and character outlines used to display images of
bitmaps in a specific format. For example, portions of the input bitmap
IGUlC~ characters can be recognized and described as ASCII charæter codes
or an image ,I,cr~rljnn The characters' associated charæter outlines can be
retrieved from ROM 42 when bitmaps of the charæters are rendered to be
displayed as rendered images by an output device. Alternatively, ROM 42 can be
included in an output device, such as printer 18, instead of being included in
computer 14.
Peripheral bus 44 is used to æcess the input, output, and storage devices
used by digital computer 14. In the described .,l,lI,odi~ , these devices include
f~oppy disk drive 22, hard disk dtive 24, optical scanner 26, camera 28, bitmap
generator 30, display screen 16, and printer device 18. Keyboard controller 46 is
used to receive input from keyboard 20 and send decoded symbols for eæh
pressed key to IlliClU,UlU-,C~ UI 36 over bus 47.
.. .. . . .. .. , . , . . ~

~ 21~72S~
- 14-
Figure 3 is a .' ,, illusQration showing an example of a displayed
raster image S0 of the present invention generated from a hybrid data sQucture. As
described in greater detail with respect to Figures S and 7, objects or "lexical units"
a~f an input bitmap are partitioned and analyzed by a recogniær. The recognizer
5 assigns a confidence level to each of one or more guesses or hypotheses ("labels")
of the object's meaning or identity. Herein, an object is considered ''i~l~Lirl~vlc'l or
"rc~v~ J" if at least one label for that object has a confidence level greater than a
recognition confidence threshold. ~nrr~r~n~ " an object is considered "non-
identifiable" or "1 llr~ uvl~; (1" if it has no labels greater tban the recogniQion
10 confidence threshold. "Non-identifiable" thus does not necessarily mean that the
system does not have a hypothesis as to the meaning of the object, but, rather, Qhat
aQy such hypothesis has a confidence level less than the confidence tbreshold level.
In a first ~ ..,1"~ili",. .~1 the }aster image S0 is displayed as a non-coded (i.e.,
bitmap) image, regardless of the confidence level of Qhe objects recognized in the
15 input bitmap. The original raw, UIIIJIUC~ bitmap, or portions Qhereof, can bedisplayed directly on an output device; or, each individual word biQmap can be
displayed in place of its uv..c~lJul.Ji.~g recognized word, This allows an image to
h~lve an exact appearance with all the resolution and fidelity of the original image
that was obtained using, for example, an OCR device. The displayed image S0
20 does not rely on - c, which can be ~ lLL~,~,vu~ im that
objects or other i =- .. ,,. i~ s due to recognition have no chance of being displayed.
Tlle words and/or other objects of the biQmap image are still analyzed and
cvllc~ulldillg coded data is created in the vd~,l~lvullJ~ however, to allow searching
and editing of the objects, as described below.
In an second r~ , as shown in Figure 3, the recognized objects of
raster image 50 can be rendered and displayed from coded data. Characters 52 andwords 54 are raster images rendered from character codes. These characters and
wûrds are identifiable cûded objects that have been stored in a specific fûrmat, such

- `
~ 21~2~
- 15-
as ASCII or PostScript, having an associated size and typeface which can be stored
~nd m ~ir~ t~-d more easily than the original input bitmap form. When printed ona sheet of paper on printer 18, as shown in Figure 3, character outlines associated
~vith each identifiable character are rendered into bitmaps which are displayed as
5 coded raster images by printer 18.
Non-coded raster images 56 are different from characters 52 and words 54.
Non-coded images 56 are derived from portions of the original input bitmap whichwere not recogniæd by a recognizer ;",1.1. .". ..t. J on Illi~,lU~UIU~ UI 36. In the
described ~ " .l ,o. l; ", ,l the confidence level of recognition for the objects of images
10 56 was not high enough to allow the objects to be classified as identifiable objects;
thus, they are non-identifiable objects. Since non-coded images 56 are not
recognized and derived from stored (e.g., ASCII) character codes and character
outlines, they are derived from non-coded data. The images 56 are displayed on
printer 18 as images derived directly from the input bitmap wmch was received bydigital computer 14 from an input device 12. Non-coded images 56 are thus as
accurate as the original input bitmap image and can be displayed without having
been recogniæd. For example, lines 57 cannot be recogniæd as characters or
~ords, since they are graphical images. The display of images from non-coded
data of the original bitmap portion that describes these lines allows the lines to be
20 portrayed accurately. Preferably, the non-identifiable objects still exist as ~low-
confidenoe) coded data in the hybrid data structure; they are just not displayed.
Herein, "coded images" are derived, rendered and displayed from coded data, and
"non-coded images" are derived and displayed using non-coded data.
Figure 4 is a flow diagram 60 illustrating the method of the present
25 invention of producing a hybrid data structure and raster images derived fromcoded and non-coded data from an input bitmap. The essence of the process of thepresent invenùon is to produce and store the hybrid data structure, and the

~1~72~ 0
- 16-
r1rocesses of displaying, editing, and searching the hybrid data structure can be
included or provided from other ~ ' ' processes.
The process begins at step 62, and, in step 64, a raw input bitmap is
retrieved from one or more input devices 12. As described above, the raw bitmap
5 contains one or more objects, such as text characters and words or other shapes.
This raw bitmap is stored and available to the ~ ,lU,UlU~f~ UI 36 in appropriatesteps of the present invention. In next step 66, the Illlk.lUIJlU~ Ul 36 converts the
raw bitmap into a hybrid data structure of identifiable objects and non-identifiable
objects. The identifiable objects, such as words and characters, are derived from
10 the portions of the bitmap which are able to be recognized by a recognizer
, '- ' on the ~ .lu~lu~,sju~ as described below. Non-identifiable objects
ale objects derived from portions of the input bit map which are not able to be
recod by the recognizer. In the first l .l.l,ù.l;,,,. ..: the hybrid data structure
also includes "links" which associate objects in the data structure to portions or
15 lacations of the raw bitmap. Such links can include ~ - " pointers, etc., as
explained below. The data structure is a "hybrid" data structure in the sense that
both coded and non-coded portions are referenced by the data structure. The
process can optionally end after step 66 is complete; otherwise, the pracess
continues to step 68.
20 In an altemate rllll,o.1;,.,f.. ,l the bybrid data structure might only include
ca,ded portions for identifiable objects. For example, a recognizer can output coded
data that is a "null" symbol (or some other indication of ~ ) for
every non-identifiable object of an input bitmap. The IlLi~,lU~JlUU~ UI then only
includes coded data for identifiable objects in the hybrid data structure and does not
2 5 include coded data that is the null symbol (or coded data that has a null symbol
associated with it) in the hybrid data structure.
.

. -17- 2~25~
In next step 68, the Il~icluAulu~,ci~aul determines if there is a pûst-prûcess. A
pûst-process ûccurs if the user wishes tû display ûr manipulate the hybrid data
structure created in step 66. The pûst-process may be performed much later and/or
on anûther cûmputer system (i.e., the hybrid data structure may be created on one
S computer and displayed or ~ _' ' on different computer). If there is nû post-
process, then the process is complete as indicated in step 70, i.e., the process is
completed with the creation ûf the hybrid data structure. If there is a post-process,
the llfi.,lulu~u~c~,ùl determines if the user wishes to display the hybrid data
structure, search the hybrid data structure, or edit the hybrid data structure.
If the user wishes to see a display, step 72 is i".~ t~-1, in which a
display manager i",l.l.. ~t d on the Illl~,IVI~IUCC~Ul controls the display of the
hybrid data structure. The hybrid data structure is displayed, for example, on
display screen 16 or on a sheet of paper by printer 18 as the original bitmap, i.e. a
non-coded raster image. Alternatively, the displayed hybrid data structure includes
15 newly-rendered raster images from the codes of the idendfiable objects and original
non-coded raster images of the non-identifiable objects. A suitable display
manager 72 is the Acrobat~ software available from Adobe Systems, Inc. If the
second ~.."l.."l;,.,~..l is used, as shown in Figure 3, the images of the non-
identifiable objects are positioned on the display so that they are aligned with the
2 0 displayed raster images of the identi~lable objects and create a substantially uniform
overall image. The process of displaying the hybrid data structure is described in
gleater detail with respect to Figure 10 When the hybrid data structure has beendisplayed, the process returns to step 68.
If the user wishes to search the hybrid data structure, step 74 is
25 i,-,l,l. .1.. .llr~1 In step 74, the Illi-,lu~lucc~ul displays the hybrid data structure as
detailed with respect to step 72 and allows a user to specify particular search
criteria, such as a word or phrase. The IIIII,IU~UIU~ >Ol then searches the hybrid
data structure for the specified criteria. The method of searching of the present
, _ _ , .. . . . . . .

- ~1572~
- 18 -
invention is described in greater detail with respect to Figure 11. The data structure
can also be displayed or edited during the search process. When the user has
finished searching, the process returns to step 68.
If the user wishes to edit the hybrid data structure, step 76 is i-lr ~S The hybrid data structure is preferably displayed on a display screen using rendered
coded images, not the raw bitmap as in step 72. In addition, the display managerpresents an editing interface for the user which provides a number of options,
illcluding character or word ~ la~ and editing of ~ GL,. l ~ ~1 words and
characters. The editing process is described in greater detail with respect to Figure
10 12. When the user is finished editing, the process returns to step 68.
Figure S is a flow diagram 66 iDustrating the conversion of the raw input
bitmap to a hybrid data structure as shown in Figure 4. The process begins at 80.
In step 82, the Illi.,lU~lU._~;ssul segmentS the input bitmap by 1 ~ the
bitmap into a hierarchy of lexical units. "Lexical units" refer to portions of a bitmap
l S or image which correspond to such units or objects as characters, words, text lines,
text blocks, etc. The described process is directly applicable to an input bitmap
which includes text words and characters, such as a bitmap produced by an optical
scanner which has scanned a text page. However, the term "lexical units" can also
refer to graphical objects, such as regular shapes, icons, symbols, lines, etc.
20 Figure Sa is a ,li-,b.~.. ~;, iDustration of the hierarchy 90 used by the
plesently described (..,.~ to segment the input bitmap. The hierarchy is
organized into seven levels 92 in the described ~ b~ ' t, where the first level
includes lexical urlits generally having the smallest area, and the seventh level
includes lexical units generally having the largest area. As shown in hgure Sâ, the
25 levels of the hierarchy, from first to seventh, are a "blob", a character, a word, a text
line, a text (or graphics) block, a page, and a document. A "blob" refers to a
ca,ntiguous mark (originating as ink or other displayed image c~.ne-it~ nr such as

19 2~25 ~
toner, light pixels, etc.) which is a part of a single character, such as the dot of an "i"
character. A graphics block can be any portion of graphical images on a page that
form a distinct unit from other graphical portions on the page. For example, an
icon or sh~pe surrounded by blank space may be considered a graphics block.
Figure 5b is an illustration of the lexical units of the hierarchy of Figure Sa.Characters 94 are level 11 in the described hierarchy. A word 96 is level III, and a
text line 98 is level IV. Text blocks 100 are level V, and page 102 is leYel VI.Finally, document 104, which includes two pages in the shown example, is level
VII.
Figure Sc is a~l: .~,., "., ~;. illustration showing the "blob" and character
levels of the hierarchy of Figure S. A character 94 can include one or more "blobs"
106. For example, the character "u" includes only one blob, which is the entire
cl~racter. ~The character "i", however, includes two blobs 106: the dot of the "i"
al1d the lower portion of the "i."
l S Referring back to Figure S, each lexical unit of each level of the hierarchy
shown in Figure Sa is segmented and identified by the Illil,l.~ lU~ 101. Preferably,
the coordinates of the segmented lexical units and the bounding boxes of the lexical
units (explained with reference to Figure 9) are stored at this time. Once the bitmap
has been partitioned into a hierarchy of lexical units in step 82, step 84 is preferably
2 0 i " ~ f ' 1, in which the geometric correction of each page of the input bitmap is
nnin~-d In this step, a correcting transform is created if any lexical units of the
input bitmap are ",;~";.. .d with reference to a bitmap coordinate reference
system. Step 84 is described in greater detail with reference to Figure 6.
In next step 86, a label list having one or more labels is assigned by a
25 recognizer to each lexical unit which has been organized in one or more
f d hierarchy levels. A confidence level is assigned to each label in the
liSt which provides an indication of how "confident" the recognizer is that the label

-20- 2~72~ ~
correctly represents the lexical unit. In an altemate ~ ~l,o~l;", ~l sufficient
illl`ulllla~ivl. iS also assigned for each lexical unit in 1~ ~(l- t` 111-1-- d hierarchy levels
to retrieve the original bitmap portion of the lexical unit if the confidence levels for
that lexical unit are below a recognition threshold. These processes are described rn
5 greater detail with reference to Figure 7. The lexical units having one or more
labels with a confidence greater than the threshold level are therefore lliJell irl~.7vki
c~bjects," and the lexical units having no labels with a confidence level greater than
tlle threshold value are "non-identifiable objects." Both identifiable objects and
non-identifiable objects are placed in a hybrid data structure in step 86. After step
10 86 has been ~ "~ l the process is complete as indicated at 88.
Figure 6 is a flow diagram illustrating the step 84 of ~I..t~.1 ''"',,, the
geometric correction of the page as shown in Figure 5. The process 84 begins at
110, and in step 112, the layout ,1.~ of the page are measured. In tbis
slep, only~general ~ ,a;7UICll~ are taken to see if correction is required. For
15 example, the rotation of the input bitmap with reference to an output bitmap
coordinate system can be measured. The output bitmap coordinate system can
represent the reference orientation for output bitmaps which are sent to an output
device such as printer 18. The angle of the text lines with respect to the angle of the
bitmap coordinate lines can be measured. Also, the curvature of text lines can be
20 measured with respect to the bitmap coordinate system. The distance between atext line and a coordinate grid line can be measured along the length of the text line
to see if the distance varies. Both the rotation of the entire bitmap and the curvature
of the text lines are checked in this step because they are common alignment
problems which occur when a page of text or other images is scanned by an optica'
25 scanner26.
In step 114, t'ne Ulil,IVIJIV~ 7.7VI determines if the bitmap layout has any
distortion. The IlI~,a;7111tilll.,11L~ taken in step 112 are examined and compared to
threshold values to detect general distortion. For example, the rotation of the

-21- 21~2~0
bitmap can be detected by comparing the angle llh.~ of the text baselines
with respect to the bitmap coordinate system. If the angles are under a threshold
value, for example one degree, then no distortion is present. The curvature of text
lines can be detected by examining the distance between each text line and a bitmap
S coordinate system grid line. If the distance does not vary outside of a threshold
range, for example l/16 inch, then no distortion is present. If the input bitmap is
determined to have no distortion, then the process is complete as indicated at 116.
If the input bitmap is determined to have distortion, then step 118 is
executed, in which a distortion map is created from the bivmap. A distortion map is
10 created by measuring the deviation of rectilinear objects with respect to the bitmap
e~n~ ? RectiLinear objects include such objects as text baselines (i.e., a line
with which the bottom ends of non-descending characters in a line of text are
a~igned) and near-horizontal or near-vertical graphic lines. The distortion map is
represented by a list of x and y ~' .' at selected rectilinear object
15 coordinates (e.g., the endpoints of lines).
In step 120, a layout correction vransform is created. This transform
specifles how the Illi~.lU~JlV~ VI is to adjust the bitmap so that the measured
distortion is reduced or eliminated. The correcting v-ansform can be ~ ' as
a polynomial ~ lU,.ill.dliUll of the distortion map. Methods for computing
20 correction transforms are well known to those skilled in the att. For example,
Chapter 14 of ~ 'l I?r~e~ i C - The Art of S~ Press,
William et al., Cambridge University Press, 1988, describes one such method
kllown as Least Squares A~,ulu~dllld~iùll. The correCvion transform is used before
displaying an output bitmap as an image as detailed with respect to Figure 10 The
25 process is then complete at 116.
Figure 7 is a flow diagrdm illustraving the step 86 of assigning a label list toeach lexical unit in lul~ l hierarchy levels and, in an altemate .~ ' t,

-22- 21~2~ 0
.
assigning sufficient inf~rm~tir)n to retrieve the original bitmap portion of the lexical
unit if the confidence levels for that lexical unit are below a threshold. A "label
list," as described herein, includes one or more coded labels and a confidence level
for each label. Thus, if only one label is produced by a recognizer, the label can still
5 be considered to be in a "list." In the described ~ ~ ' of Figure 7, the
' hierarchy levels which are assigned label lists are the "character"
hierarchy level (level II in Figure 5a) and the "word" hierarchy level (level III in
Figure 5a). The described . ~ ' is thus most applicable to an input bitmap
which describes a page of text. In alternate ~ " different hierarchy levels
10 can be used. Also, a different number of hierarchy levels can be used; for example,
cnly one level, characters, can be recognized. However, when l'~o Ig
characters, another hierarchy level including connected characters can also be
recognized to decipher ambiguous character image, ' such as two
U ~ la~ lg characters.
The process begins at 124. In step 126, the charæter counter variahle "C" is
initialized to I and C is compared to NCHAR, which is the number of characters
which hâve been segmented in the input bitmap in step ~2 of Figure 5. If "C" is
Iess than NCHAR, step 128 is ~ If .~ lr~1, in which recognition is performed on
CHAR(C) to produce a component charæter list (i.e., charæter label list) having a
20 confidence level for each component charæter in the list. At this step, aU the
segmented characters in the raw bitmap are assigned a component character list
with confidence levels.
The recognition of characters from the segmented bitmap is preferably
performed by recognition software , ' ' by ~ ,lU~)lU~ ,ssul 36 (or another
25 connected llfiulu~Jluc~ssul) which can analyze a bitmap ûf one of many different
resolutions. Such recognizers are well known to those skilled in the att. A suitable
recognrzer for use with the present invention is RecoreTM, sold by Ocron, Inc. of
Santa Clara, California. A recognizer typically ûutputs a number ûf different
. _ . _ .. . . _ _ . . . .

~ -23- 2~72~ 0
hypotheses or ~ ;, which each could represent the bitmap character. The
recognizer assigns a confidence level to each of these ~ ;-, (or "labels")
~vhich represents how close the recognizer believes the label is to the identity of the
character. In the presently described rll.l.o~l;ll....l a character label is actually a
5 "shape code". A shape code is not the actual identity of a character, but represents
the general shape of the character. For example, the shape code "O" can represent a
capital "O", a lowercase "o", or a zero ("0"). The recognizer recognizes the
segmented bitmap character as one or more shape code labels, each of which has ar
associated confidence level. The confidence levels of the described . l ' are
10 separate numeric values; however, the confidence levels can be ,' ' as
other indicators. For example, if only one label is produced by the recognizer, the
confidence level can be the label itself, or, if no label is produced, the corlfidence
level can be a null symbol,
Fig~re 7a is a table 146 which shows examples of shape codes and
15 associated confidence levels for a recognized bitmap character. For exa~nple, the
bitmapped character "O" was segmented and sent to the recogrlizer as CHAR(C) in
step 128 of Figure 7. The recognizer analyzes the bitmapped character and outputs
a label list such as the one shown in Figure 7a. The shape codes are character labels
148 which represent the shape of the recognized character. For each shape code a20 confidence level 150 is associated which indicates how close the bitmapped
character is to that label in the le~ analysis In the example of Figure 7a,
the character label "O" has the greatest confidence level at 95%. Character labels
"C" and "Q" have much lower corlfidence levels. The implied characters 152 are
the possible characters represented by character labels 148 Character label "C" can
25 represent two possible characters, "C" and "c." Character label "Q" represents only
one possible character, "Q."
Referring back to Figure 7, steps 126 and 128 are i.. ~ .. t~ 1 for each
segmented character until all characters in the raw bitmap have been analyzed by ~le

215~25~
-24 -
recognizer. The process then continues to step 130, in which a word counter
variable "W" is set to one and W is compared to NWORDS, which is the number
of words which have been segmented in the raw bitmap in step 82 of Figure 5. If
"W" is less than NWORDS, step 132 is , ' l, in which a word recognizer
5 performs recognition on WORD(W) (i.e. a segmented word bitmap) using the
component character list and other procedures to produce a word list (label list)
having a confidence level for each coded label. All of the segmented words of the
input bitmap are assigned a label list with confidence levels regardless of the values
of the confidence levels.
Word recognition typically involves generating possible character sequences
(i.e., coded word labels) determined by the component character labels and
assigning a confidence level to each such sequence. One way of 1' ~ word
l.abel confidence levels is by a three-step process. First, the confidence level of
component characters are adjusted according to the character's ~ullr~ with
15 local page geometry. Local page geometry includes, for example, the character's
position relative to a baseline, an x-height line based on the top end of mid-height
characters of the ~UIIUUIIdillg text, and a cap-height line based on the top end of
higher characters of the :~UllUUlll]illg text. Second, the adjusted confidence levels ûf
the component characters are combined (e.g., via ~ if confidence levels
20 au-e given as probabilities) to yield a ~I~,Iilllill~y word label confidence level.
Finally, the ~lulilllillaly word label confidence level is adjusted according to the
degree to which the word label conforn~s with various predeflned lexical constructs.
Predefined lexical constructs include lexicons (word lists), common character
p.atterns (e.g., phone numbers or dates), and character sub-sequence 1,l.' '`
25 (e.g., bigrams, trigrams, and n-grams, i.e., ..- .l.;,. ~;....~ of 2, 3, or n characters).
Word recognizers which can produce word labels and word label confidence ievels
by this and other methods are well-known to those skilled in the art. For example,

21~2~
- 25 -
the ~ recognizer Recore sold by Ocron, Inc., is suitable for both the
character and word recognition of the present invention.
Figure 7b shows an example of a word label list 170 including word labels
166 and associated confidence levels 168 for the word bitmap "Open." The
S complete list (not shown) includes a coded word label entry for each .
of characters implied by the shape code labels of the four component characters
("O," "p," "e," and "n"). Figure 7a shows three recogruzed character labels and
implied characters for the first character ("O") for a total of six possible characters
(including all implied characters). If there were six possible characters for each of
tbe four characters in the word "Open," there would be 64 or 1296 character
sequences in the label list 170. The seventh entry in list 170 ("Open") has the
highest confidence level of the labels shown.
Referring to Figure 7, in next step 134, a number of label entries for
WORD(W) and their associated confideQce levels are saved in a hybrid data
15 structure. In the presently described ... I.Q.l;... l, all the word labels having a
co,nfidence level above a ~ f .1 "storage threshold" are stored in the hybrid
data structure. Thus, in the example of Figure 7b, if the storage threshold level
were 50 or greate}, then four labels from the shown label list 170 would be stored
in the hybrid data structure. In other .. l.. ~.l;.. ~, different amounts of word
20 labels for WORD(W) can be stored in the hybrid data structure. For example, if
the hybrid data structure were being stored in a format which only allowed one
label to be stored (described with reference to Figure 11), then the word label
having the greatest confidence level would be saved in the hybrid data structure.
Alternatively, only labels having confidence levels greater than the recognition25 threshold can be stored in the hybrid data structure. Depending on the highest
confidence level of the stored labels and the recogrution threshold, some words in
the hybrid data structure are considered identifiable objects, and some a~e
considered non-identifiable objects. The hybrid data structure can be stored in
_ _ . . ... . .... . _ . _ _ _ _ _ , . .

215~2~ ~
- ~ -26-
memory, saved as a file orl disk, etc~ The word labels are preferably stored as
coded data in a standard format such as ASCII, PostScript, etc.
In step 136, the coordinates of the lexical unit ~ ~ ' 3 to WORD(W)
are saved. These coordinates refer to the lexical unit (word bitmap) of the raw
S bitmap that UU~ *)Ulld~ tO WORD(W) with reference to the layout of the page.
For example, horizontal and vertical coordinates can be stored for opposite points
(or all four points) of the bounding box s,.l,uu.lJil.e, the word bitmap. Bounding
boxes are described in greater detail with respect to figure 9. These coordinates can
be saved irl the hybrid data structure with the associated label entries of
10 WORD(W). The IlUil_lUl)lUUW~UI can use the UUUI ' to find the lexical unit of
the raw bitmap that UUIIC:~J i' to WORD(W) and know the size of this lexical
unit. These coordinates can be considered "links", since they link a coded word to
its .u~ portion of the raw bitmap. Other types of links can also be used,
such as poiriters. The links can be useful to, for example, highlight a word bitmap
15 that has been matched to a search word, as described with reference to Figure 11.
The coordinates can also be used to display the word bitmap . . " - - ~L.~ to a
word that is being edited in the editing window of Figure 12a. These coordinatesare also useful in the . .,.l ~ 1 of the present invention where all individual raw
word bitmaps are displayed by raising the recognition threshold confidence level20 above 10û%, as described with reference to Figure lû. In another l ...~.u.l;.l.. l.
where only ulllc~u~ d (non-identifiable) words have . l ,. l r~ g raw bitmaps
displayed, the coordinates of WORD(W) can be used to display a word's non-
coded bitmap at its correct location on a page if the word is considered
..... ,, ,~,.. ,.. 1
25 Next, optional step 138 is i",l,l.. I.. t. J In the rlrst .. hf.. li.. ~. the entire
raw bitmap is displayed regardless of whether words are recognized or
Ulll~U~lliL~,d (the raw bitmap was saved with respect to Figure 4). Thus,
individual word bitmaps do not have to be saved, and step 138 is nût needed.
,, . . . ., . .. . . ., ... . ., _, _ . . .

21~7~ ~
-27 -
EIowever, in the altemate c~ ' word bitmaps can be saved here. For
example, in the e..ll>u," in which all words have an associated word bitmap
displayed, the ~ .IU~UIUl,C~:.UI can ~f~ ti~lly save the bitrnap for WORD(W) in
step 138. Likewise,inthe~.. ,.l,.,.l;.. ~ wherewordbitmapsforonly ~.d~l
S words are displayed, the ~IU~IUC~ UI can save the individual non-coded bitmap
of WORD(W) in step 138 if the confidence level of the top word label entry fûr
WORD(W) is less than the recognitiûn threshold confidence level (this is the step
138 shown in Figure 7) The "top" word label entry is the label having the greatest
corlfidence level in the label list. Thus, in the e~ample of Figure 7b, the label
10 "Open" would be the top label ently. In the presently described ~ bc " t, therecognition threshold confidence level is user-selectable, and the default threshold
value is 90, The confidence level of the top word label "Open" is greater than this
recognition threshold, so WORD(W) is considered to have been recogluzed as the
word "Operl" and is an identifiable object. The input word bitmaps of identifiable
15 objects are not saved in such an ~ ull;l~ Alternatively, the identifiable object
inputbitmapscanbesavedforalaterprocess;in the first .. 1.,.. 1.. ,.l, the original
raw bitmap and portions thereof are available, for example, for editing purposes(e g., the entire raw bitmap is stored in memory or on a storage device). If the top
label's confidence level were below the threshold value, then WORD(W) would be
20 considered ''U~ ,C.~ ,d'' (a rlon-identifiable object), and the non-coded word
bitmap of WORD(W) would be saved in step 138.
The non-coded data (word bitmap) can be saved directly in the hybrid data
structure of identifiable and non-identifiable objects. Alternatively, the non-coded
data can be stored in a separate file or other storage area, and the storage location of
25 the non-coded data can be stored in tbe hybrid data structure. This allows non-
coded data to be easily accessed whenever the hybrid data structure is displayed or
.
, . .. ...

2~72~
-28 -
Once step 138 is ~' 1, the process retums to step 130 to process
another segmented WORD(W). When all the segmented words of the input raw
bitmap have been recognized or saved as bitmaps in step 130-138, the process
preferably ~'~ step 140~ in which the font attributes of the entire raw
5 bitmap are recognized. Each recognized (identified) word is assigned a typeface
which determines how the characters of the recognized word appear when
displayed. This step is described in greater detail with reference to Figure 8. In the
fust embodiment, all the coded words are displayed m non-coded bitmap form
regardless of their corlfidence level, and thus do not require font attributes.
10 However, the font att~ibutes are still recognized in step 140 so that a coded word
can be rendered and displayed in an editing window, as described with referei~ce to
Figure 12.
Step 141 is i,.,~ .1 to compute size adjusting transforms for the
identified w'ords after font attributes have been assigned to each identified word.
15 Again, in the first . ~ o.l;". .,~ this is a. ~ so the coded words can be
accurately rendered and viewed in an editing window as in Figure 12a. ïn step 141,
each identified word is rendered in memory using the appropriate typeface and size
assigned in step 140. The si7e of the rendered word is compared to the size of tbe
original bitmap of the word. If the size difference is not within a ~.~,' ' '
2 0 tolerance, a scale adjustment is computed and stored with the identified word in the
hybrid data structure. This process is described in greater detail with refererlce to
Figure 9. The process is then complete as indicated at 142.
Figure 8 is a flow diagram illustrating step 140 of Figure 7, wherein font
attributes of the raw input bit map are recognized. The process begins at 210, arld,
25 in step 212, a page counter variable "P" is initialized to one and P is compared to
NPAGES, which is the total number of pages in the raw bit map (known from the
step 82 0f Figure 5). If P is less than or equal to NPAGES, then step
214 ls l' 1, in which a character counter variable "C" is set to I and C is

-29 21~2~ ~
compared to TOTCHAR, the total number of recognized cha~acters on the
currently-examined page of the raw bit map. "Recognized" characters. as described
herein, are those character labels having a confidence level aboYe a character
recognition threshold. The character recognition threshold is preferably set at a high level, since only accurately-recognjzed characters should be used for font
n (If not enough characters have a confidence level above the threshold,
t~le threshold can be lowered until enough characters qualify.) If C ~ TOTCHAR,
then step 216 is i~ ' l, in which the ~ U~lUCC~SUI checks if CHAR(C), a
rccognjzed character label, is the selected character type. The selected character type
10 is a certain character, such as "a," "g," etc., which is to be measured for font
features. The order of characters whjch are selected can be determined from a
t~ 1;1. J ordered list of characters. The ordered list can be designed so that
characters whjch are easily measured for font features are positioned at the top of
the Ijst. If enough characters having the same font are measured, then the font can
l S be determjned and characters near the bottom of the ordered list do not have. to be
measured. If CHAR(C) is not the selected character type, then the process
increments C in step 214 and a new character label is checked if it is the selected
type in step 216. If CHAR(C) is the selected type, then step 218 is ;.. lll.. t.. l
In step 218, a number of font features are measured for the portion of the raw
2 0 bitmap ~;ull~ to CHAR(C).
Figure 8a is a d;.~ tic j~lu5tratjon showing a characte} 236 of the rawinput bit map. Font related features such as stem width 238 can be measured in
various places to determjne which font type the character belongs to. Other
can also be measured, such as character height arld width 24û, x-
25 hei~ht, optical density, italic angle, serif type, etc. Some of the ll..,~.-IC.Il~ can
be specific to the type of character. For example, a "t" may need I ~ ~ of
certain stem areas, whi~e an `'a" character may need different h.~ ll along
the enclosed portion of the character. Often, ;,.. ~,, '- ;1; ~ 242 are present due to

30 2157~
scanning errors or other errors propagated in the process of creating an input bit
map. However, if several characters of the same type are measured, these
illC~,.lLlli~i~ a~e averaged out.
Referring back to Figure 8, after step 218, the ml.,lUI)lUU~ UI returns to
S step 214 to increment C and measure the font features for the next character in the
raw bit map. Once all the bitmap portions ~ul~c~uulld~..6 to recognized characters
have been measured, the process continues to step 220, in which the characters on
page P are partitioned into font "clusters." Figure 8b shows a graph 244 of grûups
or clusters of font attributes which have been measured for one tyre of character
(for example, a "t"). Axes 245 are "feature axes" in that they represent common
features (tlimrnci~1nc, ~hi--kr"-ee,-e etc.) that are measured for all characters of a
character type. For example, the two-~' ' graph 244 shows one axis
lc~ulc~CIIlillg the height dimension of the character and the other axis n.rr.~Jthe width dirnension of the character. These are only two of many possible features
that can be measured and compared; other features (stem width, italic angle, etc.)
can also be compared. Data points 246 represent characters having specific
UIC~ ,llL~ on graph 244. Characters which have similar In~ are
grouped in clusters 247. For example, cluster Cl designates a number of measured"t" characters which are very clûse to each other in height and width (and other2 0 features not shown). Therefore, they most ~kely have the same font type and are
clustered together. Likewise, cluster C2 represents "t" characters that have similar
l`c~ llL:~, but different ~ ulC.Il~.llL~ from cluster Cl, which are grouped to
represent a single font type.
Referring back to Figure 8, once the clusters have been organized in step
2 5 220, step 226 is i . . ,~ i in which characters on the current page are examined
for new font types which have nût been found previously. Herein, a "font type"
refers to a particular typeface (presently unassigned) to which a word belongs, and
a font type number references a particular font type. New font types are found by
... .... ... . .

-31- 2~ 5
` . ~
measuring characters as described above in steps 214-220 and comparing the
,a~u~ l~ to the lu~,a ..,._llhllb taken for font types already found. If the new.,.. _~."r.,.. f~ are equivalent to lll~,a~ already taken, then the font type is
rlot new, and the process continues to step 230. rf the new are
S different from Ill~,a~UI~ already taken, then a new font type has been found,
and the process continues to step 228. In step 228, a font type number is created
for each cluster of ~ a~Ull organi7ed in step 220 or in step 226. The process
then continues to step 230.
In step 230, a font type is assigned to each recogni_ed word label on the
10 page. In altemate flllbUl;~ i, font types can be assigned to character labels.
Each font type ~ AI~I~ to a font type number derived from the ~ If '~-
of the characters. A font type is assigned to a word label by examinirlg the
characters of the word and, ' g which cluster includes one or more of those
characters. ~ Only the high-confidence (rfco~i7f-1) characters of the word are
15 examined. Thus, in the example of Figure 7b, the recogni7ed word "Open" is
assigned a font type number by examining one or more of the recoglu7ed
characters of the word, such as "O," and d~,t ~ in which cluster that character
is included. The font type number for that cluster is then assigned to the word. If
no high-confidence characters are present in a word, then the font type of an
20 adjacent word can be assigned to the word.
In step 232, each font type number created is assigned to a typeface from a
library of typefaces. In the described f A ~ . ` t, several typefaces are stored in
memory or on disk. These typefaces are standard typefaces, for example, Times'lD,
Helvetica~, and other typefaces available from Adobe Systems, Inc. Each font
25 type number is assigned a typeface by comparing the of the font
type (the cluster) with known lll~,a~ llt~ and .l, -- ,., h . ;~ ` wbich describe the
starLdard typefaces. A font type number is assigned a standard typeface having the
closest l~l~a~LIA~ to the averaged ~ ~llb of the font type. A typeface is

-32- 21~72
, ~
assigned to each font type number in a similar fashion until all the recognized
words on the present page of the hybrid data structure can be associated with a
standard available typeface. The process then returns to step 212, wheK the pagevariabie "P" is ' and fonts are similarly measured and assigned on the
5 following page~ When all Ihe pages have been examined, the process is complete as indicated in step 234~
When storing the typeface and other font attribuoes for each identified word,
different methods can be used. If the identifled word is stored as ASCII characoer
codes in the hybrid data structuK, then a typeface and font description can be stoKd
10 as a "font tag" with the ASCIl characters. Storing character codes with a font tag is
well known to those skilled in the art. If the identifled word is stored in a more
fa,nt-oriented code language such as PostScript, the typeface and other font
attributes for the word can be specified and stored within the rules of the language,
as is well-k~own to those skilled in the art.
Figure 9 is a flow diagram illustrating step 141 of Figure 7. in which size
adjusting transforms are computed for identified words. The process begins at
174. In step 175, the height (hl) and width (wl) of the bounding box of the
original, non-coded bitmap for the word aK ~IPt~nninr~l A bitmap's bounding box
is the smallest rectangle (aligned with the baseline) that completely surrounds the
20 bitmap. The coordinaoes of the bounding boxes for all the segmented lexical units
are available from the storage area where they were stoKd when originally the input
bit map was originally segmented (step 82 of FiguK ~). In step 176, the word's top
confidence label is used to render a bitmap in memory in the word's assigned
typeface and size; then the height (h2) and width (w2) of the rendered bitmap's
2 5 boonding box are ~ '
FigUK 9a is a ~' ~ illustration showing the original, non-coded
word bitmap 192 and the rendeKd bitmap 194 derived from the top label irl label

-
~ -33 215~2$ ~
list 170. The bounding box 196 of the original bitmap and the bounding box 198
of the rendered bitmap are also shown. These bounding boxes are compared as
shown in diagram 200, where the height hl and width wl are the ~lim~r-i~mc for
the bounding box of the original bitmap, and the height h2 and width w2 are the
5 dimensions for the bounding box of the rendered bitmap.
Referring back to Figure 9, in step 178, the relative error between the width
wl of the bounding box of the original bitmap and the width w2 of the bounding
box of the rendered bitmap is calculated as ''Ew.'' Sirnilarly, the relative error
between the heights hl and h2 of the bounding boxes of the original and rendered10 bitmaps is calculated as ' Eh.
Steps 180 and 182 check if the dimensions of the rendered bitmap's
bounding box fall within an acceptable range. If the relative errors are too great, the
original bitmap 192 of the word is used. These steps are l in graph 202
of Figure 9~. If the absolute value of Ew is less than a first threshold value for the
15 width (Twl), and if the absolute value of Eh is less than a fust threshold value for
the height (Thl), then the relative error is adequately small and no further
processing is required (Twl and Thl are shown as T1 and Ew and Eh are shown
as E in hgure 9b). In the described . ' " t~ the value used for Twl and Thl
are 0.05 and 0.05, ~ Li~,ly. The process is then complete as indicated in step
20 186. If either or both of IEhl and IEWI are greater than their, ~ ., Tl
values in step 180, then the process continues to step 182. In step 182, if the
absolute value of Ew is greater than a second threshold value for the width (Tw2)~
or if the absolute value of Eh is greater than a second threshold value for the height
(Tl,2), then the dimensions of the rendered bitmap are considered to be too different
2 5 from the original bitmap to be adjusted, and the process continues to step 184 (Tw2
and Th2 are shown as T2 in Flgure 9b). In the described . , ~ , the value
used for TW2 and Th2 are 0.20 and 0.20, ~ ,Li~,ly. In step 184, the confidence
level for the topmost label of WORD(W) is set to a level less than the threshold

34 ~72SO
confidence level and the non-coded bitmap of WORD(W) is saved in the hybrid
data structure. Thus, WORD(W) becomes a non-identifiable object instead of an
identifiable object, i.e., the non-coded bitmap for WORD(W) should be displayed
instead of displaying the top label in the associated label list, since WORD(W) is
S outside the æceptable range of sizes. The process is then complete as indicated at
186. Step 184 is not . 'f ' in the first , l,o.l;...- ' in which the entire raw
bitmap is displayed.
If both of IEhl and IEWI are less than their ~ I ' g, T2 values in step
182, then the process continues to step 188. In step 188, if Ew and Eh a~e both
10 less than zero, then the rendered bitmap is slightly smaller than the ori3inal bitmap
and no size adjustment is required. If either Ew or Eh is positive, step 190 is
.. , <1 in which horizonta~ and vertical scale factors are computed and stored
for WORD(W). Whenever WORD(W) is to be rendered and displayed (such as
in an editing window), the scale factors adjust the rendered word label to the
15 ~ r ~ g size of its original bitmap image. Storing scale factors requires
much less space than storing a size-adjusted bitmap. The process is then complete
as indicated in step 186.
In an altemate ~ 21;~ 1 other lexical units (text line, text block, page,
etc.) in the hierarchy can be rendered and compared to the .;~ 1 ' 3 portion of
2 0 the original bitmap as described above. Scale factors can be computed and stored at
this time for those lexical units,
Figure 10 is a flow diagram illustrating step 72 of Flgure 4, in which the
hybrid data structure of identifiable and non-identifiable objects is displayed. The
display process of Figure 10 is only for displaying the hybrid data structure; if the
25 user wishes to edit the data structure, then the editmg process of Figure 12 is
preferably used. The display process starts at 250. In step 251, the current "mode"
is ~' ' ' For example, in the first ...~ , the entire raw bitmap (or a
.. .. .. _ . . . _ _ _ . . ..

35 2i~25 0
. ~
portion thereof) can be displayed for all the displayed words nn a document,
regardless of confidence level. This ~ u~ can also be ;~ t ~I as a
mode, selectable by the user. The user could thus also select the alternative mode m
which recognized words are rendered and displayed and u....~,S.,~ l words are
S displayed as their ~.:UII~ JUlldill~ respective portions of the raw bitmap.
If the complete raw bitmap display mode or . . ,.l ,,).l .. ". ~ has been selected
or is being used, then step 253 is ~ .tl-l, in which the layout correction
r.,t",~i.", of Figure S is applied to the stored raw bitmap to produce a
transformed raw bitmap. The raw bitmap has thus been corrected for geometric
distortion. In next step 255, a selected page of the 'ul~l raw bitmap is
retrieved and displayed on an output device such as a display screen or printer
device. The original and l. ~ r~ l raw bitmaps are preferably stored in memory
or by a different storage device such that pages of the bitmap can be retrieYed and
displayed. ~t is assumed in step 255 that an entire page is being displayed on the
output device; if only part of a page, or more than ûne page, is to be displayed on
fhe output display device, then the appropriate portions of or number of pages of
the original bitmap are displayed. Also, one or more bitmap pages can be retrieved
at one time while only a portion of the retrieved pages is displayed. The "selected"
page is the page that the user has selected to be displayed. The selected page can
2 0 also be a default page, such as the first page of a document when the document is
first loaded from a storage device. The process is then complete as indicated in step
264. If the user selects a different page of the raw bitmap, then step 255 is again
;.1~1.1. ,. ,~,, ~t. A for the selected page.
Altematively, the complete page display of steps 253 and 255 can be
~ui~ ' '~, ;".~-l "- ~f~ ~I using the steps 252 and 254-262, described below. Insuch an ~ .- - 1, the thresbold confidence level of step 254 can be t~ .t ol~ilyset (i.e., set only for the display process) to a value higher than lûû% (e.g. lû1%),
so tllat none of the words in the hybrid data structure have confidence levels greater
_, _ _ . . . .. . . _ _

-36- 21~5 0
. ~
than tho threshold confidence level. This aUows step 262, in which the bitmap for a
word is displayed, to be performed for every word in the hybrid data structure, thus
providing an all-bitmap display of the words.
If the complete bitmap display has not been selected at step 251, then the
5 user desires the alternate CIL L _ '' ' of rendering and displaying recognized coded
data and displaying individual word bitmaps. In step 252, word counter variable
"~" is initialized to 1 and W is compared to NWORDS, which is the number of
words (both identifiable and non-identifiable) in the hyWd data structure (if the
entire hybrid data structure is to be displayed). If only a portion of the hybrid data
10 st~ucture is displayed, the NWORDS can be the number of words in the displayed
portion. If W is less than NWORDS, then step 254 is i.. .~ J. in which the
CPU checks if the confidence level for WORD(W) is greater than the thrbshold
confidence value. If so, then WORD(W) is an identifiable word, and step 256 is
i-lr ~ In step 256, the coded top label of WORD(W) is rendered and
15 displayed at the location specifled by the coordinates of WORD(W) according to
the assigned typeFace and at the appropriate scale factor for WORD(W). In the
described~ l,u.~ thellli.lu~lu~ ul,orrelatedprocessorsucb as arenderer
cbip, renders character outlines cu~ r "n~ to characters of WORD(W) into a
bitmap for display according to the known 1 l ~ .,. L .i`l;. ~ of the word and the
2 0 typeface assigned to the word. Once WORD(W) has been rendered and displayed, the process returns to step 252, where W is ' and the next word is
processed.
If the confidence level for WORD(W) is not greater tnan the threshold
corlfidence value in step 254, then WORD(W) is not an identifiable (r~cû~i~
object; it is a non-identifiable (u~ uu~,.. ;~l) object. Step 260 is then -1~' 1,
in ~vhich the layout correction l, "" r " ", ~ " is applied to the stored non-coded
bitmap which is associated with the L.~c,U~ll;~.i word (or other object) to produce
a transformed bitrnap. The transformed bitmap has thus been corrected for

-37- 21572~
geometric distortion. In step 262, the l.lhlu~,lu~ ,ùl displays the ~ r., ,.,. dbitmap as an nûn-coded raster image at the cûûrdinates ûf WORD(W). The nûn-
cûded image can be displayed ûn a display screen 16, a printer 18, ûr other suitable
output device; the bitmap can also be scaled ~I,Uy~ for the given ûutput
5 device resolutiûn. The process then returns to step 252 to mcrement counter W and
display the next word in the hybrid data structure. Once alL objects in the hybrid
data structure (or in a designated portion of the hybrid data structure) have been
displayed either as coded or non-coded images, the process is complete as mdicated
in step 264.
Figure 11 is a fLow diagram illustrating step 74 of Figure 4, in which the
hybrid data structure is searched. The process begins at 270, and, in step 272, a
word (or a number of words, i.e., a phrase) is obtained which is to be used as the
search criteria. This search word is typicaUy entered by the user from keyboard 20,
or it may b,e loaded from a fiLe, another computer, an input device, etc. Other
15 objects can also be used as search criteria in other ~ b. " In step 274, the
variable HIT is initialized to zero. Variable H~T indicates how many instances of
the search word have been found in the hybrid data structure. In step 276, a word
counter variable "W" is initialized to 1 and is compared to the number of words
NWORD in the hybrid data structure. If W is less than NWORD, step 278 is
20 ;..~ in which the Illi~lU~lUCCi~UI checks if the search word is
a~ , equal to the top label entty (label with highest confidence level) for
WORD(W). Thus, in the described -.I.o.l;....1, even ""'~'6''''' ~ words (or
other non-identifiable objects) are compared to the search word even though the
~;~U5~ L words do not have confidence levels above the recognition threshold
25 level described in Figures 7 and 10. In other . .I.o~ other labels in label list
170 can be searched to find a match to the search word. The term " ~ r 'y
equc71," refers to the search word differing in minor ways from WORD(W), such
as when an ulll~u~ ,d word has one or two letters, for example, different from
,,, . , ... , ,, .. . , . .. ,,, . ,, ., . ,, _ _ , . ,

-
38- 21~2~
the search word, yet shll being equivalent for searcb purposes. Other exarnples of
words being d,~ SI equal include words having uppercase or lowercase
letters, words with different suffixes such as "-ing", plural and non-plural forms of
a word, etc.
If the search word is not 1I t' ' ' ' ~/ equal, then the process returns to
step 276 to increment W and examine the next WORD(W). If the search word
matches (is d~LJII 'y equal to) WORD(W), then step 280 is i".l,lr.,..., . ,rl inwhich the variable HIT is ;- -, ~ . . J ln step 282, the variable
HITARRAY(HIT) is set equal to the value of word counter variable W and the
10 coordinates of WORD(W) so that the found word can be displayed or ~ ' '
if necessary. For example, the data structure is preferably displayed as a complete
raw bitmap, as shown in step 255 of hgure 10. The data structure can then be
searched for a search word as described above. If a coded word (WORD(W)) is
matched to Lhe search word, the bitmap word ~u~ Jù.ldulg to the coded word can
15 be highlighted on a screen if it is currently being displayed. The coordinates of
WORD(W) are used to highlight the word bitmap that is being displayed. A
matched coded word can also be displayed/lf,61,1i61,L~ in the editor described with
reference to Figure 12. The process then returns to step 276 to increment W and
examine the next WORD(W). When all words in the document have been
20 searched, the process is complete as indicated in step 284.
In alternate . ' " the search process as described above can also be
j,,,~,l. .I~. .~t. ~1 in separate application programs which have their own search
functions and are well-known to those skilled in the art. For example, Acrobat~ by
Adobe Systems, Inc., allows a user to search an electronic document having a
25 variety of formats and object types.
In alternate . . . ,l ..~ the stored hybrid data structure can be adapted to
conform with a pre-existing "standard" format for document storage. For example,

. ~ ~39~ 21~2~
Acrobat includes a Portable Document Format (PDF). The hybrid data structure
can be stored in this format by storing recognized word labels as coded text (e.g., in
PostScript) and UlllG~U~.U~d word labels as both non-coded bitmaps and as
"invisible" text to enable searching. That is, the u~u~u~;lu~d word objects (i.e., the
S word label in a list having the bighest confidence) can be displayed, but have the
same color as the b~ v~ d to appear "invisible" and allow the non-coded bitmap
ta~ be displayed over the Ullll::~,U~ ' ' words. The invisible word objects can still
be compared to the search word and located by an er-tolerant search .fl I ' - ' `'I',
andlor edited if desired (described below).
Figure 12 is a flow diagram illustrating step 76 of Figure 4, in which the
hybrid data structure is edited to reduce the number of non-identifiable objects in
the stlucture. The method can be i.,~ J with an editor text window (shown
in Figure 12a) which always displays coded word labels, regardless of the label's
confidence level, and does not display non-coded bitmaps. Displayed words are
15 highlighted according to the Klation of each word label's confidence level to a user
colltrolled display threshold confidence level. The display threshold level allows
the user to preview the results when the hybrid data structure's confidence threshold
is changed. It also allows the user to optimize the number of words that need
for possible error correction.
2 0 The process starts at 288. In step 29û, a display threshold value is set bythe user. In step 292, the selected page is displayed, preferably on a display screen.
The coded data of the hyWd data StrUCtUre is rendered and displayed in this
process; the raw bitmap is not displayed. The user selects which page, or portion
of a page, he wishes to view. The user can specify this m an interface, such as the
2 ~i interface described below with respect to Figure 12a In step 294, the
,lvplv~ vl highlights the displayed words which have a top label entry that has
a confidence level below the display threshold level. ~Tj"~ li~,' " 1~, can meandisplaying a word in inverse video (e.g. white letters on a colored or shaded

~ -40- 2 ~ ~ 2~ ~
bl~h~;luul~d) or displaying the word in a distinct or different color, fontl etc. ~he
user can thus identify at a glance which words have a confidence level below thedisplay threshold leveL At an optimal display threshold value, most of the
hiehliVh~ words contain recognition errors and all of the ~ i"' ' words are
5 correctly recogruzed. The user can preferably change the display threshold level to
a desired level.
In step 296, the user selects a word which he wishes to edit. This can be
d in several ways, such as using a keyboard, a mouse, stylus, etc. In
step 298, the processor displays the original bitmap portion .,ull. r ' ~v to the
10 word and a portion of the :~UII~ ~ g area of the original bitmap, preferably in a
separate window. In addition, a "pop up menu" of aL or some of the label entriesassociated with the selected word which were stored in the hybrid data structure are
displayed in the separate window (or a different window). The user thus is
perrnitted to' view the stored guesses made by the recognizer for a word. In step5 300, the user edits the top label entry of the selected word to correct a
word The user can simply type in the correct word, which can
often be surmised by looking at the displayed original bitmap image of the word.The user can also select one of the other label entries in the displayed list of lahel
entries, which will substitute the top label ent~y with the selected label entry. After
20 the user has changed a word, the top label ently for that word is ~
assigned a confidence level of 100%. Once the word has been edited, the
llfl-,lU~JlU~ UI checks if the user is finished editing in step 302. If not, the process
returns back to step 292 to display a selected portion of the hybrid data structure. If
the user is finished, the process is complete as indicated in step 304.
Figure 12a is a ~ illustration of a screen display showing an
editing interface 310 of the described . .- ~c ~ This interface is displayed by
the display manager which controls the display, edit, and sea~ch functions. Editing
window 312 is used to display pages or portions of pages of coded data of the

-41- 21~2~0
.
hybrid data structure to the user to view. Text 314 includes aU words of the hybrid
data structure in their word label (coded) form. Words 316 have confidence levels
b~elow the display threshold level and are highlighted to indicate that they maycontain errors. Word 318 is both highlighted as a low-confidence word and is also
highlighted as a word currently selected by the user (words 316 and 318 can be
displayed as different colors, pattems, etc.) The associated original image 320 from
the vicinity of word 318 in the input raw bitmap is displayed irl window 322. Inthe described ~.I.l,Q.I;~l.. .ll displayed label list 324 including all the stored label
entries for selected word 318 is shown in window 326. In an altemate
10 ~.. "1,.~,1;. "` " the user can select how many of the stored labels are displayed in list
324 (if more than one label is stored in the hybrid data structure). A confidence
level for each label ently in list 324 can also be displayed if desired by the user.
While this invention has been described in terms of several preferred
. ..II,QlI;ll.. l~ it is ,' ' that alterations, ,. ..l;li ~ ,,c and pC~ u~dLiolls
15 thereof will'become apparent to those skilled in the art upon a reading of the
. . i r~ , and study of the drawings. rùl ~ QI~" certain ~~ ,10~5~ has been
used for the purposec of descriptive clarity, and not to limit the present invention. It
is therefore intended that the following appended c~aims include all such alterations,
",~.1;ri~._1i....~andr~,....~,1;....~aSfallwithinthetrue5piritandscopeofthepresent
20 invention.

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Time Limit for Reversal Expired 2002-08-30
Application Not Reinstated by Deadline 2002-08-30
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2001-08-30
Application Published (Open to Public Inspection) 1996-03-01

Abandonment History

Abandonment Date Reason Reinstatement Date
2001-08-30

Maintenance Fee

The last payment was received on 2000-08-04

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
MF (application, 2nd anniv.) - standard 02 1997-09-02 1997-07-30
MF (application, 3rd anniv.) - standard 03 1998-08-31 1998-08-18
MF (application, 4th anniv.) - standard 04 1999-08-30 1999-08-05
MF (application, 5th anniv.) - standard 05 2000-08-30 2000-08-04
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ADOBE SYSTEMS, INC.
Past Owners on Record
DAVID M. EMMETT
DENNIS G. NICHOLSON
JAMES C. KING
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Representative drawing 1998-02-12 1 12
Description 1996-02-29 41 1,952
Cover Page 1996-04-16 1 17
Drawings 1996-02-29 16 354
Abstract 1996-02-29 1 36
Claims 1996-02-29 10 348
Courtesy - Abandonment Letter (Maintenance Fee) 2001-09-26 1 185
Reminder - Request for Examination 2002-04-30 1 118
Courtesy - Office Letter 1995-10-17 1 14
Courtesy - Office Letter 1996-03-10 1 23
PCT Correspondence 1996-08-29 1 15