Language selection

Search

Patent 1256214 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 1256214
(21) Application Number: 1256214
(54) English Title: METHOD AND APPARATUS FOR VECTORIZING DOCUMENTS AND FOR SYMBOL RECOGNITION
(54) French Title: METHODE ET APPAREIL DE VECTORISATION DE DOCUMENTS ET DE RECONNAISSANCE DE SYMBOLES
Status: Term Expired - Post Grant
Bibliographic Data
(51) International Patent Classification (IPC):
  • H04N 1/41 (2006.01)
(72) Inventors :
  • GROVER, DAVID N. (United States of America)
  • KLECA, EUGENE A. (United States of America)
  • LIPKIE, CURTIS A. (United States of America)
(73) Owners :
  • ANA TECH CORPORATION
(71) Applicants :
  • ANA TECH CORPORATION
(74) Agent: NORTON ROSE FULBRIGHT CANADA LLP/S.E.N.C.R.L., S.R.L.
(74) Associate agent:
(45) Issued: 1989-06-20
(22) Filed Date: 1984-04-12
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data: None

Abstracts

English Abstract


Abstract
An apparatus codes a scanned document locally representing
each graphic element within a prescribed measure of accuracy by
a trapezoidal approximation. The invention also includes a
method for similarly coding a scanned document. The apparatus
determines whether each scanned run is indicative of a .gamma.- or
.lambda. -junction, the termination of an old, or the commencement of
a new graphic element, and whether a new linear approximation
is necessary. The invention in a preferred embodiment recognizes
symbols by determining the center of mass and maximum extremity of
a symbol candidate, and comparing it to a reference library
after normalizing with respect to scale, orientation and center of
mass. In a preferred embodiment, an adaptive threshold parameter
governs coding so as to reject noise and optimize a pair of linear
predicters in a small number of scans. In a further preferred
embodiment the accuracy of the linear predicters is refined so
that the error is exponentially bounded.


Claims

Note: Claims are shown in the official language in which they were submitted.


The embodiments of the invention in which an exclusive
property or privilege is claimed are defined as follows:
1. An apparatus for coding a graphic image from
raster digital signals defining light intensities of the
graphic image along sequential scan lines, such apparatus
comprising:
data compression means for extracting coordinates
of a run of like-valued signals along a scan line;
correlating means for correlating coordinates for
a given run from the data compression means with a new or
previously scanned vector;
vector memory means for storing data pertaining to
each vector with respect to which the correlating means has
correlated run data;
processing means, operative on coordinates of the
given run and on data read from the vector memory means
related to a correlated vector, for determining whether the
given run's coordinates define a continuation of the
correlated vector to within a prescribed degree of accuracy;
and
means for clearing the data related to the
correlated vector from the vector memory means when the
processing means determines that the given run's coordinates
do not represent a continuation of the vector to within the
prescribed degree of accuracy and for causing data corres-
ponding to the correlated vector to be given as an output.
2. An apparatus according to claim 1, wherein each of
the vectors is a trapezoid, and wherein (i) each of the two
parallel sides of each trapezoid in the approximation is
coincident with a scan line, (ii) the other two sides of
each trapezoid are not necessarily parallel to one another,
and (iii) the number of scan line lines between the two
parallel sides of each trapezoid is determined directly from
processing the coordinate data and is not necessarily an
integral multiple of a fixed integer greater than one.
28

3. An apparatus according to claim 2 further including
means operative on the output data for recognizing
symbol candidates among scanned graphic images; and
means for determining the center of mass of each
symbol candidate.
4. An apparatus according to claim 3 further
including means for determining the point of maximum extent
of a given symbol candidate and its distance from the center
of mass.
5. An apparatus according to claim 4 further
including
a reference library of symbols; and
means for comparing a given symbol candidate with
the symbols of the library and determining their degree of
similarity.
6. An apparatus according to claim 5, wherein the
reference library includes symbols normalized by size and
orientation, and wherein the means for comparing includes
means for normalizing the size and orientation of the given
symbol candidate and means for measuring the degree of
overlap with symbols of the library.
7. An apparatus according to claim 6 wherein the
means for normalizing includes means for transposing a
representation of the symbol candidate so that its center of
mass lies at the origin of a coordinate system, and also
includes means for scaling so that the extremity of the
transposed symbol lies at a fixed point.
8. An apparatus according to cliam 7 wherein the
symbol candidate is identified as a given symbol from the
library if its measured overlap with such symbol is greater
than its overlap with other symbols of the library and is
larger than a preselected quantity k.
29

9. An apparatus according to claim 2 further including
means for determining the prescribed degree of accuracy as a
variable function of the width of the graphic element being
scanned by the apparatus.
10. An apparatus for coding binary data representing
sequential scan line signals indicative of portions of an
image field, such apparatus comprising:
data compression means for extracting from such
data coordinates descriptive of each run of consecutive
like-valued signals in a scan line, such coordinates being
indicative of the width and location on such scan line of
the runs; and
processing means operative on the coordinates for
representing the scanned image field invariably and entirely
as a piecewise trapezoidal approximation, wherein (i) each
of the two parallel sides of each trapezoid in the approxima-
tion is coincident with a scan line; (ii) the other two
sides of each trapezoid are not necessarily parallel to one
another; and (iii) the number of scan line lines between the
two parallel sides of each trapezoid is determined directly
from processing the coordinate data and is not necessarily
an integral multiple of a fixed integer greater than one.
11. An apparatus according to claim 10 wherein the
piecewise trapezoidal approximation comprises a plurality of
blocks of coded information signals, each such block being
representative within a prescribed measure of accuracy of
values of the coordinates of the runs related to a corres-
ponding graphic element over a preceding number of scan lines.
12. An apparatus according to claim 11 wherein the
prescribed measure of accuracy lies between two preselected
values.

13. An apparatus according to claim 12 wherein the
prescribed measure of accuracy is a variable function, and
wherein the processing means further includes means for
determining the value of the measure of accuracy, at a given
scan line for a given graphic element, as a function of the
width of the graphic element.
14. An apparatus for coding data representing sequen-
tial scan line signals defining light intensities of a
graphic image, such apparatus comprising:
data compression means for extracting coordinates
of a run of consecutive like-valued signals along a scan
line, such coordinates being indicative of the width and
location on such scan line of the run;
memory means for storing at a given time blocks of
processed data, each block related to a single vector which
has been previously scanned, and such data including for
each vector, a coordinate pair descriptive of the width and
location of the vector at an initial scan line on which the
vector first occurs and a coordinate pair descriptive of the
width and location of the vector at the scan line on which
data pertaining to the vector was last obtained;
slope calculation means, for calculating, on a
recurring basis, the average rate of change per scan line of
each of the coordinate pairs over the length of the vector
prediction means, using data from the slope calculation
means, for predicting the coordinates descriptive of the
width and location of a run of like-valued signals applic-
able to a vector on a current scan line; and
connectivity means for determining whether the
coordinates descriptive of a given run are predicted, within
a prescribed measure of accuracy, by the prediction means
using data from the slope calculation means applicable to a
given vector in the memory means, and, in the event of a
prediction within the prescribed measure of accuracy, for
updating the data for the given vector in the memory means.
31

15. An apparatus according to claim 14, wherein the
connectivity means comprises means, for determining whether
the run has one of the following properties:
(i) it is the first scan of a vector after a Y-
junction with a previously scanned vector;
(ii) it is the first scan of a vector after a .lambda.-
junction with a previously scanned vector;
(iii) it is a simple continuation of a previously
scanned vector;
(iv) it is a new vector with no connectivity to a
previously scanned vector.
16. An apparatus according to claim 15, wherein the
connectivity means includes a plurality of parallel arithmetic
logic units configured so that the determinations made by
the connectivity means are made substantially simultaneously.
17. An apparatus according to claim 14, including means
for causing recurrent calculation by the slope calculation
means whenever the number of scan lines, over which coordi-
nate data of the given vector have been obtained, is an
integral power of 2.
18. An apparatus, operative on input signal samples
representative of light intensities of a two-dimensional
image along a series of successive scan lines, for encoding
graphic information therein by generation of a series of
output records representative of significant graphic data,
the apparatus comprising:
(a) data compression means for extracting, from
such input signal samples, two coordinates descriptive of
each run of consecutive like-valued signals in a scan line,
such coordinates being indicative of the width and location
on such scan line of the run;
32

(b) graphic element memory means, for storing a
plurality or records of graphic elements, each graphic
element constituting a run of substantially consecutive like-
valued signals in each of a plurality of substantially
consecutive scan lines, the run satisfying a continuity
requirement that the coordinates indicative of the width and
location of substantially each run, occurring in any scan
after the first in which one of such runs occurs, be within
a prescribed measure of accuracy of the coordinates predicted
therefor on the basis of a specified algorithm utilizing
coordinate data of a run from at least one preceding scan
line;
(c) predicting means, for predicting, in accord-
ance with the algorithm, the coordinates indicative of the
width and location of a run on each of a number M, M ? 1, of
consecutive scan lines attributable to a first graphic
element, based on data for such graphic element stored in the
graphic element memory means;
(d) processing means, in communication with the
predicting means and the data compression means, for deter-
mining whether a run has a fact occurred within M scan lines
having coordinates that have been predicted by the prediction
means within the prescribed measure of accuracy, and for
updating the graphic element memory means in the event of a
prediction of a run within the prescribed measure of accuracy,
and in the event that there has not been a prediction within
the prescribed measure of accuracy, for providing an output
from the graphic element memory means of data pertaining to
the first graphic element.
19. An apparatus according to claim 18, wherein the
processing means further includes means for purging from the
graphic element memory means data pertaining to the first
graphic element in the event that such data has been provided
as an output, so as to make available in the graphic element
memory means the space formerly occupied by such data.
33

20. An apparatus according to claim 18, wherein (i)
the graphic elements are all vectors, (ii) the predicting
means includes means for determining the rate of change per
scan line of each of two coordinates of the first graphic
element, and (iii) each of the coordinates predicted by the
predicting means is a function of the product of the rate of
change of per scan line of such coordinate times the number
of scan lines over which the prediction is being made.
21. An apparatus according to claim 20, wherein the
prescribed measure of accuracy is a variable function having
values between two preselected numbers, and wherein the
processing means includes means for determining the
prescribed measure of accuracy as a function of the width of
the first graphic element at a preceding scan line.
22. An apparatus according to claim 18, wherein the
prescribed measure of accuracy is a variable function having
values between two preselected numbers, and wherein the
processing means includes means for determining the
prescribed measure of accuracy as a function of the width of
the first graphic element at a preceding scan line.
34

Description

Note: Descriptions are shown in the official language in which they were submitted.


~5~;2~
(1)
METHOD AND APPARATUS FOR VE~T~RIZING DOCUMENTS
- __ _ AND SYMBOL RECt)GNITION
S ..
D criPtion
This invention relates to device3 and methods for
compactly converting planar drawing~ to coded dlgitized
informatlon for subse~uent storage, printing and editing, f
and to such de~ices used for symbol learning and recogni
tion.
sackqround Art
Digital coding of graphic lnforma~ion is commonly
15 called for in a wld~ variety of ~ontext~ from facsl- '
mile da~a tran~mi~ion to compute~ized photograph analy- f
si~ and pattern recognition, to computer-aided design
applications. The fir~t step in ~uch digitizing is to
scan the document in a controlled fashion, measuring the 'l~
20 graphic value of the image at each point. Currently i-
available scanning devices are capable of substantially ~.
simultaneou~ly delive~ing a binary output signal for each
of n lines of resolution cell~ ~ each cell being
approx~mately r 01 mm square. Thus a one metee long ~can
llne of an engineering drawing for example would contain
105 such resolution cells; a single square centimeter
would contain 10~ re301ution cells.
In pr~ctlce there i~ a high degree of repetition of
graphical values in any lmage, and accordingly with the
ex~reme volume of digitized graphical value3 produced by
~canning even the 8imple8t images ~ it is necessary to
employ coding teehnl~ues, or, more colloquially, pattern
recognizing techniques, to reduce the required volume of
stored or tran~mltted dat~. The ~imple~t such technique
is a one-dimen~ional lnformation compre~on, such a~
run-length ~ncoding, in which ~or each ~can line a ~tring
length and starting coordina~e are coded only when the
value of a string of con~ecutive resolutlon cell~
,.................................................................... ~
. . . , . ..: . ..
- - : ~ , .:, .
'
- " . ' ~ ,
:..

~ 2)
change~. Where, ~ indicated above~ the digitiæed infor-
mation is ln the ~orm of rAster output data ~rom .Olmm
resolution cell~, a typlcal 80 character alphabetic line
might then be coded a~ ~pproximately 200 information
slgnals for each 20~m long scan line, a reduction of 99
percent compared to the 2x104 bits of raw raster output
data. When it is considered that a sheet of A4 paper
contains 6X108 such resolution cellsf it can be seen that
such a coding i~ still very cumbersome, requiring over a
million informatlon signal~ to code a single page of bi-
tonal writing, ~can line by scan line. This inefficiency
i~ addres~ed in the prlor art by a number of techniques
which look for broader patterns by correlating the run
length compres~ed dat~ across a second dimension, typi-
cally by comparing contiguous adjacent scan line data andcoding the difference.
Among such te~hniques are those shown in U.S.
Patents Nos. 3,937,871; 4~213,154; and 4,lB9,711.
Another technique of two dimensional encoding involve~
de~igning circuitry specifically to efficiently
recognize particular kinds of patterns encountered in a :
fixed use. U.S. Patent No. 4,307,377; for example, shows
a device which code~ narrow straight lines, and whi~h
approximates narrow curved lines by a segmental approxi~
mation. That patent claims a 97 percent reduction in the
amount of data required to be stored, although it i~ not
clear what the ba~e technique for such a comparison i~.
Each of the above techniques, while offering a
significant reduction in the amount of data required to
be stored as compared to the raw raster output, ha~ its
drawbacks. Typically the coding technlques which corre-
late succes~ive scan line intercepts require the codinq
of some data for each ~can llne intercept, and do not
produce an output indexed to provide simple acces for
editing or for addressing portions of the stored image.
: The method shown in U.S. Patent No. 4,307,377 avoid~
the6e problem~ ~or certain graphic elements, but does not -

offer as signlficant a ~ata reduction in coding of image~
other than thin line~. As an example of the limitAtions
of prioe art, when applied to a c~ocument contain~ng only
a vertically oriented black ~sosceles triang].e, cen~er~d
on the page, none of the foregoing coding technique
would give an output as compact a~ the intuitive mathema-
tical desc~iption compri~ing two linear equations and the
top and bottom y-coordinates; nor would the coding output
of ~uch prior art devices indicate that such a ~imple
image was being scanned. What i~ needed is a single
device which qulckly recogni2e~ patterns and compactly
codes the information contained in general two dimen-
sional drawing~ containing both l~ne drawings and ~haded
image portions, and which develops an output u~eful for
the extraction of higher level information such as writ-
ten character~.
Brief ~e~cr~ption of the Invention
The presen~ invention overcomes the foregolny limi~-
ation~ by real-time processing of ra~ter scan information
using data maintained in a ~mall memory describing e~ch
signif~cant graphic elemen~, or vector, which ha~ been
encountered in the preceeding M scan lines. In a pre-
ferred embodlment, the ~vecto~ized" description in
memory includes function~ of two coordinates~ su~h as
oenter line and wldth, which locally represent the
graphic element as a trapezoid. As the document is
scanned, these vector descriptions in memory are con-
tlnuously updated and the descriptive parameters
refined~ When a vector ends, encounters a branch point,
or when a new vectorization i~ required to represent the
graphic element scanned, a compactly coded vector
description is ~ent as an output, together with a code
~ignal ~ndicating the presence o~ a branch point with
another vector if applicable, and the corresponding
address in memory i~ freed up to accomodate another vec-
tor. At each polnt during th~ real time proceQ~ing of
raster data a number of parallel arithmetic logic units
. -
.
:. , - - ' ' :
: . '
- '-
- ,

(4)
compare a ~can line intercept or run with a vector in
memory, and no additional coding is performed unless new
information 1~ encountered. The entire coding process is
done a~ a single pa~s operation, at a ~peed comparable to
S the scan speed, yet putting out immensely compact data.
In the above cited éxample of a ~olid triangular image
centered on the page, for exalmple, the device would, upon
pas~ing M scan lines below the bottom of the triangle
without encountering any continuing vector, cause to be
produced as an output a code indicating the vector had
terminated, the addres~ in memory of the vector parame-
ters, and its la~t encountered left and right edge coor-
dinates. The data in memory would lnclude the
coordinates of the top of the trlangle, and two linear
functions de~crlbing its center line and width. (The
ending scan line number would automatically be registered
on a common synchronizing bus for various components of
the system.~ Thu~, fewer than ten integers would fully
describe a figure which in the case ~iay of a three cen-
timeter tall ~riangle wlth an area of 10 s~uare cen-
`timeter~ would include 107 resolution cells spanning
3x103 scan lines~ For a more complica~ed figure, e.g., a
region bounded by a simple closed curve, the output would
contain similarly compres~ed data representing the flgure
~5 by numerical parameters giving a linear geometricapproximation by included trapezoids. ~n one preferred
embodiment of the device, the apparatus includes a
keyboard-s~lectable means of varying the integer para-
meter M which determines the number of scan lines ~hich a
nmissed" vector will be held in memory before it is
recognized as texmina~ed and caused to be output. The
preferred embodiment also includes a keyable thresholding
means, in which the u~er select~ an upper and lower
threshold limlt (MINTR0 and MAXTR0) for the dimensions of
images the machine will recognlze as noise -e.g. line
edge irregularlti~, ink ~pot~ or voids. The
- . . : . . .
-,
. - ~ . ' .' ' ' .
. ~ .

;L~5~21~L
(5)
apparatus then varies an intermediate threshold testing
parameter between the two limits as a function of the
dimensional traits of the vector under consideration near a
particular scan line, so as to optimize the parameterization
of the vector within a small nurnber of scan lines, while
rejecting as noise any information contained in runs smaller
than the threshold level. In another embodiment, the
apparatus sorts the vectorized images by dimensional and
topological criteria to identify symbol candidates, deter-
mines the center of mass and maximum extremity of the symbol
candidate, and transposes the candidate to a standard orien-
tation and scale. The normalized candidate is then recog-
nized by comparison to a preselected or learned library of
symbols.
According to a further broad aspect of the present
invention there is provided an apparatus for coding a graphic
image from raster digital signals defining light intensities
of the graphic image along sequential scan lines. This
apparatus comprises data compression means for extracting
coordinates of a run of like-valued signals along a scan
line. Correlating means is provided to correlate coordinates
for a given run from the data compression means with a new
or previously scanned vector. Vector memory means is
provided for storing data pertaining to each vector with
respect to which the correlating means has correlated run
data. Processing means, operative on coordinates of the
given run and on data read from the vector memory means
related to a correlated vector,is provided for determining
whether the given run's coordinates define a continuation of
the correlated vector to within a prescribed degree of
accuracy. Means is also provided for clearing the data
related to the correlated vector from the vector memory means
when the processing means determines that the given run's
coordinates do not represent a continuation of the vector to
within the prescribed degree of accuracy and for causing data
corresponding to the correlated vector to be given as an
output.
. , . - , .
': ;
-
'

- 5a -
According to a still further broad aspect of
the present invention, there is provided an apparatus
for coding binary data representing sequential scan
line signals indicative of portions of an image field.
This appara-tus comprises data compression means for
extracting from such data coordinates descriptive
of each run of consecutive like-valued signals in
a scan line, such coordinates being indicative of
the width and location on such scan line of the runs.
Processing means is operative on the coordinates for
representing the scanned :image field invariably and
entirely as a piecewise trapezoidal approximation,
wherein (i) each of the two parallel sides of each
trapezoid in the approximation is coincident with
a scan line; (ii) the other two sides of each trape-
zoid are not necessarily parallel to one another;
and (iii) the number of scan line lines between the
two parallel sides of each trapezoid is determined
directly from processing the coordinate data and is
not necessarily an integral multiple of a fixed integer
greater than one.
According to a still further broad aspect of
the present invention, there is provided a method
of creating a normalized representation of a symbol
candidate appearing in a -two-dimensional image field.
The method comprises scanning the candidate in an
ordered manner and assigning to the candidate a rep-
resentation by coordinates along axes of a coordinate
system. The coordinates correspond to the line and
position values relating to the scan signals generated
by the candidate. The center of mass of the repre-
sentation of the candidate is then determined. A
point of the candidate at a maximal distance from
its center of mass is also determined. The candidate
is scaled with respect to its center of mass by a
., ~ . . . : .
.
.: : . ..

- 5b ~ J~
factor proportional to the reciprocal of the distance
from the center of mass -to the point of maximum ex-tre-
mity. The candidate is -then rotated around its center
of mass so the segment determined by its cen-ter of
mass and the point of maximum extremi-ty lies at a
prescribed angular orien-tation with respect to the
axes of -the cooxdinate system. The scaled rotated
candidate is then compared to a library of stored
representation of symbols and the one with which it
has the greatest degree of overlap is determined.
According to a still further broad aspect of
the present invention, there is provided a system
for identifying, from a cluster of vector data obtained
from scanning graphic information along a direction
of scan, a symbol in a library of candidates for the
symbol. The system comprises means for determining
the center of mass of the cluster. Means is also
provided for determining a point of the cluster at
a maximal distance from its center of mass. Means
is provided for scaling the cluster with respect to
the center of mass by a factor proportional to the
reciprocal of the distance from the center of mass
to the point of maximum extremity, so as to normalize
the cluster. Means is also provided for computing
a representation of the cluster equivalent to rotating
the cluster around the center of mass so that the
segment determined by the center of mass and the point
of maximum extremity lies at a prescribed angular
orientation with respect to the direction of scan.
Means is also provided for comparing the scaled rotated
candidate to a library of stored representation of
symbols and determining the one with which it has
the greatest degree of overlap.
:~
,
- , . .
,: - . . : -
,' '
'~ '
. . . .
. .

- 5c -
Brief Description of the Drawings
These and other features of the invention will
be more clearly understood by reference to -the drawings.
Figure 1 is a block diagram of the operation
of the present device.
Figure 2 shows the steps involved in the initial-
ization processing.
Figure 3 is a block diagram of the processing
apparatus showing major functional units.
Figure ~ is a flow diagram of the real time
processing steps involved in coding a document.
Figures 5-12 are logic charts and flow diagrams
indicating the precise method of coding as well as
the method of thresholding, or hole filling, and pro-
visional coding or missing vector processing, tech-
niques employed in the present device.
Figures 13 and 1~ are tables showing the data
- included in output records at the various processing
and coding stages.
Figure 15 shows the steps of termination pro-
cessing according to the present invention.
Figure 16 shows a center line output record
of a vectorized image.
.
-. . .
: , ' ~ . : ' - '' ' : - ~ . ,.
.:
: .
.

~6)
Figure 17 show~ symbol candidates i~entiied from
the output record of Figure 16.
Figure 18 shows a symbol having center o~ mas~ and
extremity in the standard orilentation.
Figure 19 show~ the same image in another oriLentation.
Figure 20 shows a symbol having ~wo extremities and
illustrates the two library t:emplate~ against which it
would be compared ~or recogniition.
Figure 21 shows the information maintained in the I,
3 and T~ registers in a preferred embodiment of the vectorizer.
These and other features of the present device will
now be more fully explained.
Detailed De~ription of the Invention
Referring now to Figure 1 there are indicated the
three stages involved ln processing or "vectorizing~ a
document according to the present invention. Generally,
the present invention takes raster output data and concurrent-
ly converts it into compactly coded ~ata representative oPraster scanned graphical elements depicted on engineering
drawings or the liLke. The coding is performed by ~epre-
senting each gxaphical element in an image as an approxi-
mation including one or more trap~zoids, and storing the
parameters of trapezoids so de~cribiny currently inter-
cepted portions of the iLmage in a memory. Each trapezoid
in the approximation of the image is referred to in this
description and the ollowing claims as a "vector~
should be noted that, contrary to normal usage, the
Nvectors~ refer~ed to herein have "width" as well a~
longitudinal orientation.) Input raster data related to a
given small region 1~ then compared to memory data for
nearby region~ and relevant information relating thereto
is encodefl. The memory data, for each vector, iLnclude~
35 an ordered block of information derive~ from processlng
prlor scan iLntercept~ of the related graphic element.
~h1B block 0f information is al50 sometimes called below
.. :. . , , ~
: ~ ~ ' . . , - ' -

~7)
a ~vector~. The first ~tage in processing, labeled
INITIALIZING PROCESSING ln Figure ly requires defining
several pointer vectors, which will depend on the size of
document scanned, a~ well as ordering the memory, and
undertaking the initial hand-~haking proce-dures with the
output proce~sor and the data compresslon unit~ The
second major ~tage depicted in Figure 1 is the real time
processing stage. During this ~tage, binary raster out-
put data is fed, via a data compression interface unit9
to the processor. The proces~or then sequentially com-
pares the incomlng segments of raster data with vec-
torized information in its memory, developing an output
when necessary~ When the processor determines that a
given vector re3ulting from raster data is terminated at
a given scan line, an ou~put is made; however a graphic
element may contlnue as a single vector to the end of the
document being scanned. Accordingly in a final st~ge,
denoted TE~MINATION PROCESSING in Figure 1 the device
forces output~ for tho~e vectors which have not already
terminated, and ~ives appropriate termination signals to
other units of the apparatus.
In Figure 2 ~re shown the step~ involved in the INI-
TIALIZING PROC~SSING according to the present device. I
2 Generally, the use of the symbol l=] in the drawing~ i
means "is replaced by~ or "is set equai ton, as is com-
monly understood in computer system~ design. As used in
the figures, V stands for the vector under consideration.
NV is a variable representing the next vector in the
memory. S* denotes the current scan line number. The
symbols R, ~ denote the right and left edge coordinates
of a run. Where used with an I (or I+l) subscript (e,g.
RI) the edge coordinateA refer to the corresponding right
or left edge data in the I ~or I+l) register of current
input segment~. (The~e register~ are described ~elow.)
A s~bscrlpt R2 refer~ to the edge data from the vector in
the J-register, which will normally be the edge data from
....
' ' , ` ' :.
:

2~.~
(8~
a previous scan of the ~raphic element related to the I
or I~l register run. Finally the subscript R0 when used
with the quantities C, E, or S denotes the value of the
center, extent or scan lin~ number re~pectively of a
memory vector at the scan line where that vector record
commences; when urqed wlth the threshold function T,
TRo denotes the la~t-computed value of the threshold
function T, which ls updated each time the center and
edge predic~or ~lopes are reiEined, as will be ~urther
explained wlth reference to Figure 6. Re~urning now to
Figure 2, as indic~ted ln the uppermost left hand box of
Figure 2. Vector number 0 i~ defined as a zero width
vector with right hand edge has coordinate ~ his is
a ~entlnel vector with fictitious or impos~ible coor-
dinates, used to designate the left hand side of the
document. The next vector, vector number 1, is al~o a
fictitious sentinel vector, and is deined in the third
box down, a~ one having its left and right edges at
2 scanner resolution element number 65,000. Thls is a vec-
tor of zero width located at the right hand edge of the
document, the number 6~000 corresponding to the case of
a document having a 65,000 resolution cell scan line
width. The remainder of the steps shown in ~he left hand
25 column of Figure 2 amount to establishing a numerieal
order of the addresses in the random access memoryO In
one embodiment of the device currently operating, th~s
memory accomodates 2 10 words, each word being lB8 bit~
long. It has been found that a memor~ of thi~ size i~
generally more then adequate for the processing of even
very den~e engineering drawings. In the right hand
column of Figure 2 are indicated the remaining steps of
the lnitiali.zing processing. Namely, after the two ~en-
tinel vectors V30 and V31 have been defined and the ran-
dom access memory fully orflered, the device is set for~can line 2ero ~t the left hand side, and signals are
output to prepare the output processor for receiving data
and to initi.ate input of data from the data compression
,
',
.

*;~
box. ( ~
Turning now to Flgure 3 there i~ shown a block
diagram of the vectorlzer according to the present invenW
tion, in which scanner lnput ls fed to a data compression
circuit 31 which extracts, fl.om each run of like-valued
bit~, ~he coordinate~ of the center point, C, the scalar
extent E or h~lf-w~dth of the run, and the coordinates of
the left edge L and rlght edge R of the run. Logically
this lnformat~on i8 redundant:, inasmuch as the first two
values may be ~omputed from t:he ~econd two, but as a
practical matter the creation of an ordered 4~tuple of
data values for each input run or ~egment can be done in
real time at ~he input ~tage, and doing so avoids greater
compl~xity in the later proce~sing. The orflered
~C,E,L,R) lnformation i5 fed to an INPUT BUFFER 32 which
contains two register~ denoted I, I~l, which hold
compressed (C,E,L,R~ data related to two consecutive
input ~egments. The device processes the information in
the input buffer by sequentially comparing coordinate~ of
2~ input segment~ to ~ corre~ponding nvector record" held in
a vector memory 33 and temporarily loaded in~o a single
register 34 called the J register.
A c~mmon mul~ibu3 structure 36 connects the variou~
registers to a plurality of arithmetic logic uni~s shown
as 3~a 35e so a~ to deliver corresponding center or edge
coordinates of ~egment~ being processed to the ALUs
35a-35e. The use of multiple ALUs and a multibus ~truc-
ture togethe~ allow numerous arithmetic comparisons on
different set~ of coordinate~ to be carr~ed out in
3~ parallel, so that relevant dimensional and topolog~cal
data may be quickly a~certa~ned. ~he J register 34 a~
noted 1~ loaded at a given ti~e with a vector record from
the memory 33. Each record ls an ordered 6tring of words
derived fro~ the proces~ing of previous ~nput ~egments,
and contain~ in addition to the edge coordinates inter-
cepted by the mo~t re~ent scan lin~ a variable thre~hold
.
:
.

~ 2
(103
number T~o, which governs processing parameterizatlon
steps, the original ~can llne number SRo with which the
record commences, and lineae predictor functions DE and
DC from which the center po.int and half-width or extent
S of the input may be predicted, The precise content~ of
the vector record and the use of each coordinate or other
parameter will be clearer w:Lth reference to Figures 7-12
and 21.
Referring now to Figure 4 there are shown the ~tep6
involved in REAL TIME PROCESSING. Initially the scan
line is incremented by one, and a 16-bit binary word is
output to the output processor indicating this increment.
The next input to the I register, which generally
receive~ the center, extent and edge coordinates of the
next scan line intercept from the data compression
module, is loaded, and inasmuch a~ the I registers will
initially contain ~rrelevant material such as the right
hand sentinel vector or the left hand sentinel vector for
the new scan line, the right edge is arbitrarily set to -
~
. This flushes the I registers and prepares ~he pro-
cessor for NEXT STAGE processing, shown in Figure 6.
The NEXT STAGE processlng 18 a sorting step, in
which the informatlon in the I register is processed by
: the parallel arlthmeti~ logic circuitry in an acceptably
short time. A~ noted before, the coding process occur3
ln real time and must be accompli~hed at substantially
the same rate a~ ~canner data is fed into ~he system. In
NEXT STAGE processing, ~ gi~en vector intercept i~ deter-
mined to be one of the following: a branch of a lamda
junction, a trunk of a Y junction, a new segment, or a
~: simple overlap talso called a continuation) of an
existing vector. In order for the coding to proceed
without inordinate demands upon qtorage or memory, and
without requiring delays or halts of the scanning opera-
tion, the cleterlllin~tlon as to which one of the above four
topological and geometric properties applies is made by

r3
(11)
performlng simple arithmetlcal tests upon the left and
right edges of a vector from memory ~held in the J
regi~ter~ and an input record from the data c~mpre~or,
held in the I or I~l register. These arithmetic com-
parisons are done in p~rallel by four of the arithmetic
logic units 35~a)-~e) shown as ALUl-ALU5 in Figure 3; the
timing ~ignals from a t~min~ control unit causin~ the
appropriate edge coordlnates to be automatically deli-
vered to the ALU along bu~e~ for such processin~.
Turning now to Figure 5 which indicates the pro-
cessing denoted N~XT I, it can be seen that the center,
extent, left and right edge coordinates ln the I+l
register are tran~ferred to the I regi~ter, and if there
i~ an input available from the data compression unit,
then the coordinates of the center, extent, and left and
rlght edges of the next available input are loaded into
the I+l regl~ter. In this manner there are always
aYailable the C, E, L and R data for the next two inter~
cepts alon~ a scanned line for proces~in~ by the pro~
cessor. In additlon to these two records being
processed, the processo~ has available in the J regi~ter
a complete vector record, comprising 188 bit~ more or
less of informaton as will be more ~pecifically ouSlined
2S below. Thi~ J regiBter record shown in Figure 21 18 the
mo~t recently updated record for a particular vector in
the main memory. Thls main memory record ~ontains pre-
dictlve parameters for the vector under consideration as
well as the scan line number where it first appeared and
~t~ last measured center, extent and left and right
edges, all of which will be loaded into the J regi~ter at
a given time for processing o~ input data from intercepts
below it in She next subRequent scan line. Thi~ vector
from main memory i~ referred to generally as a
~superad~acent vector" in relation to graphic intercep~
directly or contiguously below it and for which it may
constltute a connected portion of a continuation or a Y
~ . ~
.

t`~ ~
(l2)
or lambda member. The actlve vector ~egments in the
memory are continuou~ly re-ordered in a forward link-list
during processing ~o that the proces~or a~resses them in
the order they are encountered, from left to right, ~o
that there will be ~t mo~t two Wsuperadjacent~ vector~
for a glven intercepted segmlent.
Also shown in Figure 5 is the N~XT V proce~æing. In
the3e proce~sing ~teps the preceding vector is stored~ V
ls set equal to the next vecltor in memory, which i8
loaded into the J-regi~ter. The quantity RSAVE, a tem-
porary processing number equal to plu~ the right handcoordinate o~ the vector is 3aved, and the next vector NV
i8 loaded into the K register. (The quantity RSAVE 1
next used in a s~mple coordinate comparison to detec~
whe~her a hranch point h~s been scanne~.)
Turning now to F~gure 6 there i~ shown the stage of
processing de3ign~ted NEXT STAGE proce~sing, in which
edge data for the vector, or vector intercept~ in the J,
or I, an~ I+l register~ re~pectively are selectively com-
2~ pared by four parallel arithmetic units and, according asthe respective arithmetlc inequali~ies are true or false,
is fed into a sub~equent processing ~tage as indicated ~n
the six boxe~ on the right hand side of Figure 6.
Inspection of the log~c di2gram of Eigure 6 reveals that
for a given geometrical configuration, the arithmetic
~ompari~on te~ts outlined in the flow diagrams of the
left hand column will lead to one and only one of the
process boxe3 in the righ~ hand column of Figure 60 In
partlcular the top left h~nd logic te~t indicate~ that
only if the left ~dge of the I+l iQpUt iS not greater
than the right edge coordinate RSAVE is the intercept
repre~ented in the I~l reg~ter proce~sed and coded a~ an
L-junction. The quantity RSAVE is defined to be ini-
tially - ~ , and generally i~ e~ual to the right hand
35 edge of the vector in the J register plus ~ . Similarly
going to the ~econd logic branch box, only if the left
, - . . .. . . .
,., - - :
- , , - -
.
.
-

hand edge of the next vector to be loaded in the J
reg~ster is not gre~ter than the rlght han~ edge o~ the I
1ntercept vector plu~ ~ ; and it has passed the pre-
ceeding test does the I register intercept get proce~ed
as the trunk of a Y junction. In the preferred embodi~
ment of the invention, ~inc~e only the current left and
right edge coordinate~ of the next vector are needed for
branching te~ts~ these coordinates are loaded into
~mall register, the ~ register, to be available for thi~
proces~ing. The contents of the R register are shown in
Figure 21.
~ ontinuing down the logic diagram of Figure 6, only
when having pa~ed the preceeding two tests, if the left
edge o the ~1 intercept is greater than the right edge
of the K regi~ter vector plus ~ ,does the apparatu~
recogni2e that an expected intercept below the v+l vector
was not encountered at all and accordingly the ~MISSING
: VECTOR~ proce~sing comes lnto play to provisionally main-
tain the memory addressing of vectors and predictive ~ata
while ascertaining whether the J vector has in fact ter-
minated or whether an unintended void in the drawing was
~canned. Continuing down ~he left hand side of the logic
diagram of Figure 6, lf a scan line intercept held in the
I~l register has pas~ed the first two tests with loqic 1
and the third w1th logic O and al~o satisfies the ine-
quality that` the left edge o the R register vector i~
greater than the right hand edge of the intercept ln the
I+l register plu5 ~ , the I~l input is determined to be
a NEW S~GMENT and is proce~sed accordingly. ~inally, if
the next vector equal~ 1, that i~ the sentinel vector of
width O centered on the right hand edge of the page as
deflned in the initializing processing, then the scan
line has ended and signal~ are generated to ini~iate the
next scan. Otherwi~e having pas~ed through the logi~al
diagram at all five 3tep~, the input under consideration
i9 a SIMPLE OVERLAP (or continuat10n) of ~he vector
.
:
'. . ' ' - :

(14)
corresponding to it on preceeding scan lines, and will be
processed a~ a SIMPLE OVERLAP or continuation vector~
The proce~slng ~ppropriate for each one of these six
determination~ hown in Figures 7 to 12 only one of
whlch processing stages will be activated for a g~ven
input.
Turning now to ~igure 7 there are shown the pro-
ces3ing step~ for the scan line inter¢epts from regl~ter
I+l when the ~rithmetic comparison tests of Figure 6 have
determined the input to be a lambda-JUNCTION. In
~his event, the extent of the input i first checked ~o
be greater than a pre~et minimum extent.
The term ~e~en~n, repre~ente~ by E with an iden-
tifying ~ubscript, is used throughout this disclosure to
15 refer to one-half of the width of a run. A roundlng-off ~1
convention ih arbitrarily e~tablished to assure integral
coordinates for the center polnt and the extent of a run. 1.
The scalar quantity MINEXT of Figure 7 i~ a preselected
numbex which act~ ~8 a threshold for screening out cer-
taln kinds of noi~e that would otherwise be proces~ed as
b~anch point indicatlng data. For example certa~ n
graph~c reproduction proces~e~ create microscopic
~treamer line~ of dark density along a dark edge. Such
nol~e may be ellminated from the further level processing
steps. The choice of MINEXT may also be dictated by the
knowledge that a drawing contains vnly, e~., llnes of a
certain minimal thicknessO A suitable choice on MINEX~
~creens out all les~er-dimensional features, so that pro
cessing tlme will not be used to process such noise.
If it does not ~atisfy this threshold requirement it is
dlsregarded, the next input is loaded and next ~tage
processing i~ again commenced. On the other hand, if the
extent of the 1~1 intercept i8 larger than MINEXT then
the device determines wh~ther the present s~an line i~
the first ~can line on which thls vect~r appeared. If
not, an out]put contlnu~tion vector record is produced,
: - : .. .
... .
'' : ~ ' ' ' ' '' ~
. : :
. .

(15)
whereby the de~cription of the vector as far down a~ the
branch point 1~ senk as an output ~typical output records
are shown in detail in Figures 14-15), and the new vector
data for the left leg of the lamda replaces the former
vector data in the memory. The ~uantities C, E and S in
the J register are temporariLly saved as they will form
part of the L-junctlon OUtpllt record and the vector in
the J register is glven new C, E and S data descriptlve
of the branch of ~he L-junct:ion being processed. The
proce~sor then takes the next input, gets a free vector
address, and uses lt to reorder the vectors in memory, in
such way as to ~tart a new vector compri~ing the right
leg of the lamda junction a~ the next sequentially
addressed vector after the left leg, which has been coded
a~ a con~inuatlon of ~ts top part~ An L junctlon recor~,
Figure 14 r is generated identifying the top ~ector and
it~ relation to the right leg of the junction. During
both the entering in memory of the left leg an~ of the
right leg, the parametee ~MISSED~ is set to zero and the
parameter T~o iS ~et to MIN~RO, as i5 done initially when
commencing any new vector parametri~ation. The signifi-
cance of this latter step will be discussed later in con-
nection with Figures 11 and 12.
Turning now to Figure 8, there is shown a block
diagram of the processing undertaken in the event that
the arithmetic comparison testing of Figuee 6 indicate~ a
given intercept to be a Y-JUNCTION. Initially there is
tested whether the present scan line is in fact the fir~t
scan line for the vector. If it is not, a continuation
vector record is sent a~ an output, as indicated in
Figure 13 and the C, E~ S, TRO and MISSED data are put
into the J regi~ter a~ a continuation of the upper left
branch of the Y, thereby acqulring its memory address.
Thi~ data is put back In main memory, an output Y- -
junction record is produced and the vector~ are
reshuffled so that the former right branch of the Y-
.
, .:
,
, .
,
, ' , ~,, ':~
, .
~, ~

.
~ L~2~(16)
~unctlon is no longer in memory and exi~ting vector
records are again consecutively ~ddressed. The proces~or
then returns to NEXT STAGE processing of Figure 6.
Turning now to Flgure 9 there is ~hown in the alter-
native the processing which occurs in the event that the
arithmetlc testing of Figurle ~ indicates the given inter-
cept to be a SIMPLE OVERLAP ~or continuation) of a pre-
viously encountered superadjacent vector. As used
throughout thi~ disclosure the term "superadjacent"
refers to an unterminated vector encountered in a pre-
viou~ scan llne and which i'3 located so that its width to
some extent overlaps or is next to the given intercept in
x coordinates. A~ indicated ln Figure 9 SIMPLE OVERLAP
processing commence~ by loading the next input, getting
the next vector from memory, and undertaking the hole-
filling proces~ing (explainefl below in relation to Figure
121. Whereas the lambda- and Y-junction tests each
involve extracting information related to the topological
connectivity of different segments, the SIMPLE OVER~AP
processing i~ the major informat~on compressing step of
vectorization and involves creating and reflning a p1ec~-
wi~e l~near approximation to the image represented by the
con~ecutive ~can line intercept~. The parame~er DELS 18
set to be the number of ~can line~ betweem the present
scan line and the s~an line where the vector was ini-
t~ally encountered. The numbers DC and DE are the slopes
respectively of linear function~ representing the center
~n~ the extent. A~ noted previou~ly a para~eter TRo~ ~he
~hreshold parameter, is ~et which indicates the degree of
error which ~ill be permi~sible in the vectorlzed
approx~mation of the lmage scanned. In the SIMPLE
OVERLAP processing step, the proce~sor initlally deter-
mines whether the extent o the intercept stored:in ~he I
register dlffers from the extent predicted from memory by
more than TRo~2, and if not whether the center of the I
register intercept dlffer3 from the predicted center by
.
-
- ~ . . . .
:: - . . .. ~ . .
'`~.'. ".. ` ' . ,- . ~ . - , . ` . '
- , ' ' :,
' ` ' ' .
. :
.

(17)
more then TRo. In either case, if ~7reater than two scan
lines have elapsed ~lnce the vector was f irst encoun-
tered, the vector continuatlon record will be sent a~ an
output and a new vector approximation will be commenced
5 usi ng the currerlt data, with the new center, extent ~ and
scan line a~ the relevant vector data, with the parame-
ter~ TRo set to the minimum ~nd MISSED set to 0 a~ will
be explained in regard to Figure~ 11 and 12.
Still in reference to Figure 9, in the event that
both the cent~r and edge pr~ediction3 are within the
thre~hold amounts requlred, the device performs an addi-
tional arithmetic check to determine whether the current
scan line i8 2n ~n an integer) scan lines down from where
the vector wa8 initially encountered. I not~ the
current left and right e~ges are simply entered in the J
register, the p~rameter MISSED i~ set to 0 and the
apparatu~ proceed~ to the NEXT STAGE processing. On the
other hand, lf th~ number of el~p~ed scan line~ since
commencement of the vector is a power of 2, then the
lower left hand branch of the processing ~iagram is
implemented and th~ parameters DC and DE are updated
using present scan line data, to give a new estimate of
the slope of the center line and the slope o~ the extent
function. It wlll be appreciated that since th~Y refine-
2S ment of parameters occur~ after a power of 2 number of
scan line~, and when the edge and center predicter~ havealready been determined to be within the respective
threshold parameters TRo in absolute value, the linear
approximation to the vectorized shape of the underlying
image is refined ~o that when a vector continues ~o
appear for an increa~ing numbçr of ~can llnes an~ the
previou~ly vectorlzed approximation still hold~ good, the
linear preclictors are exponentially bounded and give a
smaller andl smaller error from the actual image. ~lnally
35 note that in additlon to updating the center and extent
predictors, the par~meter TRo i~ updated in accordance
. - -
.
~ " ' ' '; ' ' '
,
'' ` ~ :-
.

`~ ~
(18)
with the formula ~hown in the third box o~ the lower left
hand branch of the proce~sing block diagram.
Specifically, when the half width of the I register
intercept is less then MAXTRO but greatex than MINTROt
then the refinement of parameters will set TRO equal to
this half width. Thus, the threshold parameter i~ adap-
tive, 50 that when the graphic features being processed
are less then twice the width of MAXTRO, a closer ~egree
of accuracy will be requlred of the estimationsO Thi~
allows the proc~s~or to very quickly code information
encountered in thick images and yet to more delicately
catch the detail of fine graphic images, and to more
quickly establi~h its predictive parameters when com-
mencing to code a new vectorO Finally we note, again
referring to Figure 9, that in any case, whether the
~lope parameters are updated at a power of two scan line,
whether the existing data feeds s~raight through at a non
power of two ~can line, or whether the edge or center
predicter has failed the threshola accuracy te~t and a
con~inuation VeCtQr iB sent as an output, the left and
right edge data in the 3 register have substituted in
them ~he coordinates of the current left and right edges,
and the value of the parameter MISSED is set to O before
processing control shi~ts to the next stage and the J
: 25 register data i~ re-entered in memory.
Turning now to Figure lO, there are shown the pro-
cessing step~ undertaken in the event the arithmetic
sorting proce~ of Flgure 6 indicates the presence of a
: NEW SEGMENT. In that event, a~ indicated in Figure lO,
- 30 if the width of the in~ut in the I~l register is les~
than a specified minimum extent, the input is disre-
garded~ the next input loaded~ and the processing shown
in Figure 6 reco~menced. Otherwise, the next input is
loaded, ancl memory control is signaled to provide a free
memory addres~ for a new vector. If no memory addre~s i9
available, as ln the other processing steps wherein some
. . . , ~ .
.
~ . . ~ . . . -- -: .
,

(19)
shuffling of data in memory must be accomplished, the
device halts. Otherwlse the address pointers of the free
v~tor and the next vector are rearranged, and the
center, edge, threshold, current scan l~ne number, and
right and left edge~ of the new ~egment are placed in the
register wlth ~he parame~er MISSED set to 0. An output
record for new ~egment is produced as in Figure 13 and
the device proceed~ to the nlext stage. It will be
recalled that ~nitial steps of the NEXT STAGE proce~sing,
Figure 6, wlll retuxn the J regl~ter contents to memory
and proceed to ~ort out the next 1nputs according to the
basic topological and geometric primitives of Figure~
7-l0
Turning now to Figure ll there are shown the opera-
tive steps of MISSING VECTOR processing. A given vector
is retained in main memory when a continuation input
fails to appear for less then M sca~ lines. The para-
meter MISSED, lt will be recalled, comprises one of the
12 words of data constituting the vector description
carried in main memory ~s discussed below in eonnection
with Figure 21~ ~his parameter i s updated hy proces~ing
de~cribed presently. Provided that MISSE~ is less than a
preselected number MAXMIS, the main memory vector~ed
description 1~ reta~ned and processing continues a~
usual, with the J-regi~ter edge and center predicters
updated before returning to the memory as shown in the
left branch of ~he processing diagram, Figure ll.
However, when the number of missed lines is incremented
by one and become~ greater than the preset parameter
MAXMIS ~thereby indica~lng that the given vector segment
has lndeed terminated), an output record designating the
end segment i~ generated as shown in Figure 13.
Thereupon the addres3 of thi~ vector is returned to the
top of the free v~ctor 11st in the memory controller, and
~5 the address pointer~ arran~ed accordingly, whereupon the
processing proceeds to the NEXT STAGE. In this manner,
': ' '
-
.

(20)
when a predicted scan llne intercept of a vector fails to
appear, no new information i8 coded beyond the augmen-
tation of the parameter MISSED by one unit, and no infor-
mati.on is lost in reliance on such faulty data. In the
event that after failing to appear for MAXMIS scan lines
the vector is determined to have terminate~, all o~ the
provisional data entered in the preceeding MAXMIS scan
- lines ls deleted and the output record goes back to the
last actual scan llne reading, as noted in the output
recorA de~cription, Figure 13.
Turning now ~.o Figure 12, there is shown the pr~-
cessing operation o HOLE FILLING. This step i8 only
required during the proces~ing of a SIMPL~. OVERLAP, and
operates on new inputs, I, I~l, to delete runs of length
less then TR~ occuring in places where a run of another
: image value i~ predicted. Specifically, if the left edge
of the I+l input i~ less than the right edge in the J
register of the corresponding vector, and if the left
edge of the I+l input minus the right edge in the I input
20 ~s less than the threshold TRo, then the left edge coor-
dinate in the I register i~ saved, the C, E and R con-
tents of the I+l register 1~ d into the I regiYter, and
the next C, ~, L and R coordinates from the data
compre~sion module are loaded into the I~l register. The
'~ center and ex~ent of the I regi~ter are redefined in
accordance with L~ and the new ed~e data RI . In th 13
manner hole3 of dimension less than TRo are automatically
filledO A feed~ack loop then returns the device to the
sturt of the hole illing operation, and it again
undergoes the fir~t ~wo tests. Thi~ will result in
filling of additional hole~, in the event that a number
of ink spot~ or void~ occur ih close proximity wi~hin a
801e vector.
It will be ~een that the foregoing processing steps
result in the direct and str~ightforward coding of
geometric ,and dlmen~ional data for graphic images in real
:. ~ ' ............. ,
. . .
: ~ ,
~. . .

- f~ ~
(21)
time for a ~canner, with output data generated at each
point where signiflcant topological or dimensional infor-
mation comprising new data is encountered. In Figures 13
and 14 are shown output records generated at these times.
In particular as appears in Figure 14, when a branch
point such a~ a Y-(or lambda-) junction is encountered
and the output record ~or the terminated right branch of
the Y ~or new right leg of l:he lambda) ls generated, the f
record includes the addres~ VMI of the top left (or ~op)
branch as well as the addre~;s V of the portion scanned.
The output record ln the ca~ie of a Y- junction includes l.
the left and right edge coordinates of the upper portion
a~ well as the left and right edge coordin~te~ from the
I-register of th~ lower right portion. For a right leg
of a lambda-~unction, the record includes the scan jump,
center and exent data for the upper portion. Thi~ ~ata,
d~noted CJ, EJ and DELS for simpl~city, comprises the
quantities SAVCRO, SAVERO and DELS deined in Figure 7.
The particular choice of output records contains suf- l
20 fic~ent data regarding the topological and dimensional l;
changes encountered ~o that by ff~uitably programming an r
output proce~sing devlce, it ls possible to quickly print
or access and di~play images generated by, e.g., a con-
nected sub-unit deplcted within a g~fven drawing, or, for
25 instance, the set of all electrical circuit units con- I
necting to a 91fven component in an electrical schematic f
when analyzed according to this method.
A further point which has not been addressed in the
foregoing di~cussion ls the operation of the devire in
3D termination processing a~ indicated in Figure 1. Figure
15 show~ the ~tepcff involved in such proces~lnq. It will
- be appreciated that because the graphic elements encoun-
tered in a glven scan line are vectorized and stored in
memory, and there ia no output oE such data until a
35 slgn~ficant change hss occurred requ~ring a new vec-
torization of ~ub~equent continuations or branches, it i5
1.
' ' ' '
, .,' ' . ' '
,
. . ~' : . :
'. '~, , ~ ' ' ' '' . ,
- ' ' ,
.. . .
.. , . , ~

i6;~
t22)
possible for a document to reach its last scan without
having generated an out.put record indicative of a graphic
element which, for instance, has continued a~ predicted,
or which qulte simply has not terminated by the last scan
line. In thi~ event a termination processing mode i3
provided, which recogn~zes the last scan line by an
externally triggered signal related to the document size,
and proceed~ to cau~e an output of an end-of-segm,ent
record for each of the vectors then left in memory. ~hen
l,~a the scan line i8 ended, tha,t is when NV=l, the device
then outputs appropriate signal~ to indicate the end of
the coding operation and to terminate communication with
~he output processor and halt ~he apparatus.
It will be appre,ciated that the op,eration of the
foregoing coding devlce utllizes a relat.ively small
memory for 188 bit words which can be addres~ed and fed
~nto the J reg~ster by appropriate signals for proces~ing
in relation to the ~can data being fed into the I and I~l
register~. The variou~ edge and center tests necess~ry
for character~zing the current input and updating ve~tor
memory data are conducted in a straightforward way ~y
parallel arithmetic logic un~ts. ~hus, although ~he u~e
of parallel proceRsin~ in thls context constitutes part
of the within invention, the de~ign of arithmetic pro-
cessing units i~ known to those versed in the art.
It will now be seen that the coding operations per-
formed by the present invention produce a piecewise
trapezoidal approxim~tion of each two-dimensional graphic
element by extracting the graphic information ~n sequen-
tial scans and parameterizing small trapezoids ~vectors)which locally approximate the graphic element.
For each graphic element, the memory is managed in
such a way as to contain a block of dat~ descrip~ive of a
small trapezoi,d, which approximates the currently scanned
region of the graphic element. Coding is performedr ~nd
processed dla~a i8 updated or cau~ed to be output, only
, ': ' ' ' .
, , . : .
: ;

(~3)
when one o~ flve inform~tion containing event~ (Y-or
lambda-branch, new vector, continuation requirlng a fresh
trapezoid, or end of vector~ i9 encountered. These
event~, which are the 1nformation-theoretic primltlve~ of
the present invention, ~re mutually dl~tinct and are
complete in the ~en~e that any run on a scan 11ne (which
: i~ not noise, or which is not already accurately pre-
dict~d by the vector data for the region above it~ must
constitute one of these events. Moreover the flr~t four
event~ may be determined by simple arithmetic com-
parisons, and the fifth by default, so that it is a
straightforward matter for a person versed in the art to
design four arithmetic logic units which collectively and
in p~rallel wlll determlne which of the ~events~ charac-
te~ize~ a given run.
~ In one embodiment the apparatus also perform~ ~ymbol
: recognition on scanned documents in a manner explainedbelow, using the vectorized repre3entation to analyze the
scanned symbol c~nd~dates, as will now be described.
2~ After ~ document has been fully coded the coded data
may be used to reproduce a ' f~csimile document, which will
be identical, except possibly for runs of up to MAXT~0
resolution cell~ which have been filled or s~oothed out. I
Alternatively an output printer may be activated using
25 the center 1~ ne parameters (rather than the full twodimen~ional set of parameters~, to print out a pseudo-
center image of the document. Such a pseudocenter image
includes the center line (or piecewise linear curve) of
the graphic element. In Figure 16 is shown an image of
the center llne~ 161 of a vectorized image such as a
~haded topographical m~p.
In thi~ form, whether printed or as raw output, it
i8 possible to identify ma~or clustersr which are
possible symbols, for subsequent machine analysi~. Such
clusters mcly, for instance, be identified in terms of
. . ,
: :

~lze and connectlvity. For example, any i sola ted graph ic
element whlch together with all vectors connected thereto
i~ between 1 cm. ~nd 3 cm~ in diameter would be a good
candidate for recognition as an alphanumeric symbol.
S~milar criteria may be establi~he~ in so~tware to
quickly identify ~mall libraries of symbols for a given
use- e.g. electronic components in circuit diagram; beam
~ections and flttings in structural engineerlng drawings.
Figure 17 shows the symhol candidates 171 a-d 50
extracted fro~ the center line im~ge of Figure 16.
Once a major cluster or a symbol candidate C ha3
been i~olated, the app~ratu~ according to the present
~nvention calculate~ the center of mass of C from the
vectorized representation of C. Specificall~ a local
coordinate system i8 e~tabli~hed and for each tr~pezoid
Ti included in the vector representation of C there are
calculated a mass Mi e~ual to the area of the trapezoid,
and coordinates ~Xi, Yi) equal to the center of mas~ of
Ti computed as though Ti wer~ compo~ed o~ a ~heet of
20 material of un~orm den~ity. The mass M of symhol C is
set equal to ~ Mi and the center of mass CM is ~hen com-
puted from the first moment~ o~ the sy~tem of trapezoid~
around the X and Y-axe~.
Ju~t as in the mathema~ical description of a phy~
25 cal system in classi~al mechanics, the first moment
around the X axi~ is defined a~ M~ ~ MiYi and the
corresponding fir~t moment around the Y-axis is My
MiXi. The centee of ma~s CM of C then has coordinates
(X~ Y) =~
This center of ma~s point (X, Y), ha~ the property
that the sumbol C may be scaled with re3pect to (X,Y) ~nd
the re~ultant ~ymbol will havé the same center of mass
(X,Y). Also the area (or mass M) of the symbol C i~
invariant with re~pect to rotation of C around the point
~X,Y). [This latt~r property would of cour~e hold for
any point (K,Y)~l Figure~ 18, 19 show a symhol in two
different angular orientation~ about its center of ma~s.
i
: , , . . :
:: - - . : .
. ~ , .
- . .

2~
~25)
~ aving thu~ established a fixed point (X,Y) for the
candld~te C under con~lder~tion, the apparat~ next
determine~ the polnt of C which i8 at a maximum dlstance
from IX,Y). There may be a unique such extrem~t~, as for
the symbol "h~, ~everal ~uch points~ as for the ~ymbol
~X~, or even a large number of such points, as for a cir-
cular "on. ~enerally, however, there will be a small
finite number~ in any case the machine simply choo~es one
extreme point. It will be ,appreclated that, a~ a con-
sequence of elementary geomletric considerations, the
point of extremity of ~ plecewise linear symbol candidatemust be one of the vertices of the candidate, and a~cor-
dingly for a vectorized image the point of extremity may
be a~certained by slmply examining the existing output
data, computing the dlstance of each vertex from the
center of mass, and taking the maximum one.
At this point it is nece~ary to dlstinguish between
learning a symbol and recogniz~ng a symbol. The machine
~learns" a symbol in a fixed orientation as follow~, For
~- 20 a given ~ymbol C o~ mass M, center of mass CM - (X,Y) and
with a principal e~tremity (Xe ~ Ye ) the machine ~ores a
vectorized lmage or optical ma~k of the ~ymbol rotated
and shifted so that the line segment determlned by the
points tX,Y~, (X~ , Ye ) lies in a standard position~ along
25 the x-axi~ with (X,Y) shlfted to the or~gin.
In Figure 1~ i8 ~hown a symbol 180 aligned with its
center of mass 182 and lt~ extremity 183 along the X~axis
- 181. In Figure 19 i8 ~hown a symbol 190 isomorphic to
that of Figure lB in ~ different orientation. In Figure
30 19 the symbol 190 i8 ~hown h~ving center of mass 192 and
extremity 193. The center of mass 192 is placed at the
orlgin, and th~ line segment 195 defined by endpoints 192
and 193 lies at an angle 194 ~ith respect to the X-axl~
191 .
In a preferred embodiment, for a ~ymbol having
... .
: , - - .
.- ,
.
. . - . . :
~ '
-:

(26)
multiple extremlties, the machine al~o nlearn~" the image
of the symbol as determlned by each o the other extreml-
ties. Each of these learned ~ymbols is normalized, e.g.
by sc~ling 80 that ~
Figures 20 A~C ahow a ~ymbol 200 having center of
mas.~ 202 and two extremities 203, 203'. In this
situation the 3tored library of symbols would include two
templates 210 and ~20 representative of the symbol when
it is shifted lnto the normal position with it3 center-
extremity ~egmen~ allgned with the x-axis.
Having thu~ est~blished a library of learned sym-
bol~, in order to ~recognize" a newly scanned symbol C',
the machine take~ the ~ymbol can~idate, Ci and determine~
its center of ma~s CM' and extremity ~', rotating it
degrees into ~tandard posltion. The candidate C' ~ then
normalized and 18 compared to the library of learned sym-
bols to determine the degree of overlap. This cvmparison
which in the preferred embodiment is performed numeri-
cally (as i~ the rotation into standard form) amount~ to
measuring the degree of coincidence of the ima~es of C
and C' where C i~ a learned ~ymbol fro~ the library lCj~
of ~tored symbol imAges. Thi~ re~ults in a sequence of
measure~ ~ X~1: (X~ ~ 1) of the degree of colncidence
between the candidate C' anfl e~h learned symbol Cj in
the library of stored ima~es. The candidate C9 iS
~recognized" a~ a particular ~ymbol Cm . when Xm ~ Xj
and X~ is sufficlently laLge, ~ay Xm ~ T, a recogn~ion
threshold. The number ~ may be approximately .85-.9 for
recognition of mechanically formed symbols having a defi- 1.
nite type fontO In thi~ case it has also proven useful to
requir2 as ~ condition of recognition that the number
min ~Xm - Xj~ k, where k is a ~mall number, e.g. k~.2,
chosen to as~ure that the coincidence between C' ~nd
Cm is ~ignific~ntly better than between C' and
35 each other C ~ ln the llbrary. In certaln
.
' .
: . ',: . , -
:'

(27)
in~tances, as where different ~ymbols are distingui~hedonly by their or1entation (such as "p" and "dn, or "6~
and "9~ it may also be nece~sary in a library containing
two such symbols to al80 store and match the angle 194 a~
a criterion oP the recognition process.
1,
.
... , . : ~ : : : -
.
-
.
.

Representative Drawing

Sorry, the representative drawing for patent document number 1256214 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2022-01-01
Inactive: Expired (old Act Patent) latest possible expiry date 2006-06-20
Grant by Issuance 1989-06-20

Abandonment History

There is no abandonment history.

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ANA TECH CORPORATION
Past Owners on Record
CURTIS A. LIPKIE
DAVID N. GROVER
EUGENE A. KLECA
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Drawings 1993-10-07 20 456
Claims 1993-10-07 7 246
Cover Page 1993-10-07 1 19
Abstract 1993-10-07 1 25
Descriptions 1993-10-07 30 1,422