Sommaire du brevet 2245769

(12) Demande de brevet:	(11) CA 2245769
(54) Titre français:	DISPOSITIF DE LECTURE A SORTIE VOCALE ET GUIDAGE TACTILE
(54) Titre anglais:	TACTILEY-GUIDED, VOICE-OUTPUT READING APPARATUS
Statut:	Réputée abandonnée et au-delà du délai pour le rétablissement - en attente de la réponse à l’avis de communication rejetée

Données bibliographiques

(51) Classification internationale des brevets (CIB):	G06F 03/00 (2006.01) G09B 21/00 (2006.01)
(72) Inventeurs :	SEARS, JAMES T. (Etats-Unis d'Amérique)
(73) Titulaires :	JAMES T. SEARS
(71) Demandeurs :	JAMES T. SEARS (Etats-Unis d'Amérique)
(74) Agent:	GOWLING WLG (CANADA) LLP
(74) Co-agent:
(45) Délivré:
(86) Date de dépôt PCT:	1997-02-11
(87) Mise à la disponibilité du public:	1997-08-21
Requête d'examen:	1998-10-13
Licence disponible:	S.O.
Cédé au domaine public:	S.O.
(25) Langue des documents déposés:	Anglais

Traité de coopération en matière de brevets (PCT):	Oui
(86) Numéro de la demande PCT:	PCT/US1997/002079
(87) Numéro de publication internationale PCT:	US1997002079
(85) Entrée nationale:	1998-08-07

(30) Données de priorité de la demande:

Numéro de la demande	Pays / territoire	Date
60/011,561	(Etats-Unis d'Amérique)	1996-02-13

Abrégés

Abrégé français

Ce dispositif de lecture (1) d'imprimés à entrée optiques, destiné aux malvoyants ou aux aveugles, permet à l'ulitisateur de faire passer un ensemble souris (2) contenant une caméra (25) sur le texte à lire. Les images de ce texte sont introduites dans un ordinateur (3) qui les décode en leur sens symboliques par reconnaissance optique (31) de caractères. En assemblant des mots provenant d'éléments textuels en chevauchement, on recrée des ensembles complets. En utilisant également le signal de sortie concernant la localisation des caractères propre à la reconnaissance optique de caractères (31), l'ordinateur (3) commande un afficheur tactile linéaire (53) situé dans l'ensemble souris (2), ce qui indique à l'utilisateur la présence et l'alignement vertical d'un texte alphanumérique reconnu placé sous l'appareil. Cette rétroaction permet à l'utilisateur de placer, d'aligner correctement et de guider l'ensembe souris (2) sur le texte à lire. De plus, grâce à un synthétiseur vocal (61), les mots sont prononcés de façon audible quand ils sont parcourus. Comme ce dispositif peut s'utiliser d'une seule main, ses utilisations portent sur la lecture d'emballages alimentaires, de flacons de pilules, de billets de banque, d'étiquettes de prix ainsi que d'imprimés. En version portative, ce dispositif permet aux malvoyants et aux aveugles de mener une vie quotidienne plus riche.

Abrégé anglais

An optical-input print reading device (1) for people with impaired or no
vision in which the user passes a mouse assembly (2) containing a camera (25)
over the text to be read. Images of this text are input to a computer (3),
which decodes the images into their symbolic meanings through optical
character recognition (31). By assembling words together from overlapping
groups of text elements, whole words are created. Using also the recognized
character position output of the optical character recognition (31), the
computer (3) directs a linear tactile display (53) contained within the mouse
(2) assembly to indicate to the user the presence and vertical alignment of
recognized alphanumeric text under the unit. This feedback allows the user to
locate and properly align and guide the mouse (2) assembly across the text to
be read. Furthermore, through a speech synthesizer (61), the words are spoken
audibly as they are passed over. As the divice requires only one hand to
operate, its uses include reading food packaging, pill bottlles, currency, and
price tags, as well as printed sheets. In a portable form, this device
provides a means for low-vision and blind people to function more fully in
their everyday lives.

Revendications

Note : Les revendications sont présentées dans la langue officielle dans laquelle elles ont été soumises.

Claims
The invention claimed is:
1. A method for converting visible symbology into another humanly perceptible version of the
symbology, the method employing a hand-held imaging device operated by a user, the method comprising
the steps of:
imaging the symbology to convert the symbology from an optical image to an electronic signal
representative of the optical image;
performing symbology recognition on the electronic signal to convert the electronic signal into
recognized symbology and to provide the location of the recognized symbology in the image;
providing feedback to the user based on the location of the recognized symbology in the image,
the feedback being representative of the position of the hand-held imaging device relative to the
symbology; and
converting the recognized symbology into a humanly perceptible version of the symbology.
2. A method as defined in claim 1, wherein the humanly perceptible version of the symbology into
which the recognized symbology is converted is an audible version.
3. A method as defined in claim 1, wherein the humanly perceptible version of the symbology into
which the recognized symbology is converted is a Braille version.
4. A method as defined in claim 1, wherein the feedback provided to the user is tactile feedback.
5. A method as defined in claim 4, wherein the tactile feedback is provided by a moving element on
the hand-held imaging device.
6. A method as defined in claim 5, wherein the moving element includes a moving pin.
7. A method as defined in claim 6, wherein the moving pin is part of an armature in an
electromagnetic device, the electromagnetic device also including a solenoid coil.
8. A method as defined in claim 5, wherein the moving element is moved at a frequency
representative of the position of the hand-held imaging device relative to the symbology.
9. A method as defined in claim 8, wherein the frequency can vary within a range from
approximately six to sixty-five Hertz.
10. A method as defined in claim 8, wherein the frequency is inversely related to the relative distance
between the hand-held imaging device and the symbology.
11. A method as defined in claim 5, including a plurality of moving elements arranged in a linear
array oriented parallel to a longitudinal axis of the hand-held imaging device to provide an indication of
the position, relative to the longitudinal axis, of the hand-held imaging device relative to the symbology.
38

12. A method as defined in claim 11, including a plurality of said parallel linear arrays to also provide
an indication of the position, relative to an axis normal to the longitudinal axis, of the hand-held imaging
device relative to the symbology.
13. A method as defined in claim 12, including two arrays of three elements each.
14. A method as defined in claim 1, wherein the feedback provided to the user is visible feedback.
15. A method as defined in claim 14, wherein the visible feedback is provided by a source of light
on the hand-held imaging device.
16. A method as defined in claim 15, wherein the source of light is a light emitting diode.
17. A method as defined in claim 15, including a plurality of light sources arranged in a linear array
oriented parallel to a longitudinal axis of the hand-held imaging device to provide an indication of the
position, relative to the longitudinal axis, of the hand-held imaging device relative to the symbology.
18. A method as defined in claim 17, including a plurality of parallel linear arrays to also provide an
indication of the position, relative to an axis normal to the longitudinal axis, of the hand-held imaging
device relative to the symbology.
19. A method as defined in claim 18, including two arrays of three light sources each.
20. A method as defined in claim 17, wherein the plurality of light sources includes light sources
which produce light of a different color than others of the plurality of light sources.
21. A method as defined in claim 1, wherein the feedback provided to the user is audible feedback.
22. A method as defined in claim 21, wherein the audible feedback is provided when the hand-held
imaging device is skewed in orientation relative to the image by more than a predetermined amount.
23. A method as defined in claim 22, wherein the predetermined amount of skew is approximately five
degrees.
24. A method as defined in claim 1, wherein the method includes two modes, a search mode in which
the relative position of any symbology recognized will be provided as feedback to the user and a track
mode in which the relative position of a line of recognized symbology which is being tracked is provided
as feedback to the user.
25. A method as defined in claim 24, wherein the user can select the search mode or the track mode.
26. A method as defined in claim 25, wherein the recognized symbology is converted into a humanly
perceptible version of the recognized symbology only in the track mode.
27. A method as defined in claim 1, wherein the symbology recognition includes optical character
recognition.
28. A method as defined in claim 1, wherein the symbology recognition includes bar code recognition.
29. A method as defined in claim 1, wherein the visible symbology to be converted is displayed on
a video display.
39

30. A method as defined in claim 29, wherein the hand-held imaging device includes a camera for
imaging the visible symbology on the video display and converting the image to an electronic signal.
31. A method as defined in claim 30, wherein the hand-held imaging device includes a sensor to sense
that the visible symbology to be converted is being displayed on a video display.
32. A method as defined in claim 2, wherein the audible version of the recognized symbology is
provided in the form of separate symbols, including letters of the alphabet.
33. A method as defined in claim 2, wherein the audible version of the recognized symbology is
provided in the form of groups of symbols, including words.
34. A method as defined in claim 1, wherein the visible symbology is also displayed in a magnified
visible version.
35. An apparatus for converting visible symbology into another humanly perceptible version of the
symbology for a user, comprising :
an imaging device to covert the symbology from an optical image to an electronic signal
representative of the optical image,
a symbology recognizer receptive of the electronic signal to recognize symbology in the image and
to determine the location of the recognized symbology in the image;
a feedback device to provide feedback for the user based on the location of the recognized
symbology in the image, the feedback being representative of the position of the hand-held imaging device
relative to the symbology, and
a transducer to convert the recognized symbology into a humanly perceptible version of the
recognized symbology.
36. An apparatus as defined in claim 35, wherein the imaging device, the feedback device, and the
transducer are all located in a hand-held positioning device.
37. An apparatus as defined in claim 36, wherein the hand-held positioning device is tapered to a
narrower width at a distal end than at a proximal end.
38. An apparatus as defined in claim 35, wherein a bottom side of the hand-held position device is
adapted for placing against a surface with symbology to be read therefrom, and wherein the bottom side
includes a recessed channel in the center thereof to allow the bottom side to be placed against curved
surfaces to read symbology therefrom.
39. An apparatus as defined in claim 35, wherein the imaging device includes an illuminator and a
camera having a lens incorporated therein.
40. An apparatus as defined in claim 39, wherein the lens and the camera are adjustable in position
relative to each other to adjust the focal length thereof so as to be able to focus the imaging device on
symbology at various distances therefrom.

41. A method for allowing a user with impaired vision to operate a host computer with a graphical
user interface including information displayed on a computer display, the method comprising the steps of:
providing a surface onto which coded symbology has been provided;
imaging with a hand-held positioning device containing an imaging system to image and convert
symbology to an electronic signal, the device being adapted for placing in proximity to and moving across
the surface;
recognizing the coded symbology in the image and determining the location of the coded
symbology in the image, and based thereon, determining the position of the positioning device relative to
the surface;
obtaining the information in the vicinity of the position on the computer display corresponding to
the position of the positioning device relative to the surface, and providing an indication of the location
of the information relative to the positioning device's position;
providing feedback for the user based on the locational indication, the feedback being
representative of the location of the information relative to the position on the computer display
corresponding to the position of the positioning device relative to the surface; and
transducing the information into a humanly perceptible version of the information.
41

Description

Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
T~CTILEY-GUIDI~D, VOICE-OUTPUT RE:AD~G APPARATUS
De~ liou
Cross-~eference to Related Patent Applic~l~ ..s
This app~ication is related to and claims priority from Provisional Patent Application No.
60tO11,561, filed Feb. 13, 1996, titled '~liand-held Reading ~)evice for the Location, Capture and
~ Spoken Interpretation of Printed Text on a Surface," the contents of which are incorporated herein by
reference.
Te~h n i~ Field
The present invention relates to a method and apparatus for allowing persons with little or no
vision to read text and other symbology, and more particularly, to a method and a~lJ~dtuS including a
hand-held independent reading aid which provides feedback to the user of its relative position.
l'~r '~ vund Art
People who are blind or suffer from vision h.lpa;l-l,c;llL~ wish for functional independence in their
lives. The blind include those who are blind congenitally, those who suffer adventitious accidents, and
a large and growing number of elderly individuals whose sight is deteriorating due to age-related
diseases such as macular degell~.aLion and diabetes. In the modern world, where reading is a crucial
component of everyday life, the blind and those with low vision are ~ llly rh~lienged by their
ability to read the written word. In recognition of this, numerous devices have been produced in an
attempt to allow vision-impaired people to read independently.
For individuais with low-vision, particularly the elderly, the most frequently used device to allow
them to read is a simple optical m~gnifier. Ullr~lLullaLt;ly, as the disease progresses and vision
deteriorates, the amount of magnification required exceeds that practical with a magnifying iens. For
these users, one option is greater magnification provided by closed-circuit television a~)p~halùs~
However, even this palliative device is ill~urricie,lL to allow reading in situations re~uiring easy
~portability or for users in advanced stages of vision loss. For example, in advanced macular
degeneration, the central portion of the visual field of the eye's retina is rendered useless due to optical
distortion, leaving only peripheral vision intact. In this case, extreme magnification on a television
screen at close distance can at best provide only slow l-h~ -by-character reading by these users.
An alternative approach practiced by a number of m:lrmf~tllrers combines a flat-bed optical
30~ ~ scanner coupled with a C~ U~t;l co~ h~;llg optical character recognition (OCR) software and a speech
synth~-si7l~r. The user places an entire page of a document to be read on the scanner, which transmits
the image to the cOl~ ,. The OCR software decodes the image into text, and then the speech
synth~-si7~r voices the information. This system has been used very successfully to read books and
typewritten letters. This approach. however, does not work well in those cases where the spatial
format of the document is significant in understanding or navigating through the information content to
obtain the desired information, such as in a utility bill containing hundreds of numbers where the total
amount to pay is the only desired information. l~urthertnore, the flat bed scanners do not work well
with text appearing on non-flat surfaces, such as is often found on cans of food or medicine bottles.

CA 0224~769 1998-08-07
WO 97/30415 PCT/IJS97/02079
Finally, these systems are non-portable, both because of the weight and size of components designed to
scan entire pages of documents, as well as the general need for sufficient line power to drive the
mechanical scanners.
Previous attempts to make a portable, hand-held reading device have proved lmcucce~ful because
S users were unable to identify and accurately track text across a page with the hand-held device. Such
systems have included a hand-held device with a camera mounted therein. The identifying and
tracking problem is especially acute with long lines of text, lines of text closely spaced, or text which
is highly formatted. For example, an early system known as the Optacon was developed by
Telesensoly Systems, Inc. of Mountain View, California to allow blind users to read text directly
through the tactile sense in their distal finger pad. This was accomplished using a small hand-held
camera which m~gnified a typically one-quarter inch by one-quarter inch field of view on the paper
being read onto a three-quarter inch by one and one-quarter inch array of 144 vibrating tactile
stimulators. The user would guide the camera with one hand, while stroking an opposite hand finger
pad over the vibrating array. The drawbacks of this system were that the user would typically require
a six-week full-time training course to master the tracking of lines and the il-t~ G~lion of individual
characters. Furthermore, upon reaching a high level of proficiency, users would only be able to read
twenty or thirty words per minute. Generally, only young users were able to master the tactile
discrimination required to read using the device.
A second h~nd-held reading device was developed by Raymond Kurzweil in the 1980's. The
hand-held scanner transferred page image swaths to a desktop computer system cont~ining optical
character recognition software and a speech synthçsi7~r. This system has been long discontinued
because it did not provide a useful means for the user to locate and track over individual lines of text.
Audible feedback to the user indicated when tracks of images with text-like characteristics were being
passed over, however, this ~edl.ack was not specific enough to allow users to track text reliably. The
audible feedback includes a chirping sound indicative of the scanner seeing, but not necessarily
recogni7ing symbols which could include symbols, designs, or drawings, as well as text. Furthermore,
the synth~si7~d voice output of the text was not provided until after an entire swath of text image had
been s~nn~fl
In addition, several IB~ patents (U.S. Patent Nos. 5,186,629, 5,223,828, 5,287,102, and
5,374,924) disclose a mouse-driven interface for personal coul~uL~;la for use by blind persons.
Unfortunately, such systems generally require the host computer to be able to report, in ASCII, the
symbology that is proximate to the cursor. Since it may not always be possible to obtain such ASCII
information, it is desirable to have an interface which could interpret information presented on a video
screen through image recognition.

CA 0224~769 1998-08-07
WO 97130415 PCT/US97/02079
It was our intention to create a device that overcomes the disadvantages of the existing systems, in
order to improve the quality of life for vision-impaired people, by allowing them independent access to
printed and visually displayed textual information. It is against this background and the desire to solve
the problems of the prior art that the present invention has been developed.
Disclosure of Invention
Accordingly, it is an object of the present invention to provide an improved hand-held independent
reading aid.
It is another object of the present invention to provide a portable independent reading aid.
It is also an object of the present invention to provide an independent reading aid useable to read
symbols from a variety of surface contours, including both flat and non-flat surfaces.
It is still another object of the present invention to provide an improved reading aid which helps
the user to properly track lines of text.
It is yet another object of the present invention to provide an improved reading aid capable of
reading video displays.
It is still further an object of the present invention to provide a hand-held independent reading aid
useable with a single hand.
It is still further an object of the present invention to provide an independent reading aid which is
relatively inexrerl~ive~
Additional objects, advantages and novel features of this invention shall be set forth in part in the
description that follows, and in part will become apparent to those skilled in the art upon ~ min~ion
of the following specification or may be learned by the practice of the invention. The objects and
advantages of the invention may be realized and attained by means of the instrumentalities,
combinations, and methods partic~larly pointed out in the appended claims.
- To achieve the foregoing and other objects and in accoldance with the purposes of the present
invention, as embodied and broadly described therein, the present invention is directed to a method for
converting visible symbology into another humanly perceptible version of the symbology, the method
employing a hand-held imaging device operated by a user. The method includes the steps of imaging
the symbology to convert the symbology from an optical image to an electronic signal Ic~urese~ Live
~~ of the optical image; performing symbology recognition on the electronic signal to convert the
electronic signal into recognized symbology and to provide the location of the recognized symbology
.~ in the image; providing feedback to the user based on the location of the recognized symbology in the
image, the feedback being .~ st:.l~live of the position of the hand-held imaging device relative to the
- symbology; and converting the recognized symbology into a humanly perceptible version of the
symbology.

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
The humanly perceptible version of the symbology into which the recognized symbology is
converted may be an audible version. The humanly perceptible version of the symbology into which
the recognized symbology is converted may be a Braille version. The feedback provided to the user
may be tactile feedh~l k The tactile feedback may be provided by a moving element on the hand-
held imaging device. The moving element may include a moving pin. The moving pin may be part
of an armature in an electromagnetic device, the electromagnetic device also including a solenoid coil.
The moving element may be moved at a frequency representative of the position of the hand-held
imaging device relative to the symbology. The frequency can vary within a range from approximately
six to sixty-five Hertz. The frequency may be inversely related to the relative distance between the
hand-held imaging device and the symbology. The method may include a plurality of moving
elements arranged in a linear array oriented parallel to a longitudinal axis of the hand-held imaging
device to provide an indication of the position, relative to the longit--~lin~l axis, of the hand-held
imaging device relative to the symbology. The method may include a plurality of said parallel linear
arrays to also provide an indication of the position, relative to an axis normal to the longitudinal axis,
of the hand-held imaging device relative to the symbology. The method may include two arrays of
three elements each.
The feedback provided to the user may be visible feerlback The visible feedback may be
provided by a source of light on the hand-held imaging device. The source of light may be a light
emitting diode. The method may include a plurality of light sources arranged in a linear array oriented
parallel to a longihl.lin~ql axis of the hand-held imaging device to provide an indication of the position,
relative to the longi~ in~l axis, of the hand-held imaging device relative to the symbology. The
method may include a plurality of parallel linear arrays to also provide an indication of the position,
relative to an axis normal to the longil~ in~l axis, of the hand-held imaging device relative to the
symbology. The method may include two arrays of three light sources each. The plurality of light
sources may include light sources which produce light of a di~.en~ color than others of the plurality
of light sources.
The feedback provided to the user may be audible feer~ ck The audible feedback may be
provided when the hand-held imaging device is skewed in orientation relative to the image by more
than a predetermined amount. The predetermined amount of skew may be approximately five degrees.
The method may include two modes, a search mode in which the relative position of any
symbology recognized will be provided as feedback to the user and a track mode in which the relative
position of a line of recognized symbology which is being tracked is provided as feedback to the user.
The user can selcct the search mode or the track mode. The recognized symbology may be converted
into a humanly perceptible version of the recognized symbology only in the track mode.

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
The symbology recognition may include optical character recognition. The symbology recognition
may inc}ude bar code recognition.
The visible symbology to be converted may be displayed on a video display.
The hand-held imaging device may includes a camera for imaging the visible symbology on the
5 video display and converting the image to an electronic signal. The hand-held imaging device may
include a sensor to sense that the visible symbology to be converted is being displayed on a video
display.
The audible version of the recognized symbology may be provided in the form of separate
symbols, including letters of the alph~het The audible version of the recognized symbology may be
10 provided in the form of groups of symbols, including words.
The visible symbology may also be displayed in a magnified visible version.
The present invention also relates to an appa-~lu~ for converting visible symbology into another
humanly perceptible version of the symbology for a uscr. The al~aldLus includes an imaging device to
covert the symbology from an optical image to an electronic signal ~~p,esel-L~ e of the optical image;
a symbology recognizer receptive of the electronic signal to recognize symbology in the image and to
de~.rmine the location of the recognized symbology in the image; a feedback device to provide
feedback for the user based on the location of the recognized symbology in the image, the feedback
being representative of the position of the hand-held imaging device relative to the symbology; and a
tr~n~lucPr to convert the recognized symbology into a humanly perceptible version of the recogni~ed
symbology.
The imaging device, the feedback device, and the tr~n~dncior may all be located in a hand-held
positioning device. The hand-held positioning device may be tapered to a narrower width at a distal
end than at a proximal end. A bottom side of the hand-held position device may be adapted for
p-lacing against a surface with symbology to be read therefrom, and wherein the bottom side h~cludes a
recessed channel in the center thereof to allow the bottom side to be placed against curved surfaces to
read symbology therefrom. The imaging device may include an illuminator and a camera having a
lens incorporated therein. The lens and the camera may be adjustable in position relative to each other
to adjust the focal length thereof so as to be able to focus the imaging device on symbology at various
distances therefrom.
The present invention is also related to a method for allowing a user with impaired vision to
operate a host computer with a graphical user interface including information displayed on a computer
display. The method includes the steps of providing a surface onto which coded symbology has been
- provided, imaging with a hand-held positioning device containing an imaging system to image and
convert symbology to an electronic signal, the device being adapted for placing in proximity to and

CA 0224~769 1998-08-07
WO 97130415 PCT/US97/02079
moving across the surface; recognizing the coded symbology in the image and deterrnining the location
of the coded symbology in the image, and based thereon, determining the position of the positioning
device relative to the surface; obtaining the information in the vicinity of the position on the co~ uLel
display coll~;sp~nding to the position of the positioning device relative to the surface, and providing an
S indication of the location of the information relative to the positioning device's position, providing
feedback for the user based on the locational indication, the feedback being representative of the
location of the information relative to the position on the computer display cc.ll~,ollding to the
position of the positioning device relative to the surface; and transducing the information into a
humanly perceptible version of the information.
Brief I)escription of the Drawin~s
Fig. 1 is a perspective view of the tactilely-guided, voice-output reading app~dLus of the present
invention
Fig. 2 is a side view of a mouse of the apparatus shown in Fig.l, showing the mouse internal
components.
Fig. 3 is a top-level functional and software block diagram of the apparatus of Fig. 1
Fig. 4a through 4g are dia~ AticAl representations of the preferred response of a tracking
display of the appaldLus of Fig. I to text alignments within the field of view when the a~ alalus is in a
search mode.
Fig. Sa through 5e. are diagrammatical r~l,lcsellLdLions of the preferred response of a tracking
display of the appaldlus of Fig. 1 to text alignments within the field of view when the app~ualus is in a
track mode
Fig. 6 is a pGI:,~e~;Lhre view of the mouse of Fig. 2.
Fig. 7 is a flow-diagram of the control and functionality of the alternative modes of operation of
the a~uaLus of Fig. 1.
Fig. 8 is a perspective view similar to Fig. 1, showing an optional flat panel display mounted on
the lateral aspect of a computer of the app~dlus, and showing the mouse exploded away from a page
of text for ease of illustration.
Fig. 9 is a s~hPmAt}c diagram of the a,~paldlus of Fig. 1, showing the capture of luminous text
from a CRT screen using a photo diode to detect the illumination period and synchlc,ni~; the camera
image capture with that period.
Fig. 10 is a timing diagram graphically depicting timing relationships between the detected signal
of the photo diode and the camera timing signals output by the camcra timing driver, of Fig. 9.
Fig. 11 is a perspective view similar to Fig. 1, depicting the ~)paiaLuS functioning as a peripheral
input/output device throu~ll a host computer serial interface.

CA 0224~769 1998-08-07
W O97/30415 PCT~US97/~2079
Fig. 12 is an electronic block diagram of the apparatus of ~ig. 1.
Fig. 13 is an end view of mouse of Fig. 2.
Fig. 14 is a cross-sectional schematic of a cord of the apparatus of Fig. 1.
Fig. 15 is an illustration of the method of establishing end physical and electrical connection for
the cord of Fig. 14.
Fig. 16 is a schematic block diagram showing an optional feature of the ~p~Udlus of Fig. 1,
showing the use of switched single-color illumination to enhance the reading of colored text.
Fig. 17 is a schematic block diagram of many of the components of the mouse of Fig. 1.
Fig. 18 is a cross-sectional view of the tracking display on the mouse of Fig. 2.
Fig. 19 is a p~,S~I,e~iLive view of the tracking display on the mouse of Fig. 2.Fig. 20 is a bottom view of the mouse of Fig. 2.
Fig. 21 is a rear partially-cut-away view of the mouse of Fig.2, showing a lever for adjusting the
focus of the camera.
Best Mode for Carrvin~ Out the Invention
Functional System-Level Overview
Fig. 1 is a system depiction of the preferred embodiment. The Independent Reading Aid (IRA) system
1 is a text reading device which includes a mouse 2, a computer 3, and a cord 5 which connects the mouse
2 and the computer 3. In the operation, the mouse 2 is passed over a surface containing text 7. Inside
the mouse 2 is an image capture system (shown in Fig. 2), containing both an illllminzlti(~n system as well
as a camera, which captures and transmits a video image through the cord 5 to the computer 3. In the
computer 3, the image is processed to enhance contrast, and then the image is analyzed by a symbology
recognition program incorporated in software in the computer 3 to provide the location and identity of
alphanumeric or other specific symbology within the field of view. As used herein, symbology recognition
may include optical character recognition (OCR), bar code recognition, or any other process by which
symbology content or meaning is recognized by recognition of the spatial characteristics of the elements
of each symbol.
In order for the user to direct the mouse 2 over desired text elements, it is desirable for the user to
have feedback information about the location of the text elements. This locational feedback is most
intuitively provided by tactile and visual indicators located on the mouse 2. A plurality of locational
feedback indicators 9 are located on the upper, forward aspect of the mouse 2, and includes both tactile,
vibrating pins, as well as corresponding illumin~t~d points that are useful for individuals with some
residual vision. When the user locates text that he desires to read, a button 11 is depressed to command
the computer 3 to vocalize the text content through a voice synthesizer located within the computer 3, and
project same through a pair of speakers 13.

CA 0224~769 1998-08-07
W O 97/3041~ PCT~US97/02079
An aspect of the invention which provides many of its practical bene~lts is the rapid feedback of
textual location information to the user. The locational feedback indicators 9 allow the user to rapidly
locate, through tactile feel and residual vision, the individual text elements in fields of other text and
graphics, and further allows the user to track along a line of text. This allows the image-capture system
to process small images from the surface to be scanned, permitting near real-time text-to-speech conversion
so that the user's natural sense of the location of the text can be brought into play in selecting the next
text to be read. This feedback is very similar to the process of reading in a sighted person. The capacity
of the system to utili7e small images further endows the system the potential for great miniaturization.
Detailed Description of the System
Physical Description of IRA System 1 Components
Mouse Components
The internal components of the mouse 2 are depicted diagrammatically in Fig. 2. The mouse 2 is
enclosed by a plastic housing 14 made of anti-ballistic styrene (ABS) plastic, although other materials
would suffice. The mouse 2 is connected to the computer 3 through the cord 5, which is anchored to the
mouse 2 with a cord grommet 205. A characteristic of this grommet mounting is to break-away from the
mouse 2 at a nominal tension, both to prevent f nt~ngll~.ment with environmental objects, which could result
in a potential safety hazard to the user, as well as mechanical damage to the IRA system 1. The mouse
2 is constructed in a splash-resistant manner with seams that are designed so that casual spills which may
occur during usage will not permanently damage the internal electrical components (e.g., spills occurring
while using the IRA system I for reading in a lG~aul~ull).
The text to be read is illumin~ted by a plurality of illuminators 15, which in the preferred embodiment
are LEDs, although strobe, fluorescent or incandescent lamps may be alternatively used. The illuminators
15 are placed so that they can both directly, and indirectly through mirror-bounce and inside-housing
bounce, ilhlrnin~te the text to be read. A transmissive diffuser 17 is placed in close proximity to the
ilhlminzltors 15, so as to smooth the distribution of the illumination on the text to be read. A multiplicity
of illumination sources helps to increase the consistency of the illllm in~tion. In the preferred embodiment,
the illllmin~tors 15 are placed on both the left and right aspects of the mouse 1 for this purpose.
The transmissive diffuser 17 is best a controlled twenty degree dispersion holographic diffuser, such
as is available from Physical Optics Corporation of Torrance, California under Part No. LSD~'OPC10, in
order to allow the greatest amount of light to pass, and so as to reduce unwanted scatter ir unneeded
directions. However, other diffusing systems, including scattering diffusers, may work in both the
reflective and the transmissive configurations.
A window 21 is provided for reading the text appearing therethrough. The window 21 is constructed
from anti-reflection-coated glass, although certain window materials including sapphire and plastics may

CA 0224~769 1998-08-07
W O 97/30415 PCTAUS97/02079
substitute successfully. The purpose of the window 21 is to allow the mouse 2 to be completely sealed
against cont~rnin~ion~ and to protect the sensitive components inside the mouse 2.
Scattered light from the illuminSlt~d text transmits back through the window 21 and bounces off of a
mirror 19 toward a camera assembly 20. The use of the mirror 19 allows a longer focal length for the
imaging system while rem~ining in a compact configuration. This longer focal length reduces image
distortion and increases the depth of field.
The carnera assembly 20 includes a lens 23 and a camera 25, such as is available as Part No. A53,308
from Edmund Scientific of Barrington, N.T. The lens 23 is conveniently an 8mm focal length f/16 aperture
lens and is adjusted to focus on images located anywhere between the surface of the window 21 and the
lower surface of the mouse 2, typically located three miliimeters (mm) below the bottom of the window
21 surface. This provides a field of view 54 thirty mm wide and twenty-t~vo mm high, although other
dimensions can be used. The camera 25 iS a black-and-white/grey-scale CCD camera. Alternatively,
CMOS or other types of cameras may be used.
In the preferred embodiment, the black-and-white camera 25 outputs grey-scale images. Alternatively,
a color carnera can be used outputting full-color images. The advantage, however, of the black-and-white
camera is an enh~nred sensitivity to light, approximately ten-fold that of color cameras, thus allowing a
smaller optical aperture, such as 0.50 mm, with a resultant greater depth of field of approximately 6mm.
In addition, black-and-white images require lower information band width for image L.dll~r~l~ to the
~O~ Ult;l 3.
A CRT emission sensor 27 is located in the mouse 2 so that it views the scene under the window 21.
The purpose of the sensor 27 is to determine whether the text to be read is being displayed from a CRT
screen rather from a document, in which case special functions are used to synchronize the camera 25 with
the scan cycle of the CRT display. This functionality is described in more detail in a later section.
The buttons 11 and the audio speakers 13 are located on both the left and right lower lateral aspects
2~ of the mouse. Their placement is dett nnined by the ergonomics of hand placement and ease of use, which
will be considered in greater detail in a later section.
A mouse circuit board 207 contains the electronic components which interface between the IRA system
I cord 5 and the internal components of the mouse 2. These components include the afore-mentioned
buttons 11, the speakers 13, the illuminators 15, the camera 25, and the CRT emissions sensor 27, as well
as a microphone 63, a mode scroll-up button 103, a mode scroll-down button 105, and a tracking display
53, whose functions will be described in a later section.
Overall Electronic Description
~ig. 12 presents the electronic block diagram of the IRA system 1. Elements in the right of the figure
are input/output components contained within the mouse 2. Elements in the left of the figure are

CA 0224~769 1998-08-07
W O 97130415 PCTAUS97/02079
eompllt~tion ~l, power, control, and interfaee elements eontained within the computer 3. The conneetions
between the elements, shown in the center of the figure, are contained within the cord 5.
Images are eaptured by the video eamera 25 through the lens 23, and analog video signals are sent via
a eord element 177, eontained within the eord 5, to a video frame-grabber 191 contained within the
~; c~ uL~,l 3. Video frame-grabbers are widely available from companies such as Image~ation o~
Beaverton, Oregon (the PX104PLUS), Data Translation, Inc. (the DT3152), and Epix of Buffalo Grove,
Illinois. The frame-grabber 191 creates ~ligiti7f d bit-maps in memory from the video analog signal
supplied by the camera 25. These memory bit-maps are tr~n~mitt~d to a coul~ t~l eireuit board 193
through a high-speed bus 19~. The bus 194 should have a high bandwidth, such as a PCI bus or a bus
10 eonforming to the 1394 FireWire interface standard. The PCI bus is the standard communications bus
found on Pentium-class and PowerPC col,l,. .ll.,.~. Both PCI and 1394 FireWire communications standards
are chald~ ed by high data rates and the ability to transfer data directly to computer memory without
the need for intensive main proeessor involvement. The eomputer circuit board 193 contains software for
analyzing and converting the image files into text information, including an OCR program 31, through a
novel word detector 59, a speech synthesizer 61, and a tracking display driver 51. The computational
speed of the computer circuit board 193 is critical to the real-time operation of the IRA system 1, and a
proeessor with 133 MHz Pentium-elass performanee or better is preferred. The eomputer eireuit board 193
produces digital data strings which are conve~ted into analog waveforms by a D/A eonverter 185. These
waveforms are output to the audio speakers l3 through a cord element 175. A volume eontrol 187,
located on the exterior of the .;~ Ju~. 3, allows the operator to manually adjust the volume output
through the -~udio speakers 13 to a comfortable level. Alternatively, the user may insert earphones into
an earphone ~ack 189, which permits private reading by the operator, reading by those with hearing
impairment, or reading in envh.)l.lllc.ll~ with high ambient noise.
- A field programmable gate array {FPGA) 173, located within the mouse 2, provides the interface
between the assorted elements of the mouse 2 and the computer circuit board 193. The FPGA 173
includes eustom-designed circuitry comprising a high-speed serial interface, assorted timing elements, and
h~w~u t7 driver elements necessary for video camera control, LED drivers, tactile stimulator drivers, mode
button interface, illumination control, and speak button interface. In the preferred embodiment, the FPGA
173 is a XC3030VQ64 FPGA, available from XILI~X of San Jose, Californial that is programmed upon
power-up by the computer circuit board 193, although many other commercially available circuit solutions
may provide equivalent function.
The FPGA 173 communicates with the computer circuit board 193 over a cord element 179 using a
high-speed bi-directional communications protocol. The gate array 173 interfaces with most of the
elements of the mouse 2. The output of the CRT emissions sensor 27 is amplified by an amplifier 125,

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
and the resultant signal is binarized by the gate array 173. When in CRT mode, this inforrnation is used
to modify camera control signals generated within the gate array 173, which drive the timing cycles o~ the
camera 25. In addition, signals are sent to the computer circuit board 193, informing the computer
whether the camera 25 is reading from a CRT display. The illuminators 15 are controlled by a hardware
driver 167, which is comm~n~led by signals from the gate array 173. It should be noted that in different
IRA system I embodiments, the illuminators may be comprised of either white light lamps, or variously
colored LEDs, depending on the type of color discrimination desired. In the case of multiple color LEDs,
the timing of the dirr~ ll illuminators is controlled through the FPGA 173, and reported over the cord
element 179 to the computer circuit board 193.
The tracking display 53 includes a plurality of LEDs 107, 109, 111, 113, l lS, and 117, as well as a
plurality of tactile stimulators 65, 67, 69, 71, 73, and 75. Preferably, the upper LEDs 107 and 113 emit
red light, the middle LEDs 109 and l lS emit green light, and the lower LEDs 111 and 117 emit orange
light, so that low vision users can more easily ~licting~ h which LEDs are ilhlmin~ oth the LEDs
and the tactile stimul~lors are controlled by the FPGA 173 through a plurality of hardware drivers 169 for
the tactile stimulators and a plurality of hardware drivers 171 for the LEDs. The computer circuit board
193 sends signals to the FPGA 173 through the cord element 179 about the logical state of the tracking
device 53, including the active geometrical arrangement of tactile stimulators and LEDs, as well as their
frequencies and actuation strengths. This logical information is hllel~ d by the FPGA 173 and sent to
individual hardware drivers, one for each tactile stimulator and LED, determining their state of activation.
In the preferred mode, timing oscillators for the tactile stimulators are contained within the logic of the
FPGA 173. Alternatively, the colllpuLel circuit board 193 may internally decide the ~ eous state
of each LED and tactile stimulator, through its own internal oscillator, and transmit this state to the FPGA
173, which directs individual hardware drivers to their correct state.
- The state of the mode scroll-up button 103 and the mode scroll down button 105 are detected by the
FPGA 173 and communicated to the single-board computer 193 through thc cord element 179. Likewise,
the state of the buttons 11 are communicated to the co~ uLI;l circuit board 193 through the FPGA 173.
Fig. 17 depicts the circuit components on the mouse circuit board 207. The functions of this circuit
board are predominantly performed by the FPGA 173, which includes the functions surrounded by the
inner box in the figure. These conventional functions are created through standard software modules
provided by the manufacturer and are well-known in the art. These software modules are transferred from
.~ the read-only memory resident in the single-board computer 193 programmed into the FPGA 173 through
the wire 179 via a serial interface 275 upon each power-up of the IRA system 1. An FPGA clock 269
uses a crystal 271 to provide timing signals for the FPGA functions. A timer oscillator 273 feeds tactile
stimulator drive signals to a pin driver selector 277, whereupon individual drive signals are sent to the

CA 0224~769 1998-08-07
W O97/30415 PCT~US97/02079
tactile stimulator drivers 169. The tactile stimulator drivers 169 include standard bipolar of FET
transistors, sinking current through a plurality of solenoid coils 283, servicing each tactile stimulator. At
each moment of current disruption through the solenoid coils 283, a current is generated from field
collapse that ilhlmin~t~s the associated tactile stimulator LEDs 107, 109, 111, 113, 115, and 117. In this
S configuration, the individual hardware drivers 171 for the tracking LEDs are not required, since the stored
energy in the coil 283 incidentally drives the associated LED.
A frequency counter 279 measures the pulsation fre~uency of light incident on the CRT sensor, whose
output is amplif1ed by the arnplifier 125. The hardware drivers 167 control the current through the
ilhlmin~tc-rs 15 through a plurality of resistors 285. A camera timing oscillator 281 generates the standard
integration refresh and clockout signals used in the carnera 25. Such signals are used in both CMOS and
CCD cameras.
Tracking Display Description
Pig. 18 depicts an end-view cross-section of the tracking display 53. The solenoid coil 283 rests
within a cylindrical recess in a magnetic steel housing 289, and when energized, pulls a vibrating arrnature
287 upwards across the distance of an impact standoff 295, which is typically 0.015". As the vibrating
~ aLul~ 287 comes to the end of this distance, it strikes the underside of a rubber-like sealing sheet 297,
whereupon shock-like displacements of the sheet are sensed by tactile sensors in a human operator's finger
305, which rests within a smooth finger trough 303 fashioned in the magnetic steel housing 289. The
rubber-like sealing sheet 297 is affixed to the magnetic steel housing 289 using a sealing adhesive 299
around its border, such that cont~m in~tion does not interfere with the free motion of the vibrating armature
287. A bumper O-ring 291 prevents a metallic striking noise or buz from ~m~n~ting from the tactile
display 53 when the operator's finger 305 is not fully in contact with the rubber sheet 297.
When the solenoid coil 283 is de-energized, a combination of gravity and stored energy in the rubber
sealing sheet 297 and the bumper O-ring 291 moves the vibrating armature 287 downwards, where it
strikes a rubber rebound bumper 293. The rebound rubber prevents metallic noise emanation, as well as
providing a me~h~ni~ lly resonant amplification of the vibrating armature movement. The hardness
(measured in durometers) of the rubber rebound bumper 293, the bumper O-ring 291, and the rubber-like
sealing sheet 297 is selected to m~ximiz~ the skin deflection and tactile sensation afforded by the
assembly.
The presence of the rubber-like sealing sheet 297 has additional benefits. In addition to sealing the
display, the sealing sheet 297 spreads the area of tactile vibration beyond the diameter of the tactile
stimulator vibrating armature 287, making it less critical to tactile sensitivity the exact placement of the
operator's finger 305 on the surface of the tracking display 53.

CA 0224~769 1998-08-07
WO 97/3041~ PCT/US97/02079
Of course, the me~ h~nism for inducing skin deflection could be accomplished with means other than
the use of magnetic solenoids 283 ~-t--~ting vibrating armatures 287. Alternatively, these other means
coutd include piezoelectric actuators, electro-rheological fluids, electro-tactile techniques, or memory-
metals.
The LEDs 107, 109, 111, 113, 115, and 117 are mounted in depressions on the lateral aspects of the
magnetic steel housing 289, outside of the finger trough 303, so that when the operator's finger 305 is
resting in the finger trough, the LEDs are visible on either side.
A plurality of electrical leads 301 connect the LEDs 107, 109, 111, 113, 115, and 117, as well as the
solenoid coils 283, to a conventional flexible wiring circuit board (not shown) for ~rAngmi~cion to the
mouse circuit board 207.
Certain o~Je-dlo~ may suffer from neuropathy afflicting the tactile sensors in their fingertips. In such
cases, the use of the tracking device ~3 is compromised. In its stead, auxiliary tactile tracking devices may
be employed. For example, a more widely spaced set of tactile stimulators may be strapped to the
operator's arrn, and connect with the computer 3 directly to provide feedback information to the operator.
The wider spacing of tactile stimulators in such an auxiliary tactile display is necessitated by the wider
spacing of tactors in the general skin surface as ~ Ll~L~d with the fme spacing of tactors in fingertips.
System Power
In Fig. 12, a rechargeable battery 197 contains the electrical energy necessa~y to support the IRA
system 1. In the preferred embodiment, this battery is chosen to provide the highest energy density
possible, such as a metal hydride battery. ~onvenient batteries may be chosen from a variety of consumer
electronics devices, including those used for portable computers or portable video cameras.
The power from the battery 197 is converted into the various voltage levels needed ~or the electronic
devices in the IRA system I by a power conditioner 199. Outputs from the conditioner 199 are sent to
the electronics contained within the mouse 2 through a cord element 183. Other outputs are directed to
the col~ t~,l circuit board 193 and other elements of the co-l-pul~,l 3. An on/off switch 201 controls
power output from the l~cha.~eable battery 197 and may be combined with the volume control 187. A
charge connector 203 provides charging means for the battery 197.
During periods of inactivity longer than a given threshold time, the computer circuit board 193 goes
into a "sleep rnode" in order to preserve battery power. During this sleep mode, all power consuming
elements in the mouse 2 are turned off, as well as elements in the computer 3. Activity in the computer
- circuit board 193, shown in Fig. 12, is reduced to the minimum necessary to remain in an active state, in
which dynamic memory is retained. In addition to the communication of the button 11 through the FPGA
- 173, the buttons 11 have a direct link to the computer circuit board 193 through a cord element 181 to
provide a wake-up signal once pushed, which restores power to all IRA system I system elements. The

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
microphones 63 are connected to an amplifier 32 in the circuit board 207 which connects through a cord
element 323 to an A-to-D converter 325 for communication to the computer circuit board 193, as shown
in Fig. 12.
~rgonomic Design of Physical Components
Use of the IRA system I will be primarily among an elderly population with little familiarity with,
and often an aversion to, computer devices. The IRA system I needs to be easy for Op~l~t~ to use, and
such ease is ~ete~nined both by the mechanical ergonomics of the haldw~e construction and the design
of its software user interface, both of which must make the IRA system I intuitive to use. This section
presents those novel mechanical aspects of the construction that contribute to its ease of use.
Fig. 13 presents an end view of the mouse 2. The tracking display 53 and its associated tactile
stimulators and LEDs are shown in the upper aspect, with the mode scroll-up button 103 and the mode
scroll-down button 105 on the forward and upper-forward aspects of the mouse.
Of note is a U-shaped slot 209 on the bottom of the mouse 2, that extends along the length of the
mouse 2 parallel to the longitudinal axis of the mouse 2. This slot assists users in aligning the mouse 2
parallel to the axis of cylindrical objects such as medical pill bottles and food cans. Using this feature,
the cylindrical object is placed with its axis parallel to the lon~;it~ insll axis of the mouse 2, and is placed
flush against the ridges flanking the IJ-shaped slot 209. In this orientation, the text on the cylindrical
object is located directly beneath the window 21. The operator may conveniently rotate or translate the
cylindrical object to be read within the U-shaped slot 209, using the slot 209 as a guide.
In Fig. 2, the end of the U-shaped slot 209 is shown terminated by a mouse cutout 213. The purpose
of the cutout 213 is to allow bottle caps and can rims to fit underneath the end of the mouse 2, so that text
near the end of the bottle or rim can be read through the window 21.
Many of the ergonomic features of the mouse 2 are easily apprised in Fig. 6. The operator's index
or middle finger fits into the finger trough (which may be viewed in cross-section in ~ig. 18) on the upper
surface of the tracking display 53. The thumb and either the middle or ring finger rest on opposite sides
of the mouse 2, resting comfortably over the buttons 11 on each side of the mouse. The tactile display
is located directly over the window 21, and therefore presents tactile information to the operator in
overlying alignment with the text to be read. With this overlying alignment, the human factors kinesthetic
~ response necessary to track the mouse 2 along text is natural and intuitive, similar to tracing along a
vibrating line with an index finger. This is naturally performed by readers of Braille and mimics the
actions of small children learning to read.
Because the mouse 2 may be both translated and rotated while tracking text to correct skew alignments
of mouse and text, the bottom of the mouse 2 is made from a slick plastic with no preferred movement
directionality. In the preferred mode, the bottom surface of the mouse 2, corresponding to the ridges on
14

CA 0224~769 1998-08-07
WO 97/3041~ PCTJUS97/02079
the sides of the U-shaped channel 209, is formed from a self-lubricating plastic, although other methods
may be employed, including the use of Teflon pad inserts.
The mouse 2 is wider in a posterior region 215 fitting under the operator's ball of the palm than a
forward region 217 gripped between the thumb and the middle or ring finger (see Fig. l and Fig. 6). This
tapering of shape has been shown to naturally counteract the skew rotation arising from forearm pivoting
when the device is scanned left or right along a line of text.
On the lower lateral aspect of the mouse 2 are a plurality of flared extensions 219 which are integral
with a plastic housing 211. The Op~;ldtO~'S thumb and middle or ring finger rest half on and half offthese
flares during normal operation. This positioning of the fingers both provides the user rest, as well as the
1~ ability to modulate the friction and closely monitor the movement of the mouse over the surface to be
read.
The buttons l l are located in positions such that the operating fingers rest naturally and comfortably
on them. The buttons l l provide tactile feedback to the user when pressed through a distinctive tactile
click. The raised prominence of the buttons l l allows easy finger positioning by op~,~dLOl~i, and the
elongated shape of the buttons 11 allows operators with different finger lengths and hand shapes to make
use of the buttons 11.
The speakers 13 within the mouse 2 are located in such a position that when the mouse 2 is
comfortably gripped with either hand, at least one speaicer 13 projects towards the user through the gap
between the thumb and index finger. The audio speakers 13 are placed in the mouse 2, rather than in or
with the computer 3, so that the location of the sound output provides audio-spatial cues to the reader
about the location of text being read. In addition, this location minimi7~ the required audio volume, since
the operator's facial attention is naturally and directionally focused on the material being read.
The microphones 63 are placed in the front of the mouse 2, so that when the user wishes to speak into
either of the microphones 63, he merely raises the mouse 2 to his face, and one of the microphones 63
is naturally directed towards the operator's mouth.
All aspects of the external interface to the mouse 2, including the tracking display 53, the buttons l 1,
and the speakers 13, are bilaterally symmetrical. This allows equal ease of use by either hand, and may
be used equally well by left- and right-handed users.
The use of the IRA system l is facilitated by the cord 5 having a limber construction without
~ l physical memory. Such a cord is generally difficult to construct given the iarge number of
- wires contained within the cord, and the nature of molded plastic sheaths which stiffen with cold. Fig.
14 depicts a schematic of the cord 5 in the preferred embodiment. A braided cloth exterior 221 surrounds
- an extension-limiting filament 225 and a plurality of wires 223, which carry the electrical signals between
the mouse 2 and the computer 3. The filament 225 and the sheath 221 are bonded at each end to each

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
other at a plurality of attachment points 227, with the filament 225 fully ~t~nded and the sheath 221
relaxed. It is a ch~ile~ ic proper~y of the braided sheath 221 that its diameter changes under tension,
and the attachment between the sheath 221 and the filament 225 is performed when the sheath 221 is not
under tension so that its diameter is large and it does not cinch down on the filament 225 and the wires
223. The method of ~ rhing the sheath 221 to the filament 225 at the attachment point 227 is by
thermoplastic fusion, although adhesive bonding or merhs.nic~l cinching are also practical.
Fig. 15 depicts the method of establishing the physical and electrical end connections for the cord 5,
constructed according to the method of Fig. 14. A connector 231 plugs into the computer 3 and transmits
signals through the wires 223. A connector bell 229 provides the mechanical interface and protective
means ~ rhing the sheath 221 and the filament 225 through the attachment points 227. In the preferred
embodiment, the diameter of the strain relief 233 through which the cord 5 is threaded is smaller than the
rnet~r of the cord 5 at the attachment points 227, providing a secure physical grip on the cord. On the
other end of the cord 5, a strain relief 223 also provides an orifice which is smaller than the attachment
points 227, preventing the cord 5 from pulling through the strain relief 233. The internal wires 223 are
connected to a mouse end connector 235, which is fashioned to mate with a connector located on the
circuit board 207 located in the mouse 2.
While the IRA system I can be used as a device at a fixed location, it is generally int~n~ed for use
as a portable device. Pig. 1 depicts a number of features which contribute to its ease of portable use. The
co~ uL.l 3 is norrnally protected and housed in a fabric cover 247 (as shown in Fig. 1), which is
comprised of a nylon cloth in the preferred embodiment, although other fabrics or materials such as leather
are suitable. The mouse 2 is protected and housed during periods of non-use in a mouse pocket 241~ and
secured within the pocket 241 by a mouse pocket closure 243. The means of closure in the preferred
embodiment is a pair of mated Velcro strips, although adhesive band, buttons, or mated snaps are also
suitable. The flexible cord 5, during periods of non-deployment of the mouse 2, is conveniently stowed
within a cord pocket 239. The pockets 239 and 241 are integrated into the construction cover 247, as may
be other storage pockets for miscellaneous utilitarian uses, such as for the user's wallet and keys. A
convertible strap 237 and a buckle 245 are connected to the fabric cover 247, in a recor~fi~ hle means,
such that the strap 237 may serve either as a belt or shoulder means of carrying the IRA system 1.
Computer Decoding of Text Images
Fig. 3 depicts a top-level software block diagram. Images of the text 7 are captured by the camera
25 and are passed to the computer 3 for processing. The colll~u~l 3 includes a number of separable
functional components, which may or may not be embodied in separate hardware components. The
camera image is initially placed into memory, either on the computer, or within a specialized hardware
component such as the afore-mentioned frame-grabber 191. This image includes an array of pixels, which
16

CA 0224~769 1998-08-07
W O97/30415 PCTAUS97/02079
in the preferred embodiment are gray-scale pixels with a depth of 8 bits. In Fig. 3, the components
physically located within the computer 3 are located within the box designated as the conl~.ul~l 3.
An optional initial stage of image processing is pre-processing which enhAn~f~c the contrast of the bit
~ field, ca}ried out by an image pre-processor 29. In the preferred embodiment, the image pre-processor
5 29 is implemented in software running on the computer circuit board 193, although it may alternatively
~ be implemented through a hardware pre~ cessol such as a XILINX FPGA or a digital signal processing
chip such as a Texas Instruments C80. This is needed because many optical ch~a~;L~l recognition (OCR)
software progratns require high contrast images. A simple yet useful algorithm is to set a threshold value,
above which all pixels are set to white, and below which all pixel values are set to black. This works in
10 conjunction with the camera's automatic exposure control to create a generally ~Ati~fAr,tory image for input
to an OCR program. The threshold value may be set at fixed value, or may be dynamically adjusted to
accommodate variations in surface reflectivity or illumination intensity. The pre-processor ''9 may be
either a software program carried out by the main processor of the computer 3, or may be a specialized
hardware image processor, such as a gate-array processor or other image processor specialized for such
a task.
The outcome of the pre-processing is an image with high contrast, and which may be optionally
reduced in pixel depth from a greyscale to a single bit depth binary image. Such binary images may be
very rapidly decoded by a variety of OCR programs.
The image is then p~ ,t;llLc;d to the OCR program 31 in order to decode the text. Such OCR progratns
include the XIS OCR engine from Xerox and others that are widely available from a variety of software
vendors, including Caere Corporation, International Neural ~Al~hin~c, Mitek, and others. In the preferred
embodiment, we have used the Tiger OCR library available from Cognitive Technology Corporation. The
input to the OCR program 3] is a bitmap of the text to be read, and the output is a ~1AtAh_~e of text
locational data 33 relating to text within the image. In the preferred embodiment, the text locational data
33 includes not only the identity of the individual text elements, but also the location of each text element,
the degree of confidence with which the character or symbology was identified, the font and point size
in which the text is rendered, and the degree (angle) of skew from the horizontal. In cases where the OCR
program does not provide all of this information, some of the missing information may be derived from
the provided information. For example, in the absence of skew information, the skew angle can be
c~ L~d by trigonometry from the relative locations of adjacent text elements.
- Symbols and characters which overlap the boundaries of the viewing field are discarded, in order to
prevent certain mi~tAkec in OCR i"t~ .,#l~lion. For example, the character "O" which is at the bound_ry
of the viewing field, may be truncated on the right of the character to produce an â~ C~. By

CA 0224~769 1998-08-07
WO 97/30415 PCT/IJS97/02079
deleting characters that overlap the boundary of the viewing field, such mict~kf?~ are avoided. This
procedure is facilitated because the OCR program 31 returns the location of each text element.
In many cases, text will be presented to the IRA system 1 in an orthogonal rotation. For example,
a blind user picking up a document cannot easily deterrnine whether the text is upside down or sideways
S(in l~n~ re mode). In such cases~ it is of high utiiity for the lRA system l to alert the reader. To
perform this function, the pre-processor 29 in general computes a variability measure, which is
conveniently a sum of edges in the field of view 54. Moving across each row in the bitmap, a change
from a zero bit to a one bit, or from a one bit to a zero bit, increments a counter. This variability measure
may be co~ Led by a variety of alternative means. If this variability measure exceeds a threshold, the
10image can be said to contain contrast structure, generally to be either text or graphics. If alphanumeric
content is not recogni7ed in the normal orientation by the OCR program 31, the bit-image is rotated in the
image pre-processor 29 by ninety degrees and presented again to the OCR program 31 for h~ lct~lion,
and so forth. If text content is not recognized in any of the four olthogonal rotations, then the user is
alerted by the vocalization of the words "graphics1', for example. If the text is recognized in one of the
15orientations other than the upright l,,~s~L~Lion, the user is alerted by the vocalization of the proper
rotation, such as "rotate left" or "upside down" to establish upright orientation.
The IRA system I has two fundamental modes of operation: a search mode and a track mode. When
neither button 11 is pressed (thus, the system is in the search mode), software switch 41 is in position 40
to transmit signals from a center text Y-locator 35 to the tracking display 53. When the track mode is
20selected by pressing and holding in either of the buttons 11, the software switch 41 moves to position 42
and a software switch 43 moves from a null or unconnected position to position 44. The center text Y-
locator 35 determines and outputs the vertical location ("Y value") of the most vertically-centered line of
text. If multiple lines of text are located within the image, then the vertical positions of thc letters in the
line closest to the vertical center of the field of view 54 are averaged. The OCR program 31 forrnats the
25text into lines, along with the vertical position of each line within the field of view 54. The center text
Y-locator 35 ~letennin~ ~ which of these line positions is closest to that of the center of the field of view
54, and outputs that line's vertical position.
With either of the buttons 11 pressed so that the system is in the track mode, as each new image frame
is analyzed, a track line updater 36 determines if text patterns match that of previous elements in the
3~current tracking line. If so, red~m-l~nt elements within the two patterns are elimin~te.~, and the track line
identity is enlarged to include the new elements. For exarnple, if the current track line includes the text
"This is the tim", and the newest frame includes the text "e time for all", the track line updater 36 tests
dirr~ L~Iions of the current track line and the newest frame for maximum correlation. At the
~gi~ Lion which yields the largest correlation, the lines are merged to form the new track line and
18

CA 0224~769 1998-08-07
W O97/30415 PCTrJS97/02079
duplicate characters are dropped, and in this case the track line is amended to be "This is the time for all".
It should be noted that the OCR program 31 may return characters with a low certainty of
"ekLtion, and the OCR program 31 reports this lack of certainty. In such a case7 the track line
S updater 36 will replace the isolated instances of low certainty characters with a "wild-card" symbol,
perrni~ting the track line updater 36 to match this wild-card symbol with any other symbol or text element.
This permits a correlation and merging activity to continue even in the presence of occasional low certainty
characters.
The advantages of the technique used by the track line updater 36 for sensing text string overlap are
10 ~ l Instead of having to merge image fragments in image space to capture entire lines, the IRA
system X merges symbology strings in ASCII or "content" space, which requires many orders of
m~gnitu(le less computation then merging in image space.
The Ps~nti~l logic of tracking lines of text is to find strings of symbols which are contR~ lly related
to one another. This relationship is defined for purposes of the algorithm to always involve physical
contiguity, that is, letters adjacent to each other or separated by language constructs such as punctuation
or small spaces between words. When the space between words exceeds some multiple of the average
symbol width, a break in the track line is registered. The incorporation of this logic into text line tracking
allows Il~A system I to disambiguate multiple columns of text, since the IRA system I stops talking and
tracking when excessive space is encountered during the course of a single push of the button.
Operationally, tracking is terminated by a space exceeding a multiple of the average symbol width, where
the multiple is generally between 3 and 4 text spaces wide. In the case of proportional text, where symbol
width is variable, the spacing multiple is increased to account for this intrinsic variation.
A tracking text Y locator 37 outputs the average Y value of the text elements corresponding to the
current track line within the current image. The tracking text Y locator 37 uses the text position
information obtained from OCR program 31 to determine the average vertical position of the current
tracking line in the field of view 54.
A track line outputter 39 outputs text elements newly added to the track line. When the new frame
text and the current track line text are merged, new text elements which are added to the current track line
~ are output.
In overview, when the button 11 is pressed and held, the II~A system 1 enters a mode ("track mode")
in which the central line of text is used and remembered as the tracking line. - As long as the button 11
remains pressed, and the track line remains within the field of view 54, the IRA system 1 will attempt to
assemble new elements to that line, olltr~ n~ their vertical coordinates to the tracking display driver 51,
19

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
and vocalizing any complete words which are encountered. This is accomplished through the software
switch 41 and the software switch 43, which are electronically manipulated via the button 11.
When the button 11 is not pressed the IRA System I is in the search mode in which the software
switch 41 channels the Y values of the most vertically centered line from the center text locator 35 to the
S tracking display driver 51. This driver 51, in a manner to be described shortly, includes both software and
hardware elements that deliver the Y values of the most centrally located text in the window 21 to the user
through the tracking display 53 so that he may explore the locations of text elements relative to the current
mouse position. When the button I l is pressed to enter the track mode, the software switch 41 channels
the Y values of the current tracking line from the track line Y locator 37 to the tracking display driver 51,
10 so that the user may manipulate the mouse 2 through control of his hand 57 to continue tracking the text
line chosen at button press.
In the search mode, the switch 43 deactivates any text output to the speech synthesizer 61. In the
track mode, the switch 43 channels text elements found in the current tracking line from the track line text
outputter 39 to the novel word detector 59. The novel word detector 59 detl rrnines when a word has been
15 defined by virtue of having a space or punctuation before and after, and further determines whether the
word is novel and has not been vocalized immediately prior (that is, repetitive with the previous vocalized
word). If both conditions are met, the letter strin~ is transmitted to the speech synthesizer 61, which is
a software program, which may have hardware assistance, that computes and generates an electronic wave
form which, when played through the audio speakers 13, is heard by a human operator 55 as the novel
20 word. Such a speech synthesizer might include a Digital Equipment Corporation DECtalk hardware
device, or software programs such as AT&T's "Watson" that may be played through widely available
audio-output hardware such as a Sound Blaster or a digital to anaiog output device 185 connected to
speakers 13.
- The tracking display driver 51 presents information to the human operator 55, through tactile, visual
25 and audio fee<lh~r~ to allow the directed manipulation of the mouse 2 through hand control 57 both in
horizontal, vertical, and angular l~cse~lL~ion to the material being read. The inputs to the tracking display
driver 51 includes the output from the center text locator 35, the track line Y locator 37, and a skew
detector 47. The tracking display driver 51 includes a software program that detenninrs which tactile
~ stimulators and LEDs should be active and at which frequency they each should be activated. The
30 tracking display driver 51 includes hardware components within the mouse 2 which physically activate the
tactile stimulators and LEDs through electronic oscillators and transistor drivers in accordance with serial
comm~n-lc from the computer 3. This combination of tactile, visual and audio stimulus serves to provide
the user with an intuitively understandable feedback merh:~nism to locate and track individual lines of text.

CA 0224~769 1998-08-07
WO 97130415 PCT/US97/02079
The tracking display 53 includes a set of six solenoids which actuate vibrating pins~ hereafter known
as tactile stimulators, 65, 67, 69, 71, 73, and 75, that excite propriosensory responses in the distal pad of
the finger of the human operator 55. It is well noted in the scientific literature that different
propri~ c~Lul~ in the fingertip respond to different frequency stimuli, and have dirr~;~c"l spatial
S discrimination capabilities. By using an impulse displacement stimulation which excites more than one
type of proprioceptor, the perception of the pins with maximum sen~i~iviLy and spatial discrimination is
achieved. In addition, a set of six LEDs, 107, 109, 111, 113, 115, and 117, corresponding to and mounted
zlflj~r~nt to the tactile stimulators 65. 67, 69, 71, 73 and 75 are collaterally energized to provide visual
augmentation of the tactile stimulator feedh~ck Both the tactile stimulators and the L~Ds are arranged
10 in two columns of three elements (see Fig. 6 and Fig. 13). Information is provided to the human operator
55 through both the pattern of tactile stimulators energized, as well as the frequency at which the tactile
5timnl~tors vibrate. A collateral effect is the audio artifact generated by the vibration of the tactile
stimU~tors~ which is apprised by the human operator as a variable frequency soft-toned "bu~:z." I'he
frequency of this buzzing provides additional centering cues to the human operator. In a later section, the
relationship of the patterns of tactile stimulator and LED feedback to the text locational data will be
presented in detail. It should be noted that alternative arrangements of tactile stimulators are within the
spirit of the invention. A higher density of tactile stimulators may provide more detailed information to
operators, and a single column of tactile stimulators has been demon~llal~d in a prototype to communicate
acceptable positional information. Even a single tactile stimlll~tnr provides important positional
information that is of sufficient assistance to operators for detf cting and tracking text.
The tracking display driver Sl transmits "chirps" through the audio speakers 13 in response to the
skew detector 47, which detects when the text angle exceeds a threshold angle necessary for accurate OCR
hlt~ tion by the optical cl1ala~;ltl recognition program 31. In the preferred embodiment, a threshold
angle of five degrees is used, although this angle is highly dependent on the specific OCR program 31
used. If such a threshold is ~xceede~, the human operator is informed through tactile, visual and audio
displays via the audio speakers 13 and the tracking display 53. The human operator 55 can then correct
the skew angle of the mouse 2 to the text 7 so as to eliminate the audio feedback.
Feedback Algorithm
Fig. 4 presents the preferred response of the tracking display 53 to text alignments within the field of
view 54 which is depicted schematically in alignment under the tracking display 53 when the button 11
is not pressed, so that the IRA system 1 is in the search mode. The text 7 is shown in a fixed position
in the figures, while the mouse is moved over the text by the human operator 55. The tracking display
53 is located on the mouse 2, whose field of view 54 moves relative to the text 7 in conjunction with the
mouse 2. When the button is not pressed, the IRA system 1 is in the search mode, in which the most

CA 0224~769 1998-08-07
W O 97/30415 PCTrUS97/02079
central recognized text in the field of view 54 is tracked. In this figure, only the tactile stimulators 65,
67, 69, 71, 73, and 75 are depicted, although corresponding LEDs would energize in coordination with
the tactile stimulators. The dashed grids designate the image sectors of field of view 54 corresponding
to each pin. In Fig. 4a, no recognized ~1ph~nllmeric text is present within the field of view 54. In such
S an in~t~n.~.P., no tactile stimulator is energized, as indicated by the unfilled pin symbols.
In Fig. 4b, al,vhan~ leric text is recognized only within the right zone of the field of view 54. The
center-text Y locator 35 determines the Y value of the text located most closely to the vertical center. In
this case, the text substring "THE" of "THE QUICK BROWN FOX JUMPED" is bolded to indicate it is
within the field of view 54, and it is the text closest to the vertical center. The tracking device driver 51
10 directs the right-lower tactile stimulator 75 to energize, infliç~ted by the designated tactile stimulator. In
all cases, the frequency of the tactile stimulator energization is inversely varied in relation to its distance
from the vertical centerline. In testing, it has been discovered that a minimum frequency of six Hertz ~Hz)
when detection occurs at either the top or bottom extremes of the field of view 54 is a good choice for
the lowest threshold frequency. As the text elements move closer to the center of the field of view 54,
the frequency is exponentially increased to a maximum of sixty-five Hz to provide good tactile-frequency-
based center discrimination.
In Fig. 4c, alphanumeric text is recognized only within the right zone of the field of view 54.
However, the substring "THE Q~ is bolded to indicate that it is the most vertically centered text, now
within the central-right zone, so that the tracking device driver 51 directs the energization of the right-
20 central tactile stimulator 73. Note that in this algorithm, even though text is identified in more that onezone of the field of view 54, only one tactile stimulator of a column will be energ
ized at one time. This
algorithm allows the human operator to focus on centrally-located lines, removing other, potentially
di~ illg tactile stimulus from their consideration. The stimulator tactile frequency, the audio frequency,
the stim~ tor position, and the LED color and position provide somewhat red..n.~nt sensory inforrnation.
25 These stimuli are fused together within the mind of the user, providing a more robust and intuitive
centering response. Another advantage is that the celll~.;llg response of the operator is not elimin~ted even
if one of the sensory modalities is disrupted (e.g., by a poorly-centered finger pad on the tracking display
53).
~n Fig. 4d, alphanumeric text is recognized in both the left and right zone of the field of view 54. Thc
30 most vertically centered text substring, "I~K BROWN F" spans both left and right in the field of view
54, and so both the left-central tactile stimulator 67 and the right-central tactile stim~ t.-r 73 are energized.
In this case, tactile stimulators in both left and right columns are energized, but as mentioned previously,
only one tactile stimulator in each column may be energized.

CA 0224~769 1998-08-07
WO 97/304~5 PCT/US97/02079
In Fig. 4e, ~lph:lnllmeric text is recogni7~d in both the left and right zone of the field of view 54. The
mouse 2 has moved lower on the text 7, so that the central substring of text now becomes "AZY DOG.",
present within both the left-central and right-central zones and so both the left-central tactile stimulator
67 and the right-central tactile stimul~t~r 73 are energized. In the search mode, the IRA system 1 freely
changes the text being tracked to match the identity of the most centrally located text within the field of
view 54. Thus, that the central line has changed from "THE QUICK BROWN FOX JUMPED" to "OVER
T~ LAZY DOG." will have been felt and heard by the operator as a stimulation frequency ripple but
this had no effect on pins energized in the tracking display 53 during that transition.
In Fig. 4f~ ~lph~nllmt ric text is recognized only within the left zone of the field of view 54. The
center-text Y locator 35 deterrnines the Y value of the text located most closely to the vertical center, in
this case the sub-string "ULTY.", which is located in the upper left zone of the field of view 54. The
tracking device driver 51 directs the left-upper tactile stimulator 65 to energize.
As described above, the human operator 55 is alerted to excessive text skew angles identified by the
skew detector 47 through audio signals. Alternatively or in addition to this audio signal, the stimulation
pattern on the tactile display 53 may alert the human operator that the skew angle has been exceeded. In
Fig. 4g, although the text is centered, there is a skewed presentation angle. As before, the centered text
causes the left-central tactile stimulator 67 and the right-central tactile stimulator 73 to energize.
Additionally, the tracking display driver 51, with input from the skew detector 47, energizes the tactile
stimulators 69 and 71 to indicate the skew of the text. In order to ~licting~ h the skew correction signal
in these tactile stimulators 69 and 71 from the central text locator signal in other tactile stimulators 67 and
73, the energizing action of these tactile stimulators 69 and 71 is a periodic pulse coordinated with a
"chirp" audio output from the speakers, rather than a continuous vibration, as indicated by the unique
energization symbols for tactile stimulators 69 and 71.
- Fig. S presents the preferred response of the tracking display 53 to text alignments within the field of
view 54 when the button 11 is pressed, so that the IRA system I is in the track mode. The text 7 is in
a fixed position in the figures, while the mouse 2 is moved over the text by the human operator 55. The
tracking display 53 is located on the mouse 2, whose field of view 54 so moves relative to the text 7 in
conjunction with the mouse. In the track mode, the operator 55 has chosen to read a line of text, and has
indicated this desire to the IRA system 1 by pressing and holding the button 11 while drawing the mouse
2 across the text. To assist the operator in reading the line of text, at the moment of button press, the most
~ central line in the field of view 54 is designated as the tracking line. The IRA system I m~int~;ns this
clecign~fion as long as the tracking line of text remains within the field of view 54. Through feedback to
the user through the tracking display, the IRA system I directs the user to move the mouse 2 so as to
m~int~in the tracking line centrally in the field of view 54.

CA 0224~769 1998-08-07
W O 97/30415 PCT~US97/02079
.
Ln Fig. 5a, the text substring "THE Q" is located in the most central vertical position, as determined
by the track line Y locator 37, which transmits this information to the tracking display driver 51 Pressing
and holding the button 11 de~ign Itt~s this line as the tracking line, indicated by the bold font of the text.
Because the tracking line text is located only within the right-central field of view 54, only the right-
central tactile stimulator 73 is energized. The speaker 13 will have spoken "the."
In Fig. 5b, the tracking line "T~ QUICK BROWN FOX JUMPED" spans both left and right central
zones of the field of view 54, and is recognized by virtue of the substring "U~CK BROWN F" within the
field of view 54. Thus, the left-central tactile stimulator 67 and the right-central tactile stimulator 73 are
energized. As the track line retains elements from the fast movement of the button 11 push, "THE Q"
10 is also bolded. The speakers 13 will have spoken "the quick brown."
In Fig. 5c, the tracking line "THE QUICK BROWN FOX JUMPED" spans both left and right upper
zones of the field of view 54. Thus, the left-upper tactile s~iml]l~tor 65 and the right-upper tactile
stimulator 71 are energized. Note that in the track mode, the tracking display 53 responds only to the line
designated and being constructed as the track line for as long as this track line continues to remain in the
15 field of view 54, even though "OVER THE LAZY DOG." is currently in a more vertically central
location. The advantage of this algorithm is that it allows the user to recover from large mis-tracking
errors and return the track line to a central position in the field of view 54, so that he may read this line
continuously to its end. The speakers 13 will have spoken "the quick brown fox."In Fig. 5d, the tracking line "THE QUICK BRO~N FOX ~UMPED" in now located only within the
20 le~c-upper zone of the field of view 54. Thus, the left-upper tactile stimulator 65 is energized. Note that
the tactile stim~ tor is energized, even though only a single letter remains in the field of view 54. The
speakers 13 will have spoken "the quick brown fox jumped."
Obtaining a response from only a single tactile stimulator alerts the operator that the field of view 54
-contains either the beginning or end of a single line, or a small, isolated text element. In Fig. 5a, the
25 response from a single tactile stimulator in the right column suggests that the field of view 54 contains
the b~ginning of the line of text. In the tracking display response shown in Fig. 5d, the response from
a single tactile stimulator in the left column suggests that the field of view 54 contains the end of a line
of text.
The tactile display 53 alerts the operator that the skew angle has been ç~ceedçd in track mode also.
30 In the tracking display response shown in ~ig. 5e, the track line is in the central zone of the field of view
54 but there is a skewed presentation angle. As before, the track line text has been bolded and causes the
Ieft-central tactile stimulator 67 and the right-central tactile stimulator 73 to energize. Additionally, the
tracking display driver 51, with input from the skew detector 47, energizes in a periodic pulse mode the
tactile stimulators 69 and 71 to indicate the skew of the track line text. As with the search mode, in order
24

CA 0224~769 1998-08-07
WO 97/30415 PCTIUS97102079
to ~i~tingllieh the skew correction signal in the tactile stimulators 69 and 71 from the central text locator
signal in the tactile stimnl~t~-rs 67 and 73, the energizing action of the tactile stimulators 69 and 71 is a
periodic pulse coordinated with a "chirp" audio output from the speakers 13, rather than a continuous
vibration, as indicated by the unique energization symbols for tactile stimulators 69 and 71.
S When multiple lines of text are being read, the operator 55 will want to be able to find the next line
of text following that which the IRA system 1 has been tracking. In order to do this, he must by some
means reposition the mouse 2 at one carriage space below the beginning of the previously tracked line.
This is accomplished by a variety of means. In the first (manual) means, the operator 55 may leave a
positioning finger on the hand not operath1g the mouse 2 at the beginning of the ~iu~ "ly tracked line.
When the end of the line is reached, the mouse 2 is repositioned relative to this finger and then shifted
down one line. Before this new line is tracked, the positioning finger is recentered on the current line with
the IRA system 1 mouse. This technique, though manual, is very intuitive and can be fairly compared
with a child's beginning reading technique.
In the second (computer-assisted) means, the IRA system 1 may intelligently assist this manual
technique with an indication of the beginning of a line previously read. When the button 11 is pressed,
the IRA system 1 commences to remember the entire line of text read during the button push, including
the first word of the line. When the button is released, the IRA system I goes into the search mode, as
described above. During this search mode, if the first words of the most recent track line, which is stored
in IRA system I memory, is encountered, the operator is alerted with a combination of background-level
audio, visual, and tactile stimuli. This stimuli includes a brief flash of all LEDs, a brief and uni~ue chirp
audio output from the speakers 13, and a brief pulse from all tactile stimulators on the tracking display
53. This background is over laid on the conventional tracking driver output. It is understood that within
the spirit of this invention, these stimuli may take many forms. When the operator encounters and
- recognizes the first word of the line previously tracked, he may then index the mouse 2 down to the next
a~ nt line to begin reading text from this next line.
It is understood that different information can be communicated from the tactile stimulators to the
operator through different vibratory modes. For example, in order to distinguish between the central tactile
sfim~ t-)rs 67 and 73 and the other tactile stimulators, the vibrational frequency of the central tactile
stimulators may be different from that of the other tactile stimulators. This additional communication
mode enhances the human operator's ability to intuitively respond to the information, particularly since
- propriosensation in the fingertips has low spatial discrimination. In our experience, frequencies in the
range of 6 to 65 Hz offer both authoritative sensation as well as a range which may be easily discriminzltl d
by users. Square-pulse energization provides an impact wave which appears to be well sensed by users
in both position and frequency.

CA 0224~769 1998-08-07
WO 97/30415 rCT/US97102079
Alternative Modes of Operation
In different circumstances, the operator may wish the IRA system I to specialize its function. This
requires the IRA system 1 to enter different modes of operation. In order for the operator to select
between these .li~ L operational modes, a number of dir~l~lll input control means are available. The
5 preferred methods of operator mode selection are the use of mode selector buttons on the mouse 2, as well
as verbal mode selection governed through the colll~llL~,. 3 via a voice recognition program.
Fig. 6 depicts an isometric view of the mouse 2. The tracking device is located on the upper forward
aspect of the mouse 2, and forms a curved depression into which the operator 55 places his index or
middle finger. The distal pad of the finger rests on the tactile stim~ tors 65, 67, 69, 71, 73, and 75, in
such a manner that the LEDs 107, 109, 111, 113, 115, 119 may be viewed on each side of the finger.
The finger can be extended to depress the mode scroll buttons 103 and 105. The scroll buttons 103
and 105 control the selection of altemative modes of operation. The mode scroll-up button 103 steps
forward through the available modes, while the mode scroll-down button 105 steps baclc through the
available modes. Pushing both buttons simultaneously returns to norrnal read mode, as has been
previously described. As alternative modes are selectr~, the entry of each mode is vocalized through the
audio speakers 13 by naming the mode entered.
Alternatively, alternative modes can be chosen vocally. The microphones 63 are located on the
forward, lateral aspect of the mouse 2. As the operator vocalizes the name of a mode, his voice is picked-
up by the noise-canceling microphones 63, and sent via the cable 5 to the ~iOlllpuL~l 3, where it is analyzed
by a voice recognition program such as the "Watson" program from AT&T. When the name of a new
mode is det~cte~l~ the I~A system 1 confirms the mode selection by vocalizing the name of the mode
through the speakers 13.
Fig. 7 presents a flo~-diagram of the control and functionality of the alternative modes of operation.
Selection of alternative modes are made as described above through either the mode scroll-up button 103
and the mode scroll-down button 105, or through a voice recognition program 45 which uses input from
the microphones 63. This information is illtegl2lL~d through a software mode selector 145. Depending on
the mode selected, IRA system 1 function is altered as described below.
Spell Out Mode
In spell-out mode, the letters in a word are spelled out, rather than enunciated by the IRA system 1.
This mode is of particular use when reading serial numbers, garbled text, foreign text, technical
documentation, or names. The speech synthesis algorithms in current use may fail with unusual spellings
or highly technical language such as are- frequently found in English text, and that are often non-phonetic.
When a word is encountered by the operator which is not familiar to or understood by the operator, the
operator can have IR~ system I spell out the words. The internal operation of this mode is activated
26

CA 0224~769 1998-08-07
WO 97/3041S PCT/IIS97/02079
through the mode selector 145 which changes the way in which the novel word detector 59 operates.
Instead of sending entire words to the speech synthesi7~?r 61, individual letters are rapidly vocalized.
Bar Code Mode
In bar code mode, images from the image pre-processor 29 are sent not to the optical character
S recognition program 31, but instead are sent to a separate program which is specialized for the
hll~ cl~Lion of bar and space codes, such as are commonly provided on retail products, forms, etc. Bar
code illlt;l~ ,L~ion programs, such as the PDF1~00 software produced by Symbol Technologies of
Bohemia, NY, are widely available, and may be adapted for incoll~ol~Lion into the IRA system 1, and may
also be performed utilizing a special library for the optical character recognition program 31. Using this
alternative means, the mode selector 145 alters one of a plurality of OCR recognition libraries 159so that
the OCR program 31 is optimized to hlL~lyl~L bar code labels. In such a mode, the tactile stimulators
respond only when a bar code is within the field of view 54. The bar code mode of operation can be
coupled with a ~ t~h~e, so that instead of simply returning the code digits to the operator, the identity
of the m~n~lf~c1~-rer and product are vocalized. Such a mode of operation is particularly useful for blind
Op~;ld1Ol~ in grocery and department store shopping, and may be modified for use by blind op~LaL(Jl~ in
a variety of occupational contexts. It should be noted that alternative means of bar code decoding are
witbin the spirit of the invention.
Big Print Headline Mode
The optical character recognition program 31 is optimized in most circumstances for a field of view
54 that contains a multiplicity of letters. In addition, the optical character recognition program 31
generally recogni7~c letters within a certain range of font sizes. When reading very large text, such as is
found in newspaper or m~g~7in~ he~-1lin~, book titles, building directories, It;~L~ul~llL menu headings,
elevator button designations, or supermarket displays, the fonts will be very large, and may be so large
-that the individual symbols will not fit within the contact field of view 54 of the IRA system 1 camera
2~i 25. In cases where the letters fit within the field of view 54, big print he~t~lint? mode will switch optical
chdl~;Lt;l recognition program parameters so as to be able to interpret the large fonts encountered. In
certain cases, a image pre-processing software program may scale the image in order to make the
symbology more easily hlte.~l~,.~d by the OCR program 31. In cases where the letters do not fit within
the contact field of view 54 of the IRA system I camera, the operator 55 may draw the mouse 2 back
from the surface, modify the focus on the lens 23 of the camera 25, and thus expand the field of view 54
so as to enable the letters to fit. ~he natural field of view 54 divergence within the preferred lens system
23 is thir~y degrees so that the area covered expands with increasing distance from the window. The OCR
recognition library 159 is altered in both cases through the mode selector 145 so that the OCR program
31 is optimized to interpret large font text.

CA 0224~769 1998-08-07
WO 97/30415 PCT~US97/~2079
Adjusting the focal plane of the IRA system I lens 23 may be accomplished by two means. Firstly,
a lever 160 extending to the exterior of the mouse 2 may be provided and attached to the lens 23. This
allows the user to rotate the lens 23 having a coarsely-threaded barrel so as to adjust the position of the
lens 23 relative to the camera 25 to change its focus. Alternatively, the lever may pull additional optical
S elements into or out of the optical path, thereby adjusting the optical focal ~ t~n~e
Currency D~nomin~tion Mode
In many business and social contexts, blind and low-vision users will need to handle paper currency.
Such currency does not have standard font symbology that is hlL~ lable by the OCR program 31. In
such cases, special symbology h~ lalion libraries must be utilized to ~ ting~ h between different
currency denominations. Thus, in the ~ .lcd embodiment, in currency denomination mode, the mode
selector 145 alters the OCR recognition library 159 so that the OCR program 31 is optimized to interpret
currency denominations. It should be noted that alternative means of currency (~n~-min~tion identification
are within the spirit of the invention. Instead of utilizing the alternative OCR recognition library 159,
images may be transferred to a currency identification program entirely separate from the OCR program
31. Such a program could, for example, scan for images which correlate with stored images of different
currency denominations.
Medicinal Mode
OCR programs are never l~Q% accurate, and in some cases they may correct for such inaccuracies
by ch~king individual words against dictionaries of words in common usage. Due to the necessity of
accurate and reliable hlL~ Lion of medical p~ck~in~ inff)rm~tit n, it is desirable in this instance to
check the information read on medicinal p~-'k~ against standard medicinal usage. In medicinal mode,
the mode selector 145 alters an OCR program word-checking dictionary 157 to specialize in medicinal
vocabulary, or activates the OCR program dictionary 157 if such dictionary verification is not otherwise
used. In addition or alternatively, the mode selector 145 alters a plurality of OCE~ program parameters
155 to increase the confidence level at which the OCR program matches symbols, so that weak matches
are not vocalized to the user, preventing the trancmi~ion of false information.
Read Color Mode
Blind and low-vision users have expressed a strong interest in color discrimination and identification.
Many vision-impaired individuals wish to ensure that their dress is unobtrusive, and therefore desire to
have clothes that are color-coordinated or socks that match. Read color mode disables the optical character
recognition program, and in its place, the IRA system I analyzes the color information, such as hue and
density, of the image in the field of view 54. This is accomplished in a manner described in more detail
below using a black and white camera using methods such as switched single-color ilh-min~ti~-n. In this
method, reflected light is quantified when the illllmin~tors 15 are switched between different color
28

CA 0224~769 1998-08-07
WO 97/30415 PCTIUS97/02079
ilhlmin~tion. By comparing the reflected light with dirl~ colored illumination, the hue and density of
the surface can be determined. The information is compared with a library of color inforrnation, and the
color design~tion is vocalized to the operator via the speech synthesizer 61 and the audio speakers 13.
In read color mode, the mode selector 145 activates a color discriminator 147 which, as a color is
5 identified, vocalizes the name of the color from a standard library of color terms through the speech
synfh~ci7f~r 61 to audio speaker 13 output.
Braille Read Out Mode
Tactile stim~ tclrs 65, 67, 69, 71, 73, and 75 may function to display individual Braille characters,
since the standard Braille cell includes two columns of three rows, corresponding to the layout of the
10 tactile stimulators. In the preferred mode, when Braille read out mode is selected, the IRA system I
searches and tracks text in the normal manner described above. After a line of text is tracked with the
button 11 depressed, the operator may access this information in Braille by depressing the button 11
rapidly twice ("double-clicking"), or alternatively the simultaneous push of both buttons I 1, in which case
the letters of the previous track line are translated into Braille equivalent patterns of vibrating tactile
1~ stim~ tors~ and presented serially to the operator. In our experience, transmission rates of three to four
letters per second are practical for skilled Braille readers using a single vibro-tactile Braille cell and these
rates are acceptable to these readers. In Braille read out mode, the mode selector 145 directs an output
switch 149 to redirect text output to a software Braille translator 151. The Braille translator 151 pauses
until the double-click input of the button I I indicates that the operator is ready for Braille output, at which
20 time the Braille translator 151 sends the sequence of letter Braille-equivalents for output to the tracking
display driver 51.
CCTV Mode
For low-vision users, table-mounted electronic magnification devices have been found to be quite
useful for reading flat documents, and are available from a number of manufacturers such as Xerox,
25 Telesensory and HumanWare. Portable electronic-magnification units are also available from companies
such as Magni-Cam, but suffer from the difficulty that hand tremors from individuals utilizing the devices
cause magnified movements in the field of view 54. Such tremors are particularly frequent in the elderly
population suffering vision loss from macular degeneration and other diseases of advanced age. If the
resultant unwanted movements in the image resulting from tremors could be elimin~t~ such a portable
30 magnification system as part of the IRA system I would allow low-vision users to hlt~ photographs,
~ graphs, handwriting or fonts which are unhl~ ble by the OCR program. Furthermore, the device
could serve medical purposes, such as self-~min~tion.
In CCTV mode, the video signal is transmitted as in normal operation by the IRA system 1 camera
25 to tl1e image-preprocessor 29 which digitizes the image, converting the camera signal from analog to
29

CA 0224~769 1998-08-07
WO 97130415 PCT/US97/02079
digital information. In CCTV mode, the digital image, in addition to being L.~l~r~ d to the optical
chaldclcl recognition program 31, is additionally tr~ncminf~d to a CCTV driver 143 for display on a flat-
panel display 119 that is mounted to the face of the computer 3. In other embodiments, the flat panel
display 119 may alternatively be a computer CRT monitor or a standard television that is connected by
a video tr~nemi~sion cable to a connector on the colll~u1~1 3. Fig. 8 depicts an Independent Reading Aid
with the flat panel display 119 mounted on the lateral aspect of the col~-put~l 3. A subject image 121 is
placed l-nd~rne~th the mouse 2, wherein the camera 25 transmits this image 121 to the computer 3. The
image 121 is received by the image pre-processor 29 where it is ~ yl, and in addition to being sent
for input to the optical character recognition program 31, the signal is also sent to the flat panel display
119, where a magnified image of the subject image 121 appears. The activation of the flat panel display
119 is normally controlled via software mode selection, although a hardware flat panel display switch 123
on the colll~JuLc. 3 case offers an alternative method of tqn~in~ magnified image augmentation.
The digitized image, before being tr~n~mitt~d to the flat panel display 119, may be modified by a
number of image Pnh~n~-ement algorithms. Such algorithms may include contrast enhancement, image
inversion, or color balancing, such as developed specifically for low-vision applications using the CE-3~00
processor developed by Digivision of San Diego, CA. Because many elderly people suffer from hand
tremors, software within the IRA system I computer may also be used to stabilize the image, using
algorithms that are in common application in video surveillance, military target tracking, and video
camcorders.
Special Search Mode
In many ci-.i~ c~ a user needs to search through a quantity of text in order to identify a specific
piece of information. In general, such in~l~ldlion is acco.~ anied by a special word or symbol. For
exarnple, when shopping for clothes at a department store, the price will be the only infonnation on a tag
preceded by a dollar sign -- through this search feature, prices could be rapidly located and vocalized.
In another example~ a utility bill is cha.ac1~ cd by a complex combination of text, tables, and
considerable extraneous information. Yet, the amount to pay is commonly preceded by a dollar sign or
the word "pay", which can serve as a beacon for the user in finding this information. The special search
mode speeds the identification of sections of text that may subsequently be read in detail.
A list of default special words may be stored in or custom programmed into the IRA system I by the
user. When in the above-described normal search mode, the IRA system 1 hllcllJlcl~ each image frame
and hlLel~J,cL~ the image using the OCR program 31. Each set of OCR-interpreted text is scanned for
correlation with the list of default special words. Whereas in normal search mode no text is vocalized and
the primary feedbacli to the user is through the tracking display 53, in special search mode, when one of
the special words is encountered, its presence is announced to the operator by vocalization of the word

CA 0224~769 1998-08-07
W O 97/3041S PCTrUS97/02079
through the speech synth~i7~r 61 and the audio speakers 13. This allows the operator to rapidly screen
through text for symbols or words of particular inte}est. In special search mode, the mode selector 145
directs the novel word detector 59 to disregard any novel words unless they are also present in a special
search dictionary 153. Additionally, the mode selector 145 activates the software switch 43, so that text
from the track line text outputter 39 is continuously fed to the novel word detector 59, even when the
operator is operating the IRA system 1 in the search mode. If the user wishes to override or add to the
words or symbols in the special search dictionary 153, he may spell a special word or say a special symbol
into the microphone 63, which through the voice recognition program 45 is i.ll~ L~d and tr~n.~mitted
to the special search dictionary 153.
Alternative special-search modes can discriminate on any basis other than content, including any
distinction that can be discrimin~fecl by the c~ UL~-. Such distinctions could include font type, color,
or forrn~1ting~ such as italicization or bolding.
Continuous Read Mode
When quickly scanning through a complicated or voluminous text, the user is often initially screening
the page for ~ s~--L~lional word content, rather than tracking entire lines of text. In such a case, when
the IRA system 1 is in continuous read mode, when the button 11 is depressed, the IRA system 1 does
not attempt to track lines of information, but rather speaks individual words whenever novel words are
encountered in the center zone of the fie~d of view 54. In such a case, the IRA system 1 vocalizes words
as in track mode, but the tracking display 53 operates as if it is in search mode. In continuous read mode,
the mode selector 145 directs the track line text outputter 39 so that all complete words are output to the
novel word detector 59, whether or not they are located within the current contiguous track line.
Computer Interface Using The IRA System
The IRA system I may be used to access computer software and digital information from sources such
as the World Wide Web and electronic mail. This will be of great usefulness in the employment of blind
-)c .,.. ,1 1~ c ~r~ inrrr~cinoiv imnr~r~nt in e~ ent ~n~i everv~v life In ~tlfliti~n

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
emission sensor 27 is located in the mouse 2 within direct view of the text 7 to be read. The CRT
emission sensor 27 includes a photo diode 141 connected to the amplifier 125 which inputs the ambient
light measured by the photo diode 141 to a conventional CRT scan detection circuit 127 (constructed from
a phase lock loop IC CD4046 from National Semiconductor Corporation of Santa Clara, California) that
is adjusted to respond only to the common screen refresh rates of sixty to seventy-five Hz. This circuit
screens for periodic variations in the input light that are chala~iL~ ,Lic of common CRT screen refresh
rates. Televisions and video monitors typically have a characteristic and standardized ~ min~tion
frequency and duty-cycle. If the time variations correspond to the illllmin~tion signature of a cathode-ray
display terminal, a timing signal is tr~n~mittt d to a camera timing driver 129, which ~ting~ hes the
illllmin~tors 15, so that this illumination does not compete with the CRT screen illumination, and
synchronizes the camera 25 image capture with the illumination period of the CRT scan. The
synchronization is accomplished by delaying image integration 135 until one of a plurality of text
illnmin~tit)n periods 131 is rletected~ as described below.
Fig. 10 graphically depicts timing relationships between the detected signal of the photo diode 141 and
the camera timing signals output by the camera timing driver 129. The vertical axis of the graph
represents the light intensity in the IRA system 1 field of view 54 detected by the photo diode 141. Given
the small field of view 54 of the camera 25, the view is ilhlmin~tPd for only a small fraction of the CRT
scan. The illumination periods 131 are interspersed with a plurality of quiescent periods 133, when the
field of view 54 is dark. During one illumination period 131, corresponding to the image capture period
135, the camera timing driver 129 directs the integration of the optical image by the camera 25. During
subsequent quiescent periods 133, the camera image is read out to the image pre-processor 29, followed
each time by a camera reset period 139. It should be noted that at common CRT refresh rates, the camera
25 can at most capture every second refresh of the CRT screen, so that during some illnmin~ion periods,
the camera 25 does not capture images.
When the CRT emission sensor 27 no longer detects scan characteristic illumination signatures, the
camera timing driver 129 leenc.~ s the illuminators 15, and returns the camera timing to normal image
capture mode. The use of the CRT emission sensor 27 and the camera timing driver 129 allows the IRA
system 1 to read text from CRT screens, and other pulsed illumination screens. The modes of the IRA
~ system 1 operation with textual information displayed on CRT screens in this manner is ~ n~ics~l with that
of textual information that the IRA system I would encounter on conventional printed surfaces.
In a second mode of operation, the IRA system 1 system is physically connected to a target or host
computer so that it can both read and provide a point-click-and-drag capability like a co~ ,ul~l mouse.
In this case, the IRA system I fimctions as a peripheral input/output device through a serial interface,
which is shown as a system depiction in ~ig. I l. The computer 3 is connected through a communication
32

CA 0224~769 1998-08-07
WO 97/30415 PCT/US97/02079
cable 161 to a target colll,uul~l 163. The target computer 163 must be loaded with IRA system 1 interface
software to allow the interactions which will be described below. While the communications cable 161
may conveniently be a serial communications cab}e, so that it is more compatible with standard computer
mouse drivers, the communications cable may also be a parallel or other mode of communication so as
to take advantage of higher communication rates.
A coordinate pad 165 is imprinted with a regular grid of unique ~lph~nl~merjC symbols. Such symbols
could be pairs of letters, in which all pairs in the first row have "A" as the first letter, and in the second
row, all pairs have "B" as the first letter, and so forth. In the first column, all pairs have "A" as the
second letter, and in the second column, all pairs have "B" as the second letter, and so forth. This two
letter d~ci~n~tion works well, since it can be easily h~ L~d by the OCR recognition program 31, but
a variety of other alphanumeric and graphical symbologies are possible. The symbologies may be
graphical, in which case either a unique program must be utilized to hlL~ L them, or the OCR
recognition library 159 must allow the OCR program 31 to discriminate the symbols.
The arrangement of symbols on the coordinate pad 165 are such that whenever the mouse 2 is located
on the pad, at least one of the unique symbols is fully within its field of view 54, and able to be translated
using the OCR program 31. As the mouse 2 is translated over the coordinate pad 165, the OCR program
31 determines the identity, location and skew of the symbol within the field of view 54. Using this
information, IRA system 1 can compute the location of the center of its field of view 54 relative to the
coordinate pad 165 frame of reference. This information is used as an absolute cursor positioning
coordinate, which is tr~n~mi1tf d to the target collll~ulel 163 over the communications cable 161. As the
operator translates the mouse 2 over the coordinate pad 165, the position of the cursor on the target
colll~uL~, 163 is continuously updated. It should be noted that this mode of operation is ~lirr~,~;l" from
that of most computer mouses. which function as relative positioning devices. The IRA system 1
computer input as described above functions more closely to that of a ~iigiti7ing tablet, which sends
absolute positions to the target collll~uL~l. Absolute positioning is important for blind and low-vision
co,l,~ul~, users, since they do not receive the visual feedback of the current location of the mouse.
Therefore, absolute positioning allows such users to rely on kinçsthetic positioning cues, because the
position of the cursor on the screen is directly related to the haptic or tactile-kinesthetic-sensed position
of the mouse 2 on the physical coordinate pad.
So that IRA system I can provide additional information about the current status of the graphical user
~ h~ ~ce located on the target computer, it is preferred that the target computer 163 transmit pixel
information from the immediate vicinity of the cursor to the computer 3. The cc)lllpuLI;l 3 will treat this
image data in the same manner as that gathered from operation of the camera 25. This means that the text
underneath the cursor can be converted into speech feedback for the operator, and further, that lines of text
33

CA 0224~769 1998-08-07
W O97/30415 PCT~US97/02~79
can be tracked by the operator in conjunction with the operation of the tracking display 53. While often
it is the case that lines of text are already known to the target col~l,u.llel in line oriented ASCII text format,
available for speech output on the target computer, an increasing amount of text inforrnation, such as that
located on the World Wide Web, is graphical in nature (bit-mapped), and unavailable as ASCII text to
S normal target computer operation. Therefore, even when the target collll,ut~l has software optimized for
the use of blind and low-vision operators, the described methods provide the only known access to these
modem information sources.
Use of the IRA system I as an input/output interface for graphical user interfaces on target culllpuL~
is ~nh~nred by the use of the special OCR recognition libraries 159 that recognize specific features of the
graphical user interface, in the manner of the special search mode previously described. In addition to
standard fonts, these recognition libraries recognize such features as radio buttons, scroll bars, window title
bars, and special toolbar button icons. Special search mode in this case is be activated in the manner
described above for alternative modes of operation. Note that this system as described allows blind or
low-vision users to identify objects, drag and place objects, pull down and select menus, as well as locate
and edit existing text, hl~e~iLi~!e of the presence of adaptations of the operating system of the target
colllp.llcl to assist low-vision or blind users.
The co~ uLc. 3 de~rminec the position of the mouse 2 on the encoded pad 165 and sends X-Y
coordinates, which are normalized to the pad dimensions, to the target computer 163. The target computer
163 knows the size of its own screen (in display pixels) and can compute (through direct scaling) the
cu,.~,s~ollding X-Y position on its screen. Optionally, the target collll3uL~,l 163 may place its cursor at this
location.
The target computer 163 then provides the video data corresponding to each of the pixel values in
the nearby vicinity of the cursor (e.g., an array of pixels centered around the cursor position, proportional
in size to the field of view 54) to the computer 3. The computer 3 (through an alternate image mode of
the image pre-processor 29) enters this video data into the OCR program in-between uses of the OCR
program to recognize the symbology on the coordinate pad 165. The h.le.~.cLcd data from this video data
is used to drive the tactile display 53 and to provide words and symbology (e.g., icons) for the IRA system
1 to process into audible words for the user to hear. In between the reading and inlcl~cL~Lion of the video
data from the target computer 163, the OCR program's i~lL~ cL~Lions of the coordinates or coded
symbology on the coordinate ~ad 165 are not voiced or used to drive the tactile display 53. They are used
solely to derive the cursor position coordinates for the target computer 163.
Switched Single-Color Illumination
In many cases it is an advantage to be able to detect colored text on colored backgrounds, such as in
food Fs~k~ging or m~g,~7in~ advertisements. Such text may be difficult to discriminate in white-light
34

CA 0224~769 1998-08-07
WO 97/30415 PCT/U~97/02û79
imaging systems. In such a case, it is ~ dL~; to use a multiplicity of colored LED lamps, such as red,
green, blue, and infra-red, which combine to function as the ;lh-min~t-rs 15. Such a collection of colored
lamps may be used together or with one color at a time. For example, the computer may determine the
average contrast of the image elements using different color illuminations, and choose for further
5 processing those images with the highest contrast. In addition, the use of switched single-color
ilhlmin~tion provides the ability to (li~ting~ h colors within the image on the basis of dirre~cllLial
reflectivity in dirr~l~,..L color illumination.
Fig. 16 depicts a sr.hPtn~tic of using switched single-color illumination to enhance the reading of
colored text. Tlhlmin~ors 15 are replaced in this embodiment by banks of colored LEDs, comprising a
plurality of red LEDs 249, blue LEDs 251, green LEDs 253 and infra-red LEDs 255. A synch-stripper
257 extracts the timing information corresponding to exposure periods from the camera 25 by using
widely-available inte~;lal~d circuitry, available as Part No. LM 1881 from National Semiconductor
Corporation of Santa Clara, California, as well as auxiliary output from frame-g~àbb~l 7 This information
is tr~ncmit~Pd to a colored LED sequencer 259, which activates the illumination of the colored BEDs 249,
251, 253, and 255 to synchronize with the camera 25 exposure periods. The sequencer 259 is programmed
to activate one color set of colored LEDs for a period of time during each successive exposure period, so
that during each exposure, only a single color of illumination is used. This illumination period may be
adjusted to compensate for the reflectivity of the surface on which the text is ~ min~ted but in general
will be timed to be less than three milli~econds so as to prevent excessive blurring of the image from
20 movement of the mouse during the time of exposure. The sequencer 259 may be a separate electronic
circuit, but is conveniently embedded within the functions of the FPGA 173 located within the mouse 2.
Image output from the camera 25 is sent to the image pre-processor 29, which stores separately a red
field 261, a blue field 263, a green field 265 and an infra-red field 267. The pre-processor 29 assesses
the relative variations in intensity in the di~J~.11 fields, which corresponds roughly to the contrast values
25 of the images. The better contrast images are then sent to the OCR program 31 for OCR interpretation.
Further, the relative total brightness recorded in the fields 261, 263, 265, and 267 are used to determine
the dominant color in the field of view 54.
The foregoing description is considered as illustrative only of the principles of the invention.
Furthermore, since numerous modifications and changes will readily occur to those skilled in the art, it
30 is not desired to limit the invention to the exact consLlu~_lion and process shown as described above.
Accordingly, all suitable modifications and equivalents may be resorted to falling within the scope of the
invention as defined by the claims which follow.
Benefits and Advantages of IRA system I
The invention provides a number of advantages to low-vision and blind users:

CA 0224~769 1998-08-07
W O97/30415 PCT~US97/02079
~ The IRA system 1 is portable, allowing users to read labels in food stores, price tags in department
stores, menus in restaurants, recipes and package instructions in kitchens, medicine bottles in
bathrooms, currency denominations in taxicabs, and schedules for buses. The IRA system 1 is
particularly amenable to minialu.i~lion since it requires only inch-sized images to operate.
5 ~ The IRA system 1 furnishes users with spatial feedback on the location of text, as well as the text
content, thereby providing users with crucial infonnation embodied within the layout of the text. For
example, using only information about the spatial diskibution of words and not their specific identities,
users can tell the number of columns, the paragraph structure, the existence of isolated words (as in
a title), or the location of a page number, location of page headers, and more. This information is
critical to rapidly determine the general page content, whethcr the user would want to read the page,
and where the information lies. Consider a utility bill -- this special format contains volumes of
information without interest to the typical user who only wants to know the amount to pay. Using
the Il~A system 1, the user may quickly "feel out" the page and read only the desired information.
Furthermore, to scan rapidly through a book for a specific page, the IRA system 1 does not require
the user to scan entire pages, but can tactilely guide himself directly to the location in which the page
number is printed. The IRA system 1 does not simply read the text, it interacts with the user to
~et~rmine from spatial locations where the text that he wants to read is located. Current print reading
devices force the user to manually scan through the text by lictf~ning to large amounts of irrelevant
text.
20 ~ The physical shape and cha.acl~. of the mouse 2 allows the user to read text from curved and angular
surfaces on a variety of objects, such as medicine bottles, food cans, soft p~k~gf c as well as flat
objects.
~ If text is located on a fixed or difficultly-moved object, such as the product label on an appliance, the
- buttons on a microwave oven, or a tag on a chair, the IRA system 1 can be taken to the text, rather
than requiring the text to be taken to the scanner.
~ The IRA system 1 can read computer screens, allowing users to access such increasingly common
devices as automated teller machines, computerized library catalogs, and computer kiosks, not to
mention enabling access to personal colllpllLel~ without the need for handicapped-access software --
which, when available, is often inadequate.
30 ~ The IRA system 1 may be operated with one-hand, leaving the other hand free to manipulate the
object being read. This coordination between the user and the object to be read results in significant
gains in both speed and usability.
~ The IRA system I interface is intuitive in the presentation of tactile and aural feedback to the user,
coordinated with the location and content of the text. The camera 25 and speakers 13 are located in
36

CA 0224~769 l998-08-07
W O 97/30415 PCT~US97/02079
.
the mouse 2, placed to capitalize on the user's natural attention facing on the text. This placement
also provides aural as well as tactile or haptic feedback as to the location of the inforrnation. This
intuitive channeling of feedback is critical in making the IRA system l easy to learn and natural to
operate.
S ~ Because the IRA system l re~uires no keyboard, no screen, or no large page s~nning app~lldLus, it
is in~r.~ncive to produce, which is a particularly important characteristic for a device serving a
handicapped population having modest income.
The features and advantages listed above are, to our knowledge, not available in any existing device.
It is significant that this combination of features may contribute greatly to the employability and
independent-living capability of vision-impaired individuals.
An IRA system 1 prototype has been constructed with many of the features listed. In tests with both
young and elderly, and with both low-vision and totally blind individuals, the device was ~uickly learned
and accepted, allowing users to read common print on flat and curved surfaces. In testing of the device
sponsored by the Department of Education, four young, blind users gained the ability to use the device
on both flat paper and cylindrical objects within approximately 20 minutes of training. In testing
~u~JpulLillg by the National Institute of Aging, seventeen users averaging 80 years of age evaluated the
device during 2 hour training sessions. Of these subjects, 76% found the device mostly or completely
easy-to-use and 82% felt that the device would be useful in daily life and would wish to own one when
fully developed.
It should be ~pal~nl to one skilled in the art that the above-mentioned embodiments are merely
illustrations of a few of the many possible specific embodiments of the present invention. Numerous and
varied other arrangements can be readily devised by those skilled in the art without departing from the
spirit and scope of the invention, which is defined by the following claims.
37

Dessin représentatif

Une figure unique qui représente un dessin illustrant l'invention.

États administratifs

2024-08-01 : Dans le cadre de la transition vers les Brevets de nouvelle génération (BNG), la base de données sur les brevets canadiens (BDBC) contient désormais un Historique d'événement plus détaillé, qui reproduit le Journal des événements de notre nouvelle solution interne.

Veuillez noter que les événements débutant par « Inactive : » se réfèrent à des événements qui ne sont plus utilisés dans notre nouvelle solution interne.

Pour une meilleure compréhension de l'état de la demande ou brevet qui figure sur cette page, la rubrique Mise en garde , et les descriptions de Brevet , Historique d'événement , Taxes périodiques et Historique des paiements devraient être consultées.

Historique d'événement

Description	Date
Inactive : CIB expirée	2022-01-01
Inactive : CIB expirée	2022-01-01
Inactive : CIB expirée	2022-01-01
Inactive : CIB expirée	2013-01-01
Inactive : CIB de MCD	2006-03-12
Inactive : CIB de MCD	2006-03-12
Inactive : CIB de MCD	2006-03-12
Inactive : CIB de MCD	2006-03-12
Demande non rétablie avant l'échéance	2001-02-12
Le délai pour l'annulation est expiré	2001-02-12
Réputée abandonnée - omission de répondre à un avis sur les taxes pour le maintien en état	2000-02-11
Modification reçue - modification volontaire	1999-03-22
Inactive : Supprimer l'abandon	1999-02-22
Inactive : Abandon. - Aucune rép. à lettre officielle	1999-01-19
Inactive : Acc. réc. RE - Pas de dem. doc. d'antériorité	1998-12-14
Symbole de classement modifié	1998-11-06
Inactive : CIB attribuée	1998-11-06
Inactive : CIB attribuée	1998-11-06
Inactive : CIB en 1re position	1998-11-06
Inactive : Notice - Entrée phase nat. - Pas de RE	1998-10-19
Demande reçue - PCT	1998-10-13
Exigences pour une requête d'examen - jugée conforme	1998-10-13
Toutes les exigences pour l'examen - jugée conforme	1998-10-13
Requête d'examen reçue	1998-10-13
Demande publiée (accessible au public)	1997-08-21

Historique d'abandonnement

Date d'abandonnement	Raison	Date de rétablissement
2000-02-11

Taxes périodiques

Le dernier paiement a été reçu le 1998-12-07

Avis : Si le paiement en totalité n'a pas été reçu au plus tard à la date indiquée, une taxe supplémentaire peut être imposée, soit une des taxes suivantes :

taxe de rétablissement ;
taxe pour paiement en souffrance ; ou
taxe additionnelle pour le renversement d'une péremption réputée.

Les taxes sur les brevets sont ajustées au 1er janvier de chaque année. Les montants ci-dessus sont les montants actuels s'ils sont reçus au plus tard le 31 décembre de l'année en cours.
Veuillez vous référer à la page web des taxes sur les brevets de l'OPIC pour voir tous les montants actuels des taxes.

Historique des taxes

Type de taxes	Anniversaire	Échéance	Date payée
Taxe nationale de base - petite			1998-08-07
Requête d'examen - petite			1998-10-13
TM (demande, 2e anniv.) - petite	02	1999-02-11	1998-12-07

Titulaires au dossier

Les titulaires actuels et antérieures au dossier sont affichés en ordre alphabétique.

Titulaires actuels au dossier
JAMES T. SEARS

Titulaires antérieures au dossier
S.O.

Les propriétaires antérieurs qui ne figurent pas dans la liste des « Propriétaires au dossier » apparaîtront dans d'autres documents au dossier.

Documents

Pour visionner les fichiers sélectionnés, entrer le code reCAPTCHA :

Pour visualiser une image, cliquer sur un lien dans la colonne description du document (Temporairement non-disponible). Pour télécharger l'image (les images), cliquer l'une ou plusieurs cases à cocher dans la première colonne et ensuite cliquer sur le bouton "Télécharger sélection en format PDF (archive Zip)" ou le bouton "Télécharger sélection (en un fichier PDF fusionné)".

Liste des documents de brevet publiés et non publiés sur la BDBC .

Si vous avez des difficultés à accéder au contenu, veuillez communiquer avec le Centre de services à la clientèle au 1-866-997-1936, ou envoyer un courriel au Centre de service à la clientèle de l'OPIC.

({010=Tous les documents, 020=Au moment du dépôt, 030=Au moment de la mise à la disponibilité du public, 040=À la délivrance, 050=Examen, 060=Correspondance reçue, 070=Divers, 080=Correspondance envoyée, 090=Paiement})

Filtre

Télécharger sélection en format PDF (archive Zip)

Télécharger sélection (en un fichier PDF fusionné)

Description du Document	Date (aaaa-mm-jj)	Nombre de pages	Taille de l'image (Ko)
Description	1998-08-06	37	2 472
Abrégé	1998-08-06	1	65
Revendications	1998-08-06	4	199
Dessins	1998-08-06	15	469
Dessin représentatif	1998-11-11	1	16
Rappel de taxe de maintien due	1998-10-18	1	110
Avis d'entree dans la phase nationale	1998-10-18	1	192
Accusé de réception de la requête d'examen	1998-12-13	1	172
Courtoisie - Lettre d'abandon (taxe de maintien en état)	2000-03-12	1	183
PCT	1998-08-06	4	163
PCT	1998-09-28	4	160
Taxes	1998-12-06	2	82

Sélection de la langue

Menus

Abrégé français

Abrégé anglais

Historique d'événement

Historique d'abandonnement

Taxes périodiques

Historique des taxes

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.

Sommaire du brevet 2245769

Abrégé français

Abrégé anglais

Historique d'événement

Historique d'abandonnement

Taxes périodiques

Historique des taxes

Votre demande est en traitement.Les informations demandèes serontaccessibles dans quelques instants.Merci de patienter.

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.