Language selection

Search

Patent 1127764 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 1127764
(21) Application Number: 1127764
(54) English Title: SPEECH RECOGNITION SYSTEM
(54) French Title: SYSTEME D'IDENTIFICATION DE LA PAROLE
Status: Term Expired - Post Grant
Bibliographic Data
(51) International Patent Classification (IPC):
  • G10L 15/20 (2006.01)
  • G10L 25/78 (2013.01)
(72) Inventors :
  • SAKOE, HIROAKI (Japan)
(73) Owners :
(71) Applicants :
(74) Agent: SMART & BIGGAR LP
(74) Associate agent:
(45) Issued: 1982-07-13
(22) Filed Date: 1978-12-22
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data:
Application No. Country/Territory Date
157968/1977 (Japan) 1977-12-28
158819/1977 (Japan) 1977-12-27

Abstracts

English Abstract


Abstract of the Disclosure
In a speech recognition system for programming or commanding
computers and other machines, there is described a recognition system which
is less susceptible to breathing and other background noises. Speech-like
signal durations are detected from input signal waves from all sources. The
waveforms of the input signals are analyzed and recognition parameters are
extracted. A spectral change detecting unit detects the magnitude of short-
time spectral changes from the speech-like signal durations, determining
thereby whether the speech-like signal is speech. A recognition unit re-
cognizes speech patterns from signals within the speech-like signal durations
and rejects those signals which the spectral-change detecting unit indicates
do not contain speech. The parameter extracting means and the spectral change
detecting unit receive controlling signals from a control unit.


Claims

Note: Claims are shown in the official language in which they were submitted.


THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:
1. A speech recognition system comprising: a speech detecting unit for
detecting speech-like signal durations from input signal waves; means for
analyzing the waveforms of said input signal waves to extract recognition
parameters; a spectral-change detecting unit for detecting the magnitude of
short-time spectral changes in said speech-like signal durations and deciding
whether or not the integrated value of a short-time spectral change exceeds a
threshold value to thereby determine whether or not each speech-like signal
duration is speech and to deliver its output signal if the threshold value is
exceeded; a recognition unit which recognizes on the basis of said recognition
parameters speech patterns from signals fed from within said speech-like signal
durations and which rejects some recognition results on the basis of the output
signal received from said spectral-change detecting unit which indicates that
said speech-like signal duration does not include speech; and a control unit
for supplying control signals to said parameter extracting means and spectral-
change detecting unit.
2. A speech recognition system comprising: a speech detecting unit
for detecting speech-like signal durations from input signal waves: means for
analyzing the waveforms of said input signal waves to extract recognition
parameters; a voiced-speech duration detecting unit for determining whether or
not voiced-speech durations are present in each speech-like signal duration and
to thereby deliver the determination results as confirmation signals; and a
recognition unit which recognizes, on the basis of said recognition parameters,
speech patterns from signals input within said speech-like signal durations
16

and which rejects such recognition result on the basis of the confirmation signal
received from said voiced-speech duration detecting unit.
17

Description

Note: Descriptions are shown in the official language in which they were submitted.


:
~12'~764
This invention relates to the improvement in a speech recognition
system.
Speech recognition systems are useful not only as data input means
for computers but also as command means for various machines. They are ac-
tually used, for instance, as a means to feed routing information into auto-
matic package sorting machines or as means to feed inspection data into com-
puters used in automobile factories and elsewhere as described by momas B.
Hartin in his article entitled "Practical Applications of Voice Input
Maohines" published in the Proceeding of the IEEE, Vol. 64. No. 4, April
issue, 19~6, pp. 487-501 (Reference 1).
One of the advantages resulting from the use of a speech recogni-
tion system is that the operator can input information while simultaneously
doing something else with his hands and/or feet. If a wireless microp~hone
is used, he can operate the system even while waIking around. These ad-
vantages, which are not pctssible with manually operated conventional data
input means, such as typewriters, are unique to 1 speech recognition systems.
However, if one tries to orally input data while doing something
else, breathing noises inevitably become louder and begin to adversely
effect the function of the speech recognition system. In a speech recogni-
20 tion system, speech signal durations are detected by monitoring the ampli-
tude levels of a sound signal which is picked up and converted into electrical
signals by a microphone. Durations of sound signals where the amplitude
levels are higher than a predetermined threshold value are detected as speech
signal durations to be recognized and treated as such. However, when
breathing noise is so loud that its amplitude level exceeds the threshold
value, it will be detected as a speech signal duration. Accordingly, it is
-- 1 --
., .: ~
, ; ,:
.,:~ - :

`"` ~ 764
desirable that breathing noises erroneously detected be rejected as not being a
real speech signal. There are other instances, however, where other types of
noise, including common background noises, are mistaken for real speech signals
having some meaning ~hereinafter referred to as simply "speech signal" or
"speech"), and which noise signals may also cause faulty operation of the speech
recognition system.
An object of the precent invention is to provide a speech recognition
system which is less vulnerable to breathing or background noise.
The present system comprises a speech detecting unit for detecting
speech-like signal durations from input signal waves; means for analyzing the
waveforms of said input signal waves to extract recognition parameters; a
spectral-change tetecting unit for detecting the magnitude of short-time spectral
changes in said speech-like signal durations and deciding whether or not the
integrated value of a short-time spectral change exceeds a threshold value to
thereby determine whether or not each speech-like signal duration is speech and
to deliver its output signal if the threshold value is exceeded; a recognition
unit which recognizes on the basis of said recognition parameters speech patterns
from signals fed from within said speech-like signal durations and which rejects
some recognition results on the basis of the output signal received from said
spectral-change detecting unit which indicates that said speech-like signal
duration does not include speech; and a control unit for supplying control signals
to said parameter extracting means and spectral-change detecting unit.
The present system alternatively comprises a speech detecting unit for
detecting speech-like signal durations from input signal waves; means for analyz-
ing the waveforms of said input signal waves to extract recognition parameters;
a voiced-speech duration detecting unit for determining whether or not voiced
-speech durations are present in each speech-like signal duration and to thereby

7764
deliver the determination results as confirmation signals; and a recognition
unit which recognizes, on the basis of
$
, . .
~J; - 2a-

1127~764
said reCognitiOn parameters, speech patterns from signals input within said
speech-like signal durations and which rejects such recognition result on the
basis of the confirmation signal received from said voiced-speech duration
detecting unit.
The present invention will be described in detail in conjunction
with the accompanying drawing in which:
Figure 1 is a block diagram of a first embodiment of this invention;
Figure 2 illustrates in further detail a part of the first embodi-
ment;
Figure 3 is a time chart illustrating the operation of the circuit
of Figure 2;
Figures 4 and 5 are block diagrams of other examples of a spectral-
change detecting unit of Figure l;
Figure 6 is a graphic representation of the spectrum pattern of
voiced speech;
Figure 7 is a graphic representation of a typical shorttime auto-
correlation function of voiced speech;
Figure 8 is a block diagram of a second embodiment of this inven-
tion;
Figure 9 shows in detail a block diagram of a voiced-speech dura-
tion detecting unit;
Figure 10 is a waveform illustrating the operation of the peak
detecting unit of Figure 9;
Figures 11 (a) and (b) are a series of waveforms illustrating the
recognizing operation of the second embodiment;
Figure 12 is a more detailed drawing of the peak detecting unit
shown in Figure 9; and
Figures 13 and 14 are flow charts of the operation of the recogni-
-- 3 --

~27764
tion unit used in the first and second embodiments.
A first embodiment of this invention will now be described, which
uses spectrum information as a reference for determining whether a given
input signal is speech or noise.
Referring to Figure 1, the present system is composed of a speech
detecting unit 20 for detecting speech-like signal (referred to as "SLS"
hereunder) durations from input signal waves, means 10 for analyzing the
waveforms of said input signal waves to extract recognition parameters, a
recognition unit 30 for recognizing speech signals in the detected SLS dura-
tions, means 40 for determining whether or not each of the SLS durations
is a speech signal duration by examining the variation of short-time spectrum
information in said SLS duration and by supplying the determination result
to said recognition unit, and a pulse generator 5 for supplying control
signals to said means 10 and 40.
Short-time spectrum information mentioned above is not limited
to what short-time spectrum is usually considered to include, but also
includes such parameters as short-time autocorrelation co-efficients,
short-time spectrum or linear predictive co-efficients which are similar
to short-time spectrum co-efficients, and other parameters such as formant
frequencies which are closely related to short-time spectrumO
Meaningful words such as geographical or personal names are
combinations of consonants and vowels, and their short-time spectra usually
manifest significant temporal changes. In contrast, breathing noises are
generated by the friction of air within the respiratory system, and the
short-time spectral change of their duration is comparatively smallO Back-
ground noises including the sounds of motors, welding or winds have relatively
constant spectra. Therefore, the adverse effect of breathing and background
noises can be effectively eliminated by the present speech recognition
- 4 _
~, .

1~;~4 ,
System which comprises built-in means to determine short-time spectral changes.
Returning to Figure 1, an input analog sound signal s is subjected
to, for instance, 10-channel short-time frequency analysis by the spectrum
analyzer 10. An example of such an analyzer would be one composed of 10
band pass filters, 10 rectifiers, 10 smoothing filter circuits, a multiplexer
and analog-to-digital converter so that the spectrum envelope of the input
signal s can be described in 10 digital parameters.. Now, the result of
short-time spectral analysis at the point of time i is represented by a 10-
dimensional vector of
i (all' a2i' -- -- ani ...... -- alOi) .... (1)
Digital vector signals like Equation ~1) are supplied by the
spectrum analyzer 10 at predetermined frame intervals ~of 10 milliseconds,
for instance). The analyzer 10 may be the one illustrated in Figure 1
of the article entitled "Real-Time Recognition of Spoken Words" by Louis
C.W. Pols, published in IEEE Transactions on Computers, Vol. C-20, No. 9,
September issue, 1971 ~Reference 2).
Thus, the result of the spectral analysis is multiplexed by said
multiplexer synchronously with a frame sync pulse~ 1 generated by the control
unit 5 of Figure 1, and is converted into a digital signal by the analog-to-
digital converter. Incidentally, in every drawing hereinafter referred to,
thick lines represent the paths for 12-bit parallel binary signals and thin
lines represent either analog signals or one-bit binary signals. Signal
paths and signals may at times be represented by the same terms.
The analog input signal s is fed to the speech detecting unit
20, which is of the type described in U.S. Patent 3,712,959 which issued to
Ettore Fariello on January 23, 1973 ~Reference 3). The unit 20 picks out
as an SLS duration a time segment that can be regarded as a sound signal by
determining the aMplitude, zero-crossing and other characteristics of the
-- 5 --

-
~.~,~764
input signal s. The digital output signal _ of the unit 20 is supposed to
have a value "1" within the SLS duration and a value "O" otherwise. The
time indicator i is hereinafter counted on the premise that it bscomes
equal to "1" at the starting point of the speech-like signal duration. As
a result, signals in the SLS duration are a time series of the vector ai
represented by:
A = al, a2, ... , ai, ... , aI ............... (2)
where I stands for the length of this SLS duration. Hereinafter, the
signals represented by Equation (2) will be called the input pattern A.
The recognition unit 30 functions to recognize the signals in
the SLS duration designated by the signal ~ from the detecting unit 20,
i.e., the input pattern A of Equation (2), and determines the word re-
presented by the pattern. Though many different recognition principles
have been proposed for this recognition unit 30, any of them is applicable
to this invention. One conceivable example is the known pattern matching
method. By this method, a set of words to be recognized are determined
in advance, and the individual words in suitable parameters are stored as
reference patterns. As an SLS duration is detected and inputed, it is
described in the parameters thereby forming an input pattern. This input
pattern is subjected to pattern matching operation, i.e., compared with
said reference patterns and the reference pattern closest to the input
pattern is selected so that the input pattern can be identified with the
word represented by the selected reference pattern. The recognition result
is given as an output signal n. Said recognition unit 30 can be composed
in the same way as the MINICOMPUTER section in Figure 5 on page 492 of
Reference 1,
The spectral-change detecting unit 40 calculates the amount of
temporal change in the short-time spectrum signal supplied by the spectrum
!' - 6 -
, .
~ ,.
,, ~ . .

-
detecting unit 10, i.e., the vector ai in Equation (2). If the total
transition is the SLS duration, i.e., while the value of i varies from
1 to I, is greater than a predetermined threshold value, this SLS duration
is regarded as speech to generate a detection signal q.
Next, the details of the spectral-change detecting unit 40 are
described referring to a time chart shown in Figure 3. The vector ai output
from the spectrum analyzer 10 synchronously with the frame sync pulse ~ 1
- is stored in a first register 41. Simultaneously the vector ai I which
had been stored in the first register 41 until immediately before is shifted
to a second register 42. A vector-to-vector distance calculating unit 43
calculates, as unit-time transition, the distance between the vector ai
stored by the first register 41 and the vector ai I stored by the second
register 42, and gives a resultant signal d as output. Though different
definitions of the distance may be given, the Euclidean distance is used
herein, This unit-time transition d is integrated by the integrator 44
synchronously with a pulse signal ~ 2 which has the same frequency as said
frame sync pulse ~ 1 and is behind it in phase.
As soon as the signal p of said speech detecting unit 20 has
changed from 0 to 1, i.e., at the starting time of an SLS duration, a pulse
generating circuit 46 generates a starting pulse Pl- The integrator 44 is
re5et to "0" in response to the starting pulse Pl- Next, the distance dl
between the vectors ai and ai 1 in the SLS duration, i.e., the short-time
spectral change, is integrated by the integrator 44. When the SLS duration
has been terminated, i.e., when said signal p has changed from "1" to "0",
the pulse generating circuit 46 generates an ending pulse P2. The inte-
grated value D retained by said integrator 44 is compared at a comparator
circuit 45 with a predetermined threshold value ~. It is supposed that
when D is greater than~, the output signal k takes on the value "l" and
-- 7 --

~Z7764
when D either is smaller than or equals 0, the output signal k takes on the
value "1" and when D either is smaller than or equals ~, k is held to "0".
The logical product of ~his signal k and said ending pulse p2 is calculated
at an AND gate 47, and is supplied as the detecting signal q to the minicom-
puter unit built into the recognition unit 30. Incorporated into the mini-
computer unit included within recognition unit 30 is a program which
functions to output the recognition result signal n when triggered by the
input of the detection signal qsas indicated in Figure 13. me program
also functions to reject the recognition result, when no pulse q is input,
thereby preventing the output of a signal nO
Thus, a speech recognition system which is non-responsive to
breathing or background noise is achieved based on the short-time spectral
change in the SLS duration.
In the above-mentioned embodiment, since the spectral change
is evaluated in terms of the time-integrated value of the unit-time
transition d, the spectral change tends to increase with the length of
the S~S duration. The short time spectrum of breathing or background
noise is always changing, through only slighly, on a unit-time basisO
Therefore, if breathing or background noise continue for a long time,
the integrated value of its unit-time transitions d tends to become greater
and the noise may be mistaken for speech.
Referring to Figure 4 which illustrates another structure of the
spectral-change detecting unit free from said shortcoming, the first register
41~ seeond register 42, vector-to-vector distance calculating unit 43,
integrator 44, comparator circuit 45, pulse generating circuit 46 and AND
gate 47 function similarly to their respective counterparts in Eigure 2.
An additional counter 48 is reset by the starting pulse ~p output from said
pulse generating circuit 46, and then counts the frame sync pulses~ 1. As
a result, at the time when an SLS duration is terminated and the ending pulse
-- 8 --
, ~

~64
P2 is generated, a quantity ~, proportional to the time length of the SLS
duration, is stored in the counter 48. The integrated value D of the unit-
time transitions d stored in the integrator 44 is divided by the time length
signal,e counted by the counter 48. The average transition D' thus obtained
is supplied to the comparator circuit 45 to be compared with the threshold
value~.
In this way, a spectral change averaged over a period of time is
used as the basis of distinction, thereby preventing continuing breathing
or background noises from being mistaken for speech. Incidentally, said
calculating unit 43 is composed of a subtractor to obtain the difference
between the input vectors ai and ai 1 and an integrator to integrate the
absolute value of this difference.
However, if spectral changes are detected, in terms of the average
of unit-time transitions over a long period of time, changes which are
relatively great in particular parts are averaged with others, resulting in
a failure of the speech detection. For example, although the spectrum
changes relatively significantly between /i:/ and /z/ in the word "ease"
[i:Z], the overall spectral change is not so great because the major part
of the speech duration consists of the sustaining vowel /i:l. Consequently,
the speech may be rejected.
Figure 5 shows another spectral-change detecting unit 40 designed
to overcome this problem. In the figure, the structural elements 41 through
49 are the same as their respective counterparts in Figure 4. The unit-time
spectral transition d is given to an additional comparator circuit 50 to be
compared with a predetermined threshold value ~. An output signal _ of the
circuit 50 is "1~' whenever the transition d is greater than the threshold
value ~; otherwise it is "0". A set-reset type flip-flop 51 is reset at the
beginning of an SLS duration by a starting pulse Pl generated by the pulse

~Z7~
generating circuit 46. Whenever said transition d exceeds the threshold
value~, the output of the circuit 50 becomes "1" and consequently the flip-
flop 51 is set, giving an output signal k' to give "l". This signal k' is
led by way of an OR circuit 52 to the AND gate 47. Therefore, if in an SLS
duration there is even one point of time where the spectral change is great,
; a detection signal q is generated. Thus, the spectral change detection with
high sensitivity can be achieved by the use of the circuit of Figure 5.
The above description of the present invention with reference to
an embodiment thereof is not intended to limit the applicable range of this
invention. In particular, it is possible to improve the detecting performance
by inserting a low pass filter having a suitable time constant between the
vector-to-vector distance calculating unit 43 and the threshold value circuit
50 of Figure 5. Also, it is obvious that a combination of the spectral-
change detecting units of Figures 2 and 5 can be effectively used to comprise
this invention. Although the examples of Figures 3 through 5 are composed
of digital circuits, analog circuits can as well be used to achieve the same
function.
A second embodiment of the present invention will now be described
~hich distinguishes between speech and noise by determining whet~er or not
an SLS duration detected by the speech detecting unit 20 includes a voiced
component. Voiced sounds here refer to sounds generated by excitation of the
vocal tract by the oscillating wave of the vocal cords, and the usual vowels
and nasal phonemes are all voiced sounds. In contrast, unvoiced sounds are
excited by the friction or plosion of air flow in the voice tract and do not
accompany the vibration of the vocal cords.
Because all meaningful words, such as nUmerals, geographical names
and personal names, contain vowels, any usual vocal signal contains a voiced
duration. On the other hand, breathing noise generated by the friction of
- 10 -

1~764
air flow in the mouth or nostrils is essentially unvoiced. Accordingly, by
detecting the presence or absence of a voiced speech duration in an SLS
duration, it is possible to determine whether the duration is speech or
breathing noise. The spectrum of a voiced sound excited by the vibration
of the vocal cords has a harmonic structure. As illustrated in Figure 6,
the structure has the frequency of vocal cords vibration as its fundamental
frequency (known as pitch frequency). In connection with this fact, the
shor~-time autocorrelation function of a voiced sound, as indicated in Figure
7, has a relatively high peak corresponding to the period of vocal cords
vibration or the pitch period. The spectrum of usual indoor noise in which
correlation is almost absent, has no harmonic structure, and no conspicuous
peak is observed in its autocorrelation function. Thus, non-correlated back-
ground noise closely resembles unvoiced sounds, and accordingly, can be
distinguished from meaningful sounds. Background noise does sometimes in-
clude sounds having a harmonic structure such as those resulting from the
revolution of a motor. Such components of background noise can be distin-
guished from speech to some extent. The pi~ch frequencies of normal human
voices are known to be within the range of lOOHz to 350Hz. Therefore, if
the pitch frequency of an input signal lies outside of this range, it is
different from a voiced sound in the usual sense of the term.
Figure 8 is a block diagram of a second embodiment of this inven-
tion based on the above-explained principle. In Figure 8, the same reference
~umerals represent the same structural elements as in Figure 1, respectively.
Segments of the analog input wave s whose amplitude levels are higher than
a predetermined threshold level are detected by the speech detecting unit as
SLS durations. For each of these SLS durationsj the digital detect~r
signal ~ is set to "1", and is reset to "0" in response to the termination of
the duration.
- 11 -
- , .
~ '. ~ ' .

, ,, ~Z~4
The recognition system 2 which includes the spectrum analyzer 10
and recognition unit 30 recognizes the SLS durations in said signal wave
~ s, i.e., the segments where the detection signal _ is "1", and determines
- the recognition results. A voiced-speech duration detecting unit 3 detects
the presence or absence of a voiced sound in each of the SLS durations in
the signal wave s, i.e., the segments where the detection signal p is "1",
and gives a confirmation signal q' of "1" or ~oi. corresponding to the presence
or absence of a voiced signal.
Figure 9 is block diagram of the detecting unit 3 based on the
autocorrelation method of various detecting methods. Imme-liately after the
detection signal ~ of the detecting unit 20 has risen from "0" to "1",
a fiip-flop 34 is reset to "0". The input signal wave s is low-passed by an
analog low-pass filter 31 having a cut-off frequency of 350Hz, and supplied
to an autocorrelator 32. This autocorrelator 32 may be composed of the
analyzer illustrated in Figure 8 of the article by M.R. Schroeder entitled
"Vocoders: Analysis and Synthesis of Speech", published in the Proceedings
of The IEEE, Vol 54, No. 5 (May issue, 1966) (Reference 4). This auto-
correlator 32 calculates the short-time autocorrelation function of the
input signal. The short-time autocorrelation function defined as follows:
~t_as(t) s~t-~)dt (3)
where 5(t) represents the value of the input signal wave at the time t.
delta ~, the integrated length of time; and~ the delay time. Since the
pitch freq~ency normally ranges between lOOHz (hertz) and 350Hz, the pitch
period is from 2.9 to lO milliseconds. Accordingly, the autocorrelation
function of Equation (3) is calculated with respect to the delay time Z
~ithin the range of
2.9 ' ~ 10 (milliseconds) (4)
- 12 -
,.

llZ7764
This short-time autocorrelation function is calculated at every point
time t (at a sample point of time in actual practice), and is fed as a signal
x to a peak detecting unit 33. A multiplexer is built into the output sec-
tion of the autocorrelator 32 which scans the autocorrelation function~ tt,t)
with respect to the delay time ~ and outputs its result. Therefore, the
input signal x of the peak detecting unit 33 has the waveform of Figure 10.
Referring to Figure 12, which shows the detailed structure of peak
detecting unit 33, the input signal x is divided by resistors Rl and R2,
and supplied to an AND gate 120 as a signal y. The resistances of the resis-
tors Rl and R2 are so set that the level of the signal y may become equal
to the turn-on threshold value of the AND gate 120 when the level of the
input signal x becomes equal to e. A pulse series signal v is inputted
through a signal line 121, and a signal which becomes "1" within the range
defined in Equation (4) is given through a signal line 122. The circuit of
Figure 12 allows a pulse to be supplied as a peak detection signal _ only
when the delay time ~is within the range defined in Equation (4) and the
input signal x has exceeded the threshold value ~. Therefore, if the thres-
hold value e is appropriately set, a pulse signal is supplied as the peak
detection signal m only when the input sound signal s is a voiced sound.
The -flip-flop 34 of Figure 9 is set to "1" by this pulse. Thus, if at least
one voiced part is present in an SLS duration, the output of the flip-flop
34, namely the confirmation signal q' assumes the value "1". Conversely,
if said short-time autocorrelation function ~ (t,~ ) does not exceed the
threshold value ~, the pulse of the peak detecting signal _ is not generated
and consequently the flip-flop 34 remains at "0". Therefore, when an SLS
duration is terminated, the detection signal assumes the value "0", and
a volced part is present in this SLS duration, the flip-flop 34 is in a
set state and the confirmation signal q' becomes "1". If no voiced component
is present, the 1ip-flop 34 remains reset and the confirmation signal ~'
- 13 -
~ .
`~ :

llZ7764
remains "0".
The program shown in Figure 14 is lioaded into the minicomputer
used in the recognition unit 30 of Figure 8. The minicomputer receives the
confirmation signal q' after an SLS duration is terminated and said detection
signal ~ becomes "0". If q' is "1", this SLS duration is judged to be
speech, and the results of this recognition are fed to a signal line _.
If q' is "0", this SLS duration is judged to be noise such as breathing
noise, and a rejecting code is fed to the signal line _.
For the timed interrelationship between the detection signal p,
the operation of the recognition system and the confirmation signal ~', two
processing procedures are adopted as illustrated in Figures 11 ~a) and (b).
In accordance with the time chart of Figure 11 ~a), the recognition unit
performs the recognizing procedure as soon as an SLS duration is detected
and the detection signal p is "1". Soon after the speech-like signal dura-
tion is terminated and the recognizing procedure is completed ~the confir-
mation signal q' turns "1"), the recognition result is fed into the signal
line _.
Referring to the time chart of Figure 11 ~b), the recognizing
procedure is performed only in the event that the confirmation signal q' is
"1" when an SLS duration is terminated and the detection signal p turns "0".
The recognition reault is given as soon as the recognizing procedure is
completed. If the confirmation signal q' is "0" at the end of an SLS dura-
tion, no recognizing procedure is performed.
The procedure of Figure 11 ~a) has the advantage of being able
to be performed by a low-speed operation circuit because it permits a com-
paratively long time to be taken for the recognizing procedure.
The procedure o Figure 11 (b), which is allowed little time for
the recognizing procedure when the recognition result has to be supplied
- 14 -

11Z~764
promptly after the termination of a voice duration, requires a recognition
unit composed of a high-speed operation circuit. However, when the con-
firmation signal q' is "0", i.e., a given SLS duration has been determined
not to be speech, there is no need to operate the recognition unit. mis
permits such blank periods of the recognition unit to be utili~ed commonly
by more than one recognition system.
Although the present invention has hitherto been described with
reference to the embodiments, the structure of the voiced-speech duration
detecting unit 3 is not limited to that shown in Figure 4. It is possible
to distinguish between voiced and unvoiced sounds, for instance, depending
on the ratio between the spectrum of a range where a pitch is present (100
Hz - 350 Hz) and that of a whole range (100 Hz - 6,000 Hz, for example3.
This method contributes to simplification of a speech recognizing system
using a filter bank analysis. The recognition result given in a speech-like
signal duration without speech can be outputted either by giving no signal
at all or by giving a rejecting code.
- 15 -
,, , ; .
.

Representative Drawing

Sorry, the representative drawing for patent document number 1127764 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC assigned 2013-08-12
Inactive: IPC assigned 2013-08-12
Inactive: First IPC assigned 2013-08-12
Inactive: Expired (old Act Patent) latest possible expiry date 1999-07-13
Inactive: IPC removed 1984-12-31
Grant by Issuance 1982-07-13

Abandonment History

There is no abandonment history.

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
None
Past Owners on Record
HIROAKI SAKOE
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Abstract 1994-02-21 1 20
Claims 1994-02-21 2 43
Drawings 1994-02-21 8 120
Descriptions 1994-02-21 16 569