Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.
~2~ i2~
METHOD AND MEANS FOR PROCESSING SPEECH
This invention relates genexally to speech proce~sing,
and more particularly ~he invention relates to a
method and means for amplifying speech, such as for
the hard of hearing, without advers61y affecting the
5 signal intelligence thereof.
It is well recognized that persons having sensorineural
hearing impairment generally have a very limited
dynamic range, that is, very li~kle difference between
the inten~ity level o~ the softest speech of which
lC they are aware (e~g. speech awareness threshold or
SAT), the intensity le~el of ~peech which is mo~t
comfortable fox them (mos~ comfortable level or MCL),
and the intensity at which sp~ech becomes too loud to
be tolerated (uncomfortable loudnes~ level or UCL).
15 It is generally agreed that it would be highly desirable
to reduce the wide range of speech intQnsity levels to
a more re~tricted range suitable for the sensorineural
hearing impairment of each individual listener.
Speech compression ~y~tems axe known which employ
20 automatic gain control~ However, prior art systems
employing peak clipping and in3tantaneous compre sion
produce harmonic distortion which tend~ to emphasize
the stronger, low-fre~uency componants of speech and
obscures the higher frequenciesO A compreh~nsiv~
--2--
survey is presented by Braida e~ al in "~earing
Aids - A Re~iew of Past Research on Linear Amplifica-
tion, Amplitud~ Compres~ion, and Frequency Lowering",
American Speech-Language~Hearing Association, Rockville,
Maryland, April 1979. This ~urvey provides an extensive
critical review o t~.e compression li~erature in
conjunction with a tutorial on compression concepts.
The survey suggests that th~ lack of benefi~s from
compression as shown in the survey literature reflects
more a failure of researchers to adequately grasp the
concepts and complexity of compression, in theory and
implementation, rather than the potential benefit of
amplitude compres~ion itself.
It is recogniæed that the acoustical patt~rns of
speech can be systemically analyzed in three primary
time-domain component~. ~1) a fine-temporal pattern
reflecting the spectral dis~ri~ution of each brief
acoustic segment, (2) a gross-temporal pattern reflect-
ing the durations o~ the various acouskic segments
based on changes in fine-temporal patterns, and (3) a
time-varying amplitude pat~ern. Tha fine temporal
cues from segments of speech as short as five or ten
milliseconds will often provide a listener with suffi-
cient information to identify the place of articulation
for consonants. Similarly, the gross temporal pattern
will often provide sufficient information regarding
the manner of arti~ulation, e~pecially among the
classes of fricatives, affricates and ~top-plosives.
The time varying ampli~ude pa~tern, or " peech anvelope",
is the natural reqult of a speech production process
but may convey mostly redundant information already
conveyed by a gross~temporal pattern. Robinson and
Huntington, in a talk before th~ Acoustical Society of
~merica in April, 1973, recognized that conv~ntional
compression amplification introduces unde~irable
distortion when brief time cons~ant~ are utilized, and
reacts too sluggi~hly or longer time constan~s. A
~gi5~
--3--
method was proposed in which the average power of the
speech wave form over intervals of several tens of
milliseconds is measured con~inuously and is used to
determine ~he gain to be applied to tha waveform at
the center of each interval, with the resulting ampli-
tude compressed ~ignal being delayed by one-half the
length of the averaging interval. Preliminary result~
from a computer simulation suggested that speech
intelligibility could be improved by this proce3s.
However, further wo~k was not undertaken by Robinson
and Huntington to develop the process.
An object of ~he present inven~ion is an improved
method of processing speech to facilitate reception
without distorting the intelligible content thereof~
Another object o~ the invention i8 apparatus f~r
compressing speech pat~erns whereby the varia~ions in
time varying amplitude pattern or envelope are minimized
without adversely affecting tha fine-temporal and
gross-temporal pa~terns of the speech.
The present invention is directed to a method and
apparatus for processing speech in which a time-
varying averaged or root-mean-square ~RMS) amplitude
pattern is obtained and used to normalize the time
varying amplitude pattern o speech and provide a
compressed speech pattern po~itioned between the
speech awareness threshold and the uncomfortable
loudness level, ideally at the listener's most comort-
able level. Spectral shaping is employed to emphasize
the high frequency content. The invention can be
implemented in a single channel or multi channel
system. Suitable microphone means i~ employed to pick
up a speech pattsrn, and the speech pattern rom the
microphone is preamplified and then proeessed by a
suitable shaping filter which emphasizes the high
frequency content thereof. The root mean-square of
q ~ ~l
5~
. ~
the amplitude of the spectrallyshaped signal is then determined
over a specific time period~ and the inverse of the root-mean-
square is then used to modulate the spectral~yshaped signal/ thus
producing a normalized amplitude. Importantly, the shaped signal
is delayed for a sufficient time period to compensate for the
time delay involved in the root-mean-square determination prior to
the amplitude compression. The resulting signal is thus compressed
and then adjustPd -to the desired hearing ran~e with ~he spectral
shaping providing a retention of the fine-temporal pattern and the
gross-temporal pattern.
Thus, in accordance with a broad aspect of the invPntion,
there is provided apparatus for enhancing speech comprising micro-
phone means for receiving audio signals and generating electrical
signals in response thereto, high frequency emphasis means connected
to said microphone means for amplifying said electrical signals,
amplitude detection means connected to said high frequency empha-
sis means for receiving amplified signals and obtaining a root
mean square (RMS) amplitude of said amplified signals over a
selected period of time, delay means connected to said high~fre-
2~ quency emphasis means for receiving and delaying spectrally shaped
electrical signals for a portion of said selected period of timer
and signal compression means connected to said dela~ means and to
said amplitude detection means and compressing the delayed ampli-
fied electrical signals by a ratio of at least 10:1 in accordance
with said root-mean-s~uare average value whereby a constant ampli-
tude out~ut signaI without significant signal distortion is ob-
tained.
~ 6~
-4a-
In accordance with another ~road aspect of the ~n~entIon
there is provided, in speech enhancement apparatus, a method of
compressing the amplitude of audio signals without speech distor-
tion comprising the steps of obtaining a measure of amplitude of
said audio signals over a selected time period, delaying said sig~
nals by said selected time period, and compressing said delayed
signals corresponding to said selected period of time by a ratio
of at least 10:1 in accordance with said measure o-E amplitude.
In accordance with another broad aspect of the invention
there is provided apparatus for processing audio signals compris-
ing a plurality of band pass filters for receiving and filtering
audio signals into a plurality of limited frequency bands, a
plurality of amplitude compresscr means each connected with a band
pass filter, each of said amplitude compressor means including
means for obtaining a measure of amplitude of audio signals during
a selected period of time, means for delaying audio signals for
said selected period of time, and means for compressing the delayed
audio signals during said period of time based on said measure of
aMplitude, and summing means connected to said pluralit~ of ampli-
~0 tude compressor means for receiving and summing compression ofaudio signals.
T].le invention and objects and features thereof will be more
readily apparent from the following detailed description and appended
claims when taken with the drawing, in which:
Figure 1 is a functionalblock diagram of a single channel
speech processing apparatus in accordance with one embodiment of
~2~ii;2~
-4b-
the present invention.
Figure 2 is a graph illustratiny the compression of speech
in accordance with the present invention.
Figure 3 is a functional block diagram of a multichannel
embodiment of speech processing apparatus in accordance with the
invention.
Figures 4A and 4B are functional block diagrams of a tape
recording system in accordance with other embodiments of the inven-
tion.
Referring now to the drawings, Figure 1 is a functional
block diagram of a single channel speech processor in accordance
with one embodiment of the invention which has been built using
conventional, commercially available components. In this embodi-
ment a microphone 10
~i
having a broad frequency response (e.g. an electret
microphone having a response o 100 Hertz to 10K
Hertz such as a Knowles E~ 1934) picks up audio signal3
and transmits electrical signal6 ~o a pre-amplifier 12
having 26 dB of gain between 100 Hertz to lOK Hertz.
The amplified signal is then passed to high frequency
emphasis circui~y 14 (e~g. TI064 quad amplifier) which
proviaes 6 dB/octave gain ovar the range from 100
Hertz to two kiloHertz and a 1at response above two
kiloIIertz. An auxiliary input is pro~ided at 16
whereby signals from a radio receiver, for example,
can be applied to the high frequency emphasis circuitry
14.
The ~ignal from circuitry 14 is then passed to an ~MS
detector high-frequency emphasis circuitry 14 is al~o
provided to delay circuitry 18 having a delay equal to
the time constant of the RMS detector 16. In one
embodiment ~he RMS detector comprised an analog series
AD 536A and the delay circuitry 18 comprised a Raticon
SAD 4096 bucket brigade devics operated from a 80
kilohertz digital clock 20.
The delayed signal from the delay device 18 i~ then
applied as the numerator in a divider circuit 20 ~e.g.
Analog Devices AD 535 precision divider) and the RMS
amplitude of the delayed signal is applied to the
divider 20 as a denominator. Accordingly, the output
from the divider 20 is a delayed amplitude compressed
signal which is applied to the raceiver 22 (Knowles ED
1925~
Figure 2 is a plot of the compressed output level in
dB SPL for the signal applied to receiver 22 versus
the input level in aB SPL of the signal from the
microphone 10. For input level~ balow abou~ 45 dB,
the output level is attenuated. At an input level of
45 dB, the output level i5 compressed and maintained
--6--
uni~orm at approximately 100 dB SPL which is the MCL
le~el. Tha compression ratio remai~s at 10:1 or
grPater fsr input level~ above 45 dB.
Figure 3 is a multi-channel signal compression system
in accordance with another embodiment of the invention
in which signals are filtered and compressed in a
plurality of frequency bands. In this embodiment
signals from the microphone 30 are applied ko a low
band (100-400 hZ) filter 32, a middle band (400-1,600
Hz) filtex 34, and a high band (1,600-6,400Hzl filter
36. Signals from each of the filters are pa~sed to
amplitude compressor circui~ry 38, 40, and 42. Each
of the compressor circuits includes delay circuitry,
RMS detector circuitry, and divider circuitry as
illustrated in Figure 1. Because each channel includes
a narrow band of fre~uencies, the high frequency
emphasis circuitry of Figure 1 i~ not required. The
compressed signals are then applied to a summing
amplifier 44 with the composite summed signal then
applied to the recei~er 46.
Figures 4A and 4B are functional block diagrams of
other embodiments of the invention u~aful with tape
recorders and in which the compressed signal and the
detected RMS value are both recorded in time sequence
with a tape recorder. In Figure 4A 9ignals from the
microphone 50 or other audio source are applied to
amplitude compres~or 52 which ma~ be a ~ingle channel
device as in Figure 1 or a multiple channel device as
in Figure 3. The compressed audio qignal is then
recorded in an analog channel of the tape reaorder 54,
and the de~ected RMS valua is recorded in an FM channPl
of the recorder 54. Thereaftar, the recorded compressed
audio signal and the recorded RMS ~alue can be applied
to a multiplier 56 fxom which the original audio
signal and the original dynamic ranga is produced.
The resulting decompressed ~ignal is applied through
a5~a~
, .
frequency de emphasis circuik 58 to the receiver 59.
Figure 4B i5 a ~ampled digital recording system similar
to the analog systPm of Figure 4A. In thi embodiment
signals from microphone 60 are applied to the amplitude
compressor 62, as in Figure 4A, and then the compressed
audio signal and the RMS value are converted to digital
form by analog to digital circuits 63 and 65. The
digital signals are then stored in digital recorder
64. The recorded signals ar converted ~ack to analog
signals by digital ~o analog converter 57 and multi-
plying DAC 66. The decompressed signals from DAC 66
are frequency de-emphasized at 68 and then applied to
the receiver 69.
These embodiments of the invention are particularly
advantageous since tape r~corder~ typically have a
limited dynamic range. Thus, by recording the compressed
audio signal and ~he RMS on the recorder, the full
dynamic range of the recorded signal can be reconstructed
in the multiplier 56 and multiplier ~6.
In the preferred embodiments described herein, an RMS
detector has been employed. However, other measures
of the signal amplitude over a period of time, including
an average value and an approximation o the RMS
value, can he employed. A~ used herein, RMS value
include~ suitable approximations thareof. Further,
while a divider has been employed in the preferred
embodiments for obtaining the compressed signal, a
logarithmic measure of khe detected RMS or averaged
value can be employed for obtaining the compressed
value.
The invention has broad application~ including, for
example, hearing aids and audio storage media (a~
described herein), sampled digital ~torage ~ystem,
broadcast systems, public addrass sy~tams, And general
5~
--8--
voice communication including telephone3. The invention
is especially useful for communication in a noisy
environment and through a noisy communication link
such as in ield applications.
Thus, while the inventlon ha~ been described with
reference to specific embodiments, thq description i~
illustrative of the invention and i~ no~ to be cDnstrued
as limiting the inveniion. Variou~ modifications and
applications may occur to those ~killed in the art
without departing from the ~rue spirît and scope of
the invention as defined by the appended claim~.