Note: Descriptions are shown in the official language in which they were submitted.
117536;~:
This invention relates to a method and
apparatus for enhancing the perception of acoustic
sound. More specifically, this invention relates to
a method and apparatus for enhancing sound from stereo
recordings.
- Since the development of binaural or stereo
sound, many techniques have been proposed to enhance
the quality of the sound as it is perceived by the
listener. Commencing with the 1933 British Patent
394,3~5 to Blumlein, a sound enhancing technique is
described emplo~ing a pair of spaced microphones
~ ~ whose output s1gnals are applied to sum and difference
!~ networks~. The difference signal is applied to a
.~ ~
15~ reactive network for the purpose of shifting the
phase of low frequencies to such an extent (about 90
degrees) that the phase differences at low frequencies
o~ the microphone signals are converted to amplitude
differences on the output of the Blumlein network.
The phase to amplitude conversion in the
Blumlein circuit is achieved by employing a substantial
phase shift at low frequencies. Such phase shift
corresponds with a large time delay. At high frequencies,
3~
--1--
~S362
such as above about 700 Hz, the phase shift becomes
negligible and the Blumlein network reproduces the
input signals with substantially the same relative
amplitude relationship.
In British Patent 781,186 to Philip B.
VanDerLyn, a pair of input signals such as obtained
from spaced apart microphones are subjected to an
attenuating network whereby low frequencies are emphasized
without phase shift. In U.S. Patent 3,560,656 a
! 10 circuit is described whereby a monaural signal is
split into two separate components, one of which is
phase shifted from the other by a substantial amount
over a wide frequency range to simulate a stereophonic
:,
signal.
,
The U.S. Patents 3,892,624 to Shimada and
4,069,394 to Doi et al describe stereophonic sound
reproducing systems wherein portions of the input
; signals are cross-fed in out-of-phase relationship.
The '624 patent describes a specific technique appli-
cable when the speakers are located close to each
other in a co~mon cabinet. The '394 patent teaches a
cross-feed network which also introduces a large
phase shift. Similarly, the U.S. Patent 4,194,036 to
Okamoto et al provides a cross-feed circuit appli-
cable to a stereo reproduction system employing
~; substantial phase shifts and thus also substantial
! time delays. The U.S. Patent 4,027,101 to Feitas et
al teaches an audio system ~herein concert hall
reverberation effects are simulated with substantial
time delays.
~:~7S36~
i, .
Many of these prior art techniques achieve
various degrees of quality stereophonic reproduction.
The quality being a subjective standard because human
perception of sound and music is subjective. The
human hearing system, however, is on the average able
to distinguish various stereophonic reproduction
systems and assign quality evaluations to these. For
example, the sound may appear flat or full, directional
or omnidirectional and full of ambience. The sound
10- may appear blurred or the instruments clearly dis- -
tinguishable. On the whole the sound may appear
pleasant or disturbing. When experts provide rigid
technical techniques to characterize what kind of
sound is pleasing, effective or beautiful, they
usually rely upon a statistical sample of different
persons to establish their conclusions.
Various observations have been made and
.
confirmed by experimenters. For example, it is well
known and accepted that generally below about 700 Hz
the human hearing system derives directivity from
phase differences of the sound reaching a person.
Above that frequency directivity is derived from
amplitude differences. In the Blumlein patent,
~ advantage is taken of that information to enhance the
-I phase shift of microphone signals artificially at low
~' frequencies and thus achieve an alleged enhancement
; in-the quality of the sound.
The effects of sound upon a person and the
ability to discern different echoes, instruments and
recreate a mental image of the sound involves a
--3--
1~L753~z
.~ ,
complex interplay between the hearing elements, i.e.
the earflap, canal, and ear drum and the mind which
processes the information. Observations or experimental
investigations often, therefore, involve a large
number of observers to obtain a dependable statistically
valid conclusion. As examples of the types of analyses
of human acoustic responses can be found in publications
such as those of the Journal of The Audio Engineering
Society, see for instance the Haas effect article
10 republished at page 146 of the March 1972 (Volume 20)
issue, and a publication entitled Music, Sound and
Sensation by Fritz Winckel and published by Dover
Publications, 1967, a translation of a work by Max
Verlag. Of particular interest in the latter publi-
cation is a graph shown in Fig. 111 wherein various
psychoacoustic effects are characterized as a func- -
tion of frequency and time. Thus, the refractory
period (the time while nerves do not respond to
.
stimuli) of the hearing nerves generally extends from
about .6 to about 1.3 milliseconds (ms) and stereo
delay times from about one to three ms.
With a network in accordance with the in-
vention, stereo signals may be processed in a particular
manner to achieve a remarkable enhancement in the
perception of the sound when it is projected by audio
transducers such as loudspeakers or earphones. The
enhancement of the stereo signals results in the
perception of a substantial spreading of the sound
wherein the instruments can be more clearly distinguished
and a highly pleasing effect is obtained.
1~1L7S362
,.~
AS described herein for a particular embodi-
ment in accordance with the invention, an enhancement
in acoustic imagery is obtained by combining stereo
signals with delayed versions thereof and with a
predetermined cross-feed of out-of- phase portions.
The delays employed are of short durations, generally
;~ selected to significantly enhance the spatial image
perceived of the audio transducers and may be of the
order of generally less than the refractory period of
10 the human ear. The delays are preferably made appli- --
cable to a low frequency segment of the stereo signals
to effect a subtle enhancement of the sound emanating
from the audio transducers. The cross-feed of the
out-of-phase portions is further selected in a prede-
termined manner within a particular range to achieve
a highly pleasing effect whereby sound is perceived
as omnidirectional.
With signal enhancement techniques in ac-
`1 :: :
cordance with the invention, composite audio output
~j 20 signals in one embodiment thereof, are composed of
the original input s1gnals, slightly delayed versions
thereof with low frequency emphasis and out-of-phase
cross-feed portions of the delayed and original input
~; ; signals. These composite signals may then be used to
make a recording such as on a magnetic tape or disc
~ medium or can be applied to audio transducers such as
- speakers or earphones for acoustic projection or be
used for transmission over radio or television channels.
The enhancement of the acoustic imagery
perceived with the invention is remarkable and has
been immediately recognized by listeners. The descrip-
--5--
:~75362
li
tion of their perceptions include: "sound appears
everywhere", "overall clarity is increased", "re-
trieves de~ail from records that would not otherwise
be heard", "a clean bright sound", "it makes every
note sound necessary", "makes the sound come alive",
"the sound surrounds one", "creates a bright clean
listening experience", "the sound appears warmer and
at the same time brighter", "it brings out every
nuance of every instrument", "it adds a new dimension
to the sound and appears to 'fill up' the entire
room", "any stereo system can be improved with it".
It is, therefore, an object of the invention
to provide a method and apparatus for modifying audio
signals to enhance the acoustic imagery perceived
when the modified signals are projected by audio
', transducers. It is a further object of the invention
to provide a method for making a recording of audio
signals which have been modified to enhance the
acoustic perception of audio played back from the
recording. It is still further an object of the
,;
invention to provide an apparatus for enhancing the
playback of a stereo recording.
These and other objects and advantages of
~,
the invention can be understood from the following
description of several embodiments described in con-
junction with the following drawings.
~ :
Fig. 1 is a schematic block diagram of an
apparatus in accordance wi~h the invention;
Figs. 2 and 3 are plots of various charac-
teristics of an apparatus as shown in Fig. 1, with
- --6--
5362
,j
Fig. 2 illustrating plots as a function of frequency
and Fig. 3 showing plots as a funciton of the desirabil-
ity of the sound enhancement effect achieved with the
invention;
Fig. 4 is a schematic block diagram of
another form of an apparatus in accordance with the
invention;
' Fig. 5 is a passive circuit for enhancing
stereo signals in accordance with the invention; and
Fig. 6 is a diagrammatic plan view of a ~
sound projection system employing the invention.
With reference to Fig. 1, an apparatus 10
in accordance with the~invention is shown. ~ source
'; 12 of audio signals such as from a stereo receiver,
. '
; tape or disc playback device or the like provides a
pair of stereo signals A and B on lines 14, 16. The
~ source 1;2 is generally well known and need not be
; described with further detail.
~, ~
The stereo signals A, B are applied to
apparatus 10 which in~cludes networks whose functions
; are~simila~r to those descrlbed in the aforementioned
; Blumlein patent, except that~certain parameters have
j ~ been selected to achieve the remarkable sound enhance-
~1
~ ment in accordance with the invention.
::,
, Thus, the A, B audio signals are applied to
,
a summing network 18 which produces the sum A + B on
output line 20 and to a differenoe network 22 to
generate the difference A - B on output line 24. The
sum signal A + B is altered in magnitude by a modifier
network 26 having a transfer function Gl to produce
. .
_~_
~L7S31Ei2
:.
on output line 28 a signal V2. In the embodiment of
Fig. 1, the transfer function Gl is equal to about a
half so that the audio signal V2 has a value equal to
the average between A and B.
The difference signal A - B is applied to a
modifier network 30 having a transfer function G2 to
produce an audio signal V4 on output line 32. The
modifier network 30 has a transfer function selected
to introduce a small time delay, TD, for low fre-
quencies, a low frequency amplitude emphasis of apredetermined amount and a predetermined gain. The
modified audio signals V2, V4 are applied to a summing
network 34 and difference network 36 to reconstruct
composite audio signals C, D on output lines 38, 40
formed of the original signals A, B, delayed versions
thereof and out-of-phase cross-feed portions.
The composite audio signals C, D are shown
applied to a dual channel amplifier 42 to drive a
pair of loudspeakers 44, 46. Alternatively, the
20 I composite audio signals may be recorded on a tape or
mas~ter phonograph record from which duplicates may be
made following well known techniques.
The modifier network 30 introduces a time
delay for low frequencies following a frequency curve
generally as shown by curve 50 in Fig. 2. This shows
a time delay of the order of about 200 microseconds
below about 100 Hz and a gradual fall-off above 100
Hz to negligible values. Such time delay may be
achieved with a phase shift which varies in the
manner as shown with curve 52 in Fig. 2. The phase
~1~536Z
shift is illustrated to vary as a function of frequency
from a low value of the order of about 2 degrees at
about 30 Hz to a maximum at 54 of about 9 degrees at
250 Hz and then back to a low level above that
frequency.
Modifier network 30 further provides a low
frequency emphasis or boost in a form as suggested by
. curve 56 in ~ig. 2. This low frequency boost is
~I preferably of the order of about 3db as compared
1~ between signal levels at 50 Hz and about 1000 Hz with
a gradual roll-off as illustrated starting at about
100 Hz.
i Network 30 further has a gain of such magni-
tude that the output composite audio signals C and D
are provided with substantial portions of time
: del~ayed and low frequency emphasized versions of the
original audio signals A and B respectively as well
as substantlal portions of out-of-phase cross-feeds.
: :~This aspect of apparatus 10 may be particularly
¦~: 20 appreciated from the following analysis in which G
~j~ and G2 are respectively the transfer functions for
:
modifier networks 26 and 30. Thus, the composite
: slgnals C and D may be expressed as:
C - Gl(A + B) -~ G2(A - B) (1)
D = Gl(A + B) + G2(B - A) (2)
which can be rewritten as
C = Gl[A(l + 2) + B(l - 2)] (3)
,' Gl Gl
D = Gl[B(l + G2) + A(l - G2)]
30 Gl Gl
~ L7S362
..
In the embodiment shown in Fig. 1, the
transfer function.Gl may be a real number and G2
complex to obtain the desired time delay. The ratio
~; G2/Gl is so selected that each composite audio signal
C and D include substantial portions of out-of-phase
cross-feeds as well as delayed versions of the
original audio signals. Thus, composite signal C is
formed of the original audio signal A, its delayed
version as determined by the ratio G2/Gl, and a
10 cross-feed of the B audio signal. The latter
cross-feed includes some of the original B audio
signal on line 16 and an out-of-phase portion whose
~,~ magnitude is determined by the ratio G2/Gl.
I In the embodiment of Fig. 1, the preferred
~, ratio of G2/Gl is a complex number whose values de-
pend upon the relative phase shift between G2 and Gl.
Since the phase shift as illustrated with curve 52 in
~ ~ Fig. 2 is small at about 1000 Hz, the ratio G2/Gl at
¦ ; that frequency is primarily determined by the
20 absolute ratio of the amplitudes of the transfer
f uDct ion s .
A . The selection of the low frequency boost,
1 ~ ~
~!~ the magnitude of the time delay, and the proportion
, of delayed portions and out-of-phase portions (the
ratio G2/Gl at 1000 Hz) have been found to greatly
affect the imagery perceived of the projected sound.
The selection of these parameters preferably is made
~ on a subjective basis by comparison with a conventional
- stereo acoustic projection of the original audio
30 signals A and B. Since the impression of such
-lQ-
~75362
,, .
comparison may vary between individuals, the optimum
values for the circuit parameters of apparatus 10 may
vary; however, the trend oE such comparison does show
a peak in the desirability or pleasing nature of the
effect of the invention upon the listener.
~; Thus, with reference to Fig. 3, a plurality
of curves are shown wherein the independent variable
varies along the abscissa and the ordinate represents
a subjective impression of the perceived acoustic
i 10 imagery in terms of most effective at 100% and
¦ diminishing effectiveness below that in increments of
25%.
When the ratio G2/Gl (relative to 1000 Hz~
was varied, an optimum range with a peak was found to
exist as illustrated w-ith curve 60. Good acoustic
` ~ imagery was perceived with ratios for G2/Gl between
`1:
j about 1.6 and about 3.2. Acceptable acoustic
! enhancement may occur over a wider range of ratios,
say from about 1.4 to about 4.3 corresponding to an
estimated 25% achievement of the optimum enhancement.
, ~ ~
This range of ratios provides that the cross-feed
portion (all of the B audio signal in the C composite
signal and all of the A signal in the D audio signal)
occupies from about 20% to about 65% in the composite
signals, with a preferred range from about 25 to
about 50%. At these G2/Gl ratios, the delayed and
undelayed version of the audio input signals provide
from about 80 to 35 percent for a broad range and 75
to 50% for a preferred range of the composite signals
with the balance provided primarily by the cross-feed.
, ~IL7S362
At the lower 2/Gl ratios such as 1, normal
stereo projection is obtained while at higher values,
above about 4.3, special effects occur with substan-
tial balance shifts and generally result in sound
which is unacceptable by most people. A peak or most
pleasing effect was obtained at a G2/Gl ratio of
about 2.7 and with about 60 percent of the composite
signals represented by delayed and undelayed version
of the original audio input signals and the balance
of about 40~ contributed primarily by the out-of-
phase cross feed portion.
Substituting the optimum G2/Gl values in
the relationships (3) and (4) above yields in each
composite audio signal a substantial portion of an
out-of-phase time-delayed cross-feed. The amplitude
of the cross-feed may be in the range from about
fifty to about seventy-five percent of the magnitude
..
i of the other part of the composite signal.
1 The slight time delay introduced in the
,, 20 composite audio signals appears to add a significant
three dimensional perception of the sound. Enhanced
~ acoustic imagery can be obtained without time delay,
;` but preferably the delay is present for both the
in-phase and out-of-phase audio signal portions in
` the composite signals. When the time delay is
increased beyond a value of the order of about a half
-~ millisecond at a low frequency of abou-t 30 Hz such as
with curve 50.1 in Fig. 2, the acoustic enhancement
effect appears to diminish~ Similarly, below time
delays of about fifty microseconds, curve 50.2,
.
~L17S362
, .
~ little acoustic enhancement can be perceived. Since
; the increased time delays were accompanied by a
change in the frequency position of curve 56 in Fig.
2, i.e. higher for longer time delays, the longer
time delays also caused a noticeable undesired shift
in the tonal balance.
` The perceived acoustic enhancement was
found sensitive to both the amount of low frequency
, boost and the frequency below which the boost appeared.
Curve 62 illustrates a preferred frequency range of
between about 50 Hz and 200 Hz for the upper limit
of low frequency emphasis with abou~ 100 Hz optimum
value. The preferred magnitude of the low frequency
boost appeared as shown by curve 64 to be in the
range from about 2 to about 4 db with 3 db an optimum
value. Above 4 db the frequency balance appears
upset, leaving a generally unpleasant effect. Below
~,~ 2db, because of the nature of the circuit used, the
-j ~ ; beneficial effect of the time delay is reduced.
--¦ 20 With an apparatus 10 formed in accordance
with the invention and using the optimum parameters
as described, a significant enhancement in the percep-
tion of the acoustic sound is obtained. When the
apparatus is used with audio signals from a stereo
' record, it is as if previously unknown recorded
stereo information has been unlocked and a full
breadth of available sound information projected to
the listener. The modifier networks 26, 30 employed
in the apparatus 10 of Fig. 1 may be as illustrated.
Thus, modifier 26 is formed with an operational
-13-
: 1~7536Z
d
~, amplifier 70 with a feedback resistor 72 of about 55
K ohms and an input resistor 74 of 100 K ohms,
yielding a transfer function value of Gl of .55.
The modifier network 30 can be formed with
an operational amplifier 78 having a reactive feedback
network 80. The latter network has an overall gain
controlling resistor 82, of the order of lOOK ohms in
parallel with a series circuit formed by resistor 84
and a capacitor 86. An input resistor 88 of the
order of 47.5K ohms sets the particular gain of the
amplifier 78. Resistor 84 controls the amount of low
frequency boost and phase shift and is of the order
of 250K ohms. The size of the capacitor determines
the frequency below which the boost occurs and to
some extent the amount of the time delay. The
capacitor value was 2200 pico (10-12) farads. With a
network having these circuit values, optimum performance
curves such as 50 (time delay), 52 (phase shift), 56
, :
amplitude), and the optimum points on perception
curves of Flg. 3 were obtained. The time delay
curves 50.1, 50.2 and 50.3 were obtained by changing
the value of capacitor 86 respectively to 4400, 550
and 110 pico farads. When these different capacitor
I values were used, the phase shift curve 52 retained
DI its general shape as shown in Fig. 2, but its posi-
tion in the frequency domain was shifted towards
lower frequencies for higher values of C and vice
versa.
Fig. 4 shows an alternate apparatus 90 in
accordance with the invention. In this device both
input audio signals A and R on input lines 14, 16 are
-14-
1~'7536Z
:'
applied to difference circuits 92, 94 to generate
respectively difference signa~s ~-B on output 96 and
B-A on output 98. Both difference signals are then
~ applied to modifier networks such as 30 having a
I transfer function ~2- The modified outputs on lines
102, 104 are combined with amplitude modified versions
of original audio signals A and B or in summing net-
works 106, 108. The latter's outputs 110, 112 bear
composite audio signals C' and D' similar to C and D
in Fig. 1. The composite audio signals may be used,
for example, to form a master record 114 on a record-
ing apparatus 116 for subsequent mass production.
The composite signals C' and D' may be
expressed in accordance with the following relation-
¦ ship:
= Gl lA(l + G2) _ G2B] ~5)
Gl Gl
Dl = G1 [B(l + G2) _ G2A] (6)
Gl Gl
1~ ~ from (3) and (4) which are applicable to the apparatus
10 of Fig. 1. However, to a substantial extent, the
~,l - same acoustic enhancement is achieved with the
embodiment of Fig. 4, though perhaps with somewhat
reduced omnidirectional effect.
`I
Fig. 5 illustrates passive networks 120,
122 operating on sum and difference audio signals A +
B and A - B as on lines 20, 24 in Fig. 1. Networks
- 120, 122 are the equivalents of modifiers 26, 30
respectively and their outputs 124, 126 are applied,
as in Fig. 1, to sum and difEerence circuits 34, 36.
Having thus described several embodiments
of the invention for enhancing the acoustic imagery
.
-15-
~7S36Z
of audio signals, the remarkable effects of the invention
can be appreciated. For example, the invention, by
virtue OL its ability to create a spatial spread-out
perception of the projected sound, acts as a much im-
proved time to spatial angle converter. This advan-
tageous characteristic can be explained with reference
to Fig. 6 in which a pair of spaced apart loudspeakers
44, 46 are shown in a horizontal plane spaced from a
centrally stationed listener, L.
In a conventional sound projection system
without the use of the invention, spatial perception
of the sound from the speakers 44, 46 is primarily a
function of relative amplitudes of the sound. When
the ratio A/B o the input audio signals is one, the
sound appears centered and as the ratio is increased,
- the sound moves towards speaker 44. Generally, at a
ratio in excess of 10 (20db) the sound is at a
maximum spatial angle, ~ , relative to listener L.
This maximum angle is 45 degrees when the speakers
~; 20 44j 46, as shown in Fig. 6, are oriented at an angle
of 90 degrees relative to each other. The angle is
generally independent of the duration of the input
signals A and B.
With the inclusion of an acoustic image
enhancing device in accordance with the invention,
the perceived spatial angles rapidly exceed 45 degrees
at relatively low input audio signal ratio levels.
This perception increases significantly as the duration
of the sound increases. For example, when the input
signal has a duration generally longer than about
~1~7S3~i2
several milliseconds, the perceived spatial angle may
approach 90 degrees and in some cases even greater at
relatively low input signal ratios. As a result,
sustained or reverberated sounds, which in an original
recording were imaged at, for example, 45 degrees,
may now be perceived to be imaged at 90 degrees or
more. This time to angle conversion is pleasing to
the listener who is able to perceive subtleties of
many reflections and sustained waveforms and thus
receives an impression of enhanced clarity in the
sound and is able to more clearly distinguish differ-
ent instruments.
The acoustic image enhancing effect may be
varied by a listener to, for example, adjust for
different desired signal separation effects by making
resistor 88 in Fig. 1 a potentiometer as suggested by
~; the da~shed arrow 88a. As previously explained, a
change in resistor 88 alters the magnitude of G2 and
thusl in effect, enables a listener to move along the
G2/Gl ratio curve 60 of Fig. 3. The value of poten-
tiometer 88a may be selected so that for example, the
desirability of the effect varies from a low G2/Gl
value (about 1.4) corresponding to a desirability of
25% to a higher value (about 3.2) corresponding to a
desirability of 75%.
The described invention may be used with a
single monaural audio signal to provide a significant
improvement in the projection of the monaural signal.
Such arrangement involves merely connecting the
monaural audio signals to one of the inputs such as
-17~-
~:~L75362
14 in Fig. 1, leaving the other input 16 floating or
unconnected.
Having thus described several embodiments
for practicing the invention, its advantages can be
appreciated. The more specific ranges and component
values described herein have been presented to
provide a precise teaching of the invention and
practical ranges in which the remarkable effect can
be discerned. The time delay may be provided by a
circuit which does not vary with frequency such as by
a digital time delay circuit. Since the perception
of the acoustic imagery may differ for different
persons, the lndicated ranges may vary and the scope
of the invention should be determined by the follow-
ing c.alms.
;:
;: :
::~
.