Language selection

Search

Patent 2406926 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2406926
(54) English Title: MULTI-CHANNEL SURROUND SOUND MASTERING AND REPRODUCTION TECHNIQUES THAT PRESERVE SPATIAL HARMONICS IN THREE DIMENSIONS
(54) French Title: PRISE DE SON AMBIANT MULTI-CANAL ET TECHNIQUES DE REPRODUCTION QUI PRESERVENT LES HARMONIQUES SPATIALES EN TROIS DIMENSIONS
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • H04S 3/02 (2006.01)
(72) Inventors :
  • MOORER, JAMES A. (United States of America)
(73) Owners :
  • SNK TECH INVESTMENT L.L.C. (United States of America)
(71) Applicants :
  • SONIC SOLUTIONS (United States of America)
(74) Agent: BORDEN LADNER GERVAIS LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2000-10-06
(87) Open to Public Inspection: 2001-11-01
Examination requested: 2005-10-05
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2000/027851
(87) International Publication Number: WO2001/082651
(85) National Entry: 2002-10-17

(30) Application Priority Data:
Application No. Country/Territory Date
09/552,378 United States of America 2000-04-19

Abstracts

English Abstract




Techniques of making a recording of or transmitting a sound field from either
multiple monaural or directional sound signals that reproduce through multiple
discrete loud speakers a sound field with spatial harmonics that substantially
exactly match those of the original sound field. Monaural sound sources are
positioned during mastering to use contributions of all speaker channels in
order to preserve the spatial harmonics. If a particular arrangement of
speakers is different than what is assumed during mastering, the speaker
signals are rematrixed at the home, theater or other sound reproduction
location so that the spatial harmonics of the sound field reproduced by the
different speaker arrangement match those of the original sound field. An
alternative includes recording or transmitting directional microphone signals,
or their spatial harmonic components, and then matrixing these signals at the
sound reproduction location in a manner that takes into account the specific
speaker arrangement. The techniques are described for both a two dimensional
sound field and the more general three dimensional case, the latter based upon
using spherical harmonics.


French Abstract

L'invention concerne des techniques permettant d'enregistrer ou de transmettre un champ de signaux sonores monauraux ou directionnels multiples qui reproduisent par des haut-parleurs discrets multiples un champ sonore ayant des harmoniques spatiaux qui correspondent sensiblement exactement à celles du champ sonore d'origine. Des sources de sons monauraux sont placées pendant la prise de son pour utiliser des contributions de toutes les voies de locuteur afin de préserver les harmoniques spatiales. Si un ensemble particulier de locuteurs est différent de ce que l'on considère pendant la prise de son, les signaux locuteur sont rematricés à domicile, au théâtre ou autre lieu de reproduction sonore de telle façon que les harmoniques spatiales du champ sonore reproduit par les différents ensembles locuteurs correspondent à celles du champ sonore d'origine. Une alternative comporte l'enregistrement ou la transmission de signaux de microphone directionnel, ou de leurs composantes harmoniques spatiales, et ensuite le matriçage de ces signaux au niveau du lieu de reproduction sonore de telle manière que l'on tienne compte de l'ensemble locuteur spécifique. Les techniques sont décrites à la fois pour un champ sonore bidimensional et pour le cas plus général tridimensionnel, ce dernier étant basé sur l'utilisation d'harmoniques sphériques.

Claims

Note: Claims are shown in the official language in which they were submitted.





IT IS CLAIMED:
1. A method of processing a sound field for reproduction of the
sound field over a given frequency range through a surround sound system
having
a plurality of at least four channels individually feeding one of at least
four speakers,
comprising:
acquiring multiple signals of the sound field, and
directing the acquired sound field signals into individual ones of
plurality of the channels with a set of relative gains for the entire
frequency range
that is determined by solving a relationship that (1) includes selected
positions of the
speakers around a listening area not constrained to a regular geometric,
coplanar
pattern, and (2) substantially preserves individual ones of a plurality of
three
dimensional spatial harmonics of the sound field,
whereby a sound field reproduced from the speakers arranged in said
selected positions substantially reproduces the plurality of three dimensional
spatial
harmonics of the acquired sound field.
2. The method according to claim 1, wherein the number of
three dimensional spatial harmonics which are substantially preserved includes
only
zero and first order harmonics.
3. The method according to claim 1, wherein the number of
three dimensional spatial harmonics which are substantially preserved includes
zero
to nth harmonics, where n is an integer equal to or less than one less than
the square
root of the number of speakers.
4. The method according to claim 1, wherein acquiring multiple
signals of the sound field includes acquiring multiple monaural signals of
sounds
desired to be located at specific positions around the listening area, and
said
relationship includes such specific positions, whereby the sound field
reproduced
from the speakers additionally includes the monaural sounds at said specific
positions.
5. The method according to claim 1, wherein acquiring multiple
signals of the sound field includes positioning multiple directional
microphones in the
sound field.
27




6. The method according to claim 1, wherein the set of relative
gains is determined at least in part by the relationship that includes assumed
positions
of the speakers around some listening area.
7. The method according to claim 1, wherein the set of relative
gains is determined at least in part at a location adjacent the listening area
by the
relationship that includes actual positions of the speakers around the
listening area.
8. The method according to claim 1, wherein the set of relative
gains is additionally determined by that which causes a velocity and power
vectors
to be substantially aligned.
9. The method according to claim 1, wherein the set of relative
gains is additionally determined by that which causes second or higher of said
plurality of three dimensional spatial harmonics to be minimized.
10. The method according to any one of claims 1-9, wherein the
surround sound system has exactly six channels individually feeding a
different one
of exactly six speakers.
11. The method according to claim 10, wherein at least one of
said exactly six speakers is positioned to be non-coplanar with the other ones
of said
exactly six speakers.
28




12. A method of simulating a desired apparent three dimensional
position of a sound in a multi-channel surround sound system, comprising:
monaurally acquiring the sound for which a three dimensional
position is desired to be simulated, and
directing the acquired monaural sound into individual ones of the
multiple channels with a set of relative gains that is determined by solving a
relationship of a declination and an azimuth of the desired apparent position
of the
sound with respect to a point and a set of angular positions extending around
said
point that correspond to expected positions of speakers driven by individual
ones of
the multiple channel signals, said relationship being solved in a manner that
substantially preserves at least zero and first order three dimensional
harmonics of
the sound when reproduced through speakers at the expected positions as if the
monaural sound was actually present at said apparent position.
13. The method of claim 12, wherein speakers are actually
positioned with at least one of said speakers having an actual position
different from
that of the expected positions, and additionally comprising calculating a
modified set
of relative gains for driving the speakers by solving a second relationship
including
the actual positions of the speakers and in a manner that preserves individual
values
of at least zero and first order three dimensional harmonics of the sound when
reproduced through speakers at the actual positions as if the monaural sound
was
actually present at said apparent position.
14. The method according to either of claims 12 or 13, wherein
the set of relative gains is additionally determined by that which causes
velocity and
power vectors of a sound field reproduced through the speakers to be
substantially
aligned.
15. The method according to either of claims 12 or 13, wherein
the set of relative gains is additionally determined by that which causes
second and
higher three dimensional spatial harmonics of a sound field reproduced through
the
speakers to be minimized.
16. The method according to either of claims 12 or 13, wherein
the number of channels is four or more.
29




17. The method according to either of claims 12 or 13, wherein
the number of channels is exactly six.
18. The method according to claim 16, wherein at least one of the
expected positions of speakers is non-coplanar with the others ones of the
expected
positions of speakers.
19. A method of reproducing a three dimensional sound field
through four or more speakers positioned around a listening area, comprising:
acquiring a plurality of electrical signals representative of the sound
field,
processing said plurality of electrical signals in a manner to generate
signals of at least zero and first order three dimensional spatial harmonics
of said
sound field, and
processing the three dimensional spatial harmonic signals in a manner
to determine relative gains of signals fed to individual ones of the speakers
by
solving a relationship that includes terms of actual positions of the speakers
and,
when solved, substantially preserves at least the zero and first order three
dimensional harmonics of the sound field reproduced through the speakers as
respectively matching the zero and first order three dimensional harmonics of
the
acquired sound field.
20. The method according to claim 19, which additionally
comprises recording and playing back the plurality of electrical signals
representative
of the sound field.
21. The method according to claim 19, which additionally
comprises recording and playing back the signals of the sound field harmonics.
22. The method according to any one of claims 19-21, wherein
the sound field is reproduced through exactly six speakers.
23. The method according to claim 20, wherein at least one of
said exactly six speakers is positioned to be non-coplanar with the other ones
of said
exactly six speakers.
30




24. A sound reproduction system having an input to receive at
least four audio signals of an original sound field that are intended to be
reproduced
by respective ones of at least four speakers at certain assumed positions
surrounding
a listening area and outputs to drive at least four speakers at certain actual
positions
surrounding the listening area that are different from the assumed positions,
comprising:
an input that accepts information, including declination and azimuth,
of the speaker certain actual positions, and
an electronically implemented matrix responsive to inputted actual
speaker position information, including declination and azimuth, and to the
assumed
speaker positions to provide from the input signals other signals to the
outputs which
drive the speakers to reproduce the sound field with a number of three
dimensional
spatial harmonics that individually match substantially individual ones of the
same
number of three dimensional spatial harmonics in the original sound field.
25. The sound system according to claim 24, wherein the matrix
further includes:
a first part that develops, from the assumed speaker position
information and the input signals, individual signals corresponding to the
number of
three dimensional spatial harmonics, and
a second part that develops, from the three dimensional spatial
harmonic signals and the actual speaker position information, individual
signals for
the actual speakers.
26. The sound system according to either of claims 24 or 25,
wherein the number of matched three dimensional spatial harmonics includes
zero
and first order harmonics.
27. The sound system according to either of claims 24 or 25,
wherein the number of matched three dimensional spatial harmonics includes
only
zero and first order harmonics.
28. The sound system according to either of claims 24 or 25,
wherein the number of speakers at the actual speaker locations includes
exactly six.
31




29. The sound system according to claim 25, wherein at least one
of said actual speaker locations is positioned to be non-coplanar with the
other ones
of said actual speaker locations.
30. A sound system having an input to receive audio signals of
an original three dimensional sound field and outputs to drive at least four
loud
speakers at certain actual positions surrounding a listening area to reproduce
the
sound field, comprising:
an input that accepts information of the speaker actual positions, and
an electronically implemented matrix responsive to inputted
information of the actual speaker positions and input signals to provide
signals to the
outputs which drive the speakers to reproduce the sound field with a number of
three dimensional spatial harmonics that individually match substantially
corresponding ones of the same number of three dimensional spatial harmonics
in the
original sound field.
32

Description

Note: Descriptions are shown in the official language in which they were submitted.



CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
MULTI-CHANNEL SURROUND SOUND MASTERING AND
REPRODUCTION TECHNIQUES THAT PRESERVE SPATIAL
HARMONICS IN THREE DIMENSIONS
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation in part of application Serial No.
08/936,636, filed September 24, 1997, which is hereby incorporated herein by
this
reference.
BACKGROUND OF THE INVENTION
This invention relates generally to the art of electronic sound
, transmission, recording and reproduction, and, more specifically, to
improvements
in surround sound techniques.
Improvements in the quality and realism of sound reproduction have
steadily been made during the past several decades. Stereo (two channel)
recording
and playback through spatially separated loud speakers significantly improved
the
realism of the reproduced sound, when compared to earlier monaural (one
channel)
sound reproduction. More recently, the audio signals have been encoded in the
two
channels in a manner to drive four or more loud speakers positioned to
surround the
listener. This surround sound has further added to the realism of the
reproduced
sound. Mufti-channel (three or more channel) recording is used for the sound
tracks
of most movies, which provides some spectacular audio effects in theaters that
are
suitably equipped with a sound system that includes loud speakers positioned
around
its walls to surround the audience. Standards are currently emerging for
multiple
channel audio recording on small optical CDS (Compact Disks) that are expected
to become very popular for home use. A recent DVD (Digital Video Disk)
standard
provides for multiple channels of PCM (Pulse Code Modulation) audio on a CD
that
may or may not contain video.
Theoretically, the most accurate reproduction of an audio wavefront
would be obtained by recording and playing back an acoustic hologram. However,
tens of thousands, and even many millions, of separate channels would have to
be
recorded. Atwo dimensional array of speakers would have to be placed around
the
1


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
home or theater with a spacing no greater than one-half the wavelength of the
highest frequency desired to be reproduced, somewhat less than one centimeter
apart, in order to accurately reconstruct the original acoustic wavefront. A
separate
channel would have to be recorded for each of this very large number of
speakers,
involving use of a similar large number of microphones during the recording
process.
Such an accurate reconstruction of an audio wavefront is thus not at all
practical for
audio reproduction systems used in homes, theaters and the like.
When desired reproduction is three dimensional and the speakers are
no longer coplanar, these complications correspondingly multiply and this sort
of
reproduction becomes even more impractical. The extension to three dimensions
allows for special effects, such as for movies or in mastering musical
recordings, as
well as for when an original sound source is not restricted to a plane. Even
in the
case of, say, a recording of musicians on a planar stage, the resultant
ambient sound
environment will have a three dimensional character due to reflections and
variations
in instrument placement which can be captured and reproduced. Although more
difficult to quantify than the localization of a sound source, the inclusion
of the third
dimension adds to this feeling of "spaciousness" and depth for the sound field
even
when the actual sources are localized in a coplanar arrangement.
Therefore, it is a primary and general object of the present invention
to provide techniques of reproducing sound with improved realism by mufti-
channel
recording, such as that provided in the emerging new audio standards, with
about
the same number of loud speakers as currently used in surround sound systems.
It is another object of the present invention to provide a method
and/or system for playing back recorded or transmitted mufti-channel sound in
a
home, theater, or other listening location, that allows the user to set an
electronic
matrix at the listening location for the specific arrangement of loud speakers
being
used there.
It is further objective of the present invention to extend these
techniques and methods to the capture and reproduction of a three dimensional
sound field where the loud speakers are placed in a non-coplanar arrangement.
S L JNIMARY OF THE INVENTION
These and additional objects are realized by the present invention,
wherein, briefly and generally, an audio field is acquired and reproduced by
multiple
signals through four or more loud speakers positioned to surround a listening
area,
the signals being processed in a manner that reproduces substantially exactly
a
2


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
specified number of spatial harmonics of the acquired audio field with
practically any
specific arrangement of the speakers around the listening area. This adds to
the
realism of the sound reproduction without any particular constraint being
imposed
upon the positions of the loud speakers.
Rather than requiring that the speakers be arranged in some particular
pattern before the system can reproduce the specified number of spatial
harmonics,
whatever speaker locations that exist are used as parameters in the electronic
encoding and/or decoding of the multiple channel sound signals to bring about
this
favorable result in a particular reproduction layout. If one or more of the
speakers
is moved, these parameters are changed to preserve the spatial harmonics in
the
reproduced sound. Use of five channels and five speakers are described below
to
illustrate the various aspects of the present invention.
According to one specific aspect of the present invention, individual
monaural sounds are mixed together by use of a matrix that, when making a
recording or forming a sound transmission, angularly positions them, when
reproduced through an assumed speaker arrangement around the listener, with
improved realism. Rather than merely sending a given monaural sound to two
channels that drive speakers on each side of the location of the sound, as is
currently
done with standard panning techniques, all of the channels are potentially
involved
in order to reproduce the sound with the desired spatial harmonics. An example
application is in the mastering of a recording of several musicians playing
together.
The sound of each instrument is first recorded separately and then mixed in a
manner
to position the sound around the listening area upon reproduction. By using
all the
channels to maintain spatial harmonics, the reproduced sound field is closer
to that
which exists in the room where the musicians are playing.
According to another specific aspect of the present invention, the
mufti-channel sound may be rematrixed at the home, theater or other location
where
being reproduced, in order to accommodate a different arrangement of speakers
than
was assumed when originally mastered. The desired spatial harmonics are
accurately
reproduced with the different actual arrangement of speakers. This allows
freedom
of speaker placement, particularly important in the home which often imposes
constraints on speaker placement, without losing the improved realism of the
sound.
According to a further specific aspect of the present invention, a
sound field is initially acquired with directional information by a use of
multiple
directional microphones. Either the microphone outputs, or spatial harmonic
signals
resulting from an initial partial matrixing of the microphone outputs, are
recorded
3


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
or transmitted to the listening location by separate channels. The transmitted
signals
are then matrixed in the home or other listening location in a manner that
takes into
account the actual speaker locations, in order to reproduce the recorded sound
field
with some number of spatial harmonics that are matched to those of the
recording
location.
These various aspects may use spatial harmonics in either two or
three dimensions. In the two dimensional case, the audio wave front is
reproduced
by an arrangement of loud speakers that is largely coplanar, whether the
initial
recordings were based on two dimensional spatial harmonics or through
projecting
three dimensional harmonics on to the plane of the speakers. In a three
dimensional
reproduction, one or more of the speakers is placed at a different elevation
than this
two dimensional plane. Similarly, the three dimensional sound field is
acquired by
a non-coplanar arrangement of the multiple directional microphones.
Additional objects, features and advantages of the various aspects of
the present invention will become apparent from the following description of
its
preferred embodiments, which embodiments should be taken in conjunction with
the
accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a plan view of the placement of multiple loud speakers
surrounding a listening area;
Figures 2A-D illustrate acoustic spatial frequencies of the sound
reproduction arrangement of Figure 1;
Figure 3 is a block diagram of a matrixing system for placing the
locations of monaural sounds;
Figure 4 is a block diagram for re-matrixed the signals matrixed in
Figure 3 in order to take into account a different position of the speakers
than
assumed when initially matrixing the signals;
Figures 5 and 6 are block diagrams that show alternate arrangements
for acquiring and reproducing sounds from multiple directional microphones;
Figure 7 provides more detail of the microphone matrix block in
Figures 5 and 6; and
Figure 8 shows an arrangement of three microphones as the source
of the audio signals to the systems of Figures 5 and 6.
Figure 9 illustrates the arrangement of the spherical coordinates.
4


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
Figure 10 shows an angular alignment for a three dimensional array
of four microphones.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
The discussion starts with the method of spatial harmonics in a two
dimensional plane. Some of the results of this methodology are: (1) a way of
recording surround sound that can be used to feed any number of speakers; (2)
a
way of panning monaural sounds so as to produce exactly a given set of spatial
harmonics; and (3) a way of storing or transmitting surround sound in three
channels
such that two of the channels are a standard stereo mix, and by use of the
third
channel, the surround feed may be recreated that preserves the original
spatial
harmonics.
Following the two dimensional discussion, this same theory is
extended to three dimensions. In two dimensions, the spatial harmonics are
based
on the Fourier sine and cosine series of a single variable, the angle ~.
Unfortunately,
the mathematics for the 3D version is not as clean and compact as for 2D.
There is
not any particularly good way to reduce the complexity and for this reason the
2D
version is presented first.
To extend the method of spatial harmonics to 3 dimensions, a brief
discussion of the Legendre functions and the spherical harmonics is then
given. In
some sense, this is a generalization of the Fourier sine and cosine series.
The Fourier
series is a fi~nction of one angle, ~. The series is periodic. It can be
thought of as a
representation of fi~nctions on a circle. Spherical harmonics are defined on
the
surface of a sphere and are fi~nctions of two angles, 8 and c~. ~ is the
azimuth,
defined where zero degrees is straight ahead, 90 ° is to the left, and
180 ° is directly
behind. B is the declination (up and down), with zero degrees directly
overhead, 90 °
as the horizontal plane, and 180° being straight down. These are shown
in Figure
9 for a point (8,~). Note that the range of B is zero to 180°, whereas
the range of
~ is zero to 360 ° (or, alternately, -180 ° to 180 ° )
Spatial Harmonics in Two Dimensions
A person 11 is shown in Figure 1 to be at the middle of a listening
area surrounded by loudspeakers SP1, SP2, SP3, SP4 and SPS that are pointed to
direct their sounds toward the center. A system of angular coordinates is
established
for the purpose of the descriptions in this application. The forward direction
of the
listener 11, facing a front speaker SPl, is taken to be positioned at
(81,1)=(90°,0°)
5


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
as a reference. The angular positions of the remaining speakers SP2 (front
left), SP3
(rear left), SP4 (rear right) and SPS (front right) are respectively ( 8z,
~Z), ( 63, ~3)~
( B4, ~4), and ( B5, ~5) from that reference. Here the speakers are positioned
in a
typical arrangement defining a surface that is substantially a plane, an
example being
the horizontal planar surface of B =90 ° that is parallel to the floor
of a room in
which the speakers are positioned. In this situation, each of Bi- 95 is then
90 ° and
these 6s will not be explicitly expressed for the time being and are omitted
from
Figure 1. The elevation of one or more of the speakers above one or more of
the
other speakers is not required but may be done in order to accommodate a
restricted
space. The case of one or more ofthe 8;#90° is discussed below.
A monaural sound 13, such as one from a single musical instrument,
is desired to be positioned at an angle ~o from that zero reference, at a
position
where there is no speaker. There will usually be other monaural sounds that
are
desired to be simultaneously positioned at other angles but only the source 13
is
shown here for simplicity of explanation. For a mufti-instrument musical
source, for
example, the sounds of the individual instruments will be positioned at
different
angles ~o around the listening area during the mastering process. The sound of
each
instrument is typically acquired by one or more microphones recorded
monaurally
on at least one separate channel. These monaural recordings serve as the
sources
of the sounds during the mastering process. Alternatively, the mastering may
be
performed in real time from the separate instrument microphones.
Before describing the mastering process, Figures 2A-D are referenced
to illustrate the concept of spatial frequencies. Figure 2A shows the space
surrounding the listening area of Figure 1 in terms of angular position. The
five
locations of each of the speakers SP 1, SP2, SP3, SP4 and SPS are shown, as is
the
desired location of the sound source 13. The sound 13 may be viewed as a
spatial
impulse which in turn may be expressed as a Fourier expansion, as follows:
M
~ ~)= ao+ E (amcos m ~+ b rosin m ~) ( 1 )
m-1
where »a is an integer number of the individual spatial harmonics, from 0 to
the
numberMof harmonics being reconstructed, am is the coefficient of one
component
of each harmonic and bm is a coeWcient of an orthogonal component of each
harmonic. The value ao thus represents the value of the spatial function's
zero order.
6


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
The spatial zero order is shown in Figure 2B, having an equal
magnitude around entire space that rises and falls with the magnitude of the
spatial
impulse sound source 13. Figure 2C shows a first order spatial function, being
a
maximum at the angle of the impulse 13 while having one complete cycle around
the
space. A second order spatial function, as illustrated in Figure 2D, has two
complete
cycles around the space. Mathematically, the spatial impulse 13 is accurately
represented by a large number of orders but the fact of only a few speakers
being
used places a limit upon the number of spatial harmonics that may be included
in the
reproduced sound field. If the number of speakers is equal to or greater than
(1 +
2n), where n here is the number of harmonics desired to be reproduced, then
spatial
harmonics zero through n of the reproduced sound field may be reproduced
substantially exactly as exist in the original sound field. Conversely, the
spatial
harmonics which can be reproduced exactly are harmonics zero through n, where
n
is the highest whole integer that is equal to or less than one-half of one
less than the
number of speakers positioned around a listening area. Alternately, fewer than
this
maximum number of possible spatial harmonics may be chosen to be reproduced as
in a particular system.
One specific aspect of the present invention is illustrated by Figure
3, which schematically shows certain fixnctions of a sound console used to
master
multiple channel recordings. In this example, five signals S 1, S2, S3, S4,
and SS are
being recorded in five separate channels of a suitable recording medium such
as tape,
likely in digital form. Each of these signals is to drive an individual loud
speaker.
Two monaural sources 17 and 19 of sound are illustrated to be mixed into the
recorded signals Sl-S5. The sources 17 and 19 can be, for example, either live
or
recorded signals of different musical instruments that are being blended
together.
One or both of the sources 17 and 19 can also be synthetically generated or
naturally
recorded sound effects, voices and the like. In practice, there are usually
far more
than two such signals used to make a recording. The individual signals may be
added to the recording tracks one at a time or mixed together for simultaneous
recording.
What is illustrated by Figure 3 is a technique of "positioning" the
monaural sounds. That is, the apparent location of each of the sources 17 and
19
of sound when the recording is played back through a surround sound system, is
set
during the mastering process, as described above with respect to Figure 1.
Currently, usual panning techniques of mastering consoles direct a monaural
sound
into only two of the recorded signals S 1-SS that feed the speakers on either
side of
7


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
the location desired for the sound, with relative amplitudes that determines
the
apparent position to the listener of the source of the sound. But this lacks
certain
realism. Therefore, as shown in Figure 3, each source of sound is fed into
each of
the five channels with relative gains being set to construct a set of signals
that have
a certain number of spatial harmonics, at least the zero and first harmonics,
of a
sound field emanating from that location. One or more of the channels may
still
receive no portion of a particular signal but now because it is a result of
preserving
a given number of spatial harmonics, not because the signal is being
artificially
limited to only two of the channels.
The relative contributions of the source 17 signal to the five separate
channels S1-SS is indicated by respective variable gain amplifiers 21, 22, 23,
24 and
25. Respective gains g1, gv g3, g4 and g5 of these amplifiers are set by
control signals
in circuits 27 from a control processor 29. Similarly, the sound signal of the
source
19 is directed into each of the channels S 1-SS through respective amplifiers
3 l, 32,
33, 34 and 35. Respective gains g1', gi, g3', ga and g5' of the amplifiers 31-
35 are
also set by the control processor 29 through circuits 37. These sets of gains
are
calculated by the control processor 29 from inputs from a sound engineer
through
a control panel 45. These inputs include angles ~ (Figure 1) of the desired
placement of the sounds from the sources 17 and 19 and an assumed set of
speaker
placement angles cal-~5. Calculated parameters may optionally also be provided
through circuits 47 to be recorded. Respective individual outputs of the
amplifiers
21-25 axe combined with those of the amplifiers 31-3 5 by respective summing
nodes
39, 40, 41, 42 and 43 to provide the five channel signals S1-S5. These signals
S1-SS
are eventually reproduced through respective ones of the speakers SP1-SPS.
The control processor 29 includes a DSP (Digital Signal Processor)
operating to solve simultaneous equations from the inputted information to
calculate
a set of relative gains for each of the monaural sound sources. A principle
set of
linear equations that are solved for the placement of each separately located
sound
source may be represented as follows:
N
1+2~cos m(~o ~øi)=Egi[1+2Ecos m(~j ~1)] (2)
m j-1 m
where ~o represents the angle of the desired apparent position of the sound,
ø, and
represent the angular positions that correspond to placement of the
loudspeakers
for the individual channels with each of i and j having values of integers
from 1 to
8


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
the number of channels, m represents spatial harmonics that extend from 0 the
number of harmonics being matched upon reproduction with those of the original
sound field, N is the total number of channels, and gi represents the relative
gains of
the individual channels with i extending from 1 to the number of channels. It
is this
S set of relative gains for which the equations are solved. Use of the i and j
subscripts
follows the usual mathematical notation for a matrix, where i is a row number
and
j a column number of the terms of the matrix.
In a specific example of the number of channels N, and also the
number of speakers, being equal to 5, and only the zero and first spatial
harmonics
are being reproduced exactly, the above linear equations may be expressed as
the
following matrix:
1+ZGOS(lp~ ~1) 1+ZCOS(t~l ~1) 1+ZCOS(~~ ~1) 1+2COS(~l3 ~1) 1+'ZCOS(~~ ~1)
I+2COS(t~5 ~1) 1
1+2cos(~o ~z) 1+2cos(~1 ~s) 1+2cos(~? ~~) 1+2cos(~3 ~J) 1+2cos(~~ øS?)
1+2cos(~s ~s) 2
1+2COS(~p !b3) . 1+ZCOS(~~ ~73) I+2,COS(tp~ tb3) 1+ZCOS(~3 tb3) 1+ZCOS(~~ ~3)
I+2COS(P~s ~3) 3
1+2cos(~o ~f) 1+2cos(~1 ~~) 1+2cos(~? ~~) 1+2cos(~3 ~~) 1+2cos(~~ ~~)
1+2cos(r~s ~,~) a
1+2cos(~o ~s) 1+2cos(~1 ~S) 1+2cos(~? ~s) 1+2cos(~3 ~s) 1+2cos(~~ øs)
1+2cos(~s ~s) s
This general matrix is solved for the desired set of relative gains gl-gs.
This is a rank 3 matrix, meaning that there are a large number of
relative gain values that satisfy it. In order to provide a unique set of
gains, another
constraint is added. One such constraint is that the second spatial harmonic
is zero,
which causes the bottom two lines of the above matrix to be changed, as
follows:
1+2COS(t~l~ 1+2COS(tjl~1+2COS(Ql~I+ZCOS(~31+'ZCOS(~f1+ZCOS(~f1
~T) 4J1) l~l) ~1) ~l) ~1)


1+2cos(~o 1+2cos(I1+2cos(~s1+2cos(~31+2cos(~~~~)1+2cos(~s2
~~) ~s) ~?) ~?) ~s)


I+ZCOS(tp~1+2COS(~11+ZCOS(~~1+2COS(~31+2COS(tpt1+ZCOS(e~f3 (4)
~J3) t~3) ~3) ~3) t~3) ~3)


p cos (2 cos (2 cos (2 cos (2 cos
~1) ~s) ~3) Via) (2
~s)


0 sin(2~1)sin(2~s)sin(2~3)sin(2~t) sin(2~s)s


An alternate constraint which may be imposed on the solution of the
general matrix is to require that a velocity vector (for frequencies below a
transition
frequency within a range of about 750-1500 Hz.) and a power vector (for
frequencies above this transition) be substantially aligned. As is well known,
the
human ear discerns the direction of sound with different mechanisms in the
9


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
frequency ranges above and below this transition. Therefore, the apparent
position
of a sound that potentially extends into both frequency ranges is made to
appear to
the ear to be coming from the same place. This is obtained by equating the
expressions for the angular direction of each of these vectors, as follows:
Eg sin ~ fig; sin ~i
arctan ' ' - arctan (S )
Eg~c°s y Eg~zcos ~~
The definition of the velocity vector direction is on the left of the equal
sign and that
of the power vector on the right. For the power vector, taking the square of
the gain
terms is an approximation of a model of the way the human ear responds to the
higher frequency range, so can vary somewhat between individuals.
Once a set of relative gains is calculated by the control processor 29
for each of the sounds to be positioned around the listener 11, the resulting
signals
S1-SS can be played back from the recording 15 and individually drive one of
the
speakers SP1-SPS. If the speakers are located exactly in the angular positions
cal
~5 around the listener 11 that were assumed when calculating the relative
gains of
each sound source, or very close to those positions, then the locations of all
the
sound sources will appear to the listener to be exactly where the sound
engineer
intended them to be located. The zero, first and any higher order spatial
harmonics
included in these calculations will be faithfully reproduced.
However, physical constraints of the home, theater or other location
where the recording is to be played back often restrict where the speakers of
its
sound system may be placed. If angularly positioned around the listening
areaeat
angles different than those assumed during recording, the spatialization of
the
individual sound sources may not be optimal. Therefore, according to another
aspect of the present invention, the signals S1-SS are rematrixed by the
listener's
sound system in a manner illustrated in Figure 4. The sound channels S1-SS
played
back from the recording 15 are, in a specific implementation, initially
converted to
spatial harmonic signals ao (zero harmonic), al and b1 (first harmonic) by a
harmonic
matrix 51. The first harmonic signals al and b1 are orthogonal to each other.
If more than the zero and first spatial harmonics are to be preserved,
two additional orthogonal signals for each further harmonic are generated by
the
matrix 51. These harmonic signals then serve as inputs to a speaker matrix 53
which
converts them into a modified set of signals S 1', S2', S3', S4' and SS' that
are used
to drive the uniquely position speakers in a way to provide the improved
realism of


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
the reproduced sound that was intended when the recording 15 was initially
mastered with different speaker positions assumed. This is accomplished by
relative
gains being set in the matrices 51 and 53 through respective gain control
circuits 55
and 57 from a control processor 59. The processor 59 calculates these gains
from
the mastering parameters that have been recorded and played back with the
sound
tracks, primarily the assumed speaker angles Vii, ~z, ~3, Via, and ø5, and
corresponding actual speaker angles ~3,, ,Qz, ~3, ~a and X35 that are provided
to the
control processor by the listener through a control panel 61.
The algorithm of the harmonic matrix 51 is illustrated by use of 15
variable gain amplifiers arranged in five sets of three each. Three of the
amplifiers
are connected to receive each ofthe sound signals S1-SS being played back from
the
recording. Amplifiers 63, 64 and 65 receive the S 1 signal, amplifiers 67, 68
and 69
the S2 signal, and so on. An output from one amplifier of each of these five
groups
is connected with a summing node 81, having the ao output signal, an output
from
another amplifier of each of these five groups is connected with a summing
node 83,
having the a1 output signal, and an output from the third amplifier of each
group is
connected to a third summing node 85, whose output is the b1 signal.
The matrix 51 calculates the intermediate signals a0, al and b1 from
only the audio signals S 1-SS being played back from the recording 15 and the
speaker angles ~1, y, ~3, ~4, and ~5, assumed during mastering, as follows:
ao=S1+S2+S3+S4+SS
al = S 1 cos~l + S2 cos~z + S3 cos~3 + S4 cos~4 + SS cose~5 (6)
b1 = S1 sin~l + S2 sine~z + S3 sin~3 + S4 sin~4 + SS sin~5
Thus, in the representation of this algorithm shown as the matrix 51, the
amplifiers
63, 67, 70, 73 and 76 have unity gain, the amplifiers 64, 68, 71, 74 and 77
have
gains less than one that are cosine functions of the assumed speaker angles,
and
amplifiers 65, 69, 72, 75 and 78 have gains less than one that are sine
fi~nctions of
the assumed speaker angles.
The matrix 53 takes these signals and provides new signals S 1', S2',
S3', S4' and SS' to drive the speakers having unique positions surrounding a
listening
area. The representation of the processing shown in Figure 4 includes 15
variable
gain amplifiers 87-103 grouped with five amplifiers 87-91 receiving the signal
ao, five
amplifiers 92-97 receiving the signal al, and five amplifiers 98-103 receiving
the
signal b1. The output of a unique one of the amplifiers of each of these three
groups
11


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
provides an input to a summing node 105, the output of another of each of
these
groups provides an input to a summing node 107, and other amplifiers have
their
outputs connected to nodes 109, 111 and 113 in a similar manner, as shown.
The relative gains of the amplifiers 87-103 are set to satisfy the
following set of simultaneous equations that depend upon the actual speaker
angles
/3:
N
E[1+2cos (/~~ ,Oi)]S.'=ao+alcos /3l+blsin,l3i (7)
-i
where N=5 in this example, resulting in i and j having values of l, 2, 3, 4
and 5. The
result is the ability for the home, theater or other user to "dial in" the
particular
angles taken by the positions of the loud speakers, which can even be changed
from
time to time, to maintain the improved spatial performance that the mastering
technique provides.
A matrix expression of the above simultaneous equations for the
actual speaker position angles ~3 is as follows, where the condition of the
second
spatial harmonics equaling zero is also imposed:
1+ZCOS(Rl1+2COS(~B~1+2COS(~3-~I) 1+ZCOS(~51' p+QICOS~l+bISITIR1
RI) ~1) I+2COS(~~~1) RI


1+2cos(fjr-/3?)1+2cos(~~1+2cos(/331+2cos(~~1+2cos(~is2' o+alcos,8?+blsinj~a
~?) /js) ~~) /3a


1+LCOS(~1-~3)1+'~cOS(~B~-~3)1+ZcOS(~3-~?)1+'~cOS(~.l1+2COS(RS-~33'
=p+IXICOS~3+bISIO~
cos(2/31)cos(2/~?)cos(2,133)~Bg) cos(2,QS)4 0
cos(2/j,~) ~


sin(2~31)sin(2/j~)sin(2/33)sin(2~it)sin(2~135)5 0
~


The values of relative gains of the amplifiers 87-103 are chosen to implement
the
resulting coe$icients of a~, al and b1 that result from solving the above
matrix for the
output signals S1'-SS' of the circuit matrix 53 with a given set of actual
speaker
position angles,l31 -,135.
The forgoing description has treated the mastering and reproducing
processes as involving a recording, as indicated by block 15 in each of
Figures 3 and
4. These processes may, however, also be used where there is a real time
transmission of the mastered sound through the block 15 to one or more
reproduction locations.
The description with respect to Figures 3 and 4 has been directed
primarily to mastering a three-dimensional sound field, or at least contribute
to one,
from individual monaural sound sources. Referring to Figure 5, a technique is
12


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
illustrated for mastering a recording or sound transmission from signals that
represent a sound field in three dimensions. Three microphones 121, 123 and
125
are of a type and positioned with respect to the sound field to produce audio
signals
ml, m2 and m3 that contain information of the sound field that allows it to be
reproduced in a set of surround sound speakers. Positioning such microphones
in
a symphony hall, for example, produces signals from which the acoustic efi'ect
may
be reconstructed with realistic directionality.
As indicated at 127, these three signals can immediately be recorded
or distributed by transmission in three channels. The ml, m2 and m3 signals
are then
played back, processed and reproduced in the home, theater and/or other
location.
The reproduction system includes a microphone matrix circuit 129 and a speaker
matrix circuit 131 operated by a control processor 133 through respective
circuits
135 and 137. This allows the microphone signals to be controlled and processed
at
the listening location in a way that optimizes, in order to accurately
reproduce the
original sound field with a specific unique arrangement of loud speakers
around a
listening area, the signals S1-SS that are fed to the speakers. The matrix 129
develops the zero and first spatial harmonic signals a~, al and b1 from the
microphone
signals ml, m2 and m3. The speaker matrix 131 takes these signals and
generates the
individual speaker signals S1-SS with the same algorithm as described for the
matrix
53 of Figure 4. A control panel 139 allows the user at the listening location
to
specify the exact speaker locations for use by the matrix 131, and any other
parameters required.
The arrangement of Figure 6 is very similar to that of Figure 5,
except that it differs in the signals that are recorded or transmitted.
Instead of
recording or transmitting the microphone signals at 127 (Figure 5), the
microphone
matrixing 129 is performed at the sound originating location (Figure 6) and
the
resulting spatial harmonics ao, al and b1 of the sound field are recorded or
transmitted at 127'. A control processor 141 and control panel 143 are used at
the
mastering location. A control processor 145 and control panel 147 are used at
the
listening location. An advantage of the system of Figure 6 is that the
recorded or
transmitted signals are independent of the type and arrangement of microphones
used, so information of this need not be known at the listening location.
An example of the microphone matrix 129 of Figures 5 and 6 is given
in Figure 7. Each of the three microphone signals ml, m2 and m3 is an input to
a
bank of three variable gain amplifiers. The signal ml is applied to amplifiers
151
153, the signal m2 to amplifiers 154-156, and the signal m3 to amplifiers 157-
159.
13


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
One output of each bank of amplifiers is connected to a summing node that
results
in the zero spatial harmonic signal afl. Also, another one of the amplifier
outputs of
each bank is connected to a summing node 163, resulting in the first spatial
harmonic
signal al. Further, outputs of the third amplifier of each bank are connected
together
in a summing node 165, providing first harmonic signal b1.
The gains of the amplifiers 151-159 are individually set by the control
processor 133 or 141 (Figures 5 or 6) through circuits 135. These gains define
the
transfer fiznction of the microphone matrix 129. The transfer fi~nction that
is
necessary depends upon the type and arrangement of the microphones 121, 123
and
125 being used. Figure 8 illustrates one specific arrangement of microphones.
They
can be identical but need not be. No more than one of the microphones can be
omni-directional. As a specific example, each is a pressure gradient type of
microphone having a cardioid pattern. They are arranged in a Y-pattern with
axes
of their major sensitivities being directed outward in the directions of the
arrows.
The directions of the microphones 121 and 125 are positioned at an angle c~ on
opposite sides of the directional axis of the other microphone 123.
In this specific example, the microphone signals can be expressed as
follows, where vis an angle of the sound source with respect to the
directional axis
of the microphone 123
ml = 1 + cos( v - a)
m2 = 1 - cos v (9)
m3 = 1 + cos( v +a)
The three spatial harmonic outputs of the matrix 129, in terms of its three
microphone signal inputs, are then:
14


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
~m 1+ m 3)
+ m Zoos a
2
ao -.
1+cos a
~m 1+ m 3)
-m2
al= 2 (10)
1+cos a
b=mums
2 sin a
Since these are linear equations, the gains of the amplifiers 151-159 are the
coei~icients of each of the ml, m2 and m3 terms of these equations.
The various sound processing algorithms have been described in
terms of analog circuits for clarity of explanation. Although some or all of
the
matrices described can be implemented in this manner, it is more convenient to
implement these algorithms in commercially available digital sound mastering
consoles when encoding signals for recording or transmission, and in digital
circuitry
in playback equipment at the listening location. The matrices are then formed
within
the equipment in digital form in response to supplied software or firmware
code that
carries out the algorithms described above.
In both mastering and playback, the matrices are formed with
parameters that include either expected or actual speaker locations. Few
constraints
are placed upon these speaker locations. Whatever they are, they are taken
into
account as parameters in the various algorithms. Improved realism is obtained
without requiring specific speaker locations suggested by others to be
necessary,
such as use of diametrically opposed speaker pairs, speakers positioned at
floor and
ceiling corners of a rectangular room, other specific rectilinear
arrangements, and the
like. Rather, the processing of the present invention allows the speakers to
first be
placed where desired around a listening area, and those positions are then
used as
parameters in the signal processing to obtain signals that reproduce sound
through


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
those speakers with a specified number of spatial harmonics that are
substantially
exactly the same as those of the original audio wavefront.
The spatial harmonics being faithfully reproduced in the examples
given above are the zero and first harmonics but higher harmonics may also be
reproduced if there are enough speakers being used to do so. Further, the
signal
processing is the same for all frequencies being reproduced, a high quality
system
extending from a low of a few tens of Hertz to 20,000 Hz. or more. Separate
processing of the signals in two frequency bands is not required.
Three Dimensional Representation
So far the discussion has presented the method of spatial harmonics
in two dimensions by considering both the load speakers and sound sources to
lie in
a plane. This same theory may be extended to 3 dimensions. It then requires 4
channels to transmit the Ot't and 1s' terms of the 3-dimensional spatial
harmonic
expansion. It has the same properties for matrixing, such that 2 channels may
carry
a standard stereo mix, and the other two channels may be used to create feeds
for
any number of speakers around the listener. Unfortunately, the mathematics for
the
3D version is not as clean and compact as for 2D. There is not any
particularly good
way to reduce the complexity.
To extend the method of spatial harmonics to three dimensions, a
brief discussion of the Legendre functions and the spherical harmonics is
needed. In
some sense, this is a generalization of the Fourier sine and cosine series.
The Fourier
series is a function of one angle, ~. The series is periodic and can be used
to
represent functions on a circle. Just as the Fourier sine and cosine series
are a
complete set of orthogonal functions on the circle, spherical harmonics are a
complete set of orthogonal functions defined on the surface of a sphere. As
such,
any function upon the sphere can be represented by spherical harmonics in a
generalized Fourier series.
The spherical harmonics are functions of two coordinates on the
sphere, the angles 8 and ~. These are shown in Figure 9 where a point on the
surface
of the sphere is represented by the pair ( 8, ~). ~ is azimuth. Zero degrees
is straight
ahead. 90° is to the left. 180° is directly behind. 8 is
declination (up and down).
Zero degrees is directly overhead. 90 ° is the horizontal plane, and
180 ° is straight
down. Note that the range of 8 is zero to 180°, whereas the range of ~
is zero to
360° (or -180° to 180°). In the discussion in two
dimensions, the angular variable
B has been suppressed and taken as equal to 90°. More generally, both
angle are
16


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
included. For example, the positions of speakers SP1, SP2, SP3, SP4 and SPS in
Figure 1 are now given by the respective pairs of angles ( Bl, ell), ( 92,
~2), ( 63, ~3),
( 84, ~4), and ( B5, ~5), where the B; now lie anywhere in the range of from 0
° to 180 ° .
Figures 1 and 8 can be considered either as a coplanar arrangement of the
shown
elements or a projection of the three dimensional situation onto a particular
planar
subspace.
The common definition of spherical harmonics starts with the
Legendre polynomials, which are defined as follows:
11
1'»(~)= 2,in~ ~~, (~Z -1)n ( )
From these, we can define Legendre's associated functions, which are define as
follows:
d "'P
j'~t(f~)=(-1)».(1-~2)°/ n(f~) ~ (12)
du
wherePo(cos~=1, Pl(cos~=cosB, Pll(cos~=-sing, and so on. Both the Legendre
polynomials and the associated functions are orthogonal (but not orthonormal).
These specific definitions are given since some authors define them slightly
differently. If one of the alternate definitions is used, the equations below
must be
altered appropriately.
Although these are polynomials, they are turned into periodic
functions with the following substitution:
,u---cos8 . (13)
From these, an expansion of a function in polar coordinates can be made as
follows:
.,
f (8, ~) _ ~ A"P" (cos9) + ~ (A"n, cosna~ + Bent sin na~)P"' (cosh) (14)
n=o n.=1 ,
The functionsP"(cos~, cosna~P""'(cosh, and sinm~Pn"'(cosh are called spherical
harmonics. This expansion has an equivalence to the Fourier series of equation
(1),
but it is relatively messy to actually derive it. One approach is to fix the
value of 8
at, say, 90 ° . The remaining terms collapse into something that is
equivalent to the
17


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
Fourier sine and cosine series. The coefficients (An, An"" Bn,n) generalize
the
coefficients (ao, a"" bm) in equation (1) for n#0.
For a function that is just defined on the circle, there are 1+2T
coefficients for a series that include harmonics of order 0 through T. For the
spherical harmonic expansion, the total number of coefficients is (T+1)2 if
harmonics
through order T are included, with the square arising as the sphere is a two
dimensional surface. Thus, if keeping the harmonics through first order now
requires the four terms ofA~, Al, All, and Bll instead of the three terms of
an, al, and
b1.
When applied to sound, this can be though of as the sound pressure
on the surface of a microscopic sphere at a point in space centered at the
location
of a listener. This expansion is used as a guide through the generation of pan
matrices and microphone processing for sounds that may originate in any
direction
around the listener.
As in the ZD discussion, the function on the sphere that we want to
approximate is taken to be a unit impulse in the direction ( 90, ~ o) to the
listener, the
additional coordinate B now made explicit. For compactness, define ,u0 as
follows:
,uo---cos60 . (15)
The expansion of a unit impulse in that direction can be calculated to be the
following:
.fo (~~ ~) _ ~ ~a + 1 ~- Prr (fro )1'r (f~) + ~ (~ - n2)! cos n7(~ - ~o )pr; r
(No )1'n r (f~)
n=0 ~~ ~ nr=1 (n + 772) ~
( 16)
For multiple point sources at a number of different positions (6o,~n) or for a
non
point source, this function is respectively replaced by a sum over these
points or an
integral over the distribution.
Although the discussion here is given using the three dimensional
harmonics that arise from spherical coordinates, other sets of orthogonal
functions
in three dimensions could similarly be employed. The corresponding orthogonal
functions would then be used instead in equation (16) and the other equations.
For
3 0 example, if the geometry of the three dimensional speaker placement in the
listening
area suits itself to a particular coordinate system or if the microscopic
surface about
the point corresponding to the listener is modelled as non-spherical due to
18


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
microphone placement or characteristics, one of the, say, spheroidal
coordinate
systems and its corresponding orthogonal expansion could be used.
Returning to Figure 1, N speakers around the listener at angles of
( 91, ~ ~), ( 9z, ~ z), .. ., ( BN~ ~ N)~ but now the exemplary values of N--S
and each of
the ~ =90 ° are no longer used. The gains to each of the speakers, g;,
are sought so
that the resulting sound field around a point at the center corresponds to the
desired
sound field ( fo(8,~) above) as well as possible. These gains may be obtained
by
requiring the integrated square difference between the resulting sound field
and the
desired sound field be as small as possible. The result of this optimization
is the
following matrix equation that generalizes equation (2) with the right and
left hand
sides switched:
BG = S, (17)
where G is a column vector of the speaker gains:
GT = Lgl ... gN~ ~ ( 1$)
The components of the matrix B may be computed as follows:
2n+1 1 " ", (n-n~)t
~-0 2~' 2 my (n + 772) ~
(19)
and
S = Lbio ~.. I)NO~T ~ (20)
Note that equation (19) is similar to the expansion in equation (16)
for the unit impulse in a certain direction but for the term (-1 )"'. Although
the first
summation is written without an upper limit, in practice it will be a finite
summation.
The rank of the matrix B depends on how many terms of the expansion are
retained.
If the 0"' and 1St terms are retained, the rank of B will be 4. If one more
term is taken,
the rank will be 9. The rank of B also determines the minimum number of
speakers
required to match that many terms of the expansion.
Any number of speakers may be used, but the system of equations
will be under-determined if the number of speakers is not the perfect square
number
(T+1)Z corresponding to the T"' order harmonics. There are various ways to
solve
19


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
the under-determined system. One way is to solve the system using the pseudo-
inverse of the matrix B. This is equivalent to choosing the minimum-norm
solution,
and provides a perfectly acceptable solution. Another way is to augment the
system
with equations that force some number of higher harmonics to zero. This
involves
taking the minimum number of rows of B that preserves it rank, then adding
rows
of the following form:
[l'~+W1) '.. Pn+Wrr)~ - [0l (21a)
or
[cos~IP»tn+W) '.. cos~,,Pmn+Wrr)~ - [0~ (21b)
or
[sin~lP»tn+W1) '.. sin~,,Pe~~~+Wrr)~ - [0~ ' (21c)
These equations are generalizations of the process used to reduce equation (3)
to
equation (4) above. It does not make much difference exactly which of these
are
taken. Each additional row will augment the rank of the matrix until full rank
is
reached.
Thus we have derived the matrix equation required to produce
speaker gains for panning a single (monophonic) sound source into multiple
speakers
that will preserve exactly some number of spatial harmonics in 3 dimensions.
Figures 3 and 4 illustrated the mastering and reconstruction process
for a coplanar example of two monaural sources mixed into five signals which
are
then converted into the spatial harmonics through first order and finally
matrixed
into a modified set of signals. As noted there, any of these specific choices
could be
taken differently, although the choices of five signals being recording and
five
modified signals resulting as the output are convenient as a common
multichannel
arrangement is the 5.1 format of movie and home cinema soundtracks.
Alternative
multichannel recording and reproduction methods, for example that described in
the
co-pending U.S. patent application Ser. No. 09/505,556, filed February 17,
2000,
by James A. Moorer, entitled "CD Playback Augmentation" which is hereby
incorporated herein by this reference..
The arrangement of Figures 3 and 4 extends to incorporate three
dimensional harmonics, the main changes being that now (T+1)2 signals instead
(1+2T) signals are the output of harmonic matrix 51 if harmonics through T are
retained. Thus, keeping the harmonics through first order now requires the
four
terms (AD, A1, Al~, Bll) instead of the three terms (ao, al, b1).
Additionally, control


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
processor 59 must now calculate the gains form pairs of assumed speaker angles
( 8;, ~;) and corresponding a pairs actual speaker angles ( y,/3) instead the
just the
respective azimuthal angles ø; and ,13, the ( y,/3) again being provided
through a
control panel 61. Finally, one convenient choice for the three dimensional,
non-
coplanar case is to use six signals S1-S6 and also a modified set of six
signals
S 1'-S6'. In any case, to least four, non-coplanar speakers are required for
the
spherical harmonics just as at least three non-collinear speakers are required
in the
2D case, since at least four non-coplanar points are needed to define a sphere
and
three non-collinear points define a circle in a plane.
The reason six speakers is a convenient choice is that it allows for
four or five of the recorded or transmitted tracks on medium 15 to be mixed
for a
coplanar arrangement, with the remaining two or one tracks for speakers placed
oiI'
the plane. This allows a listener without elevated speakers or without
reproduction
equipment for the spherical harmonics to access and use only the four or five
coplanar tracks, while the remaining tracks are still available on the medium
for the
listener with full, three dimensional reproduction capabilities. This is
similar to the
situation described above in the 2D case where two channels can be used in a
traditional stereo reproduction, but the additional channels are available for
reproducing the sound field. In the 3D case of, say, six channels, two could
be used
for the stereo mix, augmented by two more for a four channel surround sound
recording, with the last two available to further augment reproduction through
six
channels to provide the three dimensional sound field. The listener could then
access
the number of channels needed from the medium stored, for example, as
described
in the co-pending application "CD Playback Augmentation" included by reference
above.
Returning to Figure 3, the modifications in this example then consist
of including an extra amplifier for each monaural source and an extra added to
supply the additional signal S6 to the medium 15. The control panel 29 would
also
then supply an additional gain for each of the sources, with all of the gains
now
derived from the declination as well as the azimuthal location of the assumed
speaker
placements. Similarly in Figure 4, each of the six signals S 1-S6 would feed
four
amplifiers in matrix 51, one for each of the four summing nodes corresponding
to
Ao, Al, All, and Bll (or, more generally, four independent linear combinations
of
these) to produce theses four output in this example using the 0''' and 15'
order
harmonics. Matrix 53 now has six amplifiers for each of these four harmonics
to
produce the set of six modified signals S1'-S6'. Again, the declination as
well as the
21


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
azimuthal location of the actual speaker placements is now used. More
generally,
control panel 61 could also supply control processor 59 with radial
information on
any speakers not on the same spherical surface as the other speakers. The
control
processor 59 could then use this information matrix 53 to produce
corresponding
modified signals to compensate for any dii~ering radii by introducing delay,
compensation for wave front spreading, or both.
In this arrangement, the equivalent of equation (6) above becomes:
Ao=S1+S2+S3+S4+SS+S6
Al = Sl cosBl + S2 cos82 + S3 cos63 + S4 cosB4 + SS cosBS+ S6 cosB6
All = S1 cos~lsinBl + S2 cos~zsin62+ S3 cos~3sin83 (6')
+ S4 cos~4sinB4 + SS cos~5sin85 + S6 cos~6sinB6
Bll = Sl sin~lsinBl + S2 sinc~2sin8a + S3 sin~3sinB3
+ S4 sinc64sinB4 + SS sinc~SsinBS + S6 sin~6sinB6 .
In the case discussed above where four of the speakers, say S 1-S4, are taken
to be
in a typical, coplanar arrangement parallel to the floor of a room, B~ B4
90° and
equation (6') simplifies considerably. Additionally, by having the full three
dimensional representation, a two dimensional projection on to any other plane
in
the listening area can be realized by fixing the appropriate Bs and ~s.
A standard directional microphone has a pickup pattern that can be
expressed as the 0"' and 1s' spatial spherical harmonics. The equation for the
pattern
of a standard pressure-gradient microphone is the following:
na(B,~) = C+ (1- C){cosOcosB+ sinOsinBcos(~-~)} , (22)
where O and ~ are the angles in spherical coordinates of the principal axis of
the
microphone. That is, they are the direction the microphone is "pointing."
Equation
(22) is the more general form of equations (9). Those equations correspond to,
up
to an overall factor of two, equation (22) with C='h, ~~90 °, ~ = v,
and Via, 0,
or - a for respective microphones ml, m2, or m3. The constant C is called the
"directionality" of the microphone and is determined by the type of
microphone. C
is one for an omni-directional microphone and is zero for a "figure-eight"
microphone. Intermediate values yield standard pickup patterns such as
cardioid
(1/2), hyper-cardioid (1/4), super-cardioid (3/8), and sub-cardioid (3/4).
With four
22


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
microphones, we may recover the 0'~ and 1g' spatial harmonics of the 3D sound
field
as follows:
Ao »a1
Al - D ma (23)
All ms
Bll ma
This equation corresponds to the 2D 0'" and 1g' spatial harmonics of equation
(10).
The spatial harmonic coefficients on the left side of the equations are
sometimes
called W, Y, Z and X in commercial sound-field microphones. Representation of
the
3-dimensional sound field by these four coefficients is sometimes referred to
as "B
format." (The nomenclature is just to distinguish it from the direct
microphone
feeds, which are sometimes called "A-format").
The terms nal, ..., nZM refer to M pressure-gradient microphones with
principal axes at the angles (0l, y), ..., (0N1, ~M). The matrix D may be
defined by
its inverse as follows:
Cl (1 - Cl ) (1- C, ) sin (1 - C, ) sin
cos O, O, cos ~, O, sin ~,


D_l _ CZ (1-CZ)cos0z (1-CZ)sin OZ (1-CZ)sin OZ (24)
cos~z sin ~2


C3 (1 -C3)COS~3(1-Cs)Sln ~3 (1-Cs)Sln ~3
COS~3 Sln ~3


IS C4 (1 C4)COS~4 (1-C4)S1n04 COS~4(1-C4)Sln ~4 .
Sln~4


Each row of this matrix is just the directional pattern of one of the
microphones. Four microphones unambiguously determine all the coefficients for
the
0"' and 19' order terms of the spherical harmonic expansion. The angles of the
microphones should be distinct (there should not be two microphones pointing
in the
same direction) and non-coplanar (since that would provide information only in
one
angular dimension and not two). In these cases, the matrix is well-conditioned
and
has an inverse.
Corresponding changes will also be need in Figures 5, 6, and 7. In
Figures 5 and 6, the number of microphones will now four, corresponding to ml
m4
in equation (23), and the four harmonics (Ao, Al, All, Bll, or four
independent linear
combinations) replace the three terms (a~, al, b1). The number of output
signals will
also be adjusted: In the example used above, S6 or S6' now being included.
Additionally, the alignment of each microphone is now specified by a pair of
parameters, the angles (O, ~ the principal axes, and each of the signals S 1-
S6 or
23


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
S 1'-S6' had a declination as well as an azimuthal angle. The microphone
matrix of
Figure 7 will correspondingly now have four sets of four amplifiers.
One possible arrangement ofthe four microphones of equations (23)
and (24) is to place nal m3 as Figure 8 on the equatorial plane with 11a4 at
the north
pole of the sphere. This corresponds to (0l, X1),(03, ~'3)-(90°,~a),
(O2, ~2)
(90 °,180 ° ), 04 0 ° . Another alternative is to place
the microphones with two
rearward facing microphones as shown in Figure 10, with ml 121 at (90
°, a), ma 123
at (90°+8,180°), m3 125 at (90°,-a), and m4 126 at
(90°-8,180°). Taking
a=X60 ° then produces a regular tetrahedral arrangement.
In some applications, one of the microphones may be placed at a
different radius for practical reasons, in which case some delay or advance of
the
corresponding signal should be introduced. For example, if the rear-facing
microphone m2 of Figure 8 were displaced a ways to the rear, the recording
advanced about lms for each foot of displacement to compensate for the
difference
in propagation time.
Equation (23) is valid for any set of four microphones, again
assuming no more than one of them is omni-directional. By looking at this
equation
for two different sets of microphones, the directional pattern of the pickup
can be
changed by matrixing these four signals. The starting point is equations (23)
and
(24) for two different sets of microphones and their corresponding matrix D.
The
actual microphones and matrix will be indicated by the letters m and D, with
the
rematrixed, "virtual" quantities indicated by a tilde.
Given the formulation of equations (23) and (24), these microphone
feeds may be transformed into the set of "virtual" microphone feeds as
follows:
~izl Ao m,
x_'72 = D-' A' - D-'D ~ZZ (25)
YI23 All 1723
T724 Bll 7724
The matrix D represents the directionality and angles of the "virtual"
microphones. The result of this will be the sound that would have been
recorded if
the virtual microphones had been present at the recording instead of the ones
that
were used. This allows recordings to be made using a "generic" sound-field
microphone and then later matrix them into any set of microphones. For
instance,
we might pick just the first two virtual microphones, 1771 and in2, and use
them as a
stereo pair for a standard CD recording. n73 could then be added in for the
sort of
24


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
planar surround sound recording described above, with m4 used for the full
three
dimensional realization.
Any non-degenerate transformation of these four microphone feeds
can be used to create any other set of microphone feeds, or can be used to
generate
speaker feeds for any number of speakers (greater than 4) that can recreate
exactly
the Ot'' and 1st spatial harmonics of the original sound field. In other
words, the
sound field microphone technique can be used to adjust the directional
characteristics and angles of the microphones after the recording has been
completed. Thus, by adding a third, rear-facing microphone in the 2D case and
a
fourth, non-coplanar microphone in the 3D case, the microphones can be revised
through simple matrix operations. Whether the material is intended to be
released
in mufti-channel format or not, the recording of the third, rear-facing
channel allows
increased freedom in a stereo release, with the recording of a fourth, non-
coplanar
channel increasing freedom in both stereo and planar surround-sound.
To matrix the microphone feeds into a number of speakers, we
reformulate the right-hand side of the matrix equation (17) for panning as
follows:
(26)
BG=R=RID~Zz
ZZZ3
JZZ4
and
~'o(W) li{W) -cos~,P'(W) -sin~,P,'(W)
R~ _ ... (27)
po(ruN) P(,uN) -COS~~,P'(,CIN) -Sln~n,P'(,CIN)
The matrix, Rl, is simply the 0"' and 1S' order spherical harmonics evaluated
at the
speaker positions. One must be careful to include the term (-1)"', since that
is a
direct result of the least-squares optimization required to derive these
equations.
Returning to the recording of the sound field, the three or four
channels of (preferably uncompressed) audio material respectively
corresponding to
the 2D and 3D sound field may be stored on the disk or other medium, and then
rematrixed to stereo or surround in a simple manner. By equation (25) {or its
2D


CA 02406926 2002-10-17
WO 01/82651 PCT/US00/27851
reduction), there are an infinite number of non-degenerate transformations of
four
channels into four other channels in a lossless fashion. Thus, instead of
storing
spatial harmonics, two channels could store a suitable stereo mix, the third
store a
channel for a 2D surround mix, and use the fourth channel for the 3D surround
mix.
In addition to the audio, the matrix D or its inverse is also stored on the
medium.
For a stereo presentation, the player simply ignores the third and fourth
channels of
audio and plays the other two as the left and right feeds. For a 2D surround
presentation, the inverse of the matrix D is used to derive the 0-th and first
2D
spatial harmonics from the first three channels. From the spatial harmonics, a
matrix
such as equation (8) or the planar projection of equation (17) is formed and
the
speaker feeds calculated. For the 3D surround presentation, the 3D harmonics
are
derived from D using all four channels to form the matrix of equation (17) and
calculate the speaker feeds.
Although the various aspects of the present invention have been
described with respect to their preferred embodiments, it will be understood
that the
present invention is entitled to protection within the full scope of the
appended
claims.
26

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2000-10-06
(87) PCT Publication Date 2001-11-01
(85) National Entry 2002-10-17
Examination Requested 2005-10-05
Dead Application 2011-10-06

Abandonment History

Abandonment Date Reason Reinstatement Date
2003-10-06 FAILURE TO PAY APPLICATION MAINTENANCE FEE 2003-10-31
2008-10-06 FAILURE TO PAY APPLICATION MAINTENANCE FEE 2009-01-15
2009-10-06 FAILURE TO PAY APPLICATION MAINTENANCE FEE 2009-11-20
2010-10-06 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $300.00 2002-10-17
Maintenance Fee - Application - New Act 2 2002-10-07 $100.00 2002-10-17
Registration of a document - section 124 $100.00 2003-01-10
Reinstatement: Failure to Pay Application Maintenance Fees $200.00 2003-10-31
Maintenance Fee - Application - New Act 3 2003-10-06 $100.00 2003-10-31
Maintenance Fee - Application - New Act 4 2004-10-06 $100.00 2004-10-06
Maintenance Fee - Application - New Act 5 2005-10-06 $200.00 2005-10-03
Request for Examination $800.00 2005-10-05
Maintenance Fee - Application - New Act 6 2006-10-06 $200.00 2006-10-06
Maintenance Fee - Application - New Act 7 2007-10-09 $200.00 2007-10-09
Reinstatement: Failure to Pay Application Maintenance Fees $200.00 2009-01-15
Maintenance Fee - Application - New Act 8 2008-10-06 $200.00 2009-01-15
Registration of a document - section 124 $100.00 2009-02-13
Reinstatement: Failure to Pay Application Maintenance Fees $200.00 2009-11-20
Maintenance Fee - Application - New Act 9 2009-10-06 $200.00 2009-11-20
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
SNK TECH INVESTMENT L.L.C.
Past Owners on Record
MOORER, JAMES A.
SONIC SOLUTIONS
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Representative Drawing 2002-10-17 1 19
Cover Page 2003-01-30 1 53
Description 2010-07-15 26 1,419
Description 2002-10-17 26 1,444
Abstract 2002-10-17 1 65
Claims 2002-10-17 6 245
Drawings 2002-10-17 6 142
PCT 2002-10-17 6 211
Assignment 2002-10-17 3 102
Correspondence 2003-01-28 1 26
Fees 2003-10-31 1 39
Assignment 2003-01-10 4 246
Correspondence 2005-03-14 11 294
Prosecution-Amendment 2005-10-05 1 42
Correspondence 2010-09-20 1 16
Correspondence 2010-09-20 1 20
Prosecution-Amendment 2006-03-22 1 56
Fees 2007-10-09 1 41
Assignment 2009-02-13 6 275
Fees 2009-01-15 1 37
Prosecution-Amendment 2010-01-21 2 47
Prosecution-Amendment 2010-07-15 5 169
Correspondence 2010-08-18 2 91