Language selection

Search

Patent 2058984 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 2058984
(54) English Title: POLYPHONIC CODING
(54) French Title: CODAGE POLYPHONIQUE
Status: Expired
Bibliographic Data
(51) International Patent Classification (IPC):
  • H04S 1/00 (2006.01)
  • H04H 20/88 (2009.01)
  • H04M 3/56 (2006.01)
  • H04H 5/00 (2006.01)
(72) Inventors :
  • HOLT, CHRISTOPHER ELLIS (United Kingdom)
  • MUNDAY, EDWARD (United Kingdom)
  • CHEETHAM, BARRY MICHAEL GEORGE (United Kingdom)
(73) Owners :
  • BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY (United Kingdom)
(71) Applicants :
(74) Agent: GOWLING WLG (CANADA) LLP
(74) Associate agent:
(45) Issued: 1998-12-01
(86) PCT Filing Date: 1990-06-15
(87) Open to Public Inspection: 1990-12-16
Examination requested: 1993-05-13
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/GB1990/000928
(87) International Publication Number: WO1990/016136
(85) National Entry: 1991-12-11

(30) Application Priority Data:
Application No. Country/Territory Date
8913758.2 United Kingdom 1989-06-15

Abstracts

English Abstract




A polyphonic (e.g stereo) audioconferencing system, in which input left and right channels are time-aligned by variable
delay stages (10a, 10b), controlled by a delay calculator (9) (e.g. by deriving the maximum cross-correlation value), and then
summed in an adder (2) and subtracted in subtracter (3) to form sum and difference signals. The sum signal is transmitted in
relatively high quality; the difference signal is reconstructed at the decoder by predictions from the sum signal using an adaptive filter
(5), The decoder adaptive filter (5) is configured either by received filter coefficients or, using backwards adaptation, from a
received residual signal produced by a corresponding adaptive filter (4) in the coder, or both. Preferably, the adaptive filter (4) is a
lattice filter, employing a gradient algorithm for coefficient update. The complexity of the adaptive filter (4) is reduced by
pre-whitening, in the encoder, both the sum and difference signals using corresponding whitening filters (14a, 14b) derived from the
sum channel.


French Abstract

L'invention est un système audioconférence polyphonique (stéréophonique par exemple) dans lequel les canaux d'entrée gauche et droit sont ordonnancés temporellement par des étages à retardement variable (10a, 10b) et contrôlés par un calculateur de retard (9) (lequel, par exemple, peut effectuer ce contrôle en déterminant la corrélation maximale) et les signaux de ces canaux sont sommés dans un additionneur (2) et soustraits dans un soustracteur (3) pour former les signaux de somme et de différence. Le signal de somme est transmis avec une qualité relativement élevée; le signal de différence est reconstruit au décodeur à partir de prédictions fournies par le signal de somme, en utilisant un filtre adaptatif (5). Le filtre adaptatif (5) du décodeur est configuré par les coefficients de filtrage reçus ou, à l'aide d'une adaptation inverse, à partir d'un signal résiduel reçu qui est produit par un filtre adaptatif correspondant (4) dans le codeur, ou de ces deux façons. Dans la concrétisation privilégiée de l'invention, ce filtre adaptatif (4) est un filtre en treillis qui utilise un algorithme à gradient pour la mise à jour des coefficients. La complexité du filtre adaptatif (4) est réduite en préblanchissant dans le codeur les signaux de somme et de différence au moyen de filtres de blanchissement correspondants (14a, 14b) obtenus du canal du signal de somme.

Claims

Note: Claims are shown in the official language in which they were submitted.


18
CLAIMS

1. Polyphonic signal coding apparatus comprising:
- means for receiving a first and at least one
second channel;
- means for filtering the first and second channel
in accordance with a filter approximating the spectral
inverse of the first channel to produce respective filtered
channels, the first said filtered channel thereby being
substantially spectrally whitened;
- means, connected to receive the filtered
channels, for periodically generating reconstruction data
enabling the formation, from the first channel, of an
estimate of the second channel, the generating means being
operable to generate a plurality of filter coefficients
which, if applied to a plural order predictor filter, would
enable the prediction of the second channel from the first
channel thus filtered;
- means for outputting data representing the said
first channel and the reconstruction data.

2. Apparatus according to claim 1, wherein said filtering
means comprises an adaptive, master, filter arranged to
filter the first channel so as to produce a whitened output,
and a slave filter arranged to filter said second channel,
the slave filter being configured so as to have an
equivalent response to the adaptive filter of the filtering
means.

3. Apparatus according to claim 1 or 2, in which the
generating means includes an adaptive filter connected to
receive the first filtered channel and produce a predicted
second channel therefrom; and means for producing a residual
signal representing the difference between the said

19

predicted second channel and the actual second filtered
channel, and in which the said reconstruction data comprises
data representing the said residual signal.

4. Apparatus according to claim 3 in which the adaptive
filter is controlled only by the said residual signal and
the said reconstruction data consists of the said residual
signal.

5. Apparatus according to claim 1, 2, or 3, wherein the
reconstruction data comprises the said filter coefficients.

6. Apparatus according to any one of claims 1 to 5,
further comprising:
- input means for receiving input signals; and
- means for producing the said channels therefrom,
the first channel being a sum channel representing the sum
of such input signals and the second of further channels
representing the differences therebetween.

7. Apparatus according to any one of claims 1 to 6,
including variable delay means for delaying at least one of
the channels, and means for controlling the differential
delay applied to the channels so as to increase the
correlation upstream of the generating means, the output
means being arranged to output also data representing the
said differential delay.

8. Apparatus according to claim 6, in which the input
means includes variable delay means for delaying the least
one of the input signals, and means for controlling the
differential delay applied to the signals so as to increase
the correlation upstream of the generating means, the output


means being arranged to output also data representing the
said differential delay.

9. Polyphonic signal decoding apparatus comprising:
- means for receiving data representing a sum
signal, and signal reconstruction data; and means operable
in response to the reconstruction data to modify the sum
signal so as to produce at least two output signals, the
modifying means comprising:
- a configurable plural order predictor filter for
receiving said signal reconstruction data and modifying its
coefficients in accordance therewith, the filter being
connected to receive the said sum signal and reconstruct
therefrom difference signal;
- an adaptive, master, filter arranged to filter
the sum signal in accordance with approximately the spectral
inverse of the sum signal so as to produce a whitened
output, and a slave filter arranged to filter said
difference signal, the slave filter being configured so as
to have an equivalent response to the adaptive master
filter; and
- means for adding the filtered difference signal
to the filtered sum signal, and for subtracting the
reconstructed difference signal from the sum signal, so as
to produce at least two output signals.

10. Apparatus as claimed in claim 9, in which the
difference signal reconstruction data comprises residual
signal data and the apparatus includes means to add the
residual signal data to the output of the filter to form the
reconstructed difference signal.

21
11. Apparatus as claimed in claim 10, in which the
predictor filter is connected to receive the residual signal
data and to modify its coefficients in accordance therewith.

12. A method of coding polyphonic input signals comprising:
- Producing therefrom a sum signal representing the
sum of such signals; and reconstruction data to enable the
formation, from the sum signal, of a further one of the
input signals;
- Producing from the input signals at least one
difference signal representing a difference therebetween;
- analysing said sum and difference signals and
generating therefrom a plurality of coefficients which, if
applied to a multi-stage predictor filter, would enable the
prediction of the difference signal from the sum signal thus
filtered;
- the coded output comprising the said sum signal
and data enabling the reconstruction of the said difference
signal therefrom;
characterised by , before said analysis, filtering the sum
signal and difference signal in accordance with a filter
approximating the spectral inverse of the sum signal, the
sum signal thereby being substantially spectrally whitened.

Description

Note: Descriptions are shown in the official language in which they were submitted.


"6,,~ 2~
0 90/16136 PCT/GB90/00928


:
POLYPHONIC CODING

t This invention relates to polyphonic coding
technique~, particularl~, but not exclusively, for coding
speech signals.
It is well~known that polyphonic, specifically
stereophonic, sound is more perceptually appealing than
monophonic sound. Where several sound sources, say
within a conferance room, are to be transmitted to a
second room, polyphonic sound allows a spatial
reconstruction of the original sound field with an image
of each sound source being perceived at an identifiable
point corresponding to its position in the original
conference room. This can ~l1min~te confusion and
misunderst~n~;nys during audio-conference discussions
;~ since each participant may be identified both by the sound
of his voice and by his perceived position within the
conference room.
Inevitably, polyphonic tr~n~m~Qsions require an
increase in transmission capacity as compared with
monophonic transmissions. The conventional approach of
transmitting two independent channels, thus doubling ~he
required transmission capacity, imposes an unnaceptably
high cost penalty in many applications and is not possible
in some cases because of the need to use e~isting ~h~nne
with fL~ed tr~nsmission capacities.
2~ In stereophonic (i.e. two-channel polyphonic) systems,
two microphones (hereinafter referred to as left and right
microphones), at difEerent positions, are used to pick up
sound generated within a room (for e~ample by a person or
persons speaking). The signals pic~ed up by the
microphones are in general diEEerent. Each microphone
siynal (referred to hereinafter as ~L(t) with ~p~ace
~ .

wn 90/~6136 ~ ~ ~ pcr/~90/00928 ~'~
-- 2 --

trans~orm ~L(s) and xR(t) with Laplace transform
~(s) respectively) may be considered to be the
superposition of source signals processed by respective
acoustic transfer functions. These transfer functions are
strongly affected by the distances between the sound
sources and each microphone and also by the acoustic
properties o~ the room. Taking the case of a single
source, e.g. a single person speaking at some fixed point
within the room, the distances between the source and the
left and right microphones give rise to different delays,
and there will also be different degrees of attenuation.
In most practical environments such as conference rooms,
the signal reaching each microphone may have travelled via
many reflected paths (e.g. from walls or ceilings) as well
as directly, producing time spreading, frequency dependent
colouration due to resonances and antiresonances, and
perhaps discrete echos.
From the foregoing, in theory, the signal from one
; microphone may be formally related to that Ero~ the other
by designating an inter~h~nnel transfer function H say;
i.e. ~(s) = Hts) ~ (s) where s is complex frequency
para~eter. This statement is based on an assumption of
linearity and time-invariance for the effect o~ room
acoustics on a sound signal as it travels from its source
to a microphone. However, in the absence of knowledge as
to the nature of H, this statement does no more than
postulate a correlation between the two signals. Such a
postulation seems inherently sensible, however, at least
in the special case of a single sound source, and
therefore one way of reducing the bi~-rate needed to
represent stereo signals should ~e to reduce the
redundancy o~ one relative to the other (to reduce this
correla~ion) prior to transmission and re-introduce it
after reception.

. 2~g~3~.
~o 90/16136 PCT/GB90/00~28


In general, H(s) is not unique and can be signal- and
time- dependent. However when the source signals are
white and uncorrelated, i.e. when their autocorrelation
~unctions are zero except at t=0 and their
cross-correlation functions are zero for all t, H(s) will
depend on fa~tors not subject to rapid change, such as
room acoustics and the positions of the microphones and
sound sources, rather than ~he nature o~ the source
signals which may be rapidly changing.
To realise such a system in physical form, the
fundamental problems of causality and stability must be
overcome. Consider for a moment a single source signal
which is delayed by dL seconds before reaching the left
microphone and by dR seconds before reaching the right
microphone (although the point to be made has more general
implications). If the source is near to, say, the left
microphone, then dL will be smaller than dr. The
interchannel transfer function H(s) must delay xL(t) by
the difference between the two delays, dR ~ dL to
produce the right channel xR(t). Since dR ~ dL is
positive, H(s) will be causal. If the signal source is now
moved closer to the right microphone than to the left,
dR ~ dL becomes negative and H(s) ~e~:-es non-causal;
in other words, khere is no causal relationship between
the right channel and the le~t ch~nnPl, but rather the
reverse so the right ch~nnPl can no longer be predicted
from the left ch~nnel, since a given event occurs first in
the right ch~nnPI. It will therefore be realised that a
simple system in which one fi~ed channel is always
transmitted and the other is reconstructed ~rom it is
impossible to realise in a direct sense.
According to a first aspect of the invention, there is
provided a polyphonic signal coding apparatus comprising:
' 35 - means for receiving at least two input ch~nn~lq
from different sources;




:

w O 90/16136 ~ 5 PCT/GB90/00928 f' '~
-- 4 --

- means for producing a sum channel representing
the sum of such signals, and for producing at least one
difference channel representing a difference therebetween;
- means for periodically generating a plurality of
parametric coefficients which, if applied to a plural
order predictor filter, would enable the prediction of the
di~ference ~h~nnpl from the sum channel thus filtered; and
- means for outputtiny data representing the said
sum channel and data enabling the reconstruction of the
said di~ference channel therefrom.
In a first embodiment, the difference signal
reconstruction data are filter coefficients. In a second
embodiment, the residual signal representing the
difference between the difference signal and the sum
signal when thus filtered is formed at the transmitter,
and this is transmitted as the difference signal
reconstruction data. In this embodiment, the prediction
residual signal may be efficiently encoded to allow a
backward adaptation technique to be used at the decoder
for deriving the prediction filter coefficients. The
residual is also used as an error signal which is added to
the prediction ~ilter's output at the decoder to correct
for innaccuracies in the prediction o~ the difference
channel from the sum channel. This llresidual only~
emho~;r~~t is also useful where the laft channel, say, is
predicted from the right channel (without forming sum and
difference signals) - provided suitable measures are taken
: to ensure ca~ l;ty - to give high quality polyphonic
reproduction. In a third embodiment, both are transmitted.
Pre~erably, the means for generating the filter
coèfficients is an adaptive filter, advantageously a
lattice filter. This type of filter also gives advantages
in non-sum and di~ference polyphonic systems.
.,

-~ 2 ~ 8 ~- 10 go/16136 pcr/GBso/oos2g
-- 5 ~
.

In preferred embodiments, variable delay means are
disposed in at least one of the input signal paths, and
controlled to time align the two signals prior to Eorming
the sum and difference signals so that causal prediction
filters of reasonable order can he used.
; This aspect of the invention has several important
advantages:
(i) The 'sum signal' is fully compatible with
monophonic encoding and is unaffected by the
polyphonic coding except for the introduction of an
imperceptible delay. In the event of loss of
stereo, monophonic back-up is thus available.
(ii) The sum signal may be transmitted by conventional
; low bit-rate coding techniques (eg. LPC) without
modification.
(iii) The encoding technique for the difference signals
can be varied to suit the application and the
available transmission capacity between the above
three embodiments. The type of residual signal and
prediction coefficients can also be selected in
various different ways, while still conforming to
the basic encoding principle.
(iv) Overall, the apparatus encodes polyphonic signals
with only a modest increase in bit-rate requirement
; 25 as compared with monophonic transmission.
(v) The encoding is digital and hence the performance
o~ the apparatus will be predictable, not subject
to ageing effects or component drift and easily
mass-produced.
A method of ~alculating appro~imations to H(s) when
the source signals are not white (which, of course,
includes all speech or music signals) is proposed in a
second aspect of the invention, using the idea of a
'prewhit~ning ~ilterl.




.
. .

W O 90/16136 ~ PCT/~B90/00928


According to a second aspect o~ the invention, there
is provided a polyphonic signal coding apparatus
comprising:-
- means for recelving at least two input channels;
- means for filtering each input channel in
accordance with a filter approximating the spectral
inverse of a first of said chalmels to produce respective
filtered chanels, the first said filtered channel thereby
being substantially spectrally whitened;
_ means for receiving said filtered chanels and for
periodically generating parametric data for each filtered
channel (other than said first), which would enable the
prediction of each input channel from said first; and
- means for outputting data representing the first
channel, and data representing said parametric data.
This aspect of the invention provides, as above, the
advantages of a digital system compatible with existing
techniques and simplifies the process of modelliny (at the
encoder) the required interchannel transfer function.
Broadly corresponding decoding apparatus is also
provided according to the invention, as are systems
including such encoding and decoding apparatus,
particularly in a audioconferencing application, but also
in a polyphonic recording application. Other aspects of
the invention are as cl~;med and disclosed herein.
The words "prediction" and "predictor" in this
specification include not only prediction of ~uture data
from past data, but also estimation of present data o~ a
ch~nnPl from past and present data of another ~hAnnel.
Thé invention will now be illustrated, by way of
example only, with reference to the ac~ p~nying drawings
in which:
- Figure l illustrates generally an encoder
according to a first aspect of the invention;

10 90/16136 2 ~ PCTiGB90/009~8
-- 7 --

; - Figure 2 illustrates generally a corresponding
decoder;
- Figure 3a illustrates an encoder according to a
preferred embodiment of the invencion;
- Figure 3b illustrates a corresponding decoder;
- Figures 4a and 4b show respectively a
corresponding encoder and decoder according to a second
aspect of the invention.
- Figures 5a and 5b illustrate an encoder and a
decoder according to a second aspect of the invention;
- Figure 6 illustrates part of an encoder according
to a yet further embodiment of the invention.
The embodiments illustrated are restricted to
2 channels (stereo) for ease of presentation, but the
invention may be generalised to any number of channels.
One possible way of removing the redundancy between
two input signals (or predicting one from the other) would
be to connect between the two channels an adaptive
predictor ~ilter whose slowly changing parameters are
calculated by standard techniques (such as, for example,
hlock cross-correlation analysis or sequential lattice
adaptation). In an audioconferencing environment, the two
signals will originate from sound sources within a room,
and the acoustic transfer function between each source ~nd
'' 25 each microphone will be characterised typically by weak
poles (~rom room resonances) and strong zeros (due to
absorption and destructive interference). An all-zero
filter could there~ore produce a reasonable appro~imation
to the acoustic transfer function between a source and a
i 30 microphone and such a filter could also be used to predict
say the left microphone signal ~L(t) from ~ (t) when
the source is close to the right microphone. However, if
the source were now moved away from the right microphone
and placed close to the left, the nature of the required
,

~ ~ 5 ~ v ~ ~
W o 90/16136 PCT/GB90/00928-
- 8 ~

filter would be effectively inverted even when delays are
introduced to guarantee causality. The filter mus~ now
mo~el a transfer function with weak zeros and strong poles
- a difficult task for an all-~ero ~ilter. Other types of
filter are not, in general, inherently stable. The net
effect of this is to cause ~nequal degradation in the
reconstr~cted chA~nel when the source shifts from one
microphone to the other. This further makes the
simplistic prediction of one channel tsaY~ the left) ~rom
o the other (say, the right) hard to realise.
In a system according to the first aspect of the
invention, better results have been obtained by forming a
"sum signal" xs(t) = xL(t~ ~ ~ (t) and predicting
either a difference signal xD(t) = xL(t) - xR(t) or
simply xL(t) or xR(t) using an all-zero adaptive
digital filter.
In practice, xR(t) and xL(t) (or xs(t) and
xD(t) ) will be processed in sampled data form as the
digital signals xR[n] and xL[n] ( or xS[n] and
xD~n] ) and ik will be more convenient to use the
z-transform' transfer fuction H(z) rather than H(S).
Referring to Figure 1, in its essential form the
invention comprises a pair of inpu~s la, lb for receiving
a pair of speech signals, e.g. from left and right
microphones. The signals at the inputs, x~(t) and
xL(t), may be in digital form. It may be convenient at
this point ~o pre-process the signals, e.g. by band
limiting. ~ach signal is then supplied to an adder 2 and
a subtractor 3, the output of the adder being the sum
~ 30 signal xs(t) = xR~t) ~ xL(t), and the output ~ the
subtracter 3 being the difference signal xD(t) = ~(t)
+ xL(t) i.e. XD(t) = H(s) Xs(s). The sum and
dif~erence signals are then supplied to filter derivation
stage ~, which derives the coefficients of a multi-stage

A'(~ 90/16136 PCr/C~B90/00928
_ g _


prediction filter which, when driven with the sum signal,
will approximate the diEference signal. The di~erence
between the approximated difference signal and th~ actual
difference signal, the prediction residual signal, will
usually also be produced (although this is not invariably
necessary). The sum signal ts then encoded (preferably
usin~ LPC or sub-band coding), for transmission or
storage, along with further data ~n~hl ing reconstruction
of the difference signal. The filter coefficients may be
o sent, or alternatively (as discussed further below), the
residual signal may be transmitted, the difference channel
being reconstituted by deriving the filter parameters at
the receiver using a backwards adaptive process known in
the art; or both may be transmitted.
Although it would be possible to calculate filter
parameters directly (using LPC analysis techniques), one
simple and effective way of provid mg the derivation
stage 4 is to use an adaptive filter (for e~ample, an
adaptive transversal filter) receiving as input the sum
2~ ch~nn~l and modelling the difference chAnnPl so as to
reduce the prediction residual. Such general techniques
of filter adaptation are well-known in the art.
Our initial experiments with this structure have used
a transversal FIR filter with coefficient update by an
algorithm for min;mi~ing the m~an square value of the
residual~ which is slmple to implement. The filter
coefficients change only slowly because the room accoustic
- (and hence the interchannel transfer function) is
relatively stable.
Referring to Figure 2, in a corresponding receiver,
the sum signal xs(t) is received together witll either
the filter parameters or the residual signal, or hoth, for
the di~ference channel, and an adaptive filter 5
corresponding to that for which the parameters were

~ W ~ 90/1613S p~T/GBso/oo92~ '~
-- 10 -

derived at the coder receives as input khe sum signal and
produces as output the reconstructed difference signal
when configured either with the received parameters or
with paranleters derived hy backwards adaptation ~rom the
5 received residual signal. Sum and difference signals are
then bo~h fed to an adder 6 and a subtracter 7, which
produce as outputs respectively the reconstructed left and
right channels at output nodes 8a and 8b.
Since a high-quality sum signal is sent, the encoder
is fully mono-compatible. In the event of loss of stereo
information, monophonic back-up is thus available.
As discussed above, one component of the transfer
functions HL and ~ is a delay component relating to
the direct distance between the signal source and each of
the microphones, and there is a corresponding delay
difference d. There is thus a strong cross-correlation
between one channel and the other when delayed by d.
This method, however, requires considerably processing
power.
An alternative method of delay estimation found in
papers on sonar research is to use an adaptive filter.
The leEt channel input is delayed by half the filter
length and the coefficients are updated using the LMS
algorithm to m;n;m;~e the mean-square error or the
output. The transversal filter coefficients will, in
theory, become the required cross-correlation
coefficients. This may seem like unnecessary repetition
o~ filter coefficient derivation were it not for the
proper~y of this delay estima~or that the r x;m11m value of
the cross-correlation coefficient (at the position of the
r~x;m--m filter coefficien~) is obtained some ~ime before
the ~ilter has converged. This methcd may be improved
further because spatial inEormation is also available Erom
the relative amplitudes of the input channels; this could

,., ,~, h ~
- /0 90/t6136 - ll - PCT/GB90/00928


be used to apply a weighting function to the filter
coe~ficients to speed convergence.
Referring to Figure 3a, in a preferred embodLment of
the invention, the complexity and length of the filter to
be calculated is therefore reduced by calculating the
required value of d in a delay calculator stage 9
(preferably employing one of the above methods), and then
bringing the channels into time alignment by delaylng one
or other by d using, for example, a pair of variable
delays lOa, lOb (although one fixed and one variable delay
could be used) controlled by the delay calculator 9. With
the major part of the speech information in the channels
time aligned, the sum and difference signals are then
formed.
Referring to Figure 3b, the delay length d is
preferably transmitted to the decoder, so that after
reconstructing the difference channel and subsequently the
left and right channels, corresponding variable length
delay stages lla, llb in one or other of the chAnne~s can
restore the interchannel delay.
In the illustrated structure, the ~sum" signal is thus
no longer quite the true sum of xL(t) + xR(t); because
of the delay d it is xL(t) + ~R(t-d). It may
therefore be preferred to locate the delays lOa, lOb (and,
possibly, the delay calculator) downstream of the adder
and subtractor 2 and 3; this gives, for practical
purposes, the same benefits of reducing the necessary
filter length.
In practice, the delay is generally imperceptible;
~ 30 typically, up to l.6 ms. Alternatively, a fi~ed delay,
sufficiently long to guarantee ca1~qA1ity, may be used,
thus removing the need to encode the delay parameter.
In the first ~ nt of the invention, as stated
above, only the filter parameters are transmitted as

W O 90/16136 ~ l PCT/GB90/~0928
- 12 -

difference signal data. With 16 bits per coefficient,
this meant that a transmission capacity of 5120 bits/sec
is needed for the difference channel (plus 8 bits for the
delay parameter). This is well wlthin the capacity of a
stan~ard 6~ kbit/sec transmission system used which
allocates 48 kbits/sec to the sum channel (efficiently
transmitted by an existing monophonic encoding technique)
and offers 16 kbits/sec for other "overhead" data. This
mode of the embodiment gives a good signal to noise ratio
o and the stereo image is present, although it is highly
dependent on the accuracy of the algorithm used to adapt
the predictive filter. Inaccuracies tend to cause the
stereo image to wander during the course of a conference
particularly when the conversation is passed from one
speaking person to another at some distance from the first.
Referring to Figure 4a, in a second embodiment of the
invention, only the residual signal is transmitted as
difference signal data. The sum signal is encoded (12a)
using, for example, sub-band coding. It is also locally
decoded (13a) to provide a signal equivalent to that at
the decoder, for input to adaptive filter 4. The residual
difference channel is also encoded (possibly including
b~n~l;m;ting) by residual coder 12b, and a corresponding
~ local decoder 13b provides the signal m;n;~;~ed to adapt
: 25 filter 4. The advantage this creates is that inaccuraciesin generating the parameters cause an increase in the
dynamic range of the residual channel and a correspon~;n~
decrease in SNR, but with no loss in stereo image.
Referring to Figure 4b, at the decoder, the analysis
~ 30 filter parameters are recovered froM the transmitted
residual by using a backwards-adapting replica filter 5 of
the adaptive filter 4 at the coder. Decoders 13c, 13d are
identical to local decoders ~3a, 13b and so the filter 5
receives the same inputs, and thus produces the same
parameters, as that of encoder filter 4.

'!O 90/16136 2 ~ PCr/GB90/00928
-- 13 --

In a further embodiment (not shown), both filter
parameters and residual signal are transmitted as
side-information, overcoming many of the problems with the
residual only embodiment because the important stereo
information in the first 2 kHz is preserved intact and the
relative amplitude information at higher frequencies is
largel~ retained by the filter parameters.
Both the above residual-only and hybrid (i.e. residual
plus parameters) e~bodiments are preferably employed, as
o described, to predict the difference channel from the sum
channel. However, it is found that the same advantages of
retaining the stereo image (albeit with a decrease in SNR)
are found when the input channels are left and right,
rather than ~um and difference, provided the problem of
causality is overcome in some manner (e~g. by inserting a
relatively lony fixed delay in one or other path). The
scope of the invention therefore encompasses this also.
The parameter-only embodiment described above
preferably uses a single adaptive filter 4 to remove
redundancy between the sum and difference ch~nnPl~. An
effec~ discovered during testing was a curious
'whispering' effect if the coefficients were not sent at a
certain rate, which was far above what should have been
necessary to describe changes in the acoustic
environment. This was because the adaptive filter, in
addition to ~delling the room acoustic transfer function,
was also trying to perform an LPC analysis of the speech.
This is solved in the second aspect of the invention
by whitening the spectra of the input signals to the
adaptive filter as shown in Figure 5, so as to reduce the
; rapidly-ch~ng; ng speech co-ponPnt leaving principally the room acoustic component.
In the second aspect of the invention, the adaptive
filter 4 which models the acoustic transfer functions may

W O 90il6l36 ~ g'l PCT/GB90/00928 ~

,

be the same as before (for example, a lattice filter of
order lO). The sum channel is passed through a whitening
filter 14a (which may be lattice or a simple transversal
structure).
The master whitening Eilter 14a receives the sum
channel and adapts to derive an approximate spectral
inverse filter to the sum signal (or, at least, the speech
components thereof) by m;nimi~ing its own OlltpUt. The
output of the filter 14a is therefore substantially
white. The parameters derived by the master filter 14a
are supplied to the slave whitening filter 14b, which is
connected to receive and filter the difference signal.
The output of the slave whitening filter 14b is therefore
the difference signal filtered by the inverse of the sum
signal, which substantially removes common signal
components, reducing the correlation between the two and
leaving the output of 14b as consisting primarily of the
acoustic response of the room. It thus reduces the
dynamic range of the residual considerably.
The ef~ect is to whiten the sum channel and to
partially whiten the difference channel without affecting
the spectral differences between them as a result of room
acoustics, so that the derived coefficients of adaptive
filter 4 are model parameters of the room acoustics.
In one Pmhod;m~nt, the coefficients only are
transmitted and the decoder is simply that of Figure 2
(needing no further filters). In this embodiment, of
course, residual encoder 12b and decoder 13b are omitted.
An adaptive filter will generally not be long enough
to filter out long-term information, such as pitch
information in spe~ch, so the sum channel will not be
completely nwhite". However, if a long-term predictor
(~nown in LPC coding) is additionally employed in filters
14a and 14b, then filter 4 could, in principle, be

~ O 90/16136 2 ~ 8 ~ PcT/~B90/nn928
- 15 -

connected to filter the difference ~h~nnel alone, and thus
to model the inverse of the room acoustic.
Since this second aspect of the invention reduces the
dynamic range of the residual, it is particularly
advantageous to employ this whitening scheme with the
residual-only transmission described above. In this case,
prior to backwards adaptation at the decoder, it is
necessary to filter the residual using the inverse of the
whitening ~ilter, or to filter the sum channel using the
o whitening filter. Either ~ilter can be derived from the
sum channel information which is transmitted.
Referring to Figure 5b, in residual-only transmission,
an adaptive whitening filter 24a (identical to 14a at the
encoder) receives the (decoded) sum channel and adapts to
whiten its output. A slave filter 24b (identical to 14b
at the encoder) receives the coefficients of 24a. Using
the whitened sum channel as its input, and adapting from
the (decoded) residual by backwards adaptation, adaptive
filter S regenerates a filtered signal which is added to
the (decoded) residual and the sum is filtered by slave
filter 24b to yield the difference ch~nnel. The sum and
difference ch~nn~l~ are then processed (6, 7 not shown) to
yield the original left and right channels.
In a further embodiment (not shown), both residual and
coefficients are transmitted.
Although this pre-whitening aspect of the invention
has been described in relation to the preferre~ em~odiment
of the invention using sum and difference channels, it i5
also applicable where the two ch~nn~l~ are ~left' and
' right' ch~n~ls.
For a typical audioconferencing applicationl the
residual will have a bandwidth of 8 kHz and must be
quantised and transmitted using spare channel capacity of
abou~ 16 kbit/s. The whitened residual will be, in

; W O 90/1~136 ~ 3 ~ ~ ~CT/CB90/00928
- 16 -

principle, small in mean square value, but will not be
optimally whitened since the copy pre-whitening filker 14b
through which the residual passes has coefficients derived
to whiten the sum channel and not necessarily the
difference channel. Typically, the d~namic range of the
~iltered signal is reduced by 12dB over the unfiltered
difference channel. One approach to this residual
quantisation problem is to reduce the bandwidth of the
residual signal. This allows downsampling to a lower
rate, with a consequential increase in bits per sample.
It is well known that most of the spatial information in a
stereo signal .is contained within the 0~2 kHz band, and
~ therefore reducing the residual bandwidth from 8 kHz to a; value in excess of 2 ~HZ does not affect the perceived
stereo image appreciably. Results have shown that
reducing the residual bandwidth to 4 kHz (and taking the
upper 4 kHz band to be identical to that of the sum
channel) produces good quality stereophonic speech when
the reduced bandwidth residual is sub-band coded using a
standard technique.
Experiments with various adaptive filters for the
filter 4 (and, where applicable, 12) showed that a
s~andard transversal FIR filter was slow to converge.
A ~aster per~ormance can be obtained by using a lattice
structure, with coefficient update using a gradient
algorithm based on Burg's method, as shown in Figure 7.
The structure uses a lattice filter 14a to pre-whiten
the spectrum o~ the primary input. The decorrelated
backwards residual outputs are then used as inputs to a
simple linear combiner which attempts to model the input
spectrum of the secondary input. ~lthough the modelling
process is the same as with the simple transversal FIR
filter, the effect of the lattice fllter is to point the
error vector in the dir%ction of the optimum LMS residual
.

. O 90/16136 PCI/GB90/0092
-- 17 --

solution. This speeds convergence considerably.
A lattice filter of order 20 is found effective in
practice.
The lattlce filter s~ructure is particularly useful as
described above, but could also be used in a system in
which, instead of forming sum and difference signals, a
(suitably delayed) left channel is predicted from the
right ch~nnel.
Although the embodiments described show a stereophonic
system, it will be appreciated that with, for example,
quadrophonic systems, the invention is implemented by
forming a sum signal and 3 difference signals, and
predicting each from the sum signal as above.
Whilst the invention has been described as applied to
a low bit-rate transmission system, e.g. for
teleconferencing, it is also useful for example for
digital storage of music on well known digital record
carriers such as Compac' Discs, by providing a formatting
means for arranging the data in a format suitable for such
record carriers.
Conveniently, much or all of the signal processing
involved is realised in a single suitably programmed
digital signal processing (dsp) chip package; two channel
packages are also commercially available. Software to
implement adaptive filters, LPC analysis and
crsss-correlations are well known.

,

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 1998-12-01
(86) PCT Filing Date 1990-06-15
(87) PCT Publication Date 1990-12-16
(85) National Entry 1991-12-11
Examination Requested 1993-05-13
(45) Issued 1998-12-01
Expired 2010-06-15

Abandonment History

There is no abandonment history.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $0.00 1991-12-11
Maintenance Fee - Application - New Act 2 1992-06-15 $100.00 1992-04-29
Registration of a document - section 124 $0.00 1992-09-04
Maintenance Fee - Application - New Act 3 1993-06-15 $100.00 1993-05-12
Maintenance Fee - Application - New Act 4 1994-06-15 $100.00 1994-04-20
Maintenance Fee - Application - New Act 5 1995-06-15 $150.00 1995-05-24
Maintenance Fee - Application - New Act 6 1996-06-17 $150.00 1996-05-01
Maintenance Fee - Application - New Act 7 1997-06-16 $150.00 1997-04-24
Maintenance Fee - Application - New Act 8 1998-06-15 $150.00 1998-05-07
Final Fee $300.00 1998-08-04
Maintenance Fee - Patent - New Act 9 1999-06-15 $150.00 1999-05-12
Maintenance Fee - Patent - New Act 10 2000-06-15 $200.00 2000-05-15
Maintenance Fee - Patent - New Act 11 2001-06-15 $200.00 2001-05-16
Maintenance Fee - Patent - New Act 12 2002-06-17 $200.00 2002-05-15
Maintenance Fee - Patent - New Act 13 2003-06-16 $200.00 2003-05-14
Maintenance Fee - Patent - New Act 14 2004-06-15 $250.00 2004-05-17
Maintenance Fee - Patent - New Act 15 2005-06-15 $450.00 2005-05-16
Maintenance Fee - Patent - New Act 16 2006-06-15 $450.00 2006-05-15
Maintenance Fee - Patent - New Act 17 2007-06-15 $450.00 2007-05-17
Maintenance Fee - Patent - New Act 18 2008-06-16 $450.00 2008-05-15
Maintenance Fee - Patent - New Act 19 2009-06-15 $450.00 2009-06-04
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY
Past Owners on Record
CHEETHAM, BARRY MICHAEL GEORGE
HOLT, CHRISTOPHER ELLIS
MUNDAY, EDWARD
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Cover Page 1998-11-10 2 70
Claims 1998-02-11 4 133
Abstract 1995-08-17 1 64
Cover Page 1994-07-09 1 17
Claims 1994-07-09 7 239
Drawings 1994-07-09 5 104
Description 1994-07-09 17 772
Representative Drawing 1998-11-10 1 4
Correspondence 1998-08-04 1 31
Fees 1997-04-24 1 47
Fees 1996-05-01 1 44
Fees 1995-05-24 1 46
Fees 1994-04-20 1 33
Fees 1993-05-12 1 25
Fees 1992-04-29 1 29
National Entry Request 1991-12-11 2 89
Prosecution Correspondence 1991-12-11 12 438
National Entry Request 1992-02-14 3 87
Office Letter 1993-06-16 1 35
Prosecution Correspondence 1993-05-13 1 33
Prosecution Correspondence 1997-12-12 1 29
Examiner Requisition 1997-06-13 2 61
International Preliminary Examination Report 1991-12-11 15 380