Note: Descriptions are shown in the official language in which they were submitted.
CA 02414636 2002-12-18
METHOD OF CAPTURING CONSTANT ECHO PATH INFORMATION IN A
FULL DUPLEX SPEAKERPHONE
Field of the Invention
The present invention relates in general to speakerphones and more
particularly to a method of capturing constant echo path information in a full
duplex
handsfree (FDHD) speakerphone.
to Backg-round of the Invention
One of the most important performance indicators for full duplex
speakerphones is convergence time (i.e. the time reduired by the echo
cancellers
within the speakerphone to reach an acceptable level of cancellation). The
15 convergence time of the speakerphone depends both on internal Line Echo
Canceller
(LEC) and Acoustic Echo Canceller (AEC) convergence times. In order to
converge
quickly and properly, a speakerphone echo canceller requires a reference
signal with
correct stochastic properties. At the beginning of a call (Start-up), the
reference signal
is usually not sufficiently stochastic (e.g. the line signal typically
comprises narrow
2o band tones such as dial tone) or speech is not present, so that echo
cancellation is
unable to commence immediately. In such situations the speakerphone loop may
remain unstable for a noticeable period of time. This can result in feedback
or
"howling" of the speakerphone during start-up, especially when the speaker
volume is
high.
In order to prevent such feedback, it is an objective of speakerphone design
to
ensure that the echo cancellers (LEC and AEC) converge rapidly to the correct
echo
path models at start-up. Otherwise, the speaker volumes must be reduced during
start-
up, which may be annoying to a user.
According to one prior art approach to reducing the problem of feedback
during speakerphone start-up, howling detection has been used (see ITU-T
CA 02414636 2002-12-18
Recommendation 6.168) in combination with gain control. According to this
approach, the speaker volume (or loop gain) is reduced when howling is
detected. A
drawback of this approach is that the gain switching is often audible which
may be
annoying to the user.
Another prior art solution involves operating the speakerphone in a half
duplex
mode on start-up in order to prevent howling and echo from interfering with
communication. The speakerphone remains in the half duplex mode until the LEC
adapts sufficiently to ensure echo cancellation. A drawback of this approach
is that the
t o speakerphone sometimes stays in the half duplex mode for a long time,
making
communication between telephone parties cliff cult or impossible.
Yet another prior art solution involves forcing the speakerphone to start
operation at a predetermined "acceptable" low volume level which guarantees
15 stability in the audio loop, and then gradually increasing the volume as
convergence
of the echo canceller is achieved. A drawback of this approach is that the
volume
adjustment is often noticeable to the user.
Since the LEC models a network echo path where the first echo reflection of
2o the near end hybrid is usually reasonably constant for each connection, and
the AEC
models an acoustic echo path where direct acoustic coupling or coupling
through the
plastic housing of the phone is always the same for a given phone, both the
LEC and
AEC tray be loaded initially with previously captured and saved constant echo
path
models represented by default coefficients, and then continue to converge
toward the
25 complete echo channel models. This results in Caster convergence tin ~e,
and more
stability as the main, strongest echo reflections will already be cancelled
using the
default coefficient models.
Thus, according to copending Patent Canadian Patent Application No.
ao 2,291,428, a method is provided for improving the start-up convergence time
of the
LEC filter, thereby resulting in a total reduced convergence time for the
speakerphone. This method is based on capturing the L,EC coefficients once the
LEC
CA 02414636 2002-12-18
has converged, and saving them as the default coefficients for the next call.
As a
result, the echo-canceling algorithm does not have to wait for a suitable
reference
signal to commence convergence. At start-up, the echo canceller immediately
begins
canceling the line echo, based on the previously stored LEC coefficients,
thereby
assisting the AEC algorithm by eliminating residual line echo from the
acoustic signal
which the AEC algorithm is required to converge to, and initially making the
speakerphone loop more stable. As indicated above, the same principal may also
be
applied to the AEC for direct acoustic coupling or coupling through the
speakerphone
housing plastic, which is always the same for a given phone. The default
coefficients
to in this case represent the constant acoustic echo path from loudspeaker to
microphone
and may be reused for each new call. At start-up, the AEC immediately starts
canceling the echo caused by direct acoustic coupling, while converging toward
the
complete acoustic echo path model that represents the combination of direct
coupling
and the specific room echo response.
The principle of saving default coefficients may also be applied to multiple
loudspeaker-to-microphone echo paths for multiple-microphone directional
systems,
or even loudspeaker-to-beam echo paths for beamforming-based systems that
perform
echo cancellation on the output signal of a beamformer. In these cases,
default
z0 coefficients can be reused from one instance of the AEC to the next in each
different
direction (e.g. angular sectors).
In order for such systems to work properly, the coefficients must be saved at
appropriate times. If they are saved at arbitrary instants (c.g. at the end of
a call), then
zs there is a risk that the full-duplex echo cancellation algorithm will not
be in a well-
converged state at the instant of saving the coefficients. For example, the
echo
cancellation algorithm may be in the process of adapting to an echo path
change
related to the user moving his/her hand towards the telephone to press a
button for
ending the call. Saving the default coefficients in this case and reusing them
at a later
~o stage (e.g. for the next call) may result in poor echo canceller
performance until it re-
converges to a set of "good" coefficients.
CA 02414636 2002-12-18
4
As indicated above, the system set forth in Canadian Patent Application No.
2,291,428 tracks the degree of convergence of the full-duplex algorithm, and
saves the
default coefficients each time the convergence reaches a predetermined level.
In one
embodiment, the amount of echo actually cancelled by the algorithm is
measured, and
the coefficients are saved each time this amount increases by 3dB from the
previous
save. One problem with this method is that if the full-duplex algorithm is
subjected to
narrow-band signals (e.g. in-band tones that are not detected fast enough),
then it may
reach excellent levels of convergence with coefficients that are very
different from the
useful wide-band echo-path coefficients. In such situations the system may
never
to reach such a good level of convergence again with a wide-band signal, such
that
proper coefficients are never captured. This may result in annoying echo
bursts for the
far-end user each time these coefficients are used (for instance, at the
beginning of
Each subsequent call). Another problem is that if the telephone is moved to a
different
location on a desk, where the direct echo path is more difficult to adapt to,
then it may
I5 never be able to capture coefficients corresponding to its new location. It
may
therefore constantly reuse coefficients that do not correspond to those
characterizing
the real echo path, resulting in mediocre echo cancellation until the
algorithm has a
chance to re-converge to the real echo path.
20 summary of the Invention
According to the present invention, a method is provided for determining
when to save coefficients so as to ensure that the system always captures
coefficients
that correspond to the best possible echo cancellation in its current
condition, and to
25 recover from scenarios where 'bad' default coefficients are captured. Thus,
the saving
of coefficients occurs at varying times depending on the amount of echo
removed by
the echo canceller (EC).
More particularly, the capture of coefficients is triggered when the amount of
3o echo cancellation provided by the EC exceeds a certain threshold, where the
value of
the threshold varies with time. The threshold is increased by a certain amount
each
CA 02414636 2002-12-18
J
time the capture is triggered, and it is decreased by a certain amount when
the capture
is not triggered despite the presence of speech on the EC reference signal.
Brief Description of the Drawings
A detailed description of the prior at~t and of a preferred embodiment of the
invention is provided herein below with reference to the following drawings,
in
which:
t o Figure 1 is a block diagram of a prior art speakerphone echo canceller
structure;
Figure 2 is a flow chart showing the steps of the echo cancellation method
according to the prior art; and
Figure 3 is a block diagram showing an adaptive filter structure for
implementing a method of triggering capture of coefficients according to the
present
mvenhon.
2o Detailed Descr~tion of Prior Art and Preferred Embodiment
As discussed briefly above, a speakerphone echo canceller (EC) comprises two
adaptive filters that attempt to converge to two different echo models
(acoustic and
network echo) at the same time. As a result, speakerphones can easily become
2~ unstable, especially during start-up.
A traditional speakerphone echo canceller is shown in Figurel, wherein
essential speakerphone components that are not related to echo cancellation
have been
omitted for clarity (e.g. double talk detector, non-linear processor, etc.)
and are not
3o addressed herein since they are not germane to the invention. The echo
canceller
attempts to model the transfer function of the echo path by weans of an LEC
filter and
an AEC filter. The received signal (line or acoustic) is applied to the input
of each
CA 02414636 2002-12-18
G
filter (LEC and AEC) and to the associated echo path (network or acoustic)
such that
the estimated echo can be canceled by simply subtracting the signal which
passes
through each echo canceller froth the received signal. If the transfer
function of the
model of the echo path is exactly the same as the transfer function of the
echo path,
the echo signal component is completely canceled (i.e. the error signal will
be rero).
The error signal is used for adaptation, so that the echo canceller converges
to the
correct transfer function, as discussed briefly above.
Typically, an algorithm such as the NLMS (Nornlalized-Least-Mean-Squared)
to algorithm is used to approximate the echo path (see "C261 (UN1C) DSP Re-
engineering and Performance Report" Mitel Semiconductor, Document No.
C261 AP13, Oct. 21, I 996).
From Figure 1 it will be appreciated that the residual echo after imperfect
cancellation by the LEC will pass to the AEC reference signal. Since this
residual
echo is not correlated to the AEC received signal, this can cause the AEC
filter to
diverge. The extent to which AEC filter diverges depends on the level of the
residual
line echo. if the line echo is sufficiently canceled, its effect on the AEC
behavior will
be negligible.
?o
Echo Return Loss Enhancement (ERLE) is an indicator of the amount of echo
removed by an echo canceller. The ERLE is defined as:
ERLE(dB)=1 Olog,~,[Power(ReceivedSignal)/Power(ErrorSignal)];
ZS
A generally acceptable LEC convergence time requires that the echo cancelier
achieve 27dB of ERLE in 0.5 sec (in ideal conditions).
Since the telephone is always connected to the same local loop (i.c. to the
o near-end Central Office (CO) or PBX), the impedance; of the local loop
remains the
same for each call and consequently the near-end echoes remain fairly
constant, from
call to call. Accordingly, the local loop echo coefficients can be stored and
re-used
CA 02414636 2002-12-18
from call to call, thereby improving the start-up ERLE of the LEC.
Furthermore, since
the direct acoustic coupling through the plastic from loudspeaker to
microphone is
constant for given phone, the coefficients representing this part of the
acoustic echo
path can also be stored and re-used from call to call or when the look
direction is
changed in a directional speakerphone system, thereby improving the start-up
ERLE
of the AEC.
Thus, with reference to the flowchart of Figure 2, which shows operation of
the method set forth in Canadian Patent Application No. 2,291,428, after start-
up of
1o the echo canceller (Step 200), any previously stored default LEC
coefficients are
loaded into the LEC. Although Canadian Patent Application No. 2,291,428 refers
only
to default coefficients being saved for the L,EC", as indicated above the same
principles
apply to the AEC coefficients. Thus, the LEC (and/or AEC) begins) convergence
using the well known NLMS algorithm (or other). On initial power-up of the
t 5 speakerphone (i.e. prior to placing the first call), the initial
coefficients are zero. Thus,
the first call after power-up wt 1l always be a "training" call that results
in capturing a
suitable set of default coefficients for future calls. Next, at step 201,the
"Cal l"
proceeds. Signal levels of the LEC (and/or AEC) received signal and error
signal are
detected (step 203) and the ERLE is calculated using the formula set forth
above (step
20 205). When a predetermined ERLE threshold level (Th) is reached (e.g. at
least 24dB
of echo is canceled), as calculated at step 20 t, and provided that the best
LEC (and/or
AEC) coefficients have not been previously saved during the call-in-progress
(step
209), then the LEC (and/or AEC) coefficients of the (near) constant echo path
are
saved (step 211). Convergence of the LEC (and/or AEC) then proceeds as per
usual
25 and the call is completed (step 213). Once saved, the default coefficients
are not
recalculated again for the duration of the call (i.e. a YES decision at step
209).
However, the LEC (and/or AEC) default coefficients will be calculated once per
each
call to ensure the best default set is captured for the next call.
3o At start-up of the next call, the previously stored LEC (and/or AEC)
coefficients are retrieved and used as the default coefficient set for the LEC
(and/or
AEC) (step 200), instead of starting from zero.
CA 02414636 2002-12-18
g
The following pseudo code illustrates the principles of the above method in
greater detail, wherein "EC" is used to indicate both the LEC and AEC:
Power-up: Default'coefficients = [000...0];
Start_Call: EC coefficients = Default_eoefficients;
Call
Execute EC algorithm;
t 0 Calculate power level of received signal ;
Calculate power level of error signal;
If (ERLE > Threshold) AND ( Best default set not saved)
Save near echo coefficients
If Not(End of the Call) Go to Call;
If New Call Go to Start Call;
Thus, each call subsequent to the initial power-up "training" call is provided
with default coefficients that model the network and acoustic echo paths and
guarantee small LEC and AEC.' error. This improves the training and tracking
2o characteristic of the Full Duplex Handsfree Speakerphone (FDHF) and
eliminates
feedback during start-up. The best results are <zchieved when the training
call uses a
handset since there is no AEC-LEC loop instability and the LEC and AEC can
therefore converge quickly.
2, According to the present invention, and in contrast with the prior method
set
forth in Canadian Patent Application No. 2,291,428, instead of fixing the
threshold
ERLE at a value of 24 dB, the "Threshold" value is varied to provide optimum
performance for any particular application. As discussed in greater detail
below, the
"Threshold" value is increased by a factor, denoted herein as
3~ ERLE~THRESHOLD_.F ACTOR-UP, each time a capture is triggered, and decreased
by another amount, denoted herein as ERLE-THRESHOLD-FACTOR~DOWN,
when the capture is not triggered even though speech is present in the
reference signal.
The following pseudo code, in combination with Figure 3, illustrates the
principle of
the present invention:
3s Power-up: Default coefficients = [000...0];
CA 02414636 2002-12-18
9
Start_Call: EC_coefficients = Default coefficients;
Call
Execute EC algorithm;
Calculate power level of received signal ;
a Calculate power level of error signal;
If (ERLE > Threshold)
Save near echo coefficients
Increase Threshold
Else if Voice present on EC reference signal
If Threshold > THRESHOLD_MIN
Decrease Threshold
If Not(End of the Call) Go to Gall;
If New Call Go to Start C.'all;
t5 It should be noted that with the method set forth above, the capture of
echo
coefficients might be triggered several times within the same call. The
rationale
behind increasing the threshold with each capture is to try and capture the
coefficients
corresponding to the best possible cancellation performance of the algorithm
in the
given system. The rationale behind decreasing the threshold when the capture
is not
2o triggered, even though there is speech activity in the reference signal, is
to avoid
getting "stuck" with bad coefficients captured as a result of a faulty
scenario (for
instance on a narrow-bald signal like a tone). A 11'llllllllllm Valu a Of the
threshold is
defined to avoid capturing coefficients below a certain level of echo
cancellation.
?5 As shown fl Figure 3, on power up of the speakerphone, the echo canceller
is
loaded with default coefficients (step 301 ). On commencement of a call (step
303), the
power of the EC signal (i.e. LEC and/or AEC' Received Signals in Figure 1 )
and error
signal (i.e. LEC Error and/or AEC Error in Figure 1 ) are calculated. Next, at
step 307,
the ERLE is computed. The algorithm then determines (at step 309) whether
ERLE>
3U Threshold. If yes, the default coefficients are saved (step 311 ) and the
Threshold is
increased by an amount ERLE_THRESHOLD_FACTOR UP (step 312). If no, the
algorithm checks for voice in the reference signal (step 313). If voice is
detected, then
the algorithm deternzines whether Threshold>THRESHOLD_M1N (step 315). If it
is,
then Threshold is decreased by an amount
3> ERLE THRESHOLD_.FAC"TOR_.DOWN(step 317). After step 312, and if the
determinations at either of steps 313 or 315 is "No", the algorithm then
determines
CA 02414636 2002-12-18
1 ()
whether the "Call" has ended (step 319). If not, the process is repeated at
step
303.Otherwise, if the "Call" has ended then the algorithm cycles until a new
"Call" is
initiated (step 321 ), whereupon the process is repeated at step 303.
The following are the values of constants and thresholds that were used in a
successful implementation of the invention:
Sampling rate = 8000 samples per second;
ERLE THRESHOLD~FACTOR DOWN = exp(-log(2)/(5*8000)) _
0.99998267147063, (resulting in a decrease of 3 dB every 5 seconds );
to ERLE THRESHOLD_FACTOR_UP = 2, increase by 3dB
THRESHOLD MIN = 2~7 = 128 (? 1 dB, or 7 tithes increase of 3dB)
Other embodiments and variations of the invention are possible. For example,
as discussed above the method of capture and use ol'the echo canceller
coefficients
15 according to the present invention applies to improving the echo canceller
performance not only for the new calls, but also to any system where the echo
canceller has to deal with variations in the echo paths that are constant and
repeatable
for long intervals. When the EC resumes operation on an echo path that is
characterized by a constant response that can be represented by the captured
default
2o coefficients, the method of the present invention may be used to capture
the
coefficients. For example, in a confcrencing system (or a speakerphone) that
uses
directional microphones or beamforming to enhance duality of the near-end
speech,
the echo canceller default coefficients can be captured according to the
method of this
invention for each look direction. All such modifications and variations are
possible
25 W1t11111 the sphere and scope of the invention as defined by the claims
appended
hereto.