Language selection

Search

Patent 2505496 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2505496
(54) English Title: ROBUST LOCALIZATION AND TRACKING OF SIMULTANEOUSLY MOVING SOUND SOURCES USING BEAMFORMING AND PARTICLE FILTERING
(54) French Title: LOCALISATION ET SUIVI ROBUSTES DE SOURCES SONORES EN MOUVEMENT SIMULTANE UTILISANT LA FORMATION DE FAISCEAU ET LE FILTRAGE DE PARTICULES
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • G01S 3/80 (2006.01)
  • G10K 15/00 (2006.01)
  • H04R 1/40 (2006.01)
  • H04R 25/00 (2006.01)
(72) Inventors :
  • MICHAUD, FRANCOIS (Canada)
  • VALIN, JEAN-MARC (Canada)
  • ROUAT, JEAN (Canada)
(73) Owners :
  • SOCPRA SCIENCES ET GENIE S.E.C. (Canada)
(71) Applicants :
  • UNIVERSITE DE SHERBROOKE (Canada)
(74) Agent: BCF LLP
(74) Associate agent:
(45) Issued:
(22) Filed Date: 2005-04-27
(41) Open to Public Inspection: 2006-10-27
Examination requested: 2010-04-20
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data: None

Abstracts

English Abstract





The present invention relates to a system for localizing at least one sound
source, comprising a set of spatially spaced apart sound sensors to detect
sound from
the at least one sound source and produce corresponding sound signals, and a
frequency-domain beamformer responsive to the sound signals from the sound
sensors
and steered in a range of directions to localize, in a single step, the at
least one sound
source. The present invention is also concerned with a system for tracking a
plurality of
sound sources, comprising a set of spatially spaced apart sound sensors to
detect
sound from the sound sources and produce corresponding sound signals, and a
sound
source particle filtering tracker responsive to the sound signals from the
sound sensors
for simultaneously tracking the plurality of sound sources. The invention
still further
relates to a system for localizing and tracking a plurality of sound sources,
comprising a
set of spatially spaced apart sound sensors to detect sound from the sound
sources
and produce corresponding sound signals; a sound source detector responsive to
the
sound signals from the sound sensors and steered in a range of directions to
localize
the sound sources, and a particle filtering tracker connected to the sound
source
detector for simultaneously tracking the plurality of sound sources.


Claims

Note: Claims are shown in the official language in which they were submitted.




41
WHAT IS CLAIMED IS:
1. A system for localizing and tracking a plurality of sound sources,
comprising:
a set of spatially spaced apart sound sensors to detect sound from the sound
sources and produce corresponding sound signals;
a sound source detector responsive to the sound signals from the sound
sensors and steered in a range of directions to localize the sound sources;
and
a particle filtering tracker connected to the sound source detector for
simultaneously tracking the plurality of sound sources.
2. A sound source localizing and tracking system as defined in claim 1,
wherein
the set of sound sensors comprises a predetermined number of omnidirectional
microphones arranged in a predetermined array.
3. A sound source localizing and tracking system as defined in claim 1,
wherein
the sound source detector is a frequency-domain steered beamformer.
4. A sound source localizing and tracking system as defined in claim 3,
wherein
the steered beamformer comprises:
a calculator of sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;
a calculator of cross-correlations by averaging the cross-power spectra over a
given period of time;
a calculator of an output energy of the steered beamformer from the calculated
cross-correlations; and
a finder of a loudest sound source localized in a given direction, the given
direction of the loudest sound source being found by maximizing the output
energy of
the steered beamformer.


42

5. A sound source localizing and tracking system as defined in claim 4,
wherein
the calculator of cross-correlations comprises:
a calculator for computing, in the frequency domain, whitened cross-
correlations; and
a weighting function applied to the calculated whitened cross-correlations to
act
as a mask based on a signal-to-noise ratio.
6. A sound source localizing and tracking system as defined in claim 5,
wherein
the weighting function is modified to include a reverberation term in a noise
estimate in
order to make the system more robust to reverberation.
7. A sound source localizing and tracking system as defined in claim 3,
wherein
the steered beamformer produces an output energy and comprises:
a uniform triangular grid for the surface of a sphere to define directions;
a calculator of sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;
a calculator of cross-correlations by averaging the cross-power spectra over a
given period of time;
a first algorithm for searching a best direction on the grid of the sphere;
a pre-computed table of time delays of arrival for each pair of sound sensors
and each direction on the grid of the sphere; and
a finder of a loudest sound source in a direction of the grid of the sphere,
the
direction of the loudest sound source being found using the first algorithm
and the pre-
computed table by maximizing the output energy of the steered beamformer.
8. A sound source localizing and tracking system as defined in claim 7,
further
comprising a second algorithm for finding another sound source after having
removed
the contribution of the loudest sound source located by the finder.
9. A sound source localizing and tracking system as defined in claim 7,
wherein
the steered beamformer further comprises:



43

a refined grid for the surrounding of a point where a sound source was found
in
order to find a direction of localization of the found sound source with
improved
accuracy.
10. A sound source localizing and tracking system as defined in claim 1,
wherein
the particle filtering tracker models each sound source using a number of
particles
having respective directions and weights.
11. A sound source localizing and tracking system as defined in claim 1,
wherein the particle filtering tracker comprises:
a calculator of a probability that a potential source is a real source.
12. A sound source localizing and tracking system as defined in claim 1,
wherein the particle filtering tracker comprises:
a calculator of a probability that a real source corresponds to a potential
source detected by the sound source detector.
13. A sound source localizing and tracking system as defined in claim 10,
wherein the particle filtering tracker comprises:
a calculator of (a) at least one of a probability that a sound source is
observed
and a probability that a real sound source corresponds to a potential sound
source, and
(b) a probability density of observing a sound source at a given particle
position; and
a calculator of updated particle weights in response to said probability
density
and said at least one probability.
14. A sound source localizing and tracking system as defined in claim 1,
wherein the particle filtering tracker comprises:
an adder of a new source when a probability that the new source is real is
higher than a first threshold.




44


15. A sound source localizing and tracking system as defined in claim 14,
wherein the sound source localizing and tracking system assumes that the added
new source exists if a probability of existence of said new source reaches a
second
threshold.

16. A sound source localizing and tracking system as defined in claim 1,
wherein the particle filtering tracker comprises:

a subtractor of a source when the latter source has not been observed for a
certain period of time.

17. A sound source localizing and tracking system as defined in claim 13,
wherein the particle filtering tracker comprises:

an estimator of a position of each source as a weighted average of the
positions of its particles, said estimator being responsive to the calculated,
updated
particle weights.

18. A system for localizing at least one sound source, comprising:

a set of spatially spaced apart sound sensors to detect sound from said at
least
one sound source and produce corresponding sound signals; and

a frequency-domain beamformer responsive to the sound signals from the
sound sensors and steered in a range of directions to localize, in a single
step, said at
least one sound source.

19. A sound source localizing system as defined in claim 18, wherein the set
of
sound sensors comprises a predetermined number of omnidirectional microphones
arranged in a predetermined array.

20. A sound source localizing system as defined in claim 18, wherein the
steered beamformer comprises:





45


a calculator of sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;

a calculator of cross-correlations by averaging the cross-power spectra over a
given period of time;

a calculator of an output energy of the steered beamformer from the calculated
cross-correlations; and

a finder of a loudest sound source localized in a given direction, the given
direction of the loudest sound source being found by maximizing the output
energy of
the steered beamformer.

21. A sound source localizing system as defined in claim 20, wherein the
calculator of cross-correlations comprises:

a calculator for computing, in the frequency domain, whitened cross-
correlations; and

a weighting function applied to the calculated whitened cross-correlations to
act
as a mask based on a signal-to-noise ratio.

22. A sound source localizing system as defined in claim 21, wherein the
weighting function is modified to include a reverberation term in a noise
estimate in
order to make the system more robust to reverberation.

23. A sound source localizing and tracking system as defined in claim 18,
wherein the steered beamformer produces an output energy and comprises:

a uniform triangular grid for the surface of a sphere to define directions;

a calculator of sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;

a calculator of cross-correlations by averaging the cross-power spectra over a
given period of time;

a first algorithm for searching a best direction on the grid of the sphere;

a pre-computed table of time delays of arrival for each pair of sound sensors
and each direction on the grid of the sphere; and




46


a finder of a loudest sound source in a direction of the grid of the sphere,
the
direction of the loudest sound source being found using the first algorithm
and the pre-
computed table by maximizing the output energy of the steered beamformer.

24. A sound source localizing system as defined in claim 23, further
comprising
a second algorithm for finding another sound source after having removed the
contribution of the loudest sound source located by the finder.

25. A sound source localizing and tracking system as defined in claim 23,
wherein the steered beamformer further comprises:

a refined grid for the surrounding of a point where a sound source was found
in
order to find a direction of localization of the found sound source with
improved
accuracy.

26. A system for tracking a plurality of sound sources, comprising:

a set of spatially spaced apart sound sensors to detect sound from the sound
sources and produce corresponding sound signals; and

a sound source particle filtering tracker responsive to the sound signals from
the
sound sensors for simultaneously tracking the plurality of sound sources.

27. A sound source tracking system as defined in claim 26, wherein the
particle
filtering tracker models each sound source using a number of particles having
respective directions and weights.

28. A sound source tracking system as defined in claim 26, wherein the
particle
filtering tracker comprises:

a calculator of a probability that a potential source is a real source.

29. A sound source tracking system as defined in claim 26, wherein the
particle
filtering tracker comprises:





47


a calculator of a probability that a real source corresponds to a potential
source.

30. A sound source tracking system as defined in claim 27, wherein the
particle
filtering tracker comprises:

a calculator of (a) at least one of a probability that a sound source is
observed
and a probability that a real sound source corresponds to a potential sound
source, and
(b) a probability density of observing a sound source at a given particle
position; and

a calculator of updated particle weights in response to said probability
density
and said at least one probability.

31. A sound source tracking system as defined in claim 26, wherein the
particle
filtering tracker comprises:

an adder of a new source when a probability that the new source is real is
higher than a first threshold.

32. A sound source tracking system as defined in claim 31, wherein the
sound source tracking system assumes that the added new source exists if a
probability of existence of said new source reaches a second threshold.

33. A sound source tracking system as defined in claim 26, wherein the
particle
filtering tracker comprises:

a subtractor of a source when the latter source has not been observed for a
certain period of time.

34. A sound source tracking system as defined in claim 30, wherein the
particle
filtering tracker comprises:

an estimator of a position of each source as a weighted average of the
positions of its particles, said estimator being responsive to the calculated,
updated
particle weights.




48


35. A method for localizing and tracking a plurality of sound sources,
comprising:

detecting sound from the sound sources through a set of spatially spaced apart
sound sensors to produce corresponding sound signals;

localizing the sound sources in response to the sound signals, localizing the
sound sources including steering in a range of directions a sound source
detector
having an output; and

simultaneously tracking the plurality of sound sources, using particle
filtering, in
relation to the output from the sound source detector.

36. A sound source localizing and tracking method as defined in claim 35,
wherein steering a sound source detector comprises steering a frequency-domain
beamformer.

37. A sound source localizing and tracking method as defined in claim 36,
wherein localizing the sound sources comprises:

computing sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;

computing cross-correlations by averaging the cross-power spectra over a given
period of time;

computing an output energy of the steered beamformer from the calculated
cross-correlations; and

finding a loudest sound source localized in a given direction, the given
direction
of the loudest sound source being found by maximizing the output energy of the
steered
beamformer.

38. A sound source localizing and tracking method as defined in claim 37,
wherein computing the cross-correlations comprises:

computing, in the frequency domain, whitened cross-correlations; and




49


applying a weighting function to the computed whitened cross-correlations to
act
as a mask based on a signal-to-noise ratio.

39. A sound source localizing and tracking method as defined in claim 38,
comprising modifying the weighting function by including a reverberation term
in a noise
estimate in order to make the method more robust to reverberation.

40. A sound source localizing and tracking method as defined in claim 36,
wherein localizing the sound sources comprises:

defining a uniform triangular grid for the surface of a sphere to define
directions;
computing sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;

computing cross-correlations by averaging the cross-power spectra over a given
period of time;

pre-computing a table of time delays of arrival for each pair of sound sensors
and each direction on the grid of the sphere; and

finding a loudest sound source in a direction of the grid of the sphere,
finding the
loudest sound source comprising searching a best direction on the grid of the
sphere
using a first algorithm and the pre-computed table by maximizing an output
energy of
the steered beamformer.

41. A sound source localizing and tracking method as defined in claim 40,
comprising finding another sound source, using a second algorithm, after
having
removed the contribution of the located, loudest sound source.

42. A sound source localizing and tracking method as defined in claim 40,
wherein localizing the sound sources further comprises:

defining a refined grid for the surrounding of a point where a sound source
was
found in order to find a direction of localization of the found sound source
with improved
accuracy.





50


43. A sound source localizing and tracking method as defined in claim 35,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises modeling each sound source using a number of particles having
respective directions and weights.

44. A sound source localizing and tracking method as defined in claim 35,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

computing a probability that a potential source is a real source.

45. A sound source localizing and tracking method as defined in claim 35,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

computing a probability that a real source corresponds to a potential source
detected by the sound source detector.

46. A sound source localizing and tracking method as defined in claim 43,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

computing (a) at least one of a probability that a sound source is observed
and a
probability that a real sound source corresponds to a potential sound source,
and (b) a
probability density of observing a sound source at a given particle position;
and

computing updated particle weights in response to said probability density and
said at least one probability.

47. A sound source localizing and tracking method as defined in claim 35,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

adding a new source when a probability that the new source is real is higher
than a first threshold.




51


48. A sound source localizing and tracking method as defined in claim 47,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises assuming that the added new source exists if a probability of
existence of
said new source reaches a second threshold.

49. A sound source localizing and tracking method as defined in claim 35,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

removing a sound source when the latter source has not been observed for a
certain period of time.

50. A sound source localizing and tracking method as defined in claim 43,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

estimating a position of each source as a weighted average of the positions
of its particles, said estimator being responsive to the calculated, updated
particle
weights.

51. A method for localizing at least one sound source, comprising:

detecting sound from said at least one sound source through a set of spatially
spaced apart sound sensors to produce corresponding sound signals; and

localizing, in a single step, said at least one sound source in response to
the
sound signals, localizing said at least one sound source including steering a
frequency-
domain beamformer in a range of directions.

52. A sound source localizing method as defined in claim 51, wherein
localizing,
in a single step, said at least one sound source comprises:

computing sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;





52


computing cross-correlations by averaging the cross-power spectra over a given
period of time;

computing an output energy of the steered beamformer from the calculated
cross-correlations; and

finding a loudest sound source localized in a given direction, the given
direction
of the loudest sound source being found by maximizing the output energy of the
steered
beamformer.

53. A sound source localizing method as defined in claim 52, wherein computing
the cross-correlations comprises:

computing, in the frequency domain, whitened cross-correlations; and

applying a weighting function to the computed whitened cross-correlations to
act
as a mask based on a signal-to-noise ratio.

54. A sound source localizing method as defined in claim 53, comprising
modifying the weighting function by including a reverberation term in a noise
estimate in
order to make the method more robust to reverberation.

55. A sound source localizing method as defined in claim 51, wherein
localizing,
in a single step, said at least one sound source comprises:

defining a uniform triangular grid for the surface of a sphere to define
directions;

computing sound power spectra and cross-power spectra of sound signal
samples in overlapping windows;

computing cross-correlations by averaging the cross-power spectra over a given
period of time;

pre-computing a table of time delays of arrival for each pair of sound sensors
and each direction on the grid of the sphere; and

finding a loudest sound source in a direction of the grid of the sphere,
finding the
loudest sound source comprising searching a best direction on the grid of the
sphere
using a first algorithm and the pre-computed table by maximizing an output
energy of
the steered beamformer.




53


56. A sound source localizing method as defined in claim 55, comprising
finding
another sound source, using a second algorithm, after having removed the
contribution
of the located, loudest sound source.

57. A sound source localizing method as defined in claim 55, wherein
localizing,
in a single step, said at least one sound source further comprises:
defining a refined grid for the surrounding of a point where a sound source
was
found in order to find a direction of localization of the found sound source
with improved
accuracy.

58. A method for tracking a plurality of sound sources, comprising:

detecting sound from the sound sources through a set of spatially spaced apart
sound sensors to produce corresponding sound signals; and

simultaneously tracking the plurality of sound sources, using particle
filtering
responsive to the sound signals from the sound sensors.

59. A sound source tracking method as defined in claim 58, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering, comprises
modeling each sound source using a number of particles having respective
directions and weights.

60. A sound source tracking method as defined in claim 58, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

computing a probability that a potential source is a real source.

61. A sound source tracking method as defined in claim 58, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:




54


computing a probability that a real source corresponds to a potential source
detected by the sound source detector.

62. A sound source tracking method as defined in claim 59, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

computing (a) at least one of a probability that a sound source is observed
and a
probability that a real sound source corresponds to a potential sound source,
and (b) a
probability density of observing a sound source at a given particle position;
and

computing updated particle weights in response to said probability density and
said at least one probability.

63. A sound source tracking method as defined in claim 58, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

adding a new source when a probability that the new source is real is higher
than a first threshold.

64. A sound source tracking method as defined in claim 63, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering, comprises
assuming that the added new source exists if a probability of existence of
said new
source reaches a second threshold.

65. A sound source tracking method as defined in claim 58, wherein
simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

removing a sound source when the latter source has not been observed for a
certain period of time.




55


66. A sound source localizing and tracking method as defined in claim 59,
wherein simultaneously tracking the plurality of sound sources, using particle
filtering,
comprises:

estimating a position of each source as a weighted average of the positions
of its particles, said estimator being responsive to the calculated, updated
particle
weights.

Description

Note: Descriptions are shown in the official language in which they were submitted.



CA 02505496 2005-04-27
1
ROBUST LOCALIZATION AND TRACKING OF SIMULTANEOUSLY MOVING
SOUND SOURCES USING BEAMFORMING AND PARTICLE FILTERING
FIELD OF THE INVENTION
The present invention relates to a sound source localizing method and system,
a sound source tracking method and system and a sound source localizing and
tracking
method and system.
BACKGROUND OF THE INVENTION
Sound source localization is defined as the determination of the coordinates
of sound sources in relation to a point in space. The auditory system of
living
creatures provides vast amounts of information about the world, such as
localization
of sound sources. For example, human beings are able to focus their attention
on
surrounding events and changes, such as a cordless phone ringing, a vehicle
honking, a person who is speaking, etc.
Hearing complements other senses such as vision since it is omnidirectional,
capable of working in the dark and not incapacitated by physical structure
such as
walls. Those who do not suffer from hearing impairments can hardly imagine
spending a day without being able to hear, especially when moving in a dynamic
and unpredictable world. Marschark [M. Marschark, "Raising and Educating a
Deaf
Child", Oxford University Press, 1998, http://www.rit.edu/
memrtllcourse/interpretinglmoduleslmodulelist.htm) has even suggested that
although deaf children have similar IQ results compared to other children,
they do
experience more learning difficulties in school. Obviously, intelligence
manifested by


CA 02505496 2005-04-27
2
autonomous robots would surely be improved by providing them with auditory
capabilities.
To localize sound, the human brain combines timing (more specifically delay
or phase) and amplitude information related to the sound perceived by the two
ears,
sometimes in addition to information from other senses. However, localizing
sound
sources using only two sensing inputs is a challenging task. The human
auditory
system is very complex and resolves the problem by taking into consideration
the
acoustic diffraction around the head and the ridges of the outer ear. Without
this
ability, localization of sound through a pair of microphones is limited to
azimuth only
without distinguishing whether the sounds come from the front or the back. It
is
even more difficult to obtain high precision readings when the sound source
and the
two microphones are located along the same axis.
Fortunately, robots did not inherit the same limitations as living creatures;
more than two microphones can be used. Using more than two microphones
improves the reliability and accuracy in localizing sounds within three
dimensions
(azimuth and elevation). Also, detection of multiple signals provides
additional
redundancy, and reduces uncertainty caused by the noise and non-ideal
conditions
such as reverberation and imperfect microphones.
Signal processing research that addresses artificial audition is often geared
toward specific tasks such as speaker tracking for videoconferencing [B.
Mungamuru and P. Aarabi, "Enhanced sound localization", IEEE Transactions on
Systems, Man, and Cybernetics Part 8, vol. 34, no. 3, 2004, pp. 1526-1540].
For
that reason, artificial audition on mobile robots is a research area still in
its infancy
and most of the work has been done in relation to localization of sound
sources and
mostly using only two microphones. This is the case of the SIG robot that uses
both
IPD (Inter-aural Phase Difference) and IID (Inter-aural Intensity Difference)
to
localize sound sources [K. Nakadai, D. Matsuura, H. G. Okuno, and H. Kitano,


CA 02505496 2005-04-27
3
"Applying scattering theory to robot audition system: Robust sound source
localization and extraction", in Proceedings IEEElRSJ International Conference
on
Intelligent Robots and Systems, 2003, pp. 1147-1152]. The binaural approach
has
limitations for evaluating elevation and usually, the front-back ambiguity
cannot be
resolved without resorting to active audition [K. Nakadai, T. Lourens, H. G.
Okuno,
and H. Kitano, "Active audition for humanoid", in Proceedings of the
Seventeenth
National Conference on Artificial Intelligence (AAAI), 2000, pp. 832-839].
More recently, approaches using more than two microphones have been
developed. One of these approaches uses a circular array of eight microphones
to
locate sound sources [F. Asano, M. Goto, K. Itou, and H. Asoh, "Real-time
source
localization and separation system and its application to automatic speech
recognition", in Proc. EUROSPEECH, 2001, pp. 1013-1016]. The article of [J.-M.
Valin, F. Michaud, J. Rouat, and D. Letourneau, "Robust sound source
localization
using a microphone array on a mobile robot", in Proceedings IEEElRSJ
International Conference on Intelligent Robots and Systems, 2003, pp. 1228-
1233]
presents a method using eight microphones for localizing a single sound source
where TDOA (Time Delay Of Arrival) estimation was separated from DOA
(Direction
Of Arrival) estimation. Kagami et al. [S. Kagami, Y. Tamai, H. Mizoguchi, and
T.
Kanade, "Microphone array for 2D sound localization and capture", in
Proceedings
IEEE International Conference on Robotics and Automation, 2004, pp. 703-708]
reports a system using 128 microphones for 2D sound localization of sound
sources: obviously, it would not be practical to include such a large number
of
microphones on a mobile robot.
Most of the work so far on localization of sound sources does not address
the problem of tracking moving sources. The article of [D. Bechler, M.
Schlosser,
and K. Kroschel, "System for robust 3D speaker tracking using microphone array
measurements", in Proceedings IEEElRSJ International Conference on Intelligent
Robots and Systems, 2004, pp. 2117-2122] has proposed to use a Kalman filter
for


CA 02505496 2005-04-27
4
tracking a moving source. However the proposed approach assumes that a single
source is present. In the past years, particle filtering [M. S. Arulampalam,
S.
Maskell, N. Gordon, and T. Clapp, "A tutorial on particle filters for online
nonlinear/non-gaussian bayesian tracking", IEEE Transactions on Signal
Processing, vol. 50, no. 2, pp. 174-188, 2002] (a sequential Monte Carlo
method)
has been increasingly popular to resolve object tracking problems. The
articles of
[D. B. Ward and R. C. Williamson, "Particle filtering beamforming for acoustic
source localization in a reverberant environment", in Proceedings IEEE
International
33 Conference on Acoustics, Speech, and Signal Processing, vol. II, 2002, pp.
1777-1780], [D. B. Ward, E. A. Lehmann, and R. C. Williamson, "Particle
filtering
algorithms for tracking an acoustic source in a reverberant environment", IEEE
Transactions on Speech and Audio Processing, vol. 11, no. 6, 2003] and [J.
Vermaak and A. Blake, "Nonlinear filtering for speaker tracking in noisy and
reverberant environments", in Proceedings IEEE International Conference on
Acoustics, Speech, and Signal Processing, vol. 5, 2001, pp. 3021-3024] use
this
technique for tracking single sound sources. Asoh et al. in [H. Asoh, F.
Asano, K.
Yamamoto, T. Yoshimura, Y. Motomura, N. Ichimura, I. Hara, and J. Ogata,
°An
application of a particle filter to bayesian multiple sound source tracking
with audio
and video information fusion"] even suggested to use this technique for mixing
audio and video data to track speakers. But again, the use of this technique
is
limited to a single source due to the problem of associating the localization
observation data to each of the sources being tracked. This problem is
referred to
as the source-observation assignment problem.
Some attempts have been made to define multi-modal particle filters in [J.
Vermaak, A. Doucet, and P. P~rez, "Maintaining multi-modality through mixture
tracking", in Proceedings International Conference on Computer Vision (ICCV),
2003, pp. 1950-1954], and the use of particle filtering for tracking multiple
targets is
demonstrated in [J. MacCormick and A. Blake, "A probabilistic exclusion
principle
for tracking multiple objects", International Journal of Computer Vision, vol.
39, no.


CA 02505496 2005-04-27
1, pp. 57- 71, 2000], [C. Hue, J.-P. L. Cadre, and P. Perez, "A particle
filter to track
multiple objects", in Proceedings IEEE VIlorkshop on Multi-Object Tracking,
2001,
pp. 61-68] and [J. Vermaak, S. Godsill, and P. Perez, "Monte carlo filtering
for multi-
target tracking and data association", IEEE Transactions on Aerospace and
5 Electronic Systems, 2005]. However, so far, the technique has not been
applied to
sound source tracking.
SUMMARY OF THE INVENTION
In accordance with the present invention, there is provided a method for
localizing at least one sound source, comprising detecting sound from the at
least one
sound source through a set of spatially spaced apart sound sensors to produce
corresponding sound signals, and localizing, in a single step, the at least
one sound
source in response to the sound signals. Localizing the at least one sound
source
includes steering a frequency-domain beamformer in a range of directions.
In accordance with the present invention, there is also provided a method for
tracking a plurality of sound sources, comprising detecting sound from the
sound
sources through a set of spatially spaced apart sound sensors to produce
corresponding sound signals, and simultaneously tracking the plurality of
sound
sources, using particle filtering responsive to the sound signals from the
sound sensors.
In accordance with the present invention, there is further provided a method
for localizing and tracking a plurality of sound sources, comprising detecting
sound from
the sound sources through a set of spatially spaced apart sound sensors to
produce
corresponding sound signals, localizing the sound sources in response to the
sound
signals wherein localizing the sound sources includes steering in a range of
directions a
sound source detector having an output, and simultaneously tracking the
plurality of
sound sources, using particle filtering, in relation to the output from the
sound source
detector.


CA 02505496 2005-04-27
6
The present invention also relates to a system for localizing at least one
sound
source, comprising a set of spatially spaced apart sound sensors to detect
sound from
the at least one sound source and produce corresponding sound signals, and a
frequency-domain beamformer responsive to the sound signals from the sound
sensors
and steered in a range of directions to localize, in a single step, the at
least one sound
source.
The present invention further relates to a system for tracking a plurality of
sound
sources, comprising a set of spatially spaced apart sound sensors to detect
sound from
the sound sources and produce corresponding sound signals, and a sound source
particle filtering tracker responsive to the sound signals from the sound
sensors for
simultaneously tracking the plurality of sound sources.
The present invention still further relates to a system for localizing and
tracking a
plurality of sound sources, comprising a set of spatially spaced apart sound
sensors to
detect sound from the sound sources and produce corresponding sound signals, a
sound source detector responsive to the sound signals from the sound sensors
and
steered in a range of directions to localize the sound sources, and a particle
filtering
tracker connected to the sound source detector for simultaneously tracking the
plurality
of sound sources.
The foregoing and other objects, advantages and features of the present
invention will become more apparent upon reading of the following non
restrictive
description of an illustrative embodiment thereof, given with reference to the
accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
In the appended drawings:


CA 02505496 2005-04-27
7
Figure 1 is a schematic block diagram of a non-restrictive illustrative
embodiment of the system for localizing and tracking a plurality of sound
sources
according to the present invention;
Figure 2 is a schematic flow chart showing how the non-restrictive
illustrative
embodiment of the sound source localizing and tracking method according to the
present invention calculates the beamformer energy in the frequency domain;
Figure 3 is a schematic block diagram of a delay-and-sum beamformer
forming part of the non-restrictive illustrative embodiment of the sound
source
localizing and tracking system according to the present invention;
Figure 4 is a schematic flow chart showing how the non-restrictive
illustrative
embodiment of the sound source localizing and tracking method according to the
present invention calculates cross-correlations by averaging cross-power
spectra of
the sound signals over a time period;
Figure 5 is a schematic block diagram of a calculator of cross-correlations
forming part of the delay-and-sum beamformer of Figure 3;
Figure 6 is a schematic representation of a recursive subdivision (two levels)
of a triangular element in view of defining a uniform triangular grid on the
surface of
a sphere;
Figure 7 is a schematic flow chart showing how the non-restrictive
illustrative
embodiment of the sound source localizing and tracking method according to the
present invention searches for a direction on the spherical, triangular grid
of Figure
6;


CA 02505496 2005-04-27
Figure 8 is a is a schematic block diagram of a device for searching for a
direction on the spherical, triangular grid of Figure 6, forming part of the
non-
restrictive illustrative embodiment of the sound source localizing and
tracking system
according to the present invention;
Figure 9 is a graph of the beamformer output probabilities Pq for azimuth as
a function of time, with observations with Pq >0.5, 0.2< P9 <0.5 and P9 <0.2;
Figure 10 is a schematic flow chart showing particle-based tracking as used
in the non-restrictive illustrative embodiment of the sound source localizing
and
tracking method according to the present invention;
Figure 11 is a schematic block diagram of a particle-based sound source
tracker forming part of the non-restrictive illustrative embodiment of the
sound
source localizing and tracking system according to the present invention;
Figure 12 is a schematic diagram showing an example of assignment with
two sound sources observed, one new source and one false detection, wherein
the
assignment can be described as f~{0,1,2,3~~= {l,-2,0,-1~;
Figure 13a is a graph illustrating an example of tracking of four moving
sources, showing azimuth as a function of time with no delay;
Figure 13b is a graph illustrating an example of tracking of four moving
sources, showing azimuth as a function of time with delayed estimation (500
ms);
Figure 14a is a schematic diagram showing an example of sound source
trajectories wherein a robot is represented as an « x » and wherein the
sources are
moving;


CA 02505496 2005-04-27
9
Figure 14b is a schematic diagram showing an example of sound source
trajectories wherein the robot is represented as an « x » and the robot is
moving;
Figure 14c is a schematic diagram showing an example of sound source
trajectories wherein the robot is represented as an « x » and wherein the
trajectories of the sources intersect;
Figure 15a is a graph showing four speakers moving around a stationary
robot in a first environment (E1) and with a false detection shown at 81;
Figure 15b is a graph showing four speakers moving around a stationary
robot in a second environment (E2);
Figure 16a is a graph showing two stationary speakers with a moving robot
in the first environment (E1), wherein a false detection is indicated at 91;
Figure 16b is a graph showing two stationary speakers with a moving robot
in the second environment (E2), wherein a false detection is indicated at 92;
Figure 17a is a graph showing two speakers' trajectories intersecting in front
of a robot in the first environment (E1 );
Figure 17b is a graph showing two speakers' trajectories intersecting in front
of the robot in the second environment (E2); and
Figure 18 is a set of four graphs showing tracking of four sound sources
using a predetermined configuration of microphones in the first environment
(E1 ),
for 4, 5, 6 and 7 microphones, respectively.


CA 02505496 2005-04-27
DETAILED DESCRIPTION OF THE ILLUSTRATIVE EMBODIMENT
The non-restrictive illustrative embodiment of the present invention will be
5 described in the following description. This illustrative embodiment used a
non-
restrictive approach based on a beamformer, for example a frequency-domain
beamformer that is steered in a range of directions to detect sound sources.
Instead
of measuring TDOAs and then converting these TDOAs to a position, the
localization of sound is performed in a single step. This single step approach
makes
10 the localization more robust, especially when an obstacle prevents one or
more
sound sensors, for example microphones from properly receiving the sound
signals.
The results of the localization are then enhanced by probability-based post-
processing which prevents false detection of sound sources. This makes the
approach according to the non-restrictive illustrative embodiment sensitive
enough
for simultaneously localizing multiple moving sound sources. This approach
works
for both far-field and near-field sound sources. Detection reliability,
accuracy, and
tracking capabilities of the approach have been validated using a mobile
robot, with
different types of sound sources.
In other words, combining TDOA and DOA estimation in a single step
improves the system's robustness, while allowing localization of simultaneous
sound sources. It is also possible to track multiple sound sources using
particle
filters by solving the above-mentioned source-observation assignment problem.
An artificial sound source localization and tracking method and system for a
mobile robot can be used for three purposes:
1 ) localizing sound sources;
2) separating sound sources in order to process only signals that are
relevant to a particular event in the environment; and


CA 02505496 2005-04-27
11
3) processing sound sources to extract useful information from the
environment (like speech recognition).
1. System Overview
The artificial sound source localization and tracking system according to the
non-restrictive illustrative embodiment is composed, as shown in Figure 1, of
three
parts:
1 ) An array of microphones 1;
2) A steered beamformer including a memoryless localization algorithm 2
delivering an initial localization of the sound sources) and a maximized
output energy 3; and
3) A particle filtering tracker 4 responsive to the initial sound source
localization and maximized output energy 3 for simultaneously tracking
all the sound sources, prevent false sound source detection, and
delivering sound source source positions 5 .
The array of microphones 1 comprises a number, for example up to eight
omnidirectional microphones mounted on the robot. Since the sound source
localization and tracking system is designed for installation on a robot,
there is no
strict constraint on the position of the microphones 1. However, the positions
of the
microphones relative to each other, is known and measured with, for example,
an
accuracy of = 0.5.
The sound signals such as 6 from the microphones 1 are supplied to the
beamformer 2. The beamformer forms a spatial filter that is steered in all
possible
directions in order to maximize the output beamformer energy 3. The direction
corresponding to the maximized output beamformer energy is retained as the
direction or initial localization of the sound source or sources.


CA 02505496 2005-04-27
12
The initial localization performed by the steered beamformer 2, including the
maximized output beamformer energy 3 is then supplied to the input of a post-
processing stage, more specifically the particle filtering tracker 4 using a
particle
filter to simultaneously track all sound sources and prevent false detections.
The output (source positions 5) of the sound source localization and tracking
system of Figure 1 can be used to draw the robot's attention to the sound
source. It
can also be used as part of a source separation algorithm to isolate the sound
coming from a single source.
2. Localization Using a Steered Beamformer
The basic idea behind the steered beamformer approach to source
localization is to direct or steer a beamformer in a range of directions, for
example
all possible directions and look for maximal output. This can be done by
maximizing
the output energy of a simple delay-and-sum beamformer.
2.1 Delay-and-sum beamformer
Operation 21 (Figure 2)
The output of an M -microphone delay-and-sum beamformer is defined as:
M-1
Y(n) _ ~ xm (n - zm ) ( 1 )
m=0
where xm (n) is the signal from the m'h microphone and zm is the delay of
arrival for
that microphone. The output energy of the beamformer over a frame of length L
is
thus given by:


CA 02505496 2005-04-27
13
L-~
E - ,.~,~ ~iJ~a~~a
~=o
L-1
_ ~ ~2p ~ri - Tp) ~- . .. ~-2,y~_! ~ri - T,td_l~~a
n=~Q
Assuming that only one sound source is present, it can be seen that E is
maximal
when the delays rm are such that the microphone signals are in phase, and
therefore add constructively.
A problem with this technique is that energy peaks are very wide [R.
Duraiswami, D. Zotkin, and L. Davis, "Active speech source localization by a
dual
coarse-to-fine search", in Proceedings IEEE International Conference on
Acoustics,
Speech, and Signal Processing, 2001, pp. 3309-3312], which means that the
resolution is poor. Moreover, in the case where multiple sources are present,
it is
likely that the two or more energy peaks overlap whereby it becomes impossible
to
differentiate one peak from the other(s). A method for narrowing the peaks is
to
whiten the microphone signals prior to calculating the energy [M. Omoiogo and
P.
Svaizer, "Acoustic event localization using a crosspower spectrum phase based
technique", in Proceedings IEEE International Conference on Acoustics, Speech,
and Signal Processing, 1994, pp. 11.273-11.276]. Unfortunately, the coarse-
fine
search method as proposed in [R. Duraiswami, D. Zotkin, and L. Davis, "Active
speech source localization by a dual coarse-to-fine search", in Proceedings
IEEE
International Conference on Acoustics, Speech, and Signal Processing, 2001,
pp.
3309-3312] cannot be used in that case because the narrow peaks can be missed
during the coarse search. Therefore, a full fine search is used and
corresponding
computer power is required. It is possible to reduce the amount of computation
by
calculating the output beamformer energy in the frequency domain. This also
has
the advantage of making the whitening of the signal easier.


CA 02505496 2005-04-27
14
For that purpose, the beamformer output energy in Equation 2 can be
expanded as:
M-1 L-!
E=~ ~~'~.(wT,~1
m=o ~~n
ar-i ~i-i t-i
2mtln Tmt~xmTSn Tm9
m~ ~f rwT~ w-~0
which in turn can be rewritten in terms of cross-correlations:
~r-~ .~e-~
~L = K + 2 ~ ~ R~~,~,r",T (Tm, - T.ra (~)
M-1 L-1
where K = ~m=0 ~n=0 x~~ (n - zm ) is nearly constant with respect to the z",
delays
and can thus be ignored when maximizing E . The cross-correlation function can
be
approximated in the frequency domain as:
L-1
~yJ~'T~ '',= ~ ~'yl~~ ~1~'~~~~skr~IL
where X; (k) is the discrete Fourier transform of x; [n], X; (k)X~ (k) * is
the cross-
power spectrum of x; [n] and x~ [n] and (~)* denotes the complex conjugate.
Operation 22 (Figure 2)


CA 02505496 2005-04-27
A calculator 32 (Figure 3) computes the power spectra and cross-power
spectra in overlapping windows (50% overlap) of, for example, L = 1024 samples
at 48 kHz (see operation 22 of Figure 2 and calculator 32 of Figure 3).
5 Operation 23 (Figure 2)
A calculator 33 (Figure 3) then computes cross-correlations R;! (z) by
averaging the cross-power spectra X; (k)X~ (k) * over, for example, a time
period of
4 frames (40 ms).
Operation 24 (Figure 2)
A calculator 34 (Figure 3) computes the beamformer output energy E from
the cross-correlations R;~ (z) (see Equation 4). When the cross-correlations
R;l (z)
are pre-computed, it is possible to compute the beamformer output energy E
using
only M(M-1)l2 lookup and accumulation operations, whereas a time-domain
computation would require 2L(M + 2) operations. For M = 8 and 2562 directions,
it
follows that the complexity of the search itself is reduced from 1.2 Gflops to
only 1.7
Mflops. After counting all time-frequency transformations, the complexity is
only
48.4 Mflops, 25 times less than a time domain search with the same resolution.
2.2 Spectral weighting
Operation 42 (Figure 4)
A cross-correlation calculator 52 (Figure 5) computes, in the frequency
domain, whitened cross-correlations using the following expression:


CA 02505496 2005-04-27
16
R~' ~tT~ ~ ~ ~x4~ )~~xf (A~)~~~~lL
While it produces much sharper cross-correlation peaks, the whitened cross-
correlations have one drawback: each frequency bin of the spectrum contributes
the
same amount to the final correlation, even if the signal at that frequency is
dominated by noise. This makes the system less robust to noise, while making
detection of voice (which has a narrow bandwidth) more difficult.
Operation 43 (Figure 4)
In order to alleviate this problem, a weighting function 53 (Figure 5) is
applied to act as a mask based on the signal-to-noise ratio (SNR). For
microphone
i, this weighting function 53 is defined as:
~~C~) _ ~f~?
~;( )+1
where ~,n(k) is an estimate of the a priori SNR at the i'" microphone, at time
frame
~, for frequency k. This estimate of the a priori SNR can be computed using
the
decision-directed approach proposed by Ephraim and Malah [Y. Ephraim and D.
Malah, "Speech enhancement using minimum mean-square error short-time
spectral amplitude estimator", IEEE Transactions on Acoustics, Speech and
Signal
Processing, vol. ASSP-32, no. 6, pp. 1109-1121, 1984]:
~~ (~) - ( 1 _ ~) ~~-'(,~)~~ ~x-' (~:)~~ + «d Ix;~(~112 ~s,w
~~C~.?


CA 02505496 2005-04-27
17
where ad = 0.1 is an adaptation rate and az (k) is a noise estimate for
microphone i . It is easy to estimate Q; (k) using the Minima-Controlled
Recursive
Average (MCRA) technique [I. Cohen and B. Berdugo, "Speech enhancement for
non-stationary noise environments", Signal Processing, vol. 81, no. 2, pp.
2403-
2418, 2001], which adapts the noise estimate during periods of low energy.
Operation 44 (Figure 4)
It is also possible to make the system more robust to reverberation by
modifying the weighting function to include a reverberation term R; (k) 54
(Figure
5) in the noise estimate. A simple reverberation model with exponential decay
is
used:
r~~(~) = v~'-~(x) + (~ _ .~.ta ~'iklx;~-'(k)~
where y represents a reverberation decay for the room and 8 is a level of
reverberation. In some sense, Equation 9 can be seen as modeling the
precedence
effect [[J. Huang, N. Ohnishi, and N. Sugie, "Sound localization in
reverberant
environment based on the model of the precedence effect", IEEE Transactions on
Instrumentation and Measurement, vol. 46, no. 4, pp. 842-846, 1997] and [J.
Huang, N. Ohnishi, X. Guo, and N. Sugie, "Echo avoidance in a computational
model of the precedence effect", Speech Communication, vol. 27, no. 3-4, pp.
223-
233, 1999]] in order to give less weight to frequency bins where a loud sound
was
recently present. The resulting enhanced cross-correlation is defined as:
R,'y(T) - ~ ~'f ~i~~x} ~x~~ ~(k)~~~'~,L yot


CA 02505496 2005-04-27
18
2.3 Direction search on a spherical grid.
Operation 72 (Figure 7)
To reduce computation required and make the sound source localization and
tracking system isotropic, a uniform triangular grid 82 (Figure 8) for the
surface of a
sphere is created to define directions. To create the grid 82, an initial
icosahedral
grid is used [F. Giraldo, "Lagrange-galerkin methods on spherical geodesic
grids",
Journal of Computational Physics, vol. 136, pp. 197-213, 1997]. In the
illustrative
example of Figure 6, each triangle such as 61 in an initial 20-element grid 62
is
recursively subdivided into four smaller triangles such as 63 and, then, 64.
The
resulting grid is composed of 5120 triangles such as 64 and 2562 points such
as 65.
The beamformer energy is then computed for the hexagonal region such as 66
associated with each of these points 65. Each of the 2562 regions 66 covers a
radius of about 2.5° around its center, setting the resolution of the
search.
Operation 73 (Figure 7)
A calculator 83 (Figure 8) computes the cross-correlations R~e~(z) using
Equation 10.
Operation 74 (Figure 7)
In this operation the following Algorithm 1 is defined.
Alstorithm 1 Steered beamformer direction search
for all grid index d do
Ed ~ 0
for all microphone pair ij do


CA 02505496 2005-04-27
19
z ~ lookup(d,ij)
Ed ~ Ed + Ryes (r)
end for
end for
direction of source E- arg maid (Ed )
Once the cross-correlations R~e~ (z) are computed, the search for the best
direction on the grid can be performed as described by Algorithm 1 (see 84 of
Figure 8).
Operation 75 (Figure 7)
The lookup parameter of Algorithm 1 is a pre-computed table 85 (Figure 8)
of the TDOA for each pair of microphones and each direction on the grid on the
sphere. Using the far-field assumption [J.-M. Valin, F. Michaud, J. Rouat, and
D.
Letourneau, "Robust sound source localization using a microphone array on a
mobile robot", in Proceedings IEEElRSJ International Conference on Intelligent
Robots and Systems, 2003, pp. 1228-1233], the TDOA in samples is computed as:
_ Fr
Ts7 c {~i-'t~,f~'~ ltI)
where ~'; is the position of microphone i , ~ is a unit-vector that points in
the
direction of the source, c is the speed of sound and FS is the sampling rate.
Equation 11 assumes that the time delay is proportional to the distance
between the
source and microphone. This is only true when there is no diffraction
involved.
While this hypothesis is only verified for an "open" array (all microphones
are in tine


CA 02505496 2005-04-27
of sight with the source), in practice it can be demonstrated experimentally
that the
approximation is sufficiently good for the sound source localization and
tracking
system to work for a "closed" array (in which there are obstacles within the
array).
5 For an array of M microphones and an N-element grid, Algorithm 1 requires
M(M-1)N table memory accesses and M(M 1 )Nl2 additions. In the proposed
configuration (N = 2562, M = 8), the accessed data can be made to fit entirely
in a
modern processor's L2 cache.
10 Operation 76 (Figure 7)
A finder 86 (Figure 1 ) uses Algorithm 1 and the lookup parameter table 85 to
localize the loudest sound source in a certain direction by maximizing the
output
energy of the steered beamformer.
Operation 77 (Figure 7)
In order to localize other sound sources that may be present, the process is
repeated by removing the contribution of the first source to the cross-
correlations,
leading to Algorithm 2 (see 87 in Figure 8). Since the number of sound sources
is
unknown, the system is designed to look for a predetermined number of sound
sources, for example four sources which is then the maximum number of sources
the beamformer is able to locate at once. This situation leads to a high rate
of false
detection, even when four or more sources are present. That problem is handled
by
the particle filter described in the following description.
Als~orithm 2 Localization of multiple sources
fog q = 1 to assumed number of sources do


CA 02505496 2005-04-27
21
Dq f- Steered beamformer direction search
for all microphone pair ij do
z E-- lookup(D,~, ij)
R~ie> (z) - 0
end for
end for
Operation 78 (Figure 7)
When a source is located using Algorithm 1, the direction accuracy is limited
by the size of the grid being used. It is however possible, as an optional
operation,
to further refine the source location estimate. For that purpose, a refined
grid 88
(Figure 8) is defined for the surrounding of the point where a sound source
was
found. To take into account the near-field effects, the grid is refined in
three
dimensions: horizontally, vertically and over distance. For example, using
five points
in each direction, a 125-point local grid can be obtained with a maximum error
of
about 1 °. For the near-field case, Equation 11 no longer holds, so it
is necessary to
compute the TDOA of operation 75 using the following relation:
_:; = F' (Il~~ - ~9 ii - Il~~ - ~:ll )
where d is the distance between the source and the center of the array.
Equation
12 is evaluated for different distances d in order to find the direction of
the source
with improved accuracy.
3. Particle-Based Tracking


CA 02505496 2005-04-27
22
The steered beamformer described hereinabove provides only
instantaneous, noisy information about the possible presence and position of
sound
sources but fails to provide information about the behaviour of the sound
source in
time (tracking). For that reason, it is desirable to use a probabilistic
temporal
integration to track different sound sources based on all measurements
available up
to the current time. Particle filters are an effective way of tracking sound
sources.
Using this approach, hypotheses about the state of each sound source are
represented as a set of particles to which different weights are assigned.
At time t, the case of sources j =0,1,...,M-1, each modeled using N
particles of positions x f', and weights r~~'; is considered. The state vector
for the
particles is composed of six dimensions, three for position and three for its
derivative:
F
(f3)
~Ctt~
Since the position is constrained to lie on a unit sphere and the speed is
tangent to
the sphere, there are only four degrees of freedom. The particle filtering
outlined in
Figure 9 is generalized to an arbitrary and non-constant number of sources. It
does
so by maintaining a set of particles for each source being tracked and by
computing
the assignment between measurements and the sources being tracked. This is
different from the approach described in [J. Vermaak, A. Doucet, and P. Perez,
"Maintaining multi-modality through mixture tracking", in Proceedings
International
Conference on Computer Vision (ICCV), 2003, pp. 1950-1954] for preserving
multi-
modality because in the present case each mode has to be a different source.


CA 02505496 2005-04-27
23
Al4orithm 3 Particle-based trackingi algorithm
(1 ) Predict the state s~'~ from s~'-'~ for each source j
(2) Compute probabilities associated with the steered beamformer response
(3) Compute probabilities P~~ associating beamformer peaks to sources being
tracked
(4) Add or remove sources if necessary
(5) Compute updated particle weights ~f';
(6) Compute position estimate x!'~ for each source
(7) Resample particles for each source if necessary
3.1 Prediction
Operation 101 (Figure 10)
During this operation, the state predictor 111 (Figure 11 ) predicts the state
s~'~ from the state s~'-'~ for each sound source j .
Operation 102 (Figure 10)
The excitation-damping model as proposed in [D. B. Ward, E. A. Lehmann,
and R. C. Williamson, "Particle filtering algorithms for tracking an acoustic
source in
a reverberant environment", IEEE Transactions on Speech and Audio Processing,
vol. 11, no. 6, 2003] is used as a predictor 112 (Figure 11):


CA 02505496 2005-04-27
24
~c~i} =uaic~y~; ~y -~- bFx ( 14)
x~°~ =x~=; 1~ + a~~ (IS)
where a = e-'~T controls the damping term, b =,l3 1- a2 controls the
excitation
term, Fs is a normally distributed random variable of unit variance and DT is
the
time interval between updates.
Operation 103 (Figure 10)
A means 113 (Figure 11) considers three possible states:
~ Stationary source ( a = 2, ~t = 0.04);
~ Constant velocity source ( a = 0.05, ~3 = 0.2);
~ Accelerated source ( a = 0.5, /3 = 0.2).
and predicts the stationary, constant velocity or accelerated state of the
sound
source.
Operation 104 (Figure 10)
A means 114 (Figure 11 ) conducts a normalization step to ensure that the
particle position x; '~ still lies on the unit sphere ( Ilx~'; (~ =1 ) after
applying Equations
14 and 15.
3.2 Probabilities from the beamformer response
Operation 105 (Figure 10)


CA 02505496 2005-04-27
During this operation, the calculator 115 calculates probabilities from the
beamformer response.
Operation 106 (Figure 10)
5
The above-described steered beamformer produces an observation D~'~ for
each time t. The observation O~'~ _ ~Oo'~...OQ~,~ is composed of Q potential
source
locations yq found by Algorithm 2, as well as the energy Eo (from Algorithm 1
) of
the beamformer for the first (most likely) potential source q = 0. Denoted
O~t~ is a set
10 of all observations up to time t.
A calculator 116 (Figure 11 ) computes a probability Pq that the potential
source q is real (not a false detection). The higher the beamformer energy,
the
more likely a potential source is real. For q > 0, false alarms are very
frequent and
15 independent of energy. With this in mind, the probability Pq is defined
empirically
as:
', v~~2, q =4,v < 1
1- v-~ ~2, q = 0, v ~ 1
pq = j o.a, q =1 ci~,
0.16, q =
Q.Q3, q = 3


CA 02505496 2005-04-27
26
with v = Eo l ET , where ET is a threshold that depends on the number of
microphones, the frame size and the analysis window used (for example ET = 150
can be used). Figure 9 shows an example of Pq values for four moving sources
with
azimuth as a function of time.
Operation 107 (Figure 10)
A calculator 117 (Figure 11 ) computes, at time t, a probability density of
observing Oq~'~ for a source located at particle position x~'; using the
following
relation:
p (pf~E#~xRe~~ -~" ~yqy xi~:; ~~~ (17)
where N(yq ; x~,; ; o-2 ) is a normal distribution centered at x~,; with
variance o-z and
corresponds to the accuracy of the steered beamformer. For example, ~ = 0.05
is
used, which corresponds to a RMS error of 3 degrees for the location found by
the
steered beamformer.
3.3 Probabilities for multiple sources
Operation 108 (Figure 10)
During this operation, probabilities for multiple sources are calculated.
Before deriving the update rule for the particle weights ~~'; , the concept of
source-observation assignment will be introduced. For each potential source q
detected by the steered beamformer, there are three possibilities:


CA 02505496 2005-04-27
27
- It is a false detection ( Ho ).
- It corresponds to one of the sources currently tracked ( H, ).
- It corresponds to a new source that is not yet being tracked ( Hz ).
In the case of possibility H, , it is determined which real source j
corresponds to potential source q . First, it is assumed that a potential
source may
correspond to at most one real source and that a real source can correspond to
at
most one potential source.
Let f : ~0,1,...,Q-1~~ {-2,-1,0,1,...,M-1} be a function assigning
observation q to source j (values -2 is used for false detection and -1 is
used for a
new source). Figure 12 illustrates a hypothetical case with four potential
sources
detected by the steered beamformer and their assignment to the real sources.
Knowing P~fIO''~~ for all possible f , a calculator 118 computes the
probability P9,~
that the real source j corresponds to the potential source q using the
following
expressions:
P'~ =~ai,J'q~p (.~ ~p''~) (r~1
r
~'q~~~H~) _~d-z.JCv~P (f ~'~°~~ (19)
j
Pqx ~ ( Hx ) _ ~ c5_ ~,JCv) ~' ~.~ ~' ~~~
J
where 5;,~ is the Kronecker delta.


CA 02505496 2005-04-27
28
Omitting t for clarity, the calculator 118 also computes the probability
P(fl0) that a certain mapping function f is the correct assignment function
using
the following relation:
pt.fiQ)=F(n'Ifl~'(.t) ~ )
"~ 1
p(Q~
Knowing that ~ f P~ f IO) =1, computing the denominator p(O) can be avoided by
using normalization. Assuming conditional independence of the observations
given
the mapping function, we obtain:
~'('p~f) ~P~~a~~~Q)) (22)
9
It is assumed that the distributions of the false detections ( Ho ) and the
new sources
( HZ ) are uniform, while the distribution for:
1 /4~, f (Q) _ -2
Ft~~lftq)) _ Ij~~, I(q)=-1 4".3?
~c''r'f(qb~P(Oql Xia) a ff9) >_ 0
The a priori probability of the function f being the correct assignment is
also
assumed to come from independent individual components, so that:
~'~~) = Ilptf(q)) ca~s~
Q


CA 02505496 2005-04-27
29
with
P S I ~~~ ~ - PqPne~u
PAP ~ob~~'~ ~a~t 1~~ I~q) ? 0
Where Phew is the a priori probability that a new source appears and Pfa~e is
the a
priori probability of false detection. The probability P(Obs~'~ IO~'-'~ ) that
source j is
observable (i.e., that it exists and is active) at time t is given by the
following
relation:
P ~Obs''~ O~t-;~~ -- P (EJ O~'-~> ~ P ~~4'°~ ~~c=-s~ ~ (~6)
where E~ is the event that source j actually exists and A~'~ is the event that
it is
active (but not necessarily detected) at time t. By active, it is meant that
the signal it
emits is non-zero (for example, a speaker who is not making a pause). The
probability that the sound source exists using the relation is given by:
P ~E ~~~x-1~) =~~~-1~ ~ (, _ ~_-~~~ pap (p' ~~pt~ ~>~
1 - ~.i - ~\'~) P ~Pi ~~~'-~~ ~ !~2'a')
where Po is the a priori probability that a source is not observed (i.e.,
undetected by
the steered beamformer) even if it exists (for example with Po = 0.2 in the
present
case). P~'~ _ ~q P~'~ is computed by the calculator 118 and represents the


CA 02505496 2005-04-27
probability that source j is observed at time t (assigned to any of the
potential
sources).
Assuming a first order Markov process, the following relation about the
5 probability of source activity can be written:
P ~A;e~ O(a_u~ -P 1A'#l i"ia i?~ P ~Aia_~> ~O(a_~>
+P ~A~a~ ~~,AJa-1}~ ~1 - P ~A~'!1? f0(a''~~~
with P~A~'~ I A~'-'~ ~ the probability that an active source remains active
(for example
10 set to 0.95), and P~A~'~hA~'-'~~ the probability that an inactive source
becomes
active again (for example set to 0.05). Assuming that the active and inactive
states
are equiprobable, the activity probability is computed using Bayes' rule:
P ~p('~ ~p(a)~ - _-_-1 _.______ (29)
P..A [_;,' .. P..A~' Vie)
1 +
P A~t}~Ofa_1)~P A~'~~~d~)
3.4 Weight update
Operation 109 (Figure 10)
A calculator 119 (Figure 11 ) computes updated particle weights w~'; .
At times t, the new particle weights for source j are defined as:


CA 02505496 2005-04-27
31
u'.it~ -P (xiea ~0t~3~
Assuming that the observations are conditionally independent given the source
position, and knowing that for a given source j~N~ ~~'; =1, it can be obtained
through Bayesian inference:
~~te) = I i.~ h
P ~~tt~~ xt'~~ P ~x('~~
a~~ p (Ot' )
p (~tot~xj3)~ p (fit' 1)~xi~:~p ~xi~~
_ J P(Ot'))
P ~xi.~ ~tel~ P lxh) ~Ois_i? ~P_ lOte>~ P ~Otr_~t~
P (pr.°? ) P ~x9
P (xt') ~,~(')~ ~,c~-1)
- ,i~ h~ X31 ~
_ rN 1 p ~x;~~ ~~tt~~ ~'~ ~~
Let 1~'~ denote the event that source j is observed at time t and knowing that
P(1~'~ ) = P~'~ _ ~ Pt>~ , we obtain:
9
p ~x?~~ ~CiI')~ _ ~1 - P''p~p (xh"~(7f'>,-,~'~~ -t- ~'~p ~x~~> ~Clt'>, I;t~ ~
(3 2~
In the case where no observation matches the source, all particle positions
have the
same probability to be observed, so we obtain:
A ( ~,
~x~ti ~t7t'~} _ ~1- Pt'i, ~ + P ~'~ 1 ~''P ~U~tj x~"c~~
(33)
P ~' ' '~V ~~,1yt~P~Ov'~ x~,~~


CA 02505496 2005-04-27
32
where the denominator on the right side of Equation 33 ensures that
~' c~> ~ c~> c~>
~r-~ p(xi,a ~ ~ l; ) =1.
3. 5 Adding or removing sources
Operation 110 (Figure 10)
During this operation, an adder/subractor adds or removes sound sources.
Operation 121 (Figure 10)
In a real environment, sources may appear or disappear at any moment. If,
at any time, Pq (HZ ) is higher than a threshold set, for example, to 0.3, it
is
considered that a new source is present. The adder 131 (Figure 11 ) then adds
a
new source, and a set of particles is created for source q . Even when a new
source
is created, it is only assumed to exist if its probability of existence P(E;
IO~'~
reaches a certain threshold, which is set, for example, to 0.98.
Operation 122 (Figure 10)
In the same manner, a time limit is set on sources. If the source has not
been observed (P;~'~ < Tubs ~ for a certain period of time, it is considered
that it no
longer exists and the subtractor 132 (Figure 11 ) removes this source. In that
case,
the corresponding particle filter is no longer updated nor considered in
future
calculations.
3. 6 Parameter estimation


CA 02505496 2005-04-27
33
Operation 123 (Figure 10)
Parameter estimation is conducted during this operation.
More specifically, a parameter estimator 133 obtains an estimated position of
each source as a weighted average of the positions of its particles:
~; t = ~ 9uh~~x;~? (3~.)
:_~
It is however possible to obtain better accuracy simply by adding a delay to
the
algorithm. This can be achieved by augmenting the state vector by past
position
values. At time t, the position at time t-T is thus expressed as:
N
Xjel~ ~,~.7a~~j~ ~
i=i
Using the same example as in Figure 9, Figure 13 shows how the particle filter
is
capable of removing the noise and produce smooth trajectories. The added delay
produces an even smoother result.
3.7 Resampling
Operation 124 (Figure 10)
Resampling is performed by a resampler 134 (Figure 10) only when
Nee. ~ ~N~ Coy,; ) ~ < N",;" [A. Doucet, S. Godsill, and C. Andrieu, "On
sequential
Monte Carlo sampling methods for bayesian filtering", Statistics and
Computing, vol.


CA 02505496 2005-04-27
34
10, pp. 197-208, 2000] with N",;" = 0.7N . That criterion ensures that
resampling
only occurs when new data is available for a certain source. Otherwise, this
would
cause unnecessary reduction in particle diversity, due to some particles
randomly
disappearing.
4. Results
The proposed sound source localization and tracking method and system
were tested using an array of omni-directional microphones, each composed of
an
electret cartridge mounted on a simple pre-amplifier. The array was composed
of
eight microphones since this is the maximum number of analog input channels on
commercially available soundcards; of course, it is within the scope of the
present
invention to use a number of microphones different from eight (8). Two array
configurations were used for the evaluation of the sound source localization
and
tracking method and system. The first configuration (C1 ) was an open array
and
included inexpensive microphones arranged on the summits of a 16 cm cube
mounted on top of the Spartacus robot (not shown). The second configuration
(C2)
was a closed array and uses smaller, middle-range cost microphones, placed
through holes at different locations on the body of the robot. For both
arrays, all
channels were sampled simultaneously using a RME Hammerfall Multiface DSP
connected to a laptop computer through a CardBus interface. Running the sound
source localization and tracking system in real-time currently required 25% of
a 1.6
GHz Pentium-M CPU. Due to the low complexity of the particle filtering
algorithm, it
was possible to use 1000 particles per source without any noticeable increase
in
complexity. This also means that the CPU time cost does not increase
significantly
with the number of sources present. For all tasks, configurations and
environments,
all parameters had the same value, except for the reverberation decay, which
was
set to 0.65 in the E1 environment and 0.85 in the E2 environment.


CA 02505496 2005-04-27
Experiments were conducted in two different environments. The first
environment (E1 ) was a medium-size room (10 m x 11 m, 2.5 m ceiling) with a
reverberation time (-60 dB) of 350 ms. The second environment (E2) was a hall
(16
m x 17 m, 3.1 m ceiling, connected to other rooms) with 1.0 s reverberation
time.
5
4.1 Characterization
The system was characterized in environment E1 in terms of detection
reliability and accuracy. Detection reliability is defined as the capacity to
detect and
10 localize sounds within 10 degrees, while accuracy is defined as the
localization
error for sources that are detected. Three different types of sound were used:
a
hand clap, the test sentence "Spartacus, come here", and a burst of white
noise
lasting 100 ms. The sounds were played from a speaker placed at different
locations around the robot and at three different heights: 0.1 m, 1 m, 1.4 m.
4.1.1 Detection Reliability
Detection reliability was tested at distances (measured from the center of the
array) ranging from 1 m (a normal distance for close interaction) to 7m
(limitations of
the room). Three indicators were computed: correct localization (within 10
degrees),
reflections (incorrect elevation due to roof of ceiling), and other errors.
For all
indicators, the number of occurrences divided by the number of sounds played
was
computed. This test included 1440 sounds at a 22.5° interval for 1 m
and 3 m and
360 sounds at a 90° interval for 5 m and 7 m.
Results are shown in Table 1 for both C1 and C2 configurations. In
configuration C1, results show near-perfect reliability even at seven meter
distance.
For C2, reliability depends on the sound type, so detailed results for
different
sounds are provided in Table 2.


CA 02505496 2005-04-27
36
Like most localization algorithms, the sound source localization and tracking
method and system was unable to detect pure tones. This behavior is explained
by
the fact that sinusoids occupy only a very small region of the spectrum and
thus
have a very small contribution to the cross-correlations with the proposed
weighting.
It must be noted that tones tend to be more difficult to localize even for the
human
auditory system.
Table 1
n~emcciomeh~bilitv for Ct and c1 cons~at~
DisterroeCorrect(9~) fteflection(9fo)Otherertnr(~O~
C C t C2 C
I t
C..."~ C2


1 m 100 94.20.0 7. 0.01.3
~


3 m '99.480.60.0 2 0.30.1
t.0


5 m 98.389.40.0 0.0 0. L 1
7 m t0D 85.0O b t, 0 1,
t 0 l
6


Table 2
Correct localization rate as a function of sound type
and distance for C2 configuration
DistauoeEE~d Speech Noise
clap (96) least
(~) t96)


I m 88.3 98:3 95.8


3 m Sa8 97.9 92.9


5 m 71.7 98.3 98.3


7 m 6t.? 95.0 98.3


4. 7.2 Localization Accuracy


CA 02505496 2005-04-27
37
In order to measure the accuracy of the sound source localization and
tracking method and system, the same setup as for measuring reliability was
used,
with the exception that only distances of 1m and 3m were tested (1440 sounds
at a
22.5° interval) due to the limited space available in the testing
environment. Neither
distance nor sound type has significant impact on accuracy. The root mean
square
accuracy results are shown in Table 3 for configurations C1 and C2. Both
azimuth
and elevation are shown separately. According to [V1I. M. Hartmann,
"Localization of
sounds in rooms", Journal of the Acoustical Society of America, vol. 74, pp.
1380-
1391, 1983] and [B. Rakerd and W. M. Hartmann, "Localization of noise in a
reverberant environment", in Proceedings 18th International Congress on
Acoustics,
2004], human sound localization accuracy ranges between two and four degrees
in
similar conditions. The localization accuracy of the sound source localization
and
tracking method and system is thus equivalent or better than human
localization
accuracy.
Table 3
Localization accuracy (root mean square error)
io~catiaatia~n-C!- (deg)
errac ~ C2 (deg)
~



A~iarw~h l.10 !.~


EleYatuo~s0.89 t.41


4.2 Source Tracking
The tracking capabilities of the sound source localization and tracking
method and system for multiple sound sources were measured. These
measurements were performed using the C2 configuration in both E1 and E2
environments. In all cases, the distance between the robot and the sources was
approximately two meters. The azimuth is shown as a function of time for each


CA 02505496 2005-04-27
38
source. The elevation is not shown as it is almost the same for all sources
during
these tests. The trajectories for the three experiments are shown in Figures
14a,
14b and 14c.
4.2.1 Moving Sources
In a first experiment, four people were told to talk continuously (reading a
text with normal pauses between words) to the robot while moving, as shown in
Figure 14a. Each person walked 90 degrees towards the left of the robot before
walking 180 degrees towards the right.
Results are presented in Figure 15 for delayed estimation (500 ms). In both
environments, the source estimated trajectories are consistent with the
trajectories
of the four speakers.
4.2.2 Moving Robot
Tracking capabilities of the sound source localization and tracking method
and system were also evaluated in the context where the robot is moving, as
shown
in Figure 14b. In this experiment, two people are talking continuously to the
robot as
it is passing between them. The robot then makes a half turn to the left.
Results are
presented in Figure 16 for delayed estimation (500 ms). Once again, the
estimated
source trajectories are consistent with the trajectories of the sources
relative to the
robot for both environments.
4.2.3 Sources with Intersecting Trajectories
In this experiment, two moving speakers are talking continuously to the
robot, as shown in Figure 14c. They start from each side of the robot,
intersecting in
front of the robot before reaching the other side. Results in Figure 17 show
that the


CA 02505496 2005-04-27
39
particle filter is able to keep track of each source. This result is possible
because
the prediction step imposes some inertia to the sources.
4.2.4 Number of Microphones
These results evaluate how the number of microphones affects the system
capabilities. For that purpose, the same recording as in 4.2.1 for C2 in E1
with only
a subset of the microphone signals to perform localization. Since a minimum of
four
microphones are necessary for localizing sounds without ambiguity, the sound
source localization and tracking method and system were evaluated using four
to
seven microphones (selected arbitrarily as microphones number 1 through N).
Comparing results from Figure 18 to those obtained in Figure 15 for E1, it can
be
observed that tracking capabilities degrade as microphones are removed. While
using seven microphones makes little difference compared to the baseline of
eight
microphones, the system was unable to reliably track more than two of the
sources
when only four microphones were used. Although there is no theoretical
relationship
between the number of microphones and the maximum number of sources that can
be tracked, this clearly shows how the redundancy added by using more
microphones can help in the context of sound source localization and tracking.
4.3 Localization and Tracking for Robot Control
This experiment is performed in real-time and consists of making the robot
follow the person speaking to it. At any time, only the source present for the
longest
time is considered. When the source is detected in front (within 10 degrees)
of the
robot, it moves forward. At the same time, regardless of the angle, the robot
turns
toward the source in such a way as to keep the source in front. Using this
simple
control system, it is possible to control the robot simply by talking to it,
even in noisy
and reverberant environments. This has been tested by controlling the robot
going
from environment E1 to environment E2, having to go through corridors and an


CA 02505496 2005-04-27
elevator, speaking to the robot with normal intensity at a distance ranging
from one
meter to two meters. The system worked in real-time, providing tracking data
at a
rate of 25 Hz (no delay on the estimator) with the reaction time dominated by
the
inertia of the robot.
5
Using an array of eight microphones, the system was able to localize and
track simultaneous moving sound sources in the presence of noise and
reverberation, at distances up to seven meters. It has been demonstrated that
the
system is capable of controlling in real-time the motion of a robot, using
only the
10 direction of sounds. It was demonstrated that the combination of a
frequency-
domain steered beamformer and a particle filter has multiple source tracking
capabilities. Moreover, the proposed solution regarding the source-observation
assignment problem is also applicable to other multiple object tracking
problems.
15 A robot using the proposed sound source localization and tracking method
and system has access to a rich, robust and useful set of information derived
from
its acoustic environment. This can certainly affect its ability of making
autonomous
decisions in real life settings, and showing higher intelligent behaviour.
Also,
because the system is able to localize multiple sound sources, it can be
exploited
20 by a sound-separating algorithm and enables speech recognition to be
performed.
This enables identification of the localized sound sources so that additional
relevant
information can be obtained from the acoustic environment.
Although the present invention has been described hereinabove with
25 reference to an illustrative embodiment thereof, this embodiment can be
modified at
will, within the scope of the appended claims, without departing from the
spirit and
nature of the present invention.

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(22) Filed 2005-04-27
(41) Open to Public Inspection 2006-10-27
Examination Requested 2010-04-20
Dead Application 2012-04-27

Abandonment History

Abandonment Date Reason Reinstatement Date
2011-04-27 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $200.00 2005-04-27
Registration of a document - section 124 $100.00 2006-04-20
Expired 2019 - Corrective payment/Section 78.6 $200.00 2007-01-17
Maintenance Fee - Application - New Act 2 2007-04-27 $100.00 2007-02-23
Registration of a document - section 124 $100.00 2007-09-21
Maintenance Fee - Application - New Act 3 2008-04-28 $100.00 2008-04-22
Maintenance Fee - Application - New Act 4 2009-04-27 $100.00 2009-04-23
Maintenance Fee - Application - New Act 5 2010-04-27 $200.00 2010-03-24
Request for Examination $800.00 2010-04-20
Registration of a document - section 124 $100.00 2011-05-02
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
SOCPRA SCIENCES ET GENIE S.E.C.
Past Owners on Record
MICHAUD, FRANCOIS
ROUAT, JEAN
SOCIETE DE COMMERCIALISATION DES PRODUITS DE LA RECHERCHE APPLIQUEE - SO CPRA SCIENCES ET GENIE S.E.C
UNIVERSITE DE SHERBROOKE
VALIN, JEAN-MARC
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Representative Drawing 2006-09-29 1 7
Abstract 2005-04-27 1 33
Description 2005-04-27 40 1,351
Claims 2005-04-27 15 522
Drawings 2005-04-27 18 278
Cover Page 2006-10-16 2 53
Assignment 2008-01-10 2 65
Fees 2008-04-22 1 31
Correspondence 2005-05-31 1 27
Assignment 2005-04-27 3 88
Assignment 2006-04-20 5 118
Prosecution-Amendment 2007-01-17 2 45
Correspondence 2007-01-27 1 15
Fees 2007-02-23 1 33
Assignment 2007-09-21 8 250
Correspondence 2007-01-31 5 164
Fees 2010-03-24 1 200
Fees 2009-04-23 1 36
Prosecution-Amendment 2010-04-20 1 40
Prosecution-Amendment 2010-11-23 2 45
Assignment 2011-05-02 5 175