Patent 2060310 Summary

(12) Patent:	(11) CA 2060310
(54) English Title:	DIGITAL SPEECH CODER WITH VECTOR EXCITATION SOURCE HAVING IMPROVED SPEECH QUALITY
(54) French Title:	CODEUR DE PAROLES NUMERIQUE A QUALITE VOCALE AMELIOREE UTILISANT UNE SOURCE D'EXCITATION VECTORIELLE
Status:	Expired

Bibliographic Data

(51) International Patent Classification (IPC):	G10L 19/12 (2006.01) G10L 19/00 (2006.01)
(72) Inventors :	GERSON, IRA ALAN (United States of America)
(73) Owners :	MOTOROLA, INC. (United States of America)
(71) Applicants :	MOTOROLA, INC. (United States of America)
(74) Agent:	GOWLING LAFLEUR HENDERSON LLP
(74) Associate agent:
(45) Issued:	2001-07-17
(86) PCT Filing Date:	1990-05-02
(87) Open to Public Inspection:	1991-02-07
Examination requested:	1996-05-24
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US1990/002469
(87) International Publication Number:	WO1991/001545
(85) National Entry:	1991-12-13

(30) Application Priority Data:

Application No.	Country/Territory	Date
370,541	United States of America	1989-06-23

Abstracts

English Abstract

In a vector excitation source digital speech coder utilizing vector
excitation, candidate excitation sources (111, 121) are con-
sidered independent of certain pitch parameters. Once a particular excitation
source has been selected, the excluded pitch par-
ameter may then be optimized, resulting in an overall improvement in speech
quality.

Claims

Note: Claims are shown in the official language in which they were submitted.

8

Claims

1. A method of encoding a speech sample, comprising the steps of:
A) determining a pitch period parameter for the speech sample;
B) determining, independent of any pitch filter coefficient, a plurality
of coded excitation signals for the speech sample;

C) processing each of the plurality of coded excitation signals to
provide a plurality of processed candidate excitation signals,
wherein each of the plurality of processed candidate excitation
signals is comprised of information that is substantially
independent of information that is representable by a pitch filter
output that is derived, at least in part, as a function of the pitch
period parameter;

D) optimizing at least a pitch filter coefficient parameter for the
speech sample.

9

2. The method of claim 1 further characterized in that the step of
determining a coded excitation signal includes processing the plurality of
candidate excitation signals to orthogonalize the plurality of candidate
excitation signals with respect to a pitch filter output that is derived, at
least in
part, as a function of the pitch period parameter.

10
3. The method of claim 1, further characterized
in that the step of determining the coded excitation
signal comprises the steps of:
B1) processing an excitation signal to
substantially remove components that are
representable, at least in part, by a reference that is
related, at least in part, to the pitch period parameter;
and.
B2) determining an appropriate excitation
signal for the speech sample.

11

4. A method of claim 3 further characterized in
that the step of processing the excitation signal
includes processing the excitation signal to
orthogonalize the excitation signal with respect to a
pitch filter output that is derived, at least in part, as a
function of the pitch period parameter.
5. The method of claim 3, and further
characterized by the step of:
C1) processing a candidate excitation
signal to substantially remove components that are
representable at least in part, by:
a reference that is related, at least in
part, to the pitch period parameter; and
the appropriate excitation signal
determined in step B2.
6. The method of claim 5 further characterized
in that the step of processing a candidate excitation
signal includes processing the candidate excitation
signal to orthogonalize the candidate excitation signal
with respect to both the reference and the appropriate
excitation signal determined in step B2.

12
7. A method of encoding a signal sample using
at least two codebooks that include information
regarding candidate excitation signals, comprising the
steps of:
A) determining, using a first one of the
codebooks, a first excitation signal for the signal
sample; characterized by:
B) determining, using a second one of
the codebooks, a second excitation signal for the signal
sample, which second excitation signal is comprised of
information that is substantially independent of
information that is representable by the first excitation
signal;
C) using the first and second excitation
signals to represent, at least in part, the signal sample.

13
8. The method of claim 7 further characterized
in that the signal sample comprises a speech sample.
9. The method of claim 7 further characterized
in that the step of determining the second excitation
signal includes processing a candidate excitation signal
to orthogonalize the candidate excitation signal with
respect to the first excitation signal.

Description

Note: Descriptions are shown in the official language in which they were submitted.

WO 91/01545 PCT/US90/02469
2060310
DIGITAL SPEECH CODER WITH VECTOR EXCITATION
SOURCE HAVING IMPROVED SPEECH QUALITY
Technical Field
This invention relates generally to speech coders,
and more particularly to digital speech coders that use
vector excitation sources.
Background of the Invention
Speech coders are known in the art. Some speech
coders convert analog voice samples into digitized
representations, and subsequently represent the spectral
speech information through use of linear predictive
coding. Other speech coders improve upon ordinary
linear predictive coding techniques by providing an
excitation signal that is related to the original voice
signal. I have described, in previously issued U.S. Patent
No. 4,817,157, a digital speech coder having an improved
vector excitation source wherein a codebook of
excitation vectors is accessed to select an excitation
signal that best fits the available information, and hence
provides a recovered speech signal that closely
represents the original.

WO 91/01545 PCT/US90/02469
In general, the resultant decoded speech signal
will more closely represent the original unencoded
speech signal if there is a significant number of
candidate excitation vectors available for consideration
as the excitation source. Increasing performance in this
way, however, generally results in enlargement of the
codebpok size, and this will usually increase processing
complexity and data rates.
A need therefore exists for a digital speech coder
that uses a vector excitation signal, wherein for a given
size codebook, the quality of the decoded speech signal
is substantially maximized with minimal increase in
complexity and substantially no increase in data rate.
Summary of the Invention
These needs and others are substantially met
through provision of the digital speech coder with vector
excitation source having improved speech quality
disclosed herein. Pursuant to this invention, when
encoding a signal sample, such as a speech sample, the
coder first determines a pitch period parameter for the
speech sample. Relying in part upon this pitch period
parameter, a particular coded excitation signal can be
determined independent of the pitch filter coefficient,
following which the pitch filter coefficient parameter
can be optimized for that particular speech sample.
This methodology allows candidate excitation signals to
be considered without requiring a commensurate
increase in processing complexity or data rates.
In one embodiment, the coded excitation signal is
determined substantially independent from any pitch

WO 91/01545 2 0 6 0 3 1 0 PCT/US90/02469
3
information. In particular, candidate excitation signals
as provided by a codebook are processed to substantially
remove components that are representable, at least in
part, by a reference component that is related, at least
in part, to the intermediate pitch vector. More
particularly, the vector component related to the
intermediate pitch vector is removed from the candidate
excitation signal (a process known as orthogonalizing).
The orthogonalized candidate excitation signals are then
compared with the unencoded speech sample to identify
the candidate excitation signal that best represents this
particular speech sample. The pitch information,
including a pitch filter coefficient parameter, can be
optimized later to best suit the selected excitation
signal to thereby yield an overall optimized coded
representation of the speech signal.
In another embodiment, a second codebook of
candidate excitation signals, wherein two excitation
signals are used to represent the speech sample, is
provided. The first excitation signal can be selected as
described above, and the second excitation signal can be
selected in a similar manner, wherein candidate second
excitation signals are first orthogonalized with respect
to both the intermediate pitch vector and the previously
selected first excitation signal.
Brief Description of the Drawin,
Fig. 1 comprises a block diagrammatic depiction of
the invention; and
Fig. 2 cow rises a simple vector diagram
representing one aspect of the invention.

WO 91/01545 PCT/US90/02469
4
6~~~,0
Best Mode For Carr~g Out The Invention:
This invention can be embodied in a speech coder
5 that makes use of an appropriate digital signal processor
such as a Motorola DSP 56000 family device. The
computational functions of such a DSP embodiment are
represented in Fig. 1 as a block diagram equivalent
circuit.
10 A pitch period parameter (101 ) (determined in
accordance with prior art technique) is provided to a
pitch filter state (102) that comprises part of a pitch
filter. The resultant signal (103) comprises an
intermediate pitch vector that is provided to both a first
15 multiplier (104) and two orthogonalizing processes (106
and 107) as described below in more detail. This first
multiplier (104) functions to multiply the resultant
signal by a pitch filter coefficient (108) to yield a pitch
filter output (109). Selection of the pitch filter
20 coefficient (108) will be described below in more detail.
A first codebook (111 ) includes a set of basis
vectors that can be linearly combined to form a plurality
of resultant excitation signals. Depending upon the size
of the memory utilized, and other factors appropriate to
the application, the number of possible resultant
excitation signals can be, for example, between 64 and
2,048, with more of course being possible when
appropriate to a particular application. The problem,
when encoding a particular speech sample, is to select
whichever of these excitation sources best represents
the corresponding component of the original speech
information.

WO 91/01545 2 0 6 0 3 1 0
PCT/US90/02469
Pursuant to this invention, once a particular
resultant signal (103) has been determined, the
excitation signals formulated by the first codebook
(111 ) will be presented in seriatim fashion as candidate
5 excitation sources. Each candidate excitation source
will first be orthogonalized (106) with respect to the
resultant signal. For example, referring momentarily to
Fig. 2, if vector A were considered to represent the
resultant signal and vector B were to represent a
particular candidate excitation source, orthogonalization
of the candidate excitation source signal would result in
the vector denoted by reference character B'. (It should
be understood that in practice, the vector dimension
space is a function of the number of samples comprising
the vectors, which may be upwards of 40 samples or
more. It should also be noted that the candidate
excitation vectors may be readily orthogonalized by
orthogonalizing the basis vectors, wherein linear
combinations of the orthogonadized basis vectors with
one another will result in orthogonalized excitation
vectors.)
Once orthogonalized, the resulting candidate
excitation source can be compared (112) with the
unencoded signal (113) (or an appropriate representative
signal based thereon) to determine the relative
similarity or disparity between the two. The process is
then repeated for each of the excitation sources of the
first codebook (111 ). A determination can then be made
as to which candidate excitation source most closely
aligns with the unencoded signal (113).
In this particular embodiment, a gain factor (114)
can also be used to modify each candidate excitation

WO 91/01545 PCT/US90/02469
6
2~ source signal, as well understood in the art. In addition,
if desired, the excitation source selection and gain
compensation can both be accomplished in a
substantially simultaneous manner, as also well
understood in the art.
Once an appropriate excitation source from the
first codebook (111 ) has been selected through this
process, the orthogonalizing process (106) can
thereafter be dispensed with and the exact excitation
source signal selected (116) through an appropriate
control mechanism (117). Thereafter, presuming a
single codebook coder, the pitch information can be
gated (117) and summed (118) together with the
selected excitation source with the pitch filter
coefficient (108) and excitation gain (114) optimized
such that the combined excitation most closely aligns
with the encoded signal (113). Once optimized, the pitch
period parameter, pitch filter coefficient, and particular
excitation source and gain are known, and appropriate
representations thereof may be utilized thereafter as
representative of the original speech sample.
If desired, and as depicted in Fig. 1, an additional
codebook (121 ) can be utilized, which second codebook
(121 ) again includes a plurality of basis vector derived
candidate excitation sources. The use of such multiple
codebooks is understood in the art. Pursuant to this
invention, however, once the first excitation source
from the first codebook (111 ) has been selected as
described above, the candidate excitation sources from
the second codebook (121 ) are orthogonalized (107) with
respect to both the resultant signal (103) and the
selected excitation source signal from the first

WO 91/01545 PCT/US90/02469
7
206010
ccdebook (111 ). The selection process can then continue
as described above, with the orthogonalized candidate
excitation source signals from the second codebook
(121 ) being compared against a representative unencoded
signal (113) to identify the closest fit. Once this
excitation source has been selected, the pitch filter
coefficient (108) and excitation gains (114 and 120) can
then be optimized as described above.
What is claimed is:

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	2001-07-17
(86) PCT Filing Date	1990-05-02
(87) PCT Publication Date	1991-02-07
(85) National Entry	1991-12-13
Examination Requested	1996-05-24
(45) Issued	2001-07-17
Expired	2010-05-02

Abandonment History

There is no abandonment history.

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Application Fee			$0.00	1991-12-13
Maintenance Fee - Application - New Act	2	1992-05-04	$100.00	1992-03-27
Maintenance Fee - Application - New Act	3	1993-05-03	$100.00	1993-03-23
Registration of a document - section 124			$0.00	1993-06-22
Maintenance Fee - Application - New Act	4	1994-05-02	$100.00	1994-03-23
Maintenance Fee - Application - New Act	5	1995-05-02	$150.00	1995-03-24
Maintenance Fee - Application - New Act	6	1996-05-02	$150.00	1996-03-26
Request for Examination			$400.00	1996-05-24
Maintenance Fee - Application - New Act	7	1997-05-02	$150.00	1997-03-25
Maintenance Fee - Application - New Act	8	1998-05-04	$150.00	1998-04-08
Maintenance Fee - Application - New Act	9	1999-05-03	$150.00	1999-03-19
Maintenance Fee - Application - New Act	10	2000-05-02	$200.00	2000-03-23
Final Fee			$300.00	2001-02-22
Maintenance Fee - Application - New Act	11	2001-05-02	$200.00	2001-05-02
Maintenance Fee - Patent - New Act	12	2002-05-02	$200.00	2002-04-03
Maintenance Fee - Patent - New Act	13	2003-05-02	$200.00	2003-04-02
Maintenance Fee - Patent - New Act	14	2004-05-03	$250.00	2004-04-06
Maintenance Fee - Patent - New Act	15	2005-05-02	$450.00	2005-04-06
Maintenance Fee - Patent - New Act	16	2006-05-02	$450.00	2006-04-05
Maintenance Fee - Patent - New Act	17	2007-05-02	$450.00	2007-04-10
Maintenance Fee - Patent - New Act	18	2008-05-02	$450.00	2008-04-07
Maintenance Fee - Patent - New Act	19	2009-05-04	$450.00	2009-04-07

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
MOTOROLA, INC.

Past Owners on Record
GERSON, IRA ALAN

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Drawings	2000-05-25	1	23
Abstract	1995-08-17	1	46
Description	1995-05-14	7	244
Claims	2000-05-25	6	103
Drawings	1994-05-14	1	32
Cover Page	1994-05-14	1	37
Claims	1994-05-14	6	160
Claims	2000-11-03	6	101
Cover Page	2001-07-10	1	38
Representative Drawing	2001-07-10	1	10
Representative Drawing	1999-02-11	1	32
Prosecution-Amendment	2000-01-25	2	7
Prosecution-Amendment	2000-05-25	10	226
Prosecution-Amendment	2000-07-18	1	34
Prosecution-Amendment	2000-11-03	3	83
Correspondence	2001-02-22	1	25
Assignment	1991-12-13	12	383
PCT	1991-12-13	9	282
Prosecution-Amendment	1996-05-24	2	69
Fees	1997-03-25	1	90
Fees	1996-03-26	1	91
Fees	1995-03-24	2	153
Fees	1994-03-23	1	103
Fees	1993-03-23	1	95
Fees	1992-03-27	1	47
Correspondence	1996-06-20	2	58

Language selection

Menus

English Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2060310 Summary

English Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.