Note: Descriptions are shown in the official language in which they were submitted.
0396
REP VQCODER IMPLEMENTED IN DIGITAL SIGNAL PROCESSORS
BACKGROUND OF THE INVENTION
The present invention generally pertains to voice coders
(vocoders) and is particularly directed to Residual-Excited
Linear Prediction (REP) vocoders. Vocoders convert speech
signals into digital form for transmission and synthesize speech
signals from these digital signals upon reception. Vocoders
typically operate at flexible binary data rates varying from
32 kbps (kilobytes per second) down to about 2.4 kbps.
Vocoders traditionally are divided into two basic types,
waveform coders and pitch-excited source coders. Waveform
coders operate at high data rates (above 16 kbps) and produce
good quality natural sounding speech which is robust against
both acoustic and transmitted noise. Source coders operate at
low data rates (less than 4.8 kbps) in an analysis/synthesis
mode governed by a mathematical model of the human vocal
" apparatus. Source vocoders typically sound robotic and do not
perform well under poor acoustic conditions.
The REP vocoder was originally proposed by Us and McGill,
"The Residual-Excited Linear Prediction Vocoder with Transmission
Rate Below 9.6 kbits/s", IEEE Trans. COMMA, 1975 pp. 1466-1473;
and an enhanced REP vocoder was proposed by Dank berg and Wrong,
"Development of a 4.8-9.6 kbps REP vocoder", ICASSP-i9. The
--1
12~396
purpose of the REP vocoder was to provide satisfactory perform-
ante in the gap between the operating ranges of waveform coders
and source coders, to wit: I kbps to 16 kbps. The REP
vocoder contains some features of both waveform coders and
source coders.
In prior art REP vocoders, digital speech data signal
samples are analyzed over relatively short time segments
(typically in the range of 10-30 my.) by a linear predictive
coding (LPC) vocal tract modeling technique to provide LPC
coefficients for each block of samples. The LPC coefficients
represent the vocal tract, glottal flow and radiation of the
speech represented by the digital signal samples. Using the
LPC coefficients, the digital speech data signal samples are
inverse filtered by a time-variant, all-pole recursive digital
filter over each short time segment to provide residual signal
(prediction error signal samples. The time-variant character
of speech is handled by a succession of such filters with
different parameters.
.... .
The residual signal and the LPC coefficients are encoded
(quantized) and formatted for transmission. Upon reception,
speech is synthesized by processing the residual signal in
accordance with the LPC coefficients.
In prior art REP vocoders, the residual signal samples
are band limited and down sampled prior -to quantization in order
~2~0396
to provide residual signal samples at a reduced data rate.
The upper band harmonics are generated during synthesis of the
speech signal when the down sampled residual signal is
unsampled and zeros are inserted between data points.
In the Us and McGill REP vocoder the residual signal is
quantized prior to transmission by adaptive delta modulation.
Dank berg and Wrong considered various other quantization tech-
piques and concluded that pitch predictive adaptive differential
pulse code modulation (PPADPCM) provided the best signal-to-
quantizing noise ratio.
In accordance with the PPADPCM technique, the residual
signal samples are processed by pitch analysis to determine the
pitch delay, are processed by pitch predictor gain analysis to
determine the pitch predictor gain in accordance with the deter-
mined pitch delay, processed by gain analysis to provide a
maximum deviation quantize gain, and are further processed by
PPADPCM in accordance with the quantize gain, pitch predictor -
gain and delay parameters to thereby provide the quantized
residual signal. The quantize gain, pitch predictor gain and
the pitch delay parameters are combined with the quantized
residual signal and the quantized LPC coefficients for
transmission.
REP vocoders of the prior art have required complex
hardware and have been so expensive to implement as to be
commercially impractical.
aye
SUMMARY OF THE INVENTION
The present invention provides a commercially practical
REP vocoder that is implemented by two digital signal processors,
one for a transmitter system and one for a remotely located no-
sever system. Thy transmitter digital signal processor is adapted for processing digital speech data signal samples to provide a
formatted transmission signal including (a) a quantized residual
signal generated by inverse filtering of the samples in accordance
with linear predictive coding (LPC) coefficients generated from
the samples, (b) quantized LPC coefficients and (c) pitch and
gain parameters generated during quantization of the residual
signal from the inverse filtered samples, all of which are
generated by the processor from the digital speech data samples.
The receiver digital signal processor is adapted for
processing the formatted transmission signal to synthesize
reconstructed digital speech data signal samples.
The transmitter digital signal processor is adapted for
performing a routine for generating the LPC coefficients; a
routine for generating the residual signal; and a routine for
quantizing the residual signal and the LPC coefficients. The
routine for generating the LPC coefficients includes a subroutine
for reemphasizing the samples in order to emphasize the high
frequencies of speech, a subroutine for defining an auto-correlation
-4- j
12~0~96
function (ACT) from the reemphasized samples in order to
generate ACT coefficients; and a subroutine for generating the
LPC coefficients from the generated ACT coefficients. The
routine for generating the residual signal includes a subroutine
for inverse filtering the reemphasized samples in accordance
with the generated LPC coefficients; a subroutine for band limiting
the residual signal by low-pass filtering in a manner which will
reduce the effects of quantization; and a subroutine
for down sampling the band limited residual signal to reduce the
number of residual signal samples that are quantized and formatted
for transmission. The routine for quantizing the residual signal
and LPC coefficients includes a subroutine for quantizing the
LPC coefficients; a subroutine for estimating the pitch period
of the down sampled residual signal by ACT analysis of the current
down sampled residual signal frame in accordance with the ACT
coefficients generated for the previous frame to thereby provide
a pitch delay parameter-for the current frame; a subroutine for
providing a pitch predictor gain parameter for each residual
signal frame in accordance with the estimated pitch delay pane-
meter for each corresponding frame; a subroutine for providing aquantizer gain parameter for each residual signal frame in accord-
ante with the pitch delay and pitch predictor gain parameters for
each corresponding frame; and a subroutine for quantizing each
residual signal frame by pitch predictive adaptive differential
--S--
. . ,
0396
pulse code modulation (PPADPCM) in accordance with the pitch
delay, pitch predictor gain and quantize gain parameters for
each corresponding frame.
The receiver digital signal processor is adapted for processing
the formatted transmission signal to synthesize reconstructed
digital speech data signal samples by performing synthesisroutinethat
includes a subroutine for regenerating the LPC coefficients from
the quantized LPC coefficients included in the transmission signal;
a subroutine for decoding the quantized residual signal included
lo in the transmission signal in accordance with the pitch delay,
pitch predictive gain and quantize gain parameters included in
the transmission signal to thereby provide a decoded down sampled
residual signal; a subroutine for spectrally regenerating-a
full-band residual signal from the decoded down sampled residual
signal; a subroutine for regenerating reemphasized digital
speech data signal samples by auto-regressively filtering the
regenerated full-band residual signal in accordance with the
regenerated LPC coefficients; and a subroutine for de-emphasizing
the regenerated reemphasized samples in order to de-emphasize the
high frequencies of speech, to thereby provide the reconstructed
digital speech data signal samples. The decoding subroutine
includes a subroutine for scaling quantize coefficients for
each quantized residual signal frame in accordance with the
quantize gain parameter included in the transmission signal;
--6-- .
., , _
12~039~i
a subroutine for providing data samples from the quantized
residual signal included in the transmission signal in accord-
ante with the scaled quantize coefficients; and a subroutine
for providing the decoded down sampled residual signal from the
data samples by pitch excitation in accordance with the pitch
delay and pitch predictor gain parameters.
Additional features of the present invention are discussed
in relation to the description of the preferred embodiment.
BRIEF DESCRIPTION OF THE DRAWING
Figure 1 is a functional block diagram illustrating the
process implemented by the transmitter digital signal processor
to code an input signal sample for transmission.
Figure 2 is a functional block diagram illustrating the
process implemented by the receiver signal processor to decode
a sample which is coded in accordance with the process illustrated
in Figure 1.
Figure 3 is a flow chart of the LPC coefficient generation
routine performed by the transmitter digital signal processor.
Figure 4 is a flow chart of the residual signal generation
routine performed by the transmitter digital signal processor.
Figure 5 is a flow chart of the quantization routine per-
formed by the transmitter digital signal processor.
I
03~;
Figure 6 is a diagram of a quantization filter implemented
during the PPADPCM quantization subroutine included in the routine
of Figure 3.
Figure 7 is a flow chart of the synthesis routine performed
by the receiver digital signal processor.
.. .
DESCRIPTION OF THE PREFERRED EMBODIMENT
In the preferred embodiment of the present invention, the
transmitter digital signal processor and receiver digital signal
processor respectively are each Texas Instruments Model TMS32010
10 Digital Signal Processors. The TMS32010 processor is a 16-bit,
200 no cycle time, stand-alone processor with a 32-bit ALUM and
Accumulator. The processor has a four level stack for nested
subroutines; and arithmetic performance is enhanced by a hardware
16*16-bit parallel multiplier, which performs a pipeline
15 multiply/accumulate operation in 400 no. The TMS32010 processor
has 144 16-bit words available as internal RAM which may be
augmented by addressing external RAM, for buffer storage, via
TBLR/TBLW ( table read/write) commands. These commands allow a
trade-off between data memory requirements and speed of operation.
Program memory may be redefined as external data memory but its
access time is 600 no. External program memory may be expanded
to OK bytes at full speed. The two processors must perform
-B-
12~1D3~!6
all operations of the REP vocoder in real time. The processor
choice is constrained by two key factors: operating speed and
available internal RAM (especially important because frame
storage is required). The TMS32010 processor is chosen based
on its fast operating speed (5 MHz), data storage capabilities,
and extensive development tools.
The principal functions of the-transmitter processor are
described with reference to Figure 1. Digital speech data signal
samples 10 are reemphasized 11 to improve the representation of
high frequencies during the subsequent LPC analysis. Reemphasized
samples 12 are subjected to LPC analysis 13 to provide LPC reflect
lion coefficients 14.
The LPC reflection coefficients 14 are antacid 15 to provide
quantized LPC reflection coefficients 16. The LPC reflection
coefficients 14 are quantized to minimize distortion during sub-
sequent transmission to the receiver. LPC coefficients 17 are
generated 18 from the quantized LPC reflection coefficients 16.
The reemphasized samples 12 are inverse filtered 19 in
accordance with the LPC coefficients 17 to provide a residual
signal 20. The residual signal 20 is band limited 21 and down-
sampled 22 to provide a base band residual signal 23.
The base band residual signal 23 is quantized by PPADPCM
quantization 24 in order to minimize the effects of distortion
during subsequent transmission of the quantized residual signal 25.
I
Three of the parameters of the PPADPCM quantization 24 are pitch
delay, pitch predictor gain and quantize gain. These three
parameters are generated during PPADPCM quantization 24 and are
necessary to decode to the quantized residual signal received
by the receiver system. Accordingly, a pitch delay signal is
provided on line 26, a pitch predictor gain signal is provided
on line 27 and a quantize gain signal is provided on line 28
incident to the PPADPCM quantization 24 of the base band residual
signal 23.
The quantized residual signal 25, the quantize, the pitch
delay signal on line 26, the pitch predictor gain signal 27, the
quantize gain signal 28 and the quantized LPC reflection
coefficients 16 are combined linearly by formatting 32 to provide
a transmission frame 34.
The principal functions of the receiver processor are de-
scribed with reference to Figure 2. The format of each received
data transmission frame 36 is decoded 37 to provide the quantized
residual signal 39, the pitch delay parameter 40, the pitch
predictor gain parameter 41, the quantize gain parameter 42 and
` the quantized LPC reflection coefficients 43.
The quantized residual signal 39 is decoded by PPADPCM
decoding 46 in accordance with the pitch delay 40, pitch predict
ion gain or and quantize gain 42 to provide a decoded base band
residual signal 47. The decoded base band residual signal 47 is
spectrally regenerated 48 to provide a full-band residual signal 49.
--10--
:
I
The quantized LPC reflection coefficients 43 are processed
50 to generate the LPC coefficients 51.
The full-band residual signal 49 is filtered 52 in accord-
ante with the generated LPC coefficients 51 to synthesize a
decoded speech data signal samples 53. The decoded speech data
signal samples 53 are de-emphasized 54 to provide a regenerated
digital speech data signal samples 55.
The processing routines performed by the transmitter
processor to perform the above-described signal processing
functions are described below with reference to the flow charts
of Figures 3, 4 and S.
The processing routine represented by the flow chart of
Figure 3 generally pertains to LPC analysis. This routine goner-
ales the LPC coefficients from a buffered frame of reemphasized
speech data signal samples. The routine of Figure 4 is generally
directed to generation of the residual signal; and the routine of
- Figure 5 is generally directed to quantization of the residual
signal and the LPC coefficients.
The LPC analysis routine includes the subroutines of
initialization 58, sample input So, reemphasis 61, ACT
generation 63, ACT normalization 65 and LPC analysis 66.
The sample input subroutine 59 reads in digital speech
data signal samples from an external data memory buffer.
396
The reemphasis subroutine 61 applies first-order digital
reemphasis to the input speech data signal samples. The input
to the algorithm is the input speech sample Sun and the output is
the reemphasized speech sample Sun, both located in internal
RAM. Eirst-order digital reemphasis is applied to the input
speech signal to emphasize the high frequencies of speech. This
leads to a more accurate estimate of the vocal tract frequency
response, which is controlled by the LPC parameters. Reemphasis
uses a single-delay high-pass filter. Experimentation shows that
the choice of the reemphasis constant (a) is not critical and
it is normally set to 0.9375. The difference equation for the
filter is:
Sun = Sun a Snowily (En. 1)
The reemphasis function is complemented at the receiver system
by applying a de-emphasis function.
The reemphasized samples are stored in an external data
memory for use in the residual signal generation routine of
Figure 4.
The ACT generation subroutine 63 iteratively updates a
correlation buffer for each input speech data signal sample.
This buffer must be zeroed prior to the first call to the sub-
routine. The output of this subroutine is a 32-bit precision
auto-correlation function (ACT) for delays between zero and ten
points.
I
3~6
In order to generate the LPC coefficients, an auto-
correlation function (ACT) must be defined from a windowed
buffer of reemphasized speech samples (so). The ACT of a
sequence is defined as:
Ok = L j-o Xj Xj + k KIWI.......... No (En. 2)
where xj = we so (En. 3)
The window (w;) is chosen to be rectangular for ease of
implementation.
Won = 1 n = 0,............ No
- 0 elsewhere (En. 4)
A tenth-order LPC analysis requires the ACT coefficients
Rural. These coefficients may be updated iteratively for
each input speech data signal sample.
Rk(n~l) = Ok + on ink (En. 5)
where Ok is the nth iteration of the kth ACT coefficients.
This equation is implemented by the ACT generation subroutine
63. The coefficients Ok are maintained with 32-bit accuracy
to remove round-off error problems. The algorithm is imply-
minted by creating a delay buffer that is initialized to zero and ripples after each iteration. This implementation also
ensures that the 32-bit result will not overflow. The maximum
-13-
~03~6
value of the ACT is the zero-delay element. If, for example,
each input sample has a maximum of 12-bit resolution, the maxim
mum value attained by the accumulator, for a data buffer of
180 samples, is:
log2t2ll * 211 * 180] = 29.492 bits (En. 6)
Upon completion of sample input, the 32-bit ACT result
must be converted to 16-bit coefficients. The ACT normalization
subroutine 65 performs all operations required to convert the
32-bit ACT to a 16-bit result. The LPC analysis subroutine 66
is transparent to a scaled ACT input. Therefore, to obtain the
maximum dynamic range of the 16-bit ACT, the 32-bit results are
scaled to the maximum, Row prior to truncation to 16-bits. The
optimal procedure for this would be to divide all coefficients
by Row However, execution efficiency is greatly improved by
simply left-shifting the 32-bit numbers to remove leading zeros
in the Row value.
A decision 67 that the 32-bit correlation frame is complete
- enables the processor to proceed to the ACT normalization
subroutine 65.
The LPC analysis subroutine 66 implements the Durbin
algorithm to generate the ten LPC coefficients and ten LPC
reflection coefficients 36. The Durbin algorithms input is
the normalized 16-bit ACT.
-14-
~L2~3~
The Durbin algorithm is an extremely efficient algorithm
for generating the LPC coefficients. See J. Meekly, "Linear
Prediction: A Tutorial Review", Pro IEEE, Vol. 63, pup 561-80,
1975. The algorithm is suitable for fixed-point arithmetic
implementation and also generates, as a by-product, the reflect
lion coefficients, which may used for quantization and coding
prior to transmission to the receiver.
Alternatively the LPC coefficients may be generated by
the lo Roux-Gueguen (LUG) recursion, which is described in
J. lo Rout and C. Gauguin, "A Fixed Point Computation of Partial
Correlation Coefficients in Linear Prediction", Pro ICASSP-77,
pup 742-3. The LUG recursion, although faster than the Durbin
algorithm, generates only the LPC reflection coefficients and
not the LPC coefficients, per so which must be generated
separately.
Durbin's recursive procedure is as follows:
Initialization.
(En. 7)
Al = -R1/Ro ten. 8)
E = [l-kl2]Eo (En. 91
Recursion imp
k = -(Rip + joy 1 a Rowley (En. 10)
at = kit (En. 11)
-15-
0;~9~ii
aji = a 1 + kiwi jig 1 lull (En. 12)
Hi = ski ] Neil (En. 13)
Symbols defined:
Hi is the prediction error energy
Rip is the ilk auto-correlation function
kit is the ilk reflection coefficient
all is the ilk I.PC coefficient (Thea iteration)
The order of the LPC analysis, P, is determined experimentally
and a Thea order analysis is sufficient to adequately model the
vocal tract frequency response.
The LPC parameters must be quantized and coded prior to
transmission and resynthesis of the digital speech data signal
at the receiver. However, the LPC coefficients, a, are sense-
live to quantization noise an* introduce significant distortion
to the signal. A solution is to quantize and code the LPC
reflection coefficients, kit which are much less sensitive to
quantization noise. This operation is performed by a LPC
_ . .
coefficient quantization subroutine 68, which is a part of the
quantization routine of Figure 5. At the receiver the LPC
coefficients may be recovered from the quantized reflection
coefficients using a subset of the recursion above.
i -1-6-
1.2~03~i
The initialization subroutine 58 and the sample input
subroutine 59 are both contained in the main program for the
transmitter processor. The main program controls the calling
of the other subroutines in the LPC analysis routine of -
Figure 3 in accordance with the following hierarchy:pre-emphasis 61, ACT generation 63, ACT normalization I and
LPC analysis 66. The main program implements the-LPC-analysis
routine of Figure 5 to generate a frame of a predetermined
number of reemphasized speech data signal samples and the
ten LPC coefficients. The term "LPC coefficients" as used
herein refers to either LPC coefficients or LPC reflection
coefficients unless the latter is specified.
The residual signal generation routine is represented by
the flow chart of Figure 4. This routine includes the subroutines
15 of initialization 70, sample input 71, inverse filter 72,
band limit 73 and down sample 74.
The initialization subroutine 70 transfers second-order
section filter coefficients from external data memory to the
internal RAM of the transmitter processor for use during the
band limit subroutine 73.
The sample input subroutine 71 inputs the reemphasized
samples from a speech data buffer located in the external data
memory to the zero-delay position of a speech delay buffer,
which is located in the internal RAM of the transmitter processor.
-17-
I. :
~2g~0396
the delay buffer is used for the implementation by the inverse
filter subroutine 72 of the all-zero Finite-Impulse-Response (FIR
filter in accordance with the LPC coefficients.
The inverse filter subroutine 72 implements an all-zero
inverse filter in accordance with the LPC coefficients to generate
the residual signal lo (Figure 1). The output from this sub-
routine 72 is provided to a residual signal data buffer which is
located in the external data memory.
The residual signal 20 is generated by inverse filtering
the reemphasized speech data signal samples 12 in accordance
with the LPC coefficients 17. (See Figure 1). The LPC Coffey-
clients are formulated mathematically to estimate the transfer
function of the vocal tract. This function is represented by
the polynomial Ho
Ho = [1 -~kP=l a Z ] (En. 14)
where a is the kth LPC coefficient. The residual signal 19
is obtained by filtering the speech data signal samples 12 by
the all-zero filter Ho 1. If on represents the input speech
sample at time n and Yin represents the corresponding output
sample, the filter can be represented by the following difference
equation:
n n 1 n-l + a xn_2 + -- + Alex 10 '
~2'~3~6
The simplest way to implement this structure is to place the
coefficients a in a fixed register and to implement the delay
buffer using a shift register. The TMS32010 micro-code is
optimized to perform this operation using the LTD/MPY commands:
the processor has a pipeline Multiply/Accumulate instruction
that executes in ~00 no.- . .. .; -. - .
The band limit subroutine 73 low-pass filters the residual
- signal 20 by implementing an eighth-order elliptic half-band
filter, which in turn is implemented by using a cascade of four
second-order sections. The transfer function of the elliptic
filter is:
Ho = A / By (En. 16)
where I KIWI a Z (En. 17)
By =~k=0 by z k bowl . (En. 18)
It is important to implement this filter in a manner which
will reduce the effects of coefficient quantization and finite
register length effects which are described in L. A. Rabiner
and B. Gold, "Theory and Application of Digital Signal
Processing", Prentice-Hall, 1975. This is best achieved by
factorizing the polynomial Ho into second order polynomials:
Ho = Hi Ho H3(z) Ho (En. 19)
where: H (z) = a + at z + a Z
1 + by . z 1 b2.z 2 (En. I
--19--
~240396
The second-order polynomial Hum is implemented by a second-
order filter section. The second-order section is implemented
by an internal subroutine that is called four times to provide
a cascade of four second-order sections. A cascade of four
sections is equivalent to an eighth-order elliptic low-pass
filter. Each section uses a set of filter coefficients and
requires its own delay buffer, which must be shifted at each
iteration.
The down sample subroutine 74 implements down sampling by
discarding predetermined samples. The down sample algorithm
uses the frame counter to alternate between discarding the input
data point or scaling it to maintain the energy per frame. The
down sampling function reduces the filtered residual signal
sample data rate. This function is executed by a frame position
pointer. The sample is either discarded or magnitude-scaled
(multiplied by a predetermined factor to maintain the average
frame energy of the residual signally If, for example, the
down sampling ratio is two, the scaling factor is also two.
A decision 75-that the frame is complete concludes the
residual signal generation routine of Figure 4
The sample input 71 and inverse filter 72 subroutines
and the decision 75 are integrated together and control the
calling hierarchy for the other subroutines in the residual
signal generation routines of Figure 4. The order of such
-20-
~2~03~6
calling hierarchy is band limit 73 and downsamp~e 74.
The quantization routine represented by the flow chart
of Figure 5 includes the following subroutines: LPC coefficient
quantization 68 (discussed above in relation to the LPC analysis
subroutine 66), pitch delay 78, pitch predictor gain 80, qua-
titer gain 81, CRC 82, PPADPCM quantization 83 and data format 84.
The LPC coefficient quantization subroutine 68 quantizes
the ten LPC reflection coefficients 14. This subroutine obtains
its input data from the LPC reflection coefficients 14 and
quantize look-up subroutine 68 during the operation of the LPC
analysis subroutine 66. this subroutine 68 is called by the
LPC analysis subroutine 66.
The reflection coefficients are quantized with a variable
number of bits per coefficient compatible with DOD standard
LPC-10 coding, which is described in T. E. Remain, "The
Government Standard Linear Predictive Coding Algorithm: LPC-10",
- Speech Technology, April 1982.
Data management is necessary because of the limited avail-
ability of internal RAM in the TMS32010. Additional data
buffers may be located in external data memory, which has a
very slow access time (800 no). A data management algorithm
performs buffer transfers between internal RAM and external data
memory to enable all routines to execute using internal RAM
memory.
-21-
)39~
The pitch delay subroutine 78 estimates the pitch period
to determine the pitch delay parameter T of the down sampled
residual signal 22 (Figure l) used for the PPADPCM quantization
using an auto-correlation function (ACT) analysis of the signal
22. The inputs to the algorithm are the partial ACT of the
previous frame and the current residual signal frame. The out-
put from the algorithm is the estimated pitch delay T and the
updated partial ACT.
The pitch delay is updated at the frame rate. Pitch
analysis uses a simple auto-correlation detector:
I = no Sun Snot (En. 21)
The pitch delay, T, is chosen as the maximum value of I,
evaluating I between Twin and Tax. To enable an accurate
estimate of the pitch Duluth analysis must cover three
pitch periods, i.e., N>3TmaX. The limits of the pitch detection
are chosen experimentally using FORTRAN simulations of the
REP vocoder algorithm; for example, Twin is a 15 sample delay
I' and Tax is a 40 sample delay. This corresponds to pitch ire-
quenches of 267 Ho and lo Ho respectively if the down sampled
residual signal 22 has a sampling rate of 4 kHz. The value N
is therefore chosen to be two down sampled frames. The auto-
correlation detector, I is evaluated as two partial-ACF's,
Al and RUT where:
-2-2-
Al no Sun 'Snot (En. 22)
, RUT no Sun Snot (En. 23)
M is a single down sampled frame. I is calculated by adding
the current frame's partial-ACF, RUT and the previous frame's
partial-~CF, Al, that was stored in external data memory.
The pitch predictor gain subroutine 80 evaluates the pitch
predictor gain parameter B for the PPADPCM quantization and
updates such evaluation at the frame rate. The pitch predictor
gain B is evaluated as:
/ n-0 Sun Snout n-0 Snot Snot (En. 24)
where N is a single down sampled frame and T is the pitch delay.
B is constrained between two limits:
B 1.0 Then: B = 1.0
B < 0.1 Then: B = 0.0
The quantize gain subroutine 81 evaluates the quantize
gain parameter qgainforthe PPADPCM quantization and updates
such evaluation at the frame rate. This parameter is used to
scale the quantize to the input signal level; each input and
output level of the quantize is multiplied by q at . The
parameter is chosen to be the maximum on:
On Is Bunt In = 0,...,M-1 (En. 25)
-2-3-
1;~4~396
where M is a single down sampled frame, T is the
pitch delay, and B is the pitch predictor gain.
The CRC subroutine 82 introduces an n-bit cyclic rerun-
dandy code (CRC) on part of the transmission frame to enable
detection of bit errors during transmission. The code protects
the LPC coefficients and PPADPCM parameters. The input to the
subroutine is the relevant quantized coefficients. The output
from the subroutine is an n-bit CRC to be transmitted.
The PPADPCM subroutine 83 quantizes the down sampled
residual signal 22, using Pitch Predictive Adaptive Differential
Pulse Code Modulation (PPADPCM). The term "pitch predictive"
is misleading however. The pitch predictor is used to remove
the dominant periodic frequency from the residual signal 22
prior to quantization. While this frequency is most commonly
the pitch period, the predictor may lock onto an alternate
frequency without detrimenting the operation of the quantize.
Therefore a rigorous pitch extraction algorithm is not necessary.
The predictor removes the dominant periodicity of the waveform
.....
to generate a "white noise" signal with a Gaussian probability
density function (pdf). This signal may then be quantized
using a classical Max quantize, as described in J. Max,
"Quantizing for Minimum Distortion," IRE Trays on Information
Theory, March 1960.
-24-
~2~9~
Figure 6 shows the structure of the PPADPCM antisera.
The quantize is embedded in the predictor loop so that the
error spectrum introduced by quantization is uniform. The
parameters of the quantize are the pitch delay IT), the
quantize gain (gain), the pitch predictor gain (B), and the
order of the quantize (Q). Experimentation determines that a
3-bit quantize is adequate to ensure good subjective speech
quality at the receiver.
The data format subroutine 84 formats a data frame 34
(Figure 1) for transmission. The input to the subroutine 84 is
a predetermined number of quantized residual signal samples 25,
the pitch delay parameter 26, the pitch predictor gain 27, the
quantize gain 28, the quantized LPC coefficients 31 Figure 1)
and the CRC. The output from the subroutine 84 is a transmission
data frame 34 which is place din the output buffer.
A decision 85 that the frame is complete concludes the
quantization routine of Figure 5.
The calling hierarchy of the subroutines in the quantize-
lion routine of Figure 5 is under the control of the main
program. The following subroutines are integrated together in
a subroutine designated PPQNT: pitch predictor gain 80, quantize
gain 81 and PPADPCM quantization 83. The calling hierarchy is
as follows: pitch 78, PPQNT, CRC 82 and data format 84. The
subroutine 68 is called by the LPC analysis subroutine 66 in
the LO analysis routine of Figure 3.
i -25-
-
2~039~S
he receiver digital signal processor utilizes a synthesis
processing routine, Referring to Figure 7, the synthesis routine
includes the following subroutines: initialization 88, data
input 89, CRC check 90, LPC coefficient generation 91, PPADPCM
S decoding 92, spectral regeneration 93, LPC synthesis filter 94,
de-emphasis 95, and speech output 97.
The initialization subroutine 88 is included in the main
program for the receiver processor. The initialization sub-
routine 88 initializes all registers and data locations within
the processor prior to the execution of each subroutine.
The data input subroutine 89 also is included in the main
program for the receiver processor. This subroutine inputs the
data transmission frame 36 received from the transmitter by
inputting the frame from a frame buffer in external data memory.
lo The CRC check subroutine 90 uses the received transmission
data frame to generate an n-bit CRC which it compares to the
n-bit CRC in the received transmission data frame to check for
transmission errors. If any errors are detected, a subset of
.,"~
the LPC and PPADPCM parameters for the current frame are disk
carded and a subset of the previous frame's parameters
substituted. The input to this subroutine is an-bit CRC word
from the data transmission frame. The output from this
subroutine is a flag indicating which set of parameters to use
during the rest of the subroutine.
-26-
.
i
to $
The LPC coefficient generation subroutine 91 reads in the
transmitted quantized PI parameters, calls a subroutine
IQRC to decode the LPC reflection coefficients, and performs a
step-up algorithm to transform the LPC reflection coefficients
to the LPC coefficients. The input to this subroutine is the
transmitted quantized LPC reflection coefficients 43 and the
output is the LPC coefficients 51 (Figure 2).
Prior to LPC synthesis filtering 52, the LPC coefficients
must be generated from the transmitted quantized LPC reflection
coefficients. These quantized LPC reflection coefficients must
be unpacked and decoded using the quantize look-up tables
described in T. E. Remain, "The Government Standard Linear
Predictive Coding Algorithm: LPC-10", Speech Technology,
April 1982. The LPC coefficients are then generated from the
' 15 decoded LPC reflection coefficients using the step-up algorithm,
a recursive algorithm which is a subset of the Durbin algorithm
described in J. Meekly, "Linear Prediction: A Tutorial Review,"
Pro IEEE, Vow 63, pup 561-80, 1975.
,- ,. .
The recursion is as follows:
Initialization: all = Al (En. 26)
Recursion imp all = kit ' (En. 27)
aji = aji 1 + kiwi jig 1 lCj<i-l (En. 28)
-27-
0391~S
where kit is the ilk reflection coefficient and air is the
ilk LPC coefficient (jth iteration). The order of the
transmitter LPC analysis, P, is ten.
The PP~DPCM decoding subroutine 92 reads in the bit-packed
quantized residual signal 39 and quantize parameters I 41, 42
received from the transmitter and generates a decoded base band
(down sampled) residual signal 47 (Figure 2). This subroutine 92
must perform the inverse operation of the transmitter's PPADPCM
coding. It therefore divides into three parts: unpacking,
quantize look-up, and pitch excitation.
The PPADPCM decoding subroutine first transfers the
PPADPCM quantize coefficients to internal RAM and scales them
using the quantize gain parameter. The inputs to this operation
are the coefficient buffer stored in external data memory and
the quantize gain. The output of this operation is the scaled
look-up table located in internal RAM.
This subroutine 92 next reads in packed data bytes from a
data buffer in external data memory, unpacks the byte, and
decodes the data samples using the quantize look-up table.
The input to this operation is the bit-packed data word and the
quantize coefficient table. The output from this operation is
the set of decoded data samples. The received data bytes are
unpacked into individual data samples by masking off each
individual data sample, which may then be decoded using the
I . j
~L2~3~Ç~
quantize look-up table that is identical to the one used at
the transmitter to quantize the data samples.
The PPADPCM decoding subroutine 92 then implements a
variable delay first-order difference equation to "pitch excite"
the input data and recover the down sampled residual signal 47.
The input to this operation is the transmitted data sample, the
pitch delay parameter and the pitch predictor gain parameter.
The output from this operation is the down sampled residual
signal 47. The difference equation for this operation is:
Sun = On + B Snot (En. 29)
where Sun is the down sampled residual signal sample, on is the
transmitted data sample, B is the pitch predictor gain, and
T is the current frame's pitch delay (period).
The spectral regeneration subroutine 93 is included in
the main program or the receiver processor. The spectral
regeneration subroutine 93 generates a full-band residual
.. ..
signal 49 from down sampled residual signal 47. The effect
is to convert a 4 kHz down sampled signal 47 to an 8 kHz
full-band signal 49.
I -
The LPC synthesis filter subroutine 94 implements an auto-
regressive LPC synthesis filter governed by the ~PC~coefficients.
The inputs to this subroutine are the LPC coefficients 51 and
the regenerated full-band residual signal 49. The output from
this subroutine is the regenerated reemphasized speech data
signal sample 53. This subroutine 94 generates the speech
data signal samples 53 by filtering the residual signal 49
with a tenth-order all-pole filter. The filter is governed by
the generated LPC coefficients 51. The transfer function of
the filter is:
Ho = [1 - ~kP=l a Z ] (En. 30)
where a is the kth LPC coefficient. If on represents the
residual signal sample 49 at time n and Yin represents the
corresponding regenerated reemphasized speech data signal
sample 53, the filter operation can be represented by the
following difference equation:
Yin n at Yn-l + a Yn_2 + -- + aye Yn_lo (En. 31)
_, .
The simplest way to implement this equation is to place the
coefficients a in a fixed register and to implement the
delay buffer using a shift register. The-TMS32010 micro-code
is optimized to perform this operation using the LTD!MPY
-30-
I
commands: the processor has a pipeline Multiply-Accumulate
instruction that executes in 400 no.
The de-emphasis subroutine 95 implements a first-order
digital de-emphasis filter. The inputs to this subroutine are
the current regenerated sample 53, the previous regenerated
sample, and the reemphasis constant. The output from this
subroutine is the regenerated speech data signal sample 55.
First-order digital de-emphasis is applied to complement
the reemphasis function in the transmitter processor.
De-emphasis uses a single-delay low-pass filter. The de-emphasis
constant is also set to 0.9375. The difference equation
for the filter is:
n On + A Yn_l (En. 32)
The speech output subroutine 97 also is included in the
main program for the receiver processor. This subroutine out-
puts the regenerated speech data signal samples to a data
buffer in external data memory from which the samples are
provided.
A decision 98 that the frame has been completed concludes
the synthesis routine of Figure 7.
The calling hierarchy for the ssTnthesis routine of Figure 7
is controlled by the main program for the receiver processor and
-31-
~?40396
calls the following subroutines in the following order: CRC
check 90, PI coefficient generation 91, PPADPCM decoding 92,
inverse filter 94 and de-emphasis 95.
Transmitter and receiver systems that are commonly
located may be included in a single digital processor.
.. ,
-32-j