Note: Descriptions are shown in the official language in which they were submitted.
CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
Method and apparatus for coding digital video signals
The invention relates to a method and an
apparatus for coding digital video signals.
Prior art
For future digital television broadcasting
(DVB = Digital Video Broadcasting) and the interactive
communications services associated therewith, vldeo
data must be substantially reduced in terms of tr.e
volume of data by means of suitable coding devices.
This step is necessary in order to be able to transmit
more programmes via existing channels or to communicate
moving picture sequences in already existing narrow-
band transmission paths. One coding method provided for
this purpose is the MPEG2 standard (ISO/IEC 13818).
Invention
In order to obtain the highest possible
compression factor during coding, the input pictures
are combined in Groups of Pictures (GOP) in the case of
MPEG. The individual pictures are coded differently
within such a Group of Pictures. According to MPEG, a
Group always consists of one intraframe-coded frame
(I frame) as well as normally a plurality of P frames
(predicted frame) and/or B ~rames (bidlrectlona y
calculated frames), it being true here as well as in
the following text that "frame" may also be replaced by
"field".
The data rate necessary for transmitting a frame from
this Group depends on the relevant frame type (I, P or
B) as well as the current picture contents. The largest
relative volume of data within a Group of Pictures is
allotted to the I frames. They contain all the data
required for complete reconstruction in the decoder. In
contrast, P frames are predicted from I frames or a
preceding P frame, that is to say the presence of a
complete I frame in the relevant Group of Pictures is
CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
necessary for the reconstruction of the P frames at the
receiving end. For this purpose, it is then necessary
to code only the difference from the preceding I or P
frame. B frames, on the other hand, are essentially
calculated (interpolated) from already reconstructed I
or P frames. The volume of data which must be
transmitted for a B frame is correspondingly low.
Certain picture types may also contain individual
macroblocs of other picture types, for example
macroblocs may occur in P and B pictures at picture
excerpts which could otherwise be coded only with
insufficient efficiency.
Although the above-described classification of the
frames into different frame types permits a very high
coding efficiency, it imparts to the different frame
types different degrees of sensitivity to transmission
errors. Thus, transmission or reconstruction errors
within B frames remain restricted to the corresponding
B frame, whereas erroneous I and P frames can affect
the entire Group of Pictures (GOP). In the case of the
most frequently chosen MPEG2 parameter of 12 frames per
GOP, the time duration for this may last up to
approximately half a second (at 50 Hz frame frequency,
at 60 or 59.94 Hz frame frequency correspondingly
shorter) and is thus very disturbing to a viewer.
A similar effect regarding error degradation
may also occur when, during the coding of a feature
film, a change in the picture scene occurs within a
Group of Pictures. In this case, no rational basis for
the prediction of the B and P frames exists for the
current Group of Pictures, rather the first picture of
the new scene is largely coded with intraframe
macroblocs, on account of the internal control of the
coder. Since the allocated volume of data for a P frame
is not large enough to intraframe-code large portions
of the picture with good quality, the reconstruction
result in the decoder will remain unsatisfactory. Only
in the following Group of Pictures do stable conditions
- CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
.
exist once again for the coding process, which permit
satisfactory reconstruction.
Previous coding processes for data reduction are ~ei
on feeding the volume of data !picture sequerlces;,
which originates from a film scanner, for example, tC'
an MPEG coder which continuously generates a data-
reduced bit stream. In this concept, scene cuts of the
original film coincide completely arbitrarily with a
frame type determined by the coder, which may lead to
the above-described error degradation within a Group of
Pictures.
The invention is based on the object of
specifying a method in which the coding process ls
synchronized with a scene change that is present. This
object is achieved by means of the method specified in
Claim 1.
The invention is based on the further object of
specifying an apparatus for application of the method
according to the invention. This bject lS achievel r, ,,
means of the apparatus specified ir. Claim 6.
For this purpose, a scene detector is connected
between the picture generator (film scanner, camera,
recording device or another signal source) and the
coder. The said scene detector generates a suitable
control signal and causes the coding process, in the
event of a scene change, to begin with a new Group of
Pictures, that is to say with an I-coded picture. This
advantageously prevents a scene change from falling in
the middle of a Group of Pictures, for example, and
thus reconstruction with impaired quality from being
engendered at the decoder. This measure advantageously
requires no additional outlay in the decoder and
consequently does not lead to an increase in the
complexity and hence the costs in the end unit, that is
to say a decoder in a set-top box or in a televlsion
set/video recorder/DVD player (Digital VideoDisk).
In principle, the method according to the
invention consists in the fact that, for coding digital
CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
video signals, in which, in each case with a defined
sequence, an intraframe-coded picture and at least one
other picture coding type for a further picture or
further pictures are used in a group of successive
pictures, pixel values which change greatly from one
picture to another picture or greatly changing picture
contents are determined in a detector and the further
coding is controlled in such a way that intraframe
coding is carried out for a picture having greatly
changing pixel values or greatly changing picture
contents, independently of the position of this picture
within the group.
Advantageous developments of the method
according to the invention emerge from the associated
dependent claims.
In principle, the inventive apparatus for
coding digital video signals, in which, in each case
with a defined sequence, an intraframe-coded picture
(I) and at least one other picture coding type (P, B)
for a further picture or further pictures are used in a
group of successive pictures, is provided with:
- a detector, which determines pixel values whlch
change greatly from one picture to another picture or
greatly changing picture contents;
- a coder, which is controlled by the detector in such
a way that intraframe coding is carried out for a
picture having greatly changing pixel values or greatly
changing picture contents, independently of the
position of this picture within the group.
Advantageous developments of the apparatus
according to the invention emerge from the associated
dependent claim.
Drawing
An exemplary embodiment of the invention is
described with reference to the drawing, in which:
Figure 1 shows an example of a scene-controlled MPEG
coder.
CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
Exemplary embodiments
The video signal is drawn from a picture
generator 1, which may be a film scanner, a television
camera or any desired analog picture source (for
example tape recording device). This signal is first of
all fed to an analog/digital converter 2 in order that
digital input data can be made available for the coder.
If the signal generator which is available is already a
digital signal source, the latter can be connected to
the circuit arrangement via input 3. The signal at
input 3 is then fed to a scene detector lu, wrlic.~.
comprises, for example, a frame memory 4, a subtractlon
stage 5 and a threshold value decision circuit 6. The
current frame n of the signal source is available at
the input of the frame memory 4 and the preceding frame
n-1 delayed by one frame period is available at the
output. The sum of the absolute value differences
between the pixels of the two frames, for example, is
calculated by means of the subtraction stage 5. This
summation value is then fed to the threshold value
decision circuit 6. Depending on this summation value,
the threshold value decision circuit generates a
control signal for the MPEG coder 8. The characteristic
of the threshold value decision circuit is in this case
dimensioned in such a way that differences in the same
picture scene which are caused, for example, by moving
objects or by slow camera panning trigger no cGntrol
signal or a first control signal (in the case of a
relatively small summation value), but that a scene
change which is characterized by completely different
picture information leads to an unambiguous control
signal or a second control signal (in the case of a
relatively large summation value).
In order that the control signal is present
contemporaneously with the scene change at the coder 8,
the frame of the signal source must be correspondingly
delayed by one frame period in the buffer 7. The coder
CA 022064~4 1997-0~-29
PD960037 D94/156A-Ha-300197
8 then identifies from the control signal whether the
frame present at its input belGngs to a new picture
scene and, if this is the case, causes the coding
algorithm to begin with a new Group of Pictures and
therefore with an I-coded picture. The data-red~cei
output signal is available at output 9.
In order, for example, to identify greater
camera panning as such, the scene detector 10 can also
carry out internally a global video signal prediction
in a global predictor 11. Such a global predictor is
described, for example, in EP-A-0 414 113. For this
purpose, the input signal and the output signal of
memory 4 are fed to the predictor 11. If this predictor
identifies global motion in the picture, the
subtraction stage 5, the threshold value decision
circuit 6 and/or the coder 8 are controlled in such a
way that the coder 8 does not deviate from the nGrmal
I-picture sequence. The global motion parameters can be
forwarded by the predictor 11 to the coder 8. The
advantage in doing this is that pictures with global
motion can be coded with good efficiency as P or B
pictures and, therefore, the number of I-coded pictures
and hence the data rate do not have to be increased
unnecessarily, that is to say that detectable (global)
camera panning is not interpreted as a scene change. It
is advantageously possible for the result of this
global prediction to be included during "normal"
prediction in the coder. This enables the prediction in
the coder 8 to be simplified or improved.
In further exemplary embodiments of the
invention, the storage capacity for the scene detector
can be reduced. A scene change can also be identified
on the basis of fields. The memories 4 and 7 are then
field memories and the pixel values of two adjacent
fields are then correspondingly processed with one
another in the subtractor 5. The two fields may also
originate from adjacent frames.
CA 022064~4 l997-0~-29
PD960037 D94/156A-Ha-300197
A different or a further reduction in the
storage capacity is possible with the aid of video data
reduction upstream of the input of memory 4 and
possibly also buffer 7, which can be carried out by
means of horizontal and/or vertical low-pass filtering
of the video data in conjunction wlth subsa~.pllng
the corresponding direction.
In the case of a scene change, either the I
coding can be additionally inserted into the sequence
present. Alternatively, the normal sequence can be
continued again from the scene change, in other words,
for example, every 12 pictures an I picture, the
picture with the scene change being the first of these
I pictures.
The invention is not just restricted to the
studio sector, but can also be employed in the consumer
sector on data media, in particular optical disks,
which contain video data coded according to the
invention. For example, the invention can be used to
improve the quality of the recording of video signals
on a digital home video recorder or DVD recorder in the
manner illustrated. In this case, the scene de-ector
not to be regarded as part of the picture generator,
but rather is implemented as an additional circu't
element in the recording unit.
In the case of received video signal which are
coded with a fixed GOP length, these signals may first
of all be decoded and then encoded with a variable GOP
length for the recording. During the recording of
digital video signals which are coded according to the
invention, are provided with corresponding, variable
GOP length information and are publicly transmitted, or
in the case of the reproduction only of prerecorded
data media, the scene detector in the recording unit
can then be omitted or it is possible not to evaluate
its output signal for the recording.
The invention is not restricted to the MPEG2 or
MPEG1 coding standard. The invention can be applied to
CA 02206454 1997-05-29
PD960037 D94/156A-Ha-300197
all coding processes which perform segmentation of the
video data into groups of pictures, for example MPEG4.