Language selection

Search

Patent 3172605 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 3172605
(54) English Title: VIDEO JITTER DETECTION METHOD AND DEVICE
(54) French Title: PROCEDE ET DISPOSITIF DE DETECTION DE SCINTILLEMENT VIDEO
Status: Granted and Issued
Bibliographic Data
(51) International Patent Classification (IPC):
  • H4N 5/14 (2006.01)
  • G6T 7/20 (2017.01)
(72) Inventors :
  • MU, CHONG (China)
  • ZHOU, XUYANG (China)
  • LIU, ERLONG (China)
  • GUO, WENZHE (China)
(73) Owners :
  • 10353744 CANADA LTD.
(71) Applicants :
  • 10353744 CANADA LTD. (Canada)
(74) Agent: JAMES W. HINTONHINTON, JAMES W.
(74) Associate agent:
(45) Issued: 2024-01-02
(86) PCT Filing Date: 2020-06-11
(87) Open to Public Inspection: 2020-12-24
Examination requested: 2022-09-21
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/CN2020/095667
(87) International Publication Number: CN2020095667
(85) National Entry: 2022-09-21

(30) Application Priority Data:
Application No. Country/Territory Date
201910546465.X (China) 2019-06-21

Abstracts

English Abstract

Disclosed in the present invention are a video jitter detection method and device. The method comprises: framing a video requiring detection to obtain a frame sequence; performing feature point detection on the frame sequence frame by frame to obtain a feature point of each frame, and generating a frame feature point sequence matrix; performing operation on the frame feature point sequence matrix on the basis of an optical flow tracking algorithm to obtain a motion vector of each frame; obtaining the feature value of the video requiring detection according to the motion vector of each frame; and obtaining an output signal by means of operation by using the feature value of the video requiring detection as an input signal of a detection model, and determining whether jitter occurs to the video requiring detection according to the output signal. According to the present invention, the use of feature point detection and the use of an optical flow tracking algorithm for feature points effectively solve the problem of tracking failure caused by excessive changes between two adjacent frames, and the present invention has good sensitivity and robustness when performing detection on videos captured in cases such as sudden large displacement, strong shaking, and large rotation of a camera lens.


French Abstract

La présente invention concerne un procédé et un dispositif de détection de scintillement vidéo. Le procédé comprend les étapes suivantes : mettre en trame une vidéo nécessitant une détection afin d'obtenir une séquence de trames ; effectuer une détection de point caractéristique sur la séquence de trames, trame par trame, afin d'obtenir un point caractéristique de chaque trame, et générer une matrice de séquence de points caractéristiques de trame ; effectuer une opération sur la matrice de séquence de points caractéristiques de trame sur la base d'un algorithme de suivi de flux optique afin d'obtenir un vecteur de mouvement de chaque trame ; obtenir la valeur caractéristique de la vidéo nécessitant une détection selon le vecteur de mouvement de chaque trame ; et obtenir un signal de sortie au moyen d'une opération en utilisant la valeur caractéristique de la vidéo nécessitant une détection en tant que signal d'entrée d'un modèle de détection, et déterminer si le scintillement se produit sur la vidéo nécessitant une détection selon le signal de sortie. Selon la présente invention, l'utilisation d'une détection de point caractéristique et l'utilisation d'un algorithme de suivi de flux optique pour des points caractéristiques résolvent efficacement le problème de suivi d'une défaillance provoqué par des changements excessifs entre deux trames adjacentes, et la présente invention présente une bonne sensibilité et une bonne robustesse lors de la réalisation d'une détection sur des vidéos capturées dans des cas tels qu'un déplacement important soudain, un tremblement fort, et une rotation importante d'un objectif de caméra.

Claims

Note: Claims are shown in the official language in which they were submitted.


Claims:
1. A device for detecting video jitter, comprising:
a framing processing module configured to perform a framing process on a to-be-
detected
video to obtain a frame sequence;
a feature point detecting module configured to perform feature point detection
on the frame
sequence frame by frame, obtain feature points of each frame by employing a
feature point
detecting algorithm including fused FAST (Features from Accelerated Segment
Test) and
SURF (Speeded-Up Robust Features), and generate a frame feature point sequence
matrix;
a vector calculating module configured to perform an operation on the frame
feature point
sequence matrix to obtain a motion vector of each frame based on an optical
flow tracking
algorithm;
a feature value extracting module configured to obtain a feature value of the
to-be-detected
video according to the motion vector of each frame; and
a jitter detecting module configured to take the feature value of the to-be-
detected video as
an input signal of a detection model to perform operation to obtain an output
signal, and
judge whether jitter occurs to the to-be-detected video according to the
output signal.
2. The device of claim 1, wherein the device further comprises a data
preprocessing module
configured to preprocess the frame sequence.
3. The device of any one of claims 1 to 2, wherein the/a data preprocessing
module further
comprises:
a grayscale-processing unit configured to grayscale-process the frame
sequence, and obtain
a grayscale frame sequence; and
a denoising processing unit configured to denoise the grayscale frame
sequence;
wherein the feature point detecting module is configured to perform feature
point detection
on the preprocessed frame sequence frame by frame.
24

4. The device of any one of claims 1 to 3, wherein the feature point
detecting module is further
configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from
Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform
feature point detection on the frame sequence frame by frame, and to obtain
the feature
points of each frame.
5. The device of any one of claims 1 to 4, wherein the vector calculating
module further
comprises:
an optical flow tracking unit configured to perform optical flow tracking
calculation on the
frame feature point sequence matrix of each frame, and obtain an initial
motion vector of
each frame;
an accumulation calculating unit configured to obtain a corresponding
accumulative
motion vector according to the initial motion vector;
a smoothening processing unit configured to smoothen the accumulative motion
vector,
and obtain a smoothened motion vector; and
a vector readjusting unit configured to employ the accumulative motion vector
and the
smoothened motion vector to readjust the initial motion vector of each frame,
and obtain
the motion vector of each frame.
6. The device of any one of claims 1 to 5, wherein the feature value
extracting module further
comprises:
a matrix converting unit configured to merge and convert the motion vectors of
all frames
into a matrix;
a standard deviation calculating unit configured to calculate unbiased
standard deviations
of various elements in the matrix; and

a weighting and fusing unit configured to weight and fuse the unbiased
standard deviations
of the various elements, and obtain a weighted value.
7. The device of any one of claims 1 to 6, wherein a to-be-detected video
is obtained.
8. The device of any one of claims 1 to 7, wherein a framing extraction
process is performed on
the to-be-detected video to obtain a frame sequence corresponding to the to-be-
detected video.
9. The device of any one of claims 1 to 8, wherein the frame sequence is
expressed as L, (1=1, 2,
3, ..., n) , where Li represents the ith frame of the video, and n represents
the total number of
frames of the video.
10. The device of any one of claims 1 to 9, wherein a current frame and an
adjacent next frame is
selected from the to-be-detected video.
11. The device of any one of claims 1 to 10, wherein a current frame and a
next frame by an
interval of N frames is selected from the to-be-detected video.
12. The device of any one of claims 1 to 11, wherein corresponding feature
points are obtained
from the current frame and the adjacent next frame.
13. The device of any one of claims 1 to 12, wherein corresponding feature
points are obtained
from the current frame and the next frame by an interval of N frames.
14. The device of any one of claims 1 to 13, wherein corresponding matching is
performed
according to the feature points of the two frames to judge whether offset
(jitter) occurs
between the current frame and the adjacent next frame.
15. The device of any one of claims 1 to 14, wherein corresponding matching is
performed
according to the feature points of the two frames to judge whether offset
(jitter) occurs
between the current frame and the next frame by an interval of N frames.
16. The device of any one of claims 1 to 15, wherein a feature point detecting
algorithm is
employed to perform feature point detection on the processed frame sequence Li
(i=1, 2, 3, ...,
n) frame by frame.
26

17. The device of any one of claims 1 to 16, wherein a feature point detecting
algorithm is
employed to obtain feature points of each frame, that is, feature points of
each frame of image
are extracted.
18. The device of any one of claims 1 to 17, wherein a feature point detecting
algorithm is
employed to generate a frame feature point sequence matrix, which is expressed
as Z (i=1,
2, ..., n).
19. The device of any one of claims 1 to 18, wherein Z (i=1, 2, ..., n) is
expressed as:
<IMG>
Wherein apq
represents a feature point detection result at row p , column q of the ith
frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of the
matrix, and q represents the number of columns of the matrix.
20. The device of any one of claims 1 to 19, wherein the optical flow tracking
algorithm is
employed to perform optical flow tracking calculation on the frame feature
point sequence
matrix.
21. The device of any one of claims 1 to 20, wherein the optical flow tracking
algorithm is
employed to track the change of feature points in the current frame to the
next frame.
22. The device of any one of claims 1 to 21, wherein the change of a feature
point sequence matrix
in the ith frame is tracked to the next i+1 th frame, and a motion vector ai
is obtained, whose
expression is as follows:
27

<IMG>
where dr represents a Euclidean column offset from the ith frame to the i+lth
frame, cly'
represents a Euclidean row offset from the PI frame to the i+lth frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
23. The device of any one of claims 1 to 22, wherein an extracted feature
value at least includes
the feature value of four dimensions.
24. The device of any one of claims 1 to 23, wherein the detection model is
well trained in advance.
25. The device of any one of claims 1 to 24, wherein sample video data in a
set of selected training
data is correspondingly processed to obtain a feature value of the sample
video data.
26. The device of any one of claims 1 to 25, wherein a detection model is
trained according to the
feature value of the/a sample video data and a corresponding annotation result
of the sample
video data to obtain the final detection model.
27. The device of any one of claims 1 to 26, wherein after the motion vectors
have been
dimensionally converted for an 111th video sample, unbiased standard
deviations of various
elements and their weighted and fused value are calculated and obtained, which
are
<IMG>
respectively expressed as
and Km , and the
annotation resultym of the 111th video sample is extracted to obtain the
training sample of the
111th video sample, which is expressed as follows:
<IMG>
28. The device of any one of claims 1 to 27, wherein the annotation result ym
indicates that no
jitter occurs to the video sample if ym=0.
28

29. The device of any one of claims 1 to 28, wherein the annotation result ym
indicates that jitter
occurs to the video sample if ym=1.
30. The device of any one of claims 1 to 29, wherein a video sample makes use
of features of at
least five dimensions.
31. The device of any one of claims 1 to 30, wherein the detection model is
selected from an SVM
(Support Vector Machine) model.
32. The device of any one of claims 1 to 31, wherein the feature value of the
to-be-detected video
is input to a well-trained SVM model to obtain an output result.
33. The device of any one of claims 1 to 32, wherein if the/a SVM model output
result is 0, this
indicates that no jitter occurs to the to-be-detected video.
34. The device of any one of claims 1 to 33, wherein if the/a SVM model output
result is 1, this
indicates that jitter occurs to the to-be-detected video.
35. The device of any one of claims 1 to 34, wherein the use of a trainable
SVM model as a video
jitter decider enables jitter detection of videos captured in different
scenarios.
36. The device of any one of claims 1 to 35, wherein the amount of information
of the image is
greatly reduced after grayscale-processing.
37. The device of any one of claims 1 to 36, wherein the frame sequence L,
(i=1, 2, 3, ..., n) is
further grayscale-processed.
38. The device of any one of claims 1 to 37, wherein a grayscale frame
sequence is obtained to
be expressed as G, (1=1, 2, 3,
n), in which the grayscale conversion expression is as follows:
G=Rx0.299+Gx0.587+Bx0.114
39. The device of any one of claims 1 to 38, wherein a TV denoising method
based on a total
variation model is employed to denoise the grayscale frame sequence G, (1=1,
2, 3, ..., n) to
obtain a denoised frame sequence expressed as T, (1=1, 2, 3, ..., n).
29

40. The device of any one of claims 1 to 39, wherein the/a denoised frame
sequence is a
preprocessed frame sequence to which the to-be-detected video corresponds.
41. The device of any one of claims 1 to 40, wherein the denoi sing method is
randomly selectable.
42. The device of any one of claims 4 to 41, wherein the SURF algorithm is an
improvement
over SIFT (Scale-Invariant Feature Transform) algorithm.
43. The device of claim 42, wherein SIFT is a feature describing method with
excellent robustness
and invariant scale.
44. The device of any one of claims 42 to 43, wherein the SURF algorithm
improves over the
problems of large amount of data to be calculated, high time complexity and
long duration of
calculation inherent in the SIFT algorithm at the same time of maintaining the
advantages of
the SIFT algorithm.
45. The device of any one of claims 4 to 44, wherein SURF is more excellent in
performance in
the aspect of invariance of illumination change and perspective change,
particularly excellent
in processing severe blurs and rotations of images.
46. The device of any one of claims 4 to 45, wherein SURF is excellent in
describing local features
of images.
47. The device of any one of claims 4 to 46, wherein FAST feature detection is
a corner detection
method.
48. The device of any one of claims 4 to 47, wherein FAST feature detection
has the most
prominent advantage of its algorithm in its calculation efficiency, and the
capability to
excellently describe global features of images.
49. The device of any one of claims 4 to 48, wherein use of the feature point
detecting algorithm
in which FAST features and SURF features are fused to perform feature point
extraction gives
consideration to global features of images.

50. The device of any one of claims 4 to 49, wherein use of the feature point
detecting algorithm
in which FAST features and SURF features are fused to perform feature point
extraction fully
retains local features of images.
51. The device of any one of claims 4 to 50, wherein use of the feature point
detecting algorithm
in which FAST features and SURF features are fused to perform feature point
extraction has
small computational expense.
52. The device of any one of claims 4 to 51, wherein use of the feature point
detecting algorithm
in which FAST features and SURF features are fused to perform feature point
extraction has
strong robustness against image blurs and faint illumination.
53. The device of any one of claims 4 to 52, wherein use of the feature point
detecting algorithm
in which FAST features and SURF features are fused to perform feature point
extraction
enhances real-time property and precision of the detection.
54. The device of any one of claims 1 to 53, wherein while the optical flow
tracking calculation
is performed on the frame feature point sequence matrix, a pyramid optical
flow tracking LK
(Lucas-Kanade) algorithm is employed.
55. The device of any one of claims 1 to 54, wherein the change of a feature
point sequence matrix
Z in the ith frame is tracked to the next i+lth frame, and a motion vector ai
is obtained, whose
expression is as follows:
<IMG>
where dr' represents a Euclidean column offset from the ith frame to the i+ th
frame, dy'
represents a Euclidean row offset from the ith frame to the i+ th frame, and
dri represents
an angle offset from the PI frame to the i+ th frame.
31

56. The device of any one of claims 1 to 55, wherein use of the/a pyramid
optical flow tracking
LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure
solves the problem
of failed tracking due to unduly large change from feature points of a current
frame to feature
points of a next adjacent frame.
57. The device of any one of claims 1 to 56, wherein use of the/a pyramid
optical flow tracking
LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure
solves the problem
of failed tracking due to unduly large change from feature points of a current
frame to feature
points of a next frame by an interval of N frames.
58. The device of any one of claims 1 to 57, wherein accumulative integral
transformation is
performed on the initial motion vector ai of each frame to obtain an
accumulative motion
vector, expressed as ,
of each frame, in which the expression of the accumulative motion
vector f3i is as follows:
<IMG>
59. The device of any one of claims 1 to 58, wherein a sliding average window
is used to
smoothen the motion vector )3i to obtain a smoothened motion vector Y , whose
expression is:
32

<IMG>
where n represents the total number of frames of the video.
60. The device of any one of claims 1 to 59, wherein the radius of the smooth
window is r, and
its expression is:
<IMG>
where tt indicates a parameter ot the sliding window, and la is a positive
number.
61. The device of any one of claims 1 to 60, wherein the specific numerical
value of can be
dynamically adjusted according to practical requirement.
62. The device of any one of claims 1 to 61, wherein the specific numerical
value of jt is set as
=30.
63. The device of any one of claims 1 to 62, wherein
and Y are used to readjust ai to
obtain a readjusted motion vector 4, whose expression is:
33

<IMG>
64. The device of any one of claims 1 to 63, wherein the readjusted motion
vector Ai is taken as
the motion vector of each frame to participate in subsequent calculation, so
that the calculation
result is made more precise.
65. The device of any one of claims 1 to 64, wherein the motion vectors of all
frames obtained
are merged and converted into a matrix.
¨
66. The device of any one of claims 1 to 65, wherein the motion vector Ai is
converted into the
<IMG>
form of a matrix
and unbiased standard deviations of its elements
are calculated by rows, the specific calculation expression is as follows:
<IMG>
67. The device of any one of claims 1 to 66, wherein the unbiased standard
deviations of the
<IMG>
various elements in the matrix are respectively expressed as
and
<IMG>
, in which A represents the average value of the samples.
68. The device of any one of claims 1 to 67, wherein weig)its are assigned to
the unbiased standard
deviations of the various elements according to practical requirements.
34

69. The device of any one of claims 1 to 68, wherein the unbiased standard
deviations of the
various elements are weighted and fused according to the weights.
70. The device of any one of claims 1 to 69, wherein the weights of the
unbiased standard
deviations of the various elements can be dynamically readjusted according to
practical
requirements.
<IMG>
71. The device of any one of claims 1 to 70, wherein the weight of is
set as 3, the
<IMG> <IMG>
weight of is set as 3, and the weight of is
set as 10, then the fusing
expression is as follows:
<IMG>
72. The device of any one of claims 1 to 71, wherein the feature value of the
to-be-detected video
is the unbiased standard deviations of the various elements and their weighted
value, which
are expressed as:
{c4.1.(dx)]5 a[A.(dy)], cr[A(dr)], K.,}(s) .
73. A computer-readable storage medium for detecting video jitter, storing
therein a computer
program, wherein the computer program is configured to:
perform a framing process on a to-be-detected video to obtain a frame
sequence;
perform feature point detection on the frame sequence frame by frame, obtain
feature
points of each frame by employing a feature point detecting algorithm
including fused
FAST (Features from Accelerated Segment Test) and SURF (Speeded-Up Robust
Features), and generate a frame feature point sequence matrix;
perform an operation on the frame feature point sequence matrix to obtain a
motion vector
of each frame based on an optical flow tracking algorithm;

obtain a feature value of the to-be-detected video according to the motion
vector of each
frame; and
take the feature value of the to-be-detected video as an input signal of a
detection model to
perform operation to obtain an output signal, and judge whether jitter occurs
to the to-be-
detected video according to the output signal.
74. The computer-readable storage medium of claim 73, wherein the computer
program is further
configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and
denoise the grayscale frame sequence;
wherein the step of feature point detection on the frame sequence frame by
frame is feature
point detection on the preprocessed frame sequence frame by frame.
75. The computer-readable storage medium of any one of claims 73 to 74,
wherein the computer
program is further configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from
Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform
feature point detection on the frame sequence frame by frame, and to obtain
the feature
points of each frame.
76. The computer-readable storage medium of any one of claims 73 to 75,
wherein the computer
program is further configured to:
perform optical flow tracking calculation on the frame feature point sequence
matrix of
each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial
motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion
vector; and
36

employ the accumulative motion vector and the smoothened motion vector to
readjust the
initial motion vector of each frame, and obtain the motion vector of each
frame.
77. The computer-readable storage medium of any one of claims 73 to 76,
wherein the computer
program is further configured to:
merge and convert the motion vectors of all frames into a matrix, and
calculate unbiased
standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and
obtain a
weighted value; and
take the unbiased standard deviations of the various elements and the weighted
value as
the feature value of the to-be-detected video.
78. The computer-readable storage medium of any one of claims 73 to 77,
wherein a to-be-
detected video is obtained.
79. The computer-readable storage medium of any one of claims 73 to 78,
wherein a framing
extraction process is performed on the to-be-detected video to obtain a frame
sequence
corresponding to the to-be-detected video.
80. The computer-readable storage medium of any one of claims 73 to 79,
wherein the frame
sequence is expressed as Li (i=1, 2, 3, ..., n), where Li represents the ith
frame of the video,
and n represents the total number of frames of the video.
81. The computer-readable storage medium of any one of claims 73 to 80,
wherein a current frame
and an adjacent next frame is selected from the to-be-detected video.
82. The computer-readable storage medium of any one of claims 73 to 81,
wherein a current frame
and a next frame by an interval of N frames is selected from the to-be-
detected video.
83. The computer-readable storage medium of any one of claims 73 to 82,
wherein corresponding
feature points are obtained from a current frame and the adjacent next frame.
37

84. The computer-readable storage medium of any one of claims 73 to 83,
wherein corresponding
feature points are obtained from a current frame and the next frame by an
interval of N frames.
85. The computer-readable storage medium of any one of claims 73 to 84,
wherein corresponding
matching is performed according to the feature points of the two frames to
judge whether
offset (jitter) occurs between the current frame and the adjacent next frame.
86. The computer-readable storage medium of any one of claims 73 to 85,
wherein corresponding
matching is performed according to the feature points of the two frames to
judge whether
offset (jitter) occurs between the current frame and the next frame by an
interval of N frames.
87. The computer-readable storage medium of any one of claims 73 to 86,
wherein a feature point
detecting algorithm is employed to perform feature point detection on the
processed frame
sequence L, (i=1, 2, 3, ..., n) frame by frame.
88. The computer-readable storage medium of any one of claims 73 to 87,
wherein a feature point
detecting algorithm is employed to obtain feature points of each frame, that
is, feature points
of each frame of image are extracted.
89. The computer-readable storage medium of any one of claims 73 to 88,
wherein a feature point
detecting algorithm is employed to generate a frame feature point sequence
matrix, which is
expressed as Z (1A, 2, ..., n) .
90. The computer-readable storage medium of any one of claims 73 to 89,
wherein Z, (i=1, 2, ...,
n) is expressed as:
<IMG>
38

l
Wherein a ' q represents a feature point detection result at row p , column q
of the ith frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of the
matrix, and q represents the number of columns of the matrix.
91. The computer-readable storage medium of any one of claims 73 to 90,
wherein the optical
flow tracking algorithm is employed to perform optical flow tracking
calculation on the frame
feature point sequence matrix.
92. The computer-readable storage medium of any one of claims 73 to 91,
wherein the optical
flow tracking algorithm is employed to track the change of feature points in
the current frame
to the next frame.
93. The computer-readable storage medium of any one of claims 73 to 92,
wherein the change of
a feature point sequence matrix Z in the ith frame is tracked to the next
i+lth frame, and a
motion vector ai is obtained, whose expression is as follows:
<IMG>
where dr represents a Euclidean column offset from the ith frame to the i+lth
frame, dy'
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
94. The computer-readable storage medium of any one of claims 73 to 93,
wherein an extracted
feature value at least includes the feature value of four dimensions.
95. The computer-readable storage medium of any one of claims 73 to 94,
wherein the detection
model is well trained in advance.
39

96. The computer-readable storage medium of any one of claims 73 to 95,
wherein sample video
data in a set of selected training data is correspondingly processed to obtain
a feature value of
the sample video data.
97. The computer-readable storage medium of any one of claims 73 to 96,
wherein a detection
model is trained according to the feature value of the sample video data and a
corresponding
annotation result of the sample video data to obtain the final detection
model.
98. The computer-readable storage medium of any one of claims 73 to 97,
wherein after the
motion vectors have been dimensionally converted for an 171th video sample,
unbiased standard
deviations of various elements and their weighted and fused value are
calculated and obtained,
<IMG>
which are respectively expressed as
and Km , and
the annotation result))m of the 171th video sample is extracted to obtain the
training sample of
the 711th video sample, which is expressed as follows:
<IMG>
99. The computer-readable storage medium of any one of claims 73 to 98,
wherein the annotation
result))m indicates that no jitter occurs to the video sample if y.=0.
100. The computer-readable storage medium of any one of claims 73 to 99,
wherein the annotation
result))m indicates that jitter occurs to the video sample if )).=1.
101. The computer-readable storage medium of any one of claims 73 to 100,
wherein a video
sample makes use of features of at least five dimensions.
102. The computer-readable storage medium of any one of claims 73 to 101,
wherein the detection
model is selected from an SVM (Support Vector Machine) model.
103. The computer-readable storage medium of any one of claims 73 to 102,
wherein the feature
value of the to-be-detected video is input to a well-trained SVM model to
obtain an output
result.

104. The computer-readable storage medium of any one of claims 73 to 103,
wherein if the/a SVM
model output result is 0, this indicates that no jitter occurs to the to-be-
detected video.
105. The computer-readable storage medium of any one of claims 73 to 104,
wherein if the/a SVM
model output result is 1, this indicates that jitter occurs to the to-be-
detected video.
106. The computer-readable storage medium of any one of claims 73 to 105,
wherein the use of a
trainable SVM model as a video jitter decider enables jitter detection of
videos captured in
different scenarios.
107. The computer-readable storage medium of any one of claims 73 to 106,
wherein the amount
of information of the image is greatly reduced after grayscale-processing.
108. The computer-readable storage medium of any one of claims 73 to 107,
wherein the frame
sequence L, (1=1, 2, 3, ..., n) is further graysc ale-processed.
109. The computer-readable storage medium of any one of claims 73 to 108,
wherein a grayscale
frame sequence is obtained to be expressed as G, (i=1, 2, 3, ..., n), in which
the grayscale
conversion expression is as follows:
G = Rx 0.299+G x 0.587 B x a 114
110. The computer-readable storage medium of any one of claims 73 to 109,
wherein a TV
denoising method based on a total variation model is employed to denoise the
grayscale frame
sequence G, (i=1, 2, 3, ..., n) to obtain a denoised frame sequence expressed
as T, (1=1, 2,
3, ..., n).
111. The computer-readable storage medium of any one of claims 73 to 110,
wherein the/a
denoised frame sequence is a preprocessed frame sequence to which the to-be-
detected video
corresponds.
112. The computer-readable storage medium of any one of claims 73 to 111,
wherein the/a TV
denoising method is randomly selectable.
41

113. The computer-readable storage medium of any one of claims 75 to 112,
wherein the SURF
algorithm is an improvement over SIFT (Scale-Invariant Feature Transform)
algorithm.
114. The computer-readable storage medium of claim 113, wherein SIFT is a
feature describing
method with excellent robustness and invariant scale.
115. The computer-readable storage medium of any one of claims 113 to 114,
wherein the SURF
algorithm improves over the problems of large amount of data to be calculated,
high time
complexity and long duration of calculation inherent in the SIFT algorithm at
the same time
of maintaining the advantages of the SIFT algorithm.
116. The computer-readable storage medium of any one of claims 75 to 115,
wherein SURF is
more excellent in performance in the aspect of invariance of illumination
change and
perspective change, particularly excellent in processing severe blurs and
rotations of images.
117. The computer-readable storage medium of any one of claims 75 to 116,
wherein SURF is
excellent in describing local features of images.
118. The computer-readable storage medium of any one of claims 75 to 117,
wherein FAST feature
detection is a corner detection method.
119. The computer-readable storage medium of any one of claims 75 to 118,
wherein FAST feature
detection has the most prominent advantage of its algorithm in its calculation
efficiency, and
the capability to excellently describe global features of images.
120. The computer-readable storage medium of any one of claims 75 to 119,
wherein use of the
feature point detecting algorithm in which FAST features and SURF features are
fused to
perform feature point extraction gives consideration to global features of
images.
121. The computer-readable storage medium of any one of claims 75 to 120,
wherein use of the
feature point detecting algorithm in which FAST features and SURF features are
fused to
perform feature point extraction fully retains local features of images.
42

122. The computer-readable storage medium of any one of claims 75 to 121,
wherein use of the
feature point detecting algorithm in which FAST features and SURF features are
fused to
perform feature point extraction has small computational expense.
123. The computer-readable storage medium of any one of claims 75 to 122,
wherein use of the
feature point detecting algorithm in which FAST features and SURF features are
fused to
perform feature point extraction has strong robustness against image blurs and
faint
illuminati on.
124. The computer-readable storage medium of any one of claims 75 to 123,
wherein use of the
feature point detecting algorithm in which FAST features and SURF features are
fused to
perform feature point extraction enhances real-time property and precision of
the detection.
125. The computer-readable storage medium of any one of claims 73 to 124,
wherein while the
optical flow tracking calculation is perfoimed on the frame feature point
sequence matrix, a
pyramid optical flow tracking LK (Lucas-Kanade) algorithm is employed.
126. The computer-readable storage medium of any one of claims 73 to 125,
wherein the change
of a feature point sequence matrix Z in the ith frame is tracked to the next
i+lth frame, and a
motion vector (xi is obtained, whose expression is as follows:
<IMG>
where cbcr represents a Euclidean column offset from the ith frame to the
i+lth frame, dy'
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dr' represents
an angle offset from the th frame to the i+lth frame.
43

127. The computer-readable storage medium of any one of claims 73 to 126,
wherein use of the/a
pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the
pyramid iterative
structure solves the problem of failed tracking due to unduly large change
from feature points
of a current frame to feature points of an adjacent next frame.
128. The computer-readable storage medium of any one of claims 73 to 127,
wherein use of the/a
pyramid optical flow tracking LK (Lucas-Kanade) algorithm that utilizes the
pyramid iterative
structure solves the problem of failed tracking due to unduly large change
from feature points
of a current frame to feature points of a next frame by an interval of N
frames.
129. The computer-readable storage medium of any one of claims 73 to 128,
wherein accumulative
integral transformation is performed on the initial motion vector ai of each
frame to obtain
an accumulative motion vector, expressed as fii , of each frame, in which the
expression of
the accumulative motion vector fli is as follows:
<IMG>
130. The computer-readable storage medium of any one of claims 73 to 129,
wherein a sliding
average window is used to smoothen the motion vector 13i to obtain a
smoothened motion
vector Y , whose expression is:
44

<IMG>
where n represents the total number of frames of the video.
131. The computer-readable storage medium of any one of claims 73 to 130,
wherein the radius of
the smooth window is r, and its expression is:
<IMG>
where indicates a parameter of the sliding window, and la is a positive
number.
132. The computer-readable storage medium of any one of claims 73 to 131,
wherein the specific
numerical value of can be dynamically adjusted according to practical
requirement.
133. The computer-readable storage medium of any one of claims 73 to 132,
wherein the specific
numerical value of is set as .=30.
134. The computer-readable storage medium of any one of claims 73 to 133,
wherein 13 and
are used to readjust ai to obtain a readjusted motion vector 114, whose
expression is:

<IMG>
135. The computer-readable storage medium of any one of claims 73 to 134,
wherein the readjusted
motion vector Ai is taken as the motion vector of each frame to participate in
subsequent
calculation, so that the calculation result is made more precise.
136. The computer-readable storage medium of any one of claims 73 to 135,
wherein the motion
vectors of all frames obtained are merged and converted into a matrix.
137. The computer-readable storage medium of any one of claims 73 to 136,
wherein the motion
<IMG>
vector is converted into the form of a matrix
and unbiased
standard deviations of its elements are calculated by rows, the specific
calculation expression
is as follows:
<IMG>
46

138. The computer-readable storage medium of any one of claims 73 to 137,
wherein the unbiased
standard deviations of the various elements in the matrix are respectively
expressed as
<IMG>
, in which A represents the average value of the
samples.
139. The computer-readable storage medium of any one of claims 73 to 138,
wherein weights are
assigned to the unbiased standard deviations of the various elements according
to practical
requirements.
140. The computer-readable storage medium of any one of claims 73 to 139,
wherein the unbiased
standard deviations of the various elements are weighted and fused according
to the weights.
141. The computer-readable storage medium of any one of claims 73 to 140,
wherein the weights
of the unbiased standard deviations of the various elements can be dynamically
readjusted
according to practical requirements.
142. The computer-readable storage medium of any one of claims 73 to 141,
wherein the weight
<IMG> <IMG>
of is set as 3, the weight of is
set as 3, and the weight of
<IMG>
is set as 10, then the fusing expression is as follows:
<IMG>
143. The computer-readable storage medium of any one of claims 73 to 142,
wherein the feature
value of the to-be-detected video is the unbiased standard deviations of the
various elements
and their weighted value, which are expressed as:
fo-[Ä(dx)], a[A.(dy)], o[2(dr)], ic,}(s)
144. The computer-readable storage medium of any one of claims 73 to 143,
wherein the computer-
readable storage medium is a read-only memory.
47

145. The computer-readable storage medium of any one of claims 73 to 144,
wherein the computer-
readable storage medium is a magnetic disk.
146. The computer-readable storage medium of any one of claims 73 to 145,
wherein the computer-
readable storage medium is an optical disk.
147.A computer device for detecting video jitter, comprising:
one or more processors, configured to:
perform a framing process on a to-be-detected video to obtain a frame
sequence;
perform feature point detection on the frame sequence frame by frame, obtain
feature points of each frame by employing a feature point detecting algorithm
including fused FAST (Features from Accelerated Segment Test) and SURF
(Speeded-Up Robust Features), and generate a frame feature point sequence
matrix;
perform an operation on the frame feature point sequence matrix to obtain a
motion vector of each frame based on an optical flow tracking algorithm;
obtain a feature value of the to-be-detected video according to the motion
vector of each frame; and
take the feature value of the to-be-detected video as an input signal of a
detection model to perfolln operation to obtain an output signal, and judge
whether jitter occurs to the to-be-detected video according to the output
signal.
148. The computer device of claim 147, wherein the computer device is further
configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and
denoise the grayscale frame sequence;
48

wherein the step of feature point detection on the frame sequence frame by
frame is feature
point detection on the preprocessed frame sequence frame by frame.
149. The computer device of any one of claims 147 to 148, wherein the computer
device is further
configured to:
employ the feature point detecting algorithm in which are the fused FAST
(Features from
Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform
feature point detection on the frame sequence frame by frame, and to obtain
the feature
points of each frame.
150. The computer device of any one of claims 147 to 149, wherein the computer
device is further
configured to:
perform optical flow tracking calculation on the frame feature point sequence
matrix of
each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial
motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion
vector; and
employ the accumulative motion vector and the smoothened motion vector to
readjust the
initial motion vector of each frame, and obtain the motion vector of each
frame.
151. The computer device of any one of claims 147 to 150, wherein the computer
device is further
configured to:
merge and convert the motion vectors of all frames into a matrix, and
calculate unbiased
standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and
obtain a
weighted value; and
take the unbiased standard deviations of the various elements and the weighted
value as
the feature value of the to-be-detected video.
49

152. The computer device of any one of claims 147 to 151, wherein a to-be-
detected video is
obtained.
153. The computer device of any one of claims 147 to 152, wherein a framing
extraction process
is performed on the to-be-detected video to obtain a frame sequence
corresponding to the to-
be-detected video.
154. The computer device of any one of claims 147 to 153, wherein the frame
sequence is
expressed as Li (1=1, 2, 3, ..., n) , where L, represents the ith frame of the
video, and n represents
the total number of frames of the video.
155. The computer device of any one of claims 147 to 154, wherein a current
frame and an adjacent
next frame is selected from the to-be-detected video.
156. The computer device of any one of claims 147 to 155, wherein a current
frame and a next
frame by an interval of N frames is selected from the to-be-detected video.
157. The computer device of any one of claims 147 to 156, wherein
corresponding feature points
are obtained from the current frame and the adjacent next frame.
158. The computer device of any one of claims 147 to 157, wherein
corresponding feature points
are obtained from the current frame and the next frame by an interval of N
frames.
159. The computer device of any one of claims 147 to 158, wherein
corresponding matching is
performed according to the feature points of the two frames to judge whether
offset (jitter)
occurs between the current frame and the adjacent next frame.
160. The computer device of any one of claims 147 to 159, wherein
corresponding matching is
performed according to the feature points of the two frames to judge whether
offset (jitter)
occurs between the current frame and the next frame by an interval of N
frames.
161. The computer device of any one of claims 147 to 160, wherein a feature
point detecting
algorithm is employed to perform feature point detection on the processed
frame sequence L,
(i=1, 2, 3, ..., n) frame by frame.

162. The computer device of any one of claims 147 to 161, wherein a feature
point detecting
algorithm is employed to obtain feature points of each frame, that is, feature
points of each
frame of image are extracted.
163. The computer device of any one of claims 147 to 162, wherein a feature
point detecting
algorithm is employed to generate a frame feature point sequence matrix, which
is expressed
as Z, (i=1, 2, ..., n).
164. The computer device of any one of claims 147 to 163, wherein Z, (1=1, 2,
..., n) is expressed
as:
<IMG>
Wherein p'q represents a feature point detection result at row p, column q of
the ith frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of the
matrix, and q represents the number of columns of the matrix.
165. The computer device of any one of claims 147 to 164, wherein the optical
flow tracking
algorithm is employed to perform optical flow tracking calculation on the
frame feature point
sequence matrix.
166. The computer device of any one of claims 147 to 165, wherein the optical
flow tracking
algorithm is employed to track the change of feature points in the current
frame to the next
frame.
167. The computer device of any one of claims 147 to 166, wherein the change
of a feature point
sequence matrix Z in the ith frame is tracked to the next i-Elth frame, and a
motion vector ai
is obtained, whose expression is as follows:
51

<IMG>
where dr represents a Euclidean column offset from the ith frame to the i+lth
frame, dy'
represents a Euclidean row offset from the PI frame to the i+lth frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
168. The computer device of any one of claims 147 to 167, wherein an extracted
feature value at
least includes the feature value of four dimensions.
169. The computer device of any one of claims 147 to 168, wherein the
detection model is well
trained in advance.
170. The computer device of any one of claims 147 to 169, wherein sample video
data in a set of
selected training data is correspondingly processed to obtain a feature value
of the sample
video data.
171. The computer device of any one of claims 147 to 170, wherein a detection
model is trained
according to the feature value of the sample video data and a corresponding
annotation result
of the sample video data to obtain the final detection model.
172. The computer device of any one of claims 147 to 171, wherein after the
motion vectors have
been dimensionally converted for an 177th video sample, unbiased standard
deviations of
various elements and their weighted and fused value are calculated and
obtained, which are
<IMG>
respectively expressed as
and m , and the
annotation resultym of the 117th video sample is extracted to obtain the
training sample of the
177th video sample, which is expressed as follows:
fa[2.(dx)]?õ, a[A(dy)].õ, ()POOL, Km ym)(m)
52

173. The computer device of any one of claims 147 to 172, wherein the
annotation result yn,
indicates that no jitter occurs to the video sample if
174. The computer device of any one of claims 147 to 173, wherein the
annotation result yn,
indicates that jitter occurs to the video sample if y .
175. The computer device of any one of claims 147 to 174, wherein a video
sample makes use of
features of at least five dimensions.
176. The computer device of any one of claims 147 to 175, wherein the
detection model is selected
from an SVM (Support Vector Machine) model.
177. The computer device of any one of claims 147 to 176, wherein the feature
value of the to-be-
detected video is input to a well-trained SVM model to obtain an output
result.
178. The computer device of any one of claims 147 to 177, wherein if the/a SVM
model output
result is 0, this indicates that no jitter occurs to the to-be-detected video.
179. The computer device of any one of claims 147 to 178, wherein if the/a SVM
model output
result is 1, this indicates that jitter occurs to the to-be-detected video.
180. The computer device of any one of claims 147 to 179, wherein the use of a
trainable SVM
model as a video jitter decider enables jitter detection of videos captured in
different scenarios.
181. The computer device of any one of claims 147 to 180, wherein the amount
of information of
the image is greatly reduced after grayscale-processing.
182. The computer device of any one of claims 147 to 181, wherein the frame
sequence L, (1=1, 2,
3, ..., n) is further grayscale-processed.
183. The computer device of any one of claims 147 to 182, wherein a grayscale
frame sequence is
obtained to be expressed as G, (i=1, 2, 3, ..., n), in which the grayscale
conversion expression
is as follows:
G=Rx0.299+Gx0.587+Bx0.1.1A
53

184. The computer device of any one of claims 147 to 183, wherein a TV
denoising method based
on a total variation model is employed to denoise the grayscale frame sequence
G, (1A, 2,
3, ..., n) to obtain a denoised frame sequence expressed as T, (1=1, 2, 3,
..., n) .
185. The computer device of any one of claims 147 to 184, wherein the/a
denoised frame sequence
is a preprocessed frame sequence to which the to-be-detected video
corresponds.
186. The computer device of any one of claims 147 to 185, wherein the
denoising method is
randomly selectable.
187. The computer device of any one of claims 149 to 186, wherein the SURF
algorithm is an
improvement over SIFT (Scale-Invariant Feature Transform) algorithm.
188. The computer device of claim 187, wherein SIFT is a feature describing
method with excellent
robustness and invariant scale.
189. The computer device of any one of claims 187 to 188, wherein the/a SURF
algorithm
improves over the problems of large amount of data to be calculated, high time
complexity
and long duration of calculation inherent in the SIFT algorithm at the same
time of
maintaining the advantages of the SIFT algorithm.
190. The computer device of any one of claims 149 to 189, wherein SURF is more
excellent in
performance in the aspect of invariance of illumination change and perspective
change,
particularly excellent in processing severe blurs and rotations of images.
191. The computer device of any one of claims 149 to 190, wherein SURF is
excellent in describing
local features of images.
192. The computer device of any one of claims 149 to 191, wherein FAST feature
detection is a
corner detection method.
193. The computer device of any one of claims 149 to 192, wherein FAST feature
detection has
the most prominent advantage of its algorithm in its calculation efficiency,
and the capability
to excellently describe global features of images.
54

194. The computer device of any one of claims 149 to 193, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction gives consideration to global features of images.
195. The computer device of any one of claims 149 to 194, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction fully retains local features of images.
196. The computer device of any one of claims 149 to 195, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction has small computational expense.
197. The computer device of any one of claims 149 to 196, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction has strong robustness against image blurs and faint
illumination.
198. The computer device of any one of claims 149 to 197, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction enhances real-time property and precision of the detection.
199. The computer device of any one of claims 147 to 198, wherein while the
optical flow tracking
calculation is performed on the frame feature point sequence matrix, a pyramid
optical flow
tracking LK (Lucas-Kanade) algorithm is employed.
200. The computer device of any one of claims 147 to 199, wherein the change
of a feature point
sequence matrix Z, in the ith frame is tracked to the next i+1th frame, and a
motion vector cti
is obtained, whose expression is as follows:
<IMG>

where dxt represents a Euclidean column offset from the ith frame to the i+lth
frame, clyi
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
201. The computer device of any one of claims 147 to 200, wherein use of the/a
pyramid optical
flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative
structure solves
the problem of failed tracking due to unduly large change from feature points
of a current
frame to feature points of an adjacent next frame.
202. The computer device of any one of claims 147 to 201, wherein use of the/a
pyramid optical
flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative
structure solves
the problem of failed tracking due to unduly large change from feature points
of a current
frame to feature points of a next frame by an interval of N frames.
203. The computer device of any one of claims 147 to 202, wherein accumulative
integral
transfommtion is perfomied on the initial motion vector ai of each frame to
obtain an
accumulative motion vector, expressed as fli , of each frame, in which the
expression of the
accumulative motion vector /3i is as follows:
<IMG>
204. The computer device of any one of claims 147 to 203, wherein a sliding
average window is
used to smoothen the motion vector to
obtain a smoothened motion vector Y , whose
expression is:
56

<IMG>
where n represents the total number of frames of the video.
205. The computer device of any one of claims 147 to 204, wherein the radius
of the smooth
window is r, and its expression iS:
<IMG>
where ix indicates a parameter of the sliding window, and is a positive
number.
206. The computer device of any one of claims 147 to 205, wherein the specific
numerical value
of can be dynamically adjusted according to practical requirement.
207. The computer device of any one of claims 147 to 206, wherein the specific
numerical value
of is set as =30.
208. The computer device of any one of claims 147 to 207, wherein /3 and
are used to
readjust ai to obtain a readjusted motion vector whose expression is:
57

<IMG>
209. The computer device of any one of claims 147 to 208, wherein the
readjusted motion vector
is taken as the motion vector of each frame to participate in subsequent
calculation, so that
the calculation result is made more precise.
210. The computer device of any one of claims 147 to 209, wherein the motion
vectors of all frames
obtained are merged and converted into a matrix.
211. The computer device of any one of claims 147 to 210, wherein the motion
vector /14 is
<IMG>
converted into the form of a matrix ,
and unbiased standard deviations
of its elements are calculated by rows, the specific calculation expression is
as follows:
<IMG>
58

212. The computer device of any one of claims 147 to 211, wherein the unbiased
standard
<IMG>
deviations of the various elements in the matrix are respectively expressed as
<IMG>
, in which A represents the average value of the samples.
213. The computer device of any one of claims 147 to 212, wherein weights are
assigned to the
unbiased standard deviations of the various elements according to practical
requirements.
214. The computer device of any one of claims 147 to 213, wherein the unbiased
standard
deviations of the various elements are weighted and fused according to the
weights.
215. The computer device of any one of claims 147 to 214, wherein the weights
of the unbiased
standard deviations of the various elements can be dynamically readjusted
according to
practical requirements.
<IMG>
216. The computer device of any one of claims 147 to 215, wherein the weight
of is
<IMG> <IMG>
set as 3, the weight of is set as 3, and the weight of is
set as 10,
then the fusing expression is as follows:
<IMG>
217. The computer device of any one of claims 147 to 216, wherein the feature
value of the to-be-
detected video is the unbiased standard deviations of the various elements and
their weighted
value, which are expressed as:
{0[.1(dx)], cr[A(dy)], cy[2(dr)], KS)(s) .
218.A computer system for detecting video jitter, comprising:
a processor, configured to:
59

perform a framing process on a to-be-detected video to obtain a frame
sequence;
perform feature point detection on the frame sequence frame by frame, obtain
feature points of each frame by employing a feature point detecting algorithm
including fused FAST (Features from Accelerated Segment Test) and SURF
(Speeded-Up Robust Features), and generate a frame feature point sequence
matrix;
perform an operation on the frame feature point sequence man-ix to obtain a
motion vector of each frame based on an optical flow tracking algorithm;
obtain a feature value of the to-be-detected video according to the motion
vector of each frame; and
take the feature value of the to-be-detected video as an input signal of a
detection model to perform operation to obtain an output signal, and judge
whether jitter occurs to the to-be-detected video according to the output
signal;
a computer-readable storage medium.
219. The computer system of claim 218, wherein the computer system is further
configured to:
grayscale-process the frame sequence, and obtain a grayscale frame sequence;
and
denoise the grayscale frame sequence;
wherein the step of feature point detection on the frame sequence frame by
frame is feature
point detection on the preprocessed frame sequence frame by frame.
220. The computer system of any one of claims 218 to 219, wherein the computer
system is further
configured to:

employ the feature point detecting algorithm in which are the fused FAST
(Features from
Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to perform
feature point detection on the frame sequence frame by frame, and to obtain
the feature
points of each frame.
221. The computer system of any one of claims 218 to 220, wherein the computer
system is further
configured to:
perform optical flow tracking calculation on the frame feature point sequence
matrix of
each frame, and obtain an initial motion vector of each frame;
obtain a corresponding accumulative motion vector according to the initial
motion vector;
smoothen the accumulative motion vector, and obtain a smoothened motion
vector; and
employ the accumulative motion vector and the smoothened motion vector to
readjust the
initial motion vector of each frame, and obtain the motion vector of each
frame.
222. The computer system of any one of claims 218 to 221, wherein the computer
system is further
configured to:
merge and convert the motion vectors of all frames into a matrix, and
calculate unbiased
standard deviations of various elements in the matrix;
weight and fuse the unbiased standard deviations of the various elements, and
obtain a
weighted value; and
take the unbiased standard deviations of the various elements and the weighted
value as
the feature value of the to-be-detected video.
223. The computer system of any one of claims 218 to 222, wherein a to-be-
detected video is
obtained.
224. The computer system of any one of claims 218 to 223, wherein a framing
extraction process
is performed on the to-be-detected video to obtain a frame sequence
corresponding to the to-
be-detected video.
61

225. The computer system of any one of claims 218 to 224, wherein the frame
sequence is
expressed as L, (1A, 2, 3, ..., n) , where L, represents the th frame of the
video, and n represents
the total number of frames of the video.
226. The computer system of any one of claims 218 to 225, wherein a current
frame and an adjacent
next frame is selected from the to-be-detected video.
227. The computer system of any one of claims 218 to 226, wherein a current
frame and a next
frame by an interval of N frames is selected from the to-be-detected video.
228. The computer system of any one of claims 218 to 227, wherein
corresponding feature points
are obtained from the current frame and the adjacent next frame.
229. The computer system of any one of claims 218 to 228, wherein
corresponding feature points
are obtained from the current frame and the next frame by an interval of N
frames.
230. The computer system of any one of claims 218 to 229, wherein
corresponding matching is
performed according to the feature points of the two frames to judge whether
offset (jitter)
occurs between the current frame and the adjacent next frame.
231. The computer system of any one of claims 218 to 230, wherein
corresponding matching is
performed according to the feature points of the two frames to judge whether
offset (jitter)
occurs between the current frame and the next frame by an interval of N
frames.
232. The computer system of any one of claims 218 to 231, wherein a feature
point detecting
algorithm is employed to perform feature point detection on the processed
frame sequence L,
(i=1, 2, 3, ..., n) frame by frame.
233. The computer system of any one of claims 218 to 232, wherein a feature
point detecting
algorithm is employed to obtain feature points of each frame, that is, feature
points of each
frame of image are extracted.
234. The computer system of any one of claims 218 to 233, wherein a feature
point detecting
algorithm is employed to generate a frame feature point sequence matrix, which
is expressed
as Z, (i=1, 2, ..., n).
62

235. The computer system of any one of claims 218 to 234, wherein Z (1=1, 2,
..., n) is expressed
as:
<IMG>
Wherein /3'g represents a feature point detection result at row p, column q of
the ith frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of the
matrix, and q represents the number of columns of the matrix.
236. The computer system of any one of claims 218 to 235, wherein the optical
flow tracking
algorithm is employed to perform optical flow tracking calculation on the
frame feature point
sequence matrix.
237. The computer system of any one of claims 218 to 236, wherein the optical
flow tracking
algorithm is employed to track the change of feature points in the current
frame to the next
fram e.
238. The computer system of any one of claims 218 to 237, wherein the change
of a feature point
sequence matrix Z in the ith frame is tracked to the next i+ith frame, and a
motion vector ai
is obtained, whose expression is as follows:
<IMG>
63

where dxt represents a Euclidean column offset from the ith frame to the i+lth
frame, clyi
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dr' represents
an angle offset from the th frame to the i+Ph frame.
239. The computer system of any one of claims 218 to 238, wherein an extracted
feature value at
least includes the feature value of four dimensions.
240. The computer system of any one of claims 218 to 239, wherein the
detection model is well
trained in advance.
241. The computer system of any one of claims 218 to 240, wherein sample video
data in a set of
selected training data is correspondingly processed to obtain a feature value
of the sample
video data.
242. The computer system of any one of claims 218 to 241, wherein a detection
model is trained
according to the feature value of the sample video data and a corresponding
annotation result
of the sample video data to obtain the final detection model.
243. The computer system of any one of claims 218 to 242, wherein after the
motion vectors have
been dimensionally converted for an frith video sample, unbiased standard
deviations of
various elements and their weighted and fused value are calculated and
obtained, which are
<IMG>
respectively expressed as
and K m , and the
annotation result ym of the mth video sample is extracted to obtain the
training sample of the
/nth video sample, which is expressed as follows:
[2.(dx)]n cr[A(cly)].rn ak(dr)1m ICm ym)(m) .
244. The computer system of any one of claims 218 to 243, wherein the
annotation result ym
indicates that no jitter occurs to the video sample if ym-0.
245. The computer system of any one of claims 218 to 244, wherein the
annotation result ym
indicates that jitter occurs to the video sample if ym=1.
64

246. The computer system of any one of claims 218 to 245, wherein a video
sample makes use of
features of at least five dimensions.
247. The computer system of any one of claims 218 to 246, wherein the
detection model is selected
from an SVM (Support Vector Machine) model.
248. The computer system of any one of claims 218 to 247, wherein the feature
value of the to-be-
detected video is input to a well-trained SVM model to obtain an output
result.
249. The computer system of any one of claims 218 to 248, wherein if the SVM
model output
result is 0, this indicates that no jitter occurs to the to-be-detected video.
250. The computer system of any one of claims 218 to 249, wherein if the SVM
model output
result is 1, this indicates that jitter occurs to the to-be-detected video.
251. The computer system of any one of claims 218 to 250, wherein the use of a
trainable SVM
model as a video jitter decider enables jitter detection of videos captured in
different scenarios.
252. The computer system of any one of claims 218 to 251, wherein the amount
of information of
the image is greatly reduced after grayscale-processing.
253. The computer system of any one of claims 218 to 252, wherein the frame
sequence L, (i=1, 2,
3, ..., n) is further grayscale-processed.
254. The computer system of any one of claims 218 to 253, wherein a grayscale
frame sequence is
obtained to be expressed as G, (i=1, 2, 3, ..., n) , in which the grayscale
conversion expression
is as follows:
G=Rx0.299+Gx0.587+Bx0.1.IA
255. The computer system of any one of claims 218 to 254, wherein a TV
denoising method based
on a total variation model is employed to denoise the grayscale frame sequence
G, (i=1, 2,
3, ..., n) to obtain a denoised frame sequence expressed as T, (i=1, 2, 3,
..., n) .

256. The computer system of any one of claims 218 to 255, wherein the denoised
frame sequence
is a preprocessed frame sequence to which the to-be-detected video
corresponds.
257. The computer system of any one of claims 218 to 256, wherein the
denoising method is
randomly selectable.
258. The computer system of any one of claims 220 to 257, wherein the SURF
algorithm is an
improvement over SIFT (Scale-Invariant Feature Transform) algorithm.
259. The computer system of claims 258, wherein SIFT is a feature describing
method with
excellent robustness and invariant scale.
260. The computer system of any one of claims 258 to 259, wherein the SURF
algorithm improves
over the problems of large amount of data to be calculated, high time
complexity and long
duration of calculation inherent in the SIFT algorithm at the same time of
maintaining the
advantages of the SIFT algorithm.
261. The computer system of any one of claims 220 to 260, wherein SURF is more
excellent in
performance in the aspect of invariance of illumination change and perspective
change,
particularly excellent in processing severe blurs and rotations of images.
262. The computer system of any one of claims 220 to 261, wherein SURF is
excellent in
describing local features of images.
263. The computer system of any one of claims 220 to 262, wherein FAST feature
detection is a
corner detection method.
264. The computer system of any one of claims 220 to 263, wherein FAST feature
detection has
the most prominent advantage of its algorithm in its calculation efficiency,
and the capability
to excellently describe global features of images.
265. The computer system of any one of claims 220 to 264, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction gives consideration to global features of images.
66

266. The computer system of any one of claims 220 to 265, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction fully retains local features of images.
267. The computer system of any one of claims 220 to 266, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction has small computational expense.
268. The computer system of any one of claims 220 to 267, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction has strong robustness against image blurs and faint
illumination.
269. The computer system of any one of claims 220 to 268, wherein use of the
feature point
detecting algorithm in which FAST features and SURF features are fused to
perform feature
point extraction enhances real-time property and precision of the detection.
270. The computer system of any one of claims 218 to 269, wherein while the
optical flow tracking
calculation is performed on the frame feature point sequence matrix, a pyramid
optical flow
tracking LK (Lucas-Kanade) algorithm is employed.
271. The computer system of any one of claims 218 to 270, wherein the change
of a feature point
sequence matrix Z in the ith frame is tracked to the next i+ith frame, and a
motion vector ai
is obtained, whose expression is as follows:
<IMG>
where dr' represents a Euclidean column offset from the ith frame to the i+lth
frame, dy'
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dri represents
an angle offset from the th frame to the i+lth frame.
67

272. The computer system of any one of claims 218 to 271, wherein use of the/a
pyramid optical
flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative
structure solves
the problem of failed tracking due to unduly large change from feature points
of a current
frame to feature points of an adjacent next frame.
273. The computer system of any one of claims 218 to 272, wherein use of the/a
pyramid optical
flow tracking LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative
structure solves
the problem of failed tracking due to unduly large change from feature points
of a current
frame to feature points of a next frame by an interval of N frames.
274. The computer system of any one of claims 218 to 273, wherein accumulative
integral
transformation is performed on the initial motion vector Cti of each frame to
obtain an
accumulative motion vector, expressed as Pi , of each frame, in which the
expression of the
accumulative motion vector Pi is as follows:
<IMG>
275. The computer system of any one of claims 218 to 274, wherein a sliding
average window is
used to smoothen the motion vector 13 to obtain a smoothened motion vector Y ,
whose
expression is:
68

<IMG>
where n represents the total number of frames of the video.
276. The computer system of any one of claims 218 to 275, wherein the radius
of the smooth
window is r, and its expression is:
<IMG>
where ix indicates a parameter of the sliding window, and is a positive
number.
277. The computer system of any one of claims 218 to 276, wherein the specific
numerical value
of can be dynamically adjusted according to practical requirement.
278. The computer system of any one of claims 218 to 277, wherein the specific
numerical value
of is set as =30.
279. The computer system of any one of claims 218 to 278, wherein /6 and
are used to
readjust ai to obtain a readjusted motion vector whose expression is:
69

<IMG>
280. The computer system of any one of claims 218 to 279, wherein the
readjusted motion vector
is taken as the motion vector of each frame to participate in subsequent
calculation, so that
the calculation result is made more precise.
281. The computer system of any one of claims 218 to 280, wherein the motion
vectors of all
frames obtained are merged and converted into a matrix.
282. The computer system of any one of claims 218 to 281. wherein the motion
vector /14 is
<IMG>
converted into the form of a matrix ,
and unbiased standard deviations
of its elements are calculated by rows, the specific calculation expression is
as follows:
<IMG>

283. The computer system of any one of claims 218 to 282, wherein the unbiased
standard
deviations of the various elements in the matrix are respectively expressed as
<IMG>
<IMG>
, in which A represents the average value of the samples.
284. The computer system of any one of claims 218 to 283, wherein weights are
assigned to the
unbiased standard deviations of the various elements according to practical
requirements.
285. The computer system of any one of claims 218 to 284, wherein the unbiased
standard
deviations of the various elements are weighted and fused according to the
weights.
286. The computer system of any one of claims 218 to 285, wherein the weights
of the unbiased
standard deviations of the various elements can be dynamically readjusted
according to
practical requirements.
<IMG>
287. The computer system of any one of claims 218 to 286, wherein the weight
of is
<IMG> <IMG> set as 3, the weight of is
set as 3, and the weight of is set as 10,
then the fusing expression is as follows:
<IMG>
288. The computer system of any one of claims 218 to 287, wherein the feature
value of the to-be-
detected video is the unbiased standard deviations of the various elements and
their weighted
value, which are expressed as:
{cî[A(dx)] o-[A(dy)], a[2.(dr)],
289. The computer system of any one of claims 218 to 288, wherein the computer-
readable storage
medium is a read-only memory.
71

290. The computer system of any one of claims 218 to 289, wherein the computer-
readable storage
medium is a magnetic disk.
291. The computer system of any one of claims 218 to 290, wherein the computer-
readable storage
medium is an optical disk.
292.A method of detecting video jitter, comprising:
performing a framing process on a to-be-detected video to obtain a frame
sequence;
performing feature point detection on the frame sequence frame by frame,
obtaining feature
points of each frame by employing a feature point detecting algorithm
including fused
FAST (Features from Accelerated Segment Test) and SURF (Speeded-Up Robust
Features), and generating a frame feature point sequence matrix;
basing on an optical flow tracking algorithm to perform an operation on the
frame feature
point sequence matrix to obtain a motion vector of each frame;
obtaining a feature value of the to-be-detected video according to the motion
vector of each
frame; and
taking the feature value of the to-be-detected video as an input signal of a
detection model
to perform operation to obtain an output signal, and judging whether jitter
occurs to the to-
be-detected video according to the output signal.
293. The method of claim 292, wherein the method further includes:
grayscale-processing the frame sequence, and obtaining a grayscale frame
sequence; and
denoising the grayscale frame sequence;
the step of performing feature point detection on the frame sequence frame by
frame is
performing feature point detection on the preprocessed frame sequence frame by
frame.
294. The method of any one of claims 292 to 293, wherein the method further
includes:
72

employing the feature point detecting algorithm in which are the fused FAST
(Features
from Accelerated Segment Test) and the SURF (Speeded-Up Robust Features) to
perform
feature point detection on the frame sequence frame by frame, and to obtain
the feature
points of each frame.
295. The method of any one of claims 292 to 294, wherein the method further
includes:
performing optical flow tracking calculation on the frame feature point
sequence matrix of
each frame, and obtaining an initial motion vector of each frame;
obtaining a corresponding accumulative motion vector according to the initial
motion
vector;
smoothening the accumulative motion vector, and obtaining a smoothened motion
vector;
and
employing the accumulative motion vector and the smoothened motion vector to
readjust
the initial motion vector of each frame, and obtaining the motion vector of
each frame.
296. The method of any one of claims 292 to 295, wherein the method further
includes:
merging and converting the motion vectors of all frames into a matrix, and
calculating
unbiased standard deviations of various elements in the matrix;
weighting and fusing the unbiased standard deviations of the various elements,
and
obtaining a weighted value; and
taking the unbiased standard deviations of the various elements and the
weighted value as
the feature value of the to-be-detected video.
297. The method of any one of claims 292 to 296, wherein a to-be-detected
video is obtained.
298. The method of any one of claims 292 to 297, wherein a framing extraction
process is
performed on the to-be-detected video to obtain a frame sequence corresponding
to the to-be-
detected video.
73

299. The method of any one of claims 292 to 298, wherein the frame sequence is
expressed as L,
(i=1, 2, 3, ..., n), where L, represents the th frame of the video, and n
represents the total
number of frames of the video.
300. The method of any one of claims 292 to 299, wherein a current frame and
an adjacent next
frame is selected from the to-be-detected video.
301. The method of any one of claims 292 to 300, wherein a current frame and a
next frame by an
interval of N frames is selected from the to-be-detected video.
302. The method of any one of claims 292 to 301, wherein corresponding feature
points are
obtained from the current frame and the adjacent next frame.
303. The method of any one of claims 292 to 302, wherein corresponding feature
points are
obtained from the current frame and the next frame by an interval of N frames.
304. The method of any one of claims 292 to 303, wherein corresponding
matching is performed
according to the feature points of the two frames to judge whether offset
(jitter) occurs
between the current frame and the adjacent next frame.
305. The method of any one of claims 292 to 304, wherein corresponding
matching is performed
according to the feature points of the two frames to judge whether offset
(jitter) occurs
between the current frame and the next frame by an interval of N frames.
306. The method of any one of claims 292 to 305, wherein a feature point
detecting algorithm is
employed to perform feature point detection on the processed frame sequence L,
(1=1, 2, 3, ...,
n) frame by frame.
307. The method of any one of claims 292 to 306, wherein a feature point
detecting algorithm is
employed to obtain feature points of each frame, that is, feature points of
each frame of image
are extracted.
308. The method of any one of claims 292 to 307, wherein a feature point
detecting algorithm is
employed to generate a frame feature point sequence matrix, which is expressed
as Z, (i=1,
2, ..., n).
74

309. The method of any one of claims 292 to 308, wherein Z (i=1, 2, ..., n) is
expressed as:
<IMG>
Wherein apq
'
represents a feature point detection result at row p, column q of the ith
frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of the
matrix, and q represents the number of columns of the matrix.
310. The method of any one of claims 292 to 309, wherein the optical flow
tracking algorithm is
employed to perform optical flow tracking calculation on the frame feature
point sequence
matrix.
311. The method of any one of claims 292 to 310, wherein the optical flow
tracking algorithm is
employed to track the change of feature points in the current frame to the
next frame.
312. The method of any one of claims 292 to 311, wherein the change of a
feature point sequence
matrix Z in the ith frame is tracked to the next i+ lth frame, and a motion
vector ai is obtained,
whose expression is as follows:
<IMG>
where dxf represents a Euclidean column offset from the ith frame to the i+lth
frame, dyi
represents a Euclidean row offset from the ith frame to the i+lth frame, and
dri represents
an angle offset from the ith frame to the i+lth frame.

313. The method of any one of claims 292 to 312, wherein an extracted feature
value at least
includes the feature value of four dimensions.
314. The method of any one of claims 292 to 313, wherein the detection model
is well trained in
advance.
315. The method of any one of claims 292 to 314, wherein sample video data in
a set of selected
training data is correspondingly processed to obtain a feature value of the
sample video data.
316. The method of any one of claims 292 to 315, wherein a detection model is
trained according
to the feature value of the sample video data and a corresponding annotation
result of the
sample video data to obtain the final detection model.
317. The method of any one of claims 292 to 316, wherein after the motion
vectors have been
dimensionally converted for an 171th video sample, unbiased standard
deviations of various
elements and their weighted and fused value are calculated and obtained, which
are
<IMG>
respectively expressed as ,
and the
annotation result ym of the 117th video sample is extracted to obtain the
training sample of the
177th video sample, which is expressed as follows:
fak(dx)lm a[A(dy)].õ, a[A.(dr)lm Km ym)(m).
318. The method of any one of claims 292 to 317, wherein the annotation
resultym indicates that
no jitter occurs to the video sample if ym-0.
319. The method of any one of claims 292 to 318, wherein the annotation
resultym indicates that
jitter occurs to the video sample if ym=1.
320. The method of any one of claims 292 to 319, wherein a video sample makes
use of features
of at least five dimensions.
321. The method of any one of claims 292 to 320, wherein the detection model
is selected from an
SVM (Support Vector Machine) model.
76

322. The method of any one of claims 292 to 321, wherein the feature value of
the to-be-detected
video is input to a well-trained SVM model to obtain an output result.
323. The method of any one of claims 292 to 322, wherein if the SVM model
output result is 0,
this indicates that no jitter occurs to the to-be-detected video.
324. The method of any one of claims 292 to 323, wherein if the SVM model
output result is 1,
this indicates that jitter occurs to the to-be-detected video.
325. The method of any one of claims 292 to 324, wherein the use of a
trainable SVM model as a
video jitter decider enables jitter detection of videos captured in different
scenarios.
326. The method of any one of claims 292 to 325, wherein the amount of
information of the image
is greatly reduced after grayscale-processing.
327. The method of any one of claims 292 to 326, wherein the frame sequence L,
(1=1, 2, 3, ..., n)
is further grayscale-processed.
328. The method of any one of claims 292 to 327, wherein a grayscale frame
sequence is obtained
to be expressed as G, (i=1, 2, 3, ..., n), in which the grayscale conversion
expression is as
follows:
G=Rx0.299+Gx0.587+Bx0.114
329. The method of any one of claims 292 to 328, wherein a TV denoising method
based on a total
variation model is employed to denoise the grayscale frame sequence G, (i=1,
2, 3, ..., n) to
obtain a denoised frame sequence expressed as T1(i=1, 2, 3, ..., n) .
330. The method of any one of claims 292 to 329, wherein the denoised frame
sequence is a
preprocessed frame sequence to which the to-be-detected video corresponds.
331. The method of any one of claims 292 to 330, wherein the denoising method
is randomly
selectable.
77

332. The method of any one of claims 294 to 331, wherein the SURF algorithm is
an improvement
over SIFT (Scale-Invariant Feature Transform) algorithm.
333. The method of claim 332, wherein SIFT is a feature describing method with
excellent
robustness and invariant scale.
334. The method of any one of claims 332 to 333, wherein the SURF algorithm
improves over the
problems of large amount of data to be calculated, high time complexity and
long duration of
calculation inherent in the SIFT algorithm at the same time of maintaining the
advantages of
the SIFT algorithm.
335. The method of any one of claims 294 to 334, wherein SURF is more
excellent in performance
in the aspect of invariance of illumination change and perspective change,
particularly
excellent in processing severe blurs and rotations of images.
336. The method of any one of claims 294 to 335, wherein SURF is excellent in
describing local
features of images.
337. The method of any one of claims 294 to 336, wherein FAST feature
detection is a comer
detection method.
338. The method of any one of claims 294 to 337, wherein FAST feature
detection has the most
prominent advantage of its algorithm in its calculation efficiency, and the
capability to
excellently describe global features of images.
339. The method of any one of claims 294 to 338, wherein use of the feature
point detecting
algorithm in which FAST features and SURF features are fused to perform
feature point
extraction gives consideration to global features of images.
340. The method of any one of claims 294 to 339, wherein use of the feature
point detecting
algorithm in which FAST features and SURF features are fused to perform
feature point
extraction fully retains local features of images.
78

341. The method of any one of claims 294 to 340, wherein use of the feature
point detecting
algorithm in which FAST features and SURF features are fused to perform
feature point
extraction has small computational expense.
342. The method of any one of claims 294 to 341, wherein use of the feature
point detecting
algorithm in which FAST features and SURF features are fused to perform
feature point
extraction has strong robustness against image blurs and faint illumination.
343. The method of any one of claims 294 to 342, wherein use of the feature
point detecting
algorithm in which FAST features and SURF features are fused to perform
feature point
extraction enhances real-time property and precision of the detection.
344. The method of any one of claims 292 to 343, wherein while the optical
flow tracking
calculation is performed on the frame feature point sequence matrix, a pyramid
optical flow
tracking LK (Lucas-Kanade) algorithm is employed.
345. The method of any one of claims 292 to 344, wherein the change of a
feature point sequence
matrix Z in the ith frame is tracked to the next i+1 th frame, and a motion
vector ai is obtained,
whose expression is as follows:
<IMG>
where dxi represents a Euclidean column offset from the ith frame to the i+lth
frame, dy'
represents a Euclidean row offset from the ith frame to the i+ 1 th frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
346. The method of any one of claims 292 to 345, wherein use of the pyramid
optical flow tracking
LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure
solves the problem
of failed tracking due to unduly large change from feature points of a current
frame to feature
points of an adjacent next frame.
79

347. The method of any one of claims 292 to 346, wherein use of the pyramid
optical flow tracking
LK (Lucas-Kanade) algorithm that utilizes the pyramid iterative structure
solves the problem
of failed tracking due to unduly large change from feature points of a current
frame to feature
points of a next frame by an interval of N frames.
348. The method of any one of claims 292 to 347, wherein accumulative integral
transformation is
performed on the initial motion vector ai of each frame to obtain an
accumulative motion
vector, expressed as fli , of each frame, in which the expression of the
accumulative motion
vector is as follows:
<IMG>
349. The method of any one of claims 292 to 348, wherein a sliding average
window is used to
smoothen the motion vector fii to obtain a smoothened motion vector r , whose
expression is:

<IMG>
where n represents the total number of frames of the video.
350. The method of any one of claims 292 to 349, wherein the radius of the
smooth window is r,
and its expression is:
<IMG>
where ix indicates a parameter of the sliding window, and la is a positive
number.
351. The method of any one of claims 292 to 350, wherein the specific
numerical value of ti can
be dynamically adjusted according to practical requirement.
352. The method of any one of claims 292 to 351, wherein the specific
numerical value of 1.1 is set
as j.t=30.
353. The method of any one of claims 292 to 352, wherein /3 and
are used to readjust ai to
obtain a readjusted motion vector 4, whose expression is:
81

<IMG>
354. The method of any one of claims 292 to 353, wherein the readjusted motion
vector /1.4 is taken
as the motion vector of each frame to participate in subsequent calculation,
so that the
calculation result is made more precise.
355. The method of any one of claims 292 to 354, wherein the motion vectors of
all frames obtained
are merged and converted into a matrix.
356. The method of any one of claims 292 to 355, wherein the motion vector Ai
is converted into
<IMG>
the form of a matiix ,
and unbiased standard deviations of its elements
are calculated by rows, the specific calculation expression is as follows:
<IMG>
82

357. The method of any one of claims 292 to 356, wherein the unbiased standard
deviations of the
<IMG>
various elements in the matrix are respectively expressed as
and
<IMG>
in which A represents the average value of the samples.
358. The method of any one of claims 292 to 357, wherein weights are assigned
to the unbiased
standard deviations of the various elements according to practical
requirements.
359. The method of any one of claims 292 to 358, wherein the unbiased standard
deviations of the
various elements are weighted and fused according to the weights.
360. The method of any one of claims 292 to 359, wherein the weights of the
unbiased standard
deviations of the various elements can be dynamically readjusted according to
practical
requirements.
<IMG>
361. The method of any one of claims 292 to 360, wherein the weight of is
set as 3,
<IMG> <IMG>
the weight of is set as 3, and the weight of is
set as 10, then the
fusing expression is as follows:
<IMG>
362. The method of any one of claims 292 to 361, wherein the feature value of
the to-be-detected
video is the unbiased standard deviations of the various elements and their
weighted value,
which are expressed as:
{cî[A(dx)] o-[A(dy)], a[2.(dr)],
83

Description

Note: Descriptions are shown in the official language in which they were submitted.


VIDEO JITTER DETECTION METHOD AND DEVICE
BACKGROUND OF THE INVENTION
Technical Field
[0001] The present invention relates to the field of computer vision
technology, and more
particularly to a method of and a device for detecting video jitter.
Description of Related Art
[0002] The wave of science and technology has profoundly changed the life of
everybody. Such
handheld video capturing devices as smart mobile phones, digital cameras, ILDC
and
SLR cameras with ever shrinking size and ever lowering price have become
everyday
necessities of most people, and an age of photographing by all people has
quietly come
into being. While people are enjoying the interesting and exhilarating moments
recorded
by handheld video capturing devices, irregular jitters generated in the video
due to
unstable movements of the lens caused by moving or unintentional shaking of
the
photographer make the effects of wonderful highlights as recorded to fall far
short of
expectation, and severely affect subsequent processing of the video at the
same time.
Accordingly, video jitter detection has become an indispensable, important
component of
the video processing technology.
[0003] Video jitter detection is the basis for subsequent readjustment and
processing of videos,
and the researching personnel has made great quantities of researches on the
basis of
video analysis in the fields of video processing, video image stabilization
and computer
vision, etc. Although some researchers have proposed several methods of
detecting video
jitters, the currently available detecting algorithms are not high in
precision, as some are
not sensitive to videos captured under conditions of large displacement and
strong jitter
1
CA 03172605 2022- 9- 21

of lenses within a short time, some are not adapted to the detection of
rotational
movements, and some are not applicable to scenarios in which the lens moves
slowly. For
example, the following commonly employed methods of detecting video jitters
are more
or less defective in ways described below.
1. Block matching method: at present, this is the most commonly used algorithm
in
video image stabilization systems. This method divides a current frame into
blocks, each pixel in a block has the same and single motion vector, and the
optimal match is searched for in each block within a specific range of
reference
frames, to thereby estimate the global motion vector of the video sequence.
Division into blocks is usually required by the block matching method, whereby
the global motion vector is estimated according to the motion vector in each
block,
then the detection of video jitter in certain specific scenarios is inferior
in effect,
for instance, a picture is divided into four grids, in which 3 grids are
motionless,
while objects in one grid are moving. Besides, Kalman filtering is usually
required
by the block matching method to process the calculated motion vectors, while
the
computational expense is large, the real-time property is inferior, and the
scenario
of large displacement and strong jitter of the lens within a short time cannot
be
accommodated.
2. Grayscale projection method: based on the principle of consistent
distribution of
grayscales in overlapped and similar regions in an image, and making use of
regional grayscale information of adjacent video frames to seek for vector
motion
relation, this algorithm mainly consists of relevant calculation of grayscale
projection in two directions of different regions, rows and columns. The
grayscale
projection method is effective to scenarios in which only translational
jitters exist,
and cannot estimate rotational motion vectors.
2
CA 03172605 2022- 9- 21

SUMMARY OF THE INVENTION
[0004] In order to solve the problems pending in the state of the art,
embodiments of the present
invention provide a method of and a device for detecting video jitter, so as
to overcome
such prior-art problems as low precision of detecting algorithms, and
insensitivity to
videos captured under conditions of large displacement and strong jitter of
lenses within
a short time.
[0005] In order to solve one or more of the aforementioned technical problems,
the present
invention employs the following technical solutions.
[0006] According to one aspect, a method of detecting video jitter is
provided, and the method
comprises the following steps:
[0007] performing a framing process on a to-be-detected video to obtain a
frame sequence;
[0008] performing feature point detection on the frame sequence frame by
frame, obtaining
feature points of each frame, and generating a frame feature point sequence
matrix;
[0009] basing on an optical flow tracking algorithm to perform an operation on
the frame feature
point sequence matrix to obtain a motion vector of each frame;
[0010] obtaining a feature value of the to-be-detected video according to the
motion vector of
each frame; and
[0011] taking the feature value of the to-be-detected video as an input signal
of a detection model
to perform operation to obtain an output signal, and judging whether jitter
occurs to the
to-be-detected video according to the output signal.
[0012] Further, prior to performing feature point detection, the method
further comprises the
following steps of preprocessing the frame sequence:
[0013] grayscale-processing the frame sequence, and obtaining a grayscale
frame sequence; and
[0014] denoising the grayscale frame sequence; wherein
[0015] the step of performing feature point detection on the frame sequence
frame by frame is
3
CA 03172605 2022- 9- 21

performing feature point detection on the preprocessed frame sequence frame by
frame.
[0016] Further, the step of performing feature point detection on the frame
sequence frame by
frame, obtaining feature points of each frame includes:
[0017] employing a feature point detecting algorithm in which are fused FAST
features and
SURF features to perform feature point detection on the frame sequence frame
by frame,
and obtain feature points of each frame.
[0018] Further, the step of basing on an optical flow tracking algorithm to
perform an operation
on the frame feature point sequence matrix to obtain a motion vector of each
frame
includes:
[0019] performing optical flow tracking calculation on the frame feature point
sequence matrix
of each frame, and obtaining an initial motion vector of each frame;
[0020] obtaining a corresponding accumulative motion vector according to the
initial motion
vector;
[0021] smoothening the accumulative motion vector, and obtaining a smoothened
motion vector;
and
[0022] employing the accumulative motion vector and the smoothened motion
vector to readjust
the initial motion vector of each frame, and obtaining the motion vector of
each frame.
[0023] Further, the step of obtaining a feature value of the to-be-detected
video according to the
motion vector of each frame includes:
[0024] merging and converting the motion vectors of all frames into a matrix,
and calculating
unbiased standard deviations of various elements in the matrix;
[0025] weighting and fusing the unbiased standard deviations of the various
elements, and
obtaining a weighted value; and
[0026] taking the unbiased standard deviations of the various elements and the
weighted value
as the feature value of the to-be-detected video.
4
CA 03172605 2022- 9- 21

[0027] According to another aspect, a device for detecting video jitter is
provided, and the device
comprises:
[0028] a framing processing module, for performing a framing process on a to-
be-detected video
to obtain a frame sequence;
[0029] a feature point detecting module, for performing feature point
detection on the frame
sequence frame by frame, obtaining feature points of each frame, and
generating a frame
feature point sequence matrix;
[0030] a vector calculating module, for basing on an optical flow tracking
algorithm to perform
an operation on the frame feature point sequence matrix to obtain a motion
vector of each
frame;
[0031] a feature value extracting module, for obtaining a feature value of the
to-be-detected
video according to the motion vector of each frame; and
[0032] a jitter detecting module, for taking the feature value of the to-be-
detected video as an
input signal of a detection model to perform operation to obtain an output
signal, and
judging whether jitter occurs to the to-be-detected video according to the
output signal.
[0033] Further, the device further comprises:
[0034] a data preprocessing module, for preprocessing the frame sequence;
[0035] the data preprocessing module includes:
[0036] a grayscale-processing unit, for grayscale-processing the frame
sequence, and obtaining
a grayscale frame sequence; and
[0037] a denoising processing unit, for denoising the grayscale frame
sequence;
[0038] the feature point detecting module is employed for performing feature
point detection on
the preprocessed frame sequence frame by frame.
[0039] Further, the feature point detecting module is further employed for:
[0040] employing a feature point detecting algorithm in which are fused FAST
features and
SURF features to perform feature point detection on the frame sequence frame
by frame,
and obtain feature points of each frame.
s
CA 03172605 2022- 9- 21

[0041] Further, the vector calculating module includes:
[0042] an optical flow tracking unit, for performing optical flow tracking
calculation on the
frame feature point sequence matrix of each frame, and obtaining an initial
motion vector
of each frame;
[0043] an accumulation calculating unit, for obtaining a corresponding
accumulative motion
vector according to the initial motion vector;
[0044] a smoothening processing unit, for smoothening the accumulative motion
vector, and
obtaining a smoothened motion vector; and
[0045] a vector readjusting unit, for employing the accumulative motion vector
and the
smoothened motion vector to readjust the initial motion vector of each frame,
and
obtaining the motion vector of each frame.
[0046] Further, the feature value extracting module includes:
[0047] a matrix converting unit, for merging and converting the motion vectors
of all frames into
a matrix;
[0048] a standard deviation calculating unit, for calculating unbiased
standard deviations of
various elements in the matrix; and
[0049] a weighting and fusing unit, for weighting and fusing the unbiased
standard deviations of
the various elements, and obtaining a weighted value.
[0050] The technical solutions provided by the embodiments of the present
invention bring about
the following advantageous effects.
1. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, by basing on the optical flow tracking
algorithm to obtain a motion vector of each frame according to the frame
feature
point sequence matrix, the present invention effectively solves the problem of
failed tracking due to unduly large change between two adjacent frames,
exhibits
6
CA 03172605 2022- 9- 21

excellent toleration and adaptability in detecting jitters of videos captured
under
the condition in which the lens slowly moves, and achieves excellent
sensitivity
and robustness when videos are detected as captured under circumstances of
abrupt large displacement, strong jitter, and excessive rotation of the lens.
2. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, a feature point detecting algorithm in
which are fused FAST features and SURF features is employed, that is to say,
the
feature point extracting algorithm is so optimized that not only the image
global
feature is considered, but the local features are also retained, moreover,
computational expense is small, robustness against image blurs and faint
illumination is strong, and real-time property and precision of the detection
are
further enhanced.
3. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, features of at least four dimensions are
extracted from the to-be-detected video, and an SVM model is used as the
detection model, so that the method of detecting video jitter as provided by
the
embodiments of the present invention is more advantageous in terms of
generality,
and precision of detection is further enhanced.
[0051] Of course, implementation of any one solution according to the present
application does
not necessarily achieve all of the aforementioned advantages simultaneously.
BRIEF DESCRIPTION OF THE DRAWINGS
[0052] To more clearly explain the technical solutions in the embodiments of
the present
invention, drawings required for use in the following explanation of the
embodiments are
briefly described below. Apparently, the drawings described below are merely
directed
7
CA 03172605 2022- 9- 21

to some embodiments of the present invention, while it is further possible for
persons
ordinarily skilled in the art to base on these drawings to acquire other
drawings, and no
creative effort will be spent in the process.
[0053] Fig. 1 is a flowchart illustrating a method of detecting video jitter
according to an
exemplary embodiment;
[0054] Fig. 2 is a flowchart illustrating preprocessing of the frame sequence
according to an
exemplary embodiment;
[0055] Fig. 3 is a flowchart illustrating performing an operation on the frame
feature point
sequence matrix based on an optical flow tracking algorithm to obtain a motion
vector of
each frame according to an exemplary embodiment;
[0056] Fig. 4 is a flowchart illustrating obtaining a feature value of the to-
be-detected video
according to the motion vector of each frame according to an exemplary
embodiment;
and
[0057] Fig. 5 is a view schematically illustrating the structure of a device
for detecting video
jitter according to an exemplary embodiment.
DETAILED DESCRIPTION OF THE INVENTION
[0058] To make more lucid and clear the objectives, technical solutions and
advantages of the
present invention, technical solutions in the embodiments of the present
invention will be
described more clearly and completely below with reference to the accompanying
drawings in the embodiments of the present invention. Apparently, the
embodiments
described below are merely partial, rather than the entire, embodiments of the
present
invention. All other embodiments achievable by persons ordinarily skilled in
the art on
the basis of the embodiments in the present invention without creative effort
shall all fall
within the protection scope of the present invention.
[0059] Fig. 1 is a flowchart illustrating a method of detecting video jitter
according to an
8
CA 03172605 2022- 9- 21

exemplary embodiment, with reference to Fig. 1, the method comprises the
following
steps.
[0060] Si - performing a framing process on a to-be-detected video to obtain a
frame sequence.
[0061] Specifically, in order to facilitate subsequent calculation so as to
detect the to-be-detected
video, after the to-be-detected video (indicated as S) has been obtained, a
framing
extraction process should be firstly performed on the to-be-detected video S
to obtain a
frame sequence corresponding to the to-be-detected video, and the frame
sequence is
expressed as Li (1=1, 2, 3, ..., n), where Li represents the ith frame of the
video, and n
represents the total number of frames of the video.
[0062] S2 - performing feature point detection on the frame sequence frame by
frame, obtaining
feature points of each frame, and generating a frame feature point sequence
matrix.
[0063] Specifically, it is required in the detection of video jitter to select
the current frame and
the adjacent next frame (or to extract the next frame by an interval of N
frames) from the
video, corresponding feature points should be obtained from the two frames of
images,
and corresponding matching is subsequently performed according to the feature
points of
the two frames, to hence judge whether offset (jitter) occurs between the two
frames.
[0064] During specific implementation, a feature point detecting algorithm is
employed to
perform feature point detection on the processed frame sequence Li(i=1, 2, 3,
..., n) frame
by frame, feature points of each frame are obtained (i.e., feature points of
each frame of
image are extracted), and a frame feature point sequence matrix is generated,
which is
supposed to be expressed as Zi (1=1, 2, ..., n), and which can be specifically
expressed as
follows:
9
CA 03172605 2022- 9- 21

_
a1 CIL2 = = - a 1
1,q
- = - at .
Z-
21I,2_2 2,q
,= a' =
1
Prq
0 a
_ F. ai
- = - a' i
pr2 prq
[0065] where P'q represents a feature point detection result at rowp, column q
of the ith frame
matrix, 1 is a feature point, 0 is a non-feature point, p represents the
number of rows of
the matrix, and q represents the number of columns of the matrix.
[0066] S3 - basing on an optical flow tracking algorithm to perform an
operation on the frame
feature point sequence matrix to obtain a motion vector of each frame.
[0067] Specifically, the optical flow tracking algorithm is employed to
perform optical flow
tracking calculation on the frame feature point sequence matrix, namely to
track the
change of feature points in the current frame to the next frame. For instance,
the change
of a feature point sequence matrix Zi in the ith frame is tracked to the next
i+lth frame, and
a motion vector 06/ is obtained, whose expression is as follows:
dx'
a (i 1,2,¨ rt) ¨ = eiv
¨
-1
I dr'
[0068] where chci represents a Euclidean column offset from the ith frame to
the i+lth frame, dyi
represents a Euclidean row offset from the 1th frame to the i+lth frame, and
dr' represents
an angle offset from the ith frame to the i+lth frame.
[0069] S4 - obtaining a feature value of the to-be-detected video according to
the motion vector
of each frame.
1.0
CA 03172605 2022- 9- 21

[0070] Specifically, the feature value of three dimensions is usually used in
the state of the art,
whereas in the embodiments of the present invention the extracted feature
value at least
includes the feature value of four dimensions. The addition of one dimension
to the
feature value as compared with prior-art technology makes the method of
detecting video
jitter as provided by the embodiments of the present invention more
advantageous in
terms of generality, and precision of detection is further enhanced.
[0071] S5 - taking the feature value of the to-be-detected video as an input
signal of a detection
model to perform operation to obtain an output signal, and judging whether
jitter occurs
to the to-be-detected video according to the output signal.
[0072] Specifically, the feature value of the to-be-detected video obtained in
the previous step is
taken as an input signal to be input into the detection model to perform
operation to obtain
an output signal, and the output signal is based on to judge whether jitter
occurs to the to-
be-detected video. As should be noted here, the detection model in the
embodiments of
the present invention is well trained in advance. During the specific
training, it is possible
to correspondingly process sample video data in a set of selected training
data by
employing the method in the embodiments of the present invention, to obtain a
feature
value of the sample video data. A detection model is trained according to the
feature value
of the sample video data and a corresponding annotation result of the sample
video data,
until model training is completed to obtain the final detection model.
[0073] For instance, suppose that an mth video sample in a set of jittering
video data with
annotations has undergone the processing specified in the above step to
extract to obtain
the feature value of the mth video sample. That is, the mth video sample is
firstly framing-
processed to obtain a frame sequence, feature point detection is then
performed frame by
frame on the frame sequence, feature points of each frame are obtained, a
frame feature
point sequence matrix is generated, the optical flow tracking algorithm is
thereafter based
11
CA 03172605 2022- 9- 21

on to perform an operation on the frame feature point sequence matrix to
obtain the
motion vector of each frame, and the feature value of the Mth video sample is
finally
obtained according to the motion vector of each frame. After the motion
vectors have
been dimensionally converted, unbiased standard deviations of various elements
and their
weighted and fused value are calculated and obtained, which are respectively
expressed
as o- [A, (dx)] ,o- [A, (dy)] ,0[A(h1')1 and K., and the annotation result ym
(if
ym=0, this indicates that no jitter occurs to the video sample, if ym=1, this
indicates that
jitter occurs to the video sample) of the nith video sample is extracted to
obtain the training
sample of the ?nth video sample, which can be expressed as follows:
[A,(dx)]m o- [A,(dy)I1m o-P-(dr)1 Km yn, }(m)
[0074] The video sample makes use of features of at least five dimensions, in
comparison with
prior-art technology in which features of three dimensions are usually used
(the average
values of average values, variances, and included angles of translational
vectors of the
translation quantities of adjacent frames are usually used), generality is
more
advantageous, and precision of detection is further enhanced. In addition, as
a preferred
embodiment in the embodiments of the present invention, the detection model
can be
selected from an SVM model, that is, the feature value of the to-be-detected
video as
obtained through the previous step is input to a well-trained SVM model to
obtain an
output result. If the output result is 0, this indicates that no jitter occurs
to the to-be-
detected video, if the output result is 1, this indicates that jitter occurs
to the to-be-detected
video. The use of a trainable SVM model as a video jitter decider enables
jitter detection
of videos captured in different scenarios, and the use of this model makes the
generality
better, and the precision rate of detection higher.
[0075] Fig. 2 is a flowchart illustrating preprocessing of the frame sequence
according to an
exemplary embodiment, with reference to Fig. 2, as a preferred embodiment in
the
embodiments of the present invention, prior to performing feature point
detection, the
12
CA 03172605 2022- 9- 21

method further comprises the following steps of preprocessing the frame
sequence.
[0076] S101 - grayscale-processing the frame sequence, and obtaining a
grayscale frame
sequence.
[0077] Specifically, since the gray space only contains luminance information
and does not
contain color information, the amount of information of the image is greatly
reduced after
grayscale-processing; accordingly, in order to reduce subsequent amount of
information
participating in the calculation to facilitate subsequent calculation, the
frame sequence Li
(i=1, 2, 3, ..., n) obtained in the previous step is further grayscale-
processed in the
embodiments of the present invention, and a grayscale frame sequence is
obtained to be
expressed as Gi (i=1, 2, 3, ..., n) , in which the grayscale conversion
expression is as
follows:
G = Rx 0.299+G x 0.587 + B x0.114
[0078] S102 - denoising the grayscale frame sequence.
[0079] Specifically, in order to effectively prevent noise points (namely non
feature points) from
affecting subsequent steps and to enhance precision of detection, it is
further required to
denoise the grayscale frame sequence. During specific implementation, it is
possible to
employ a TV denoising method based on a total variation model to denoise the
grayscale
frame sequence Gi (i=1, 2, 3, ..., n) to obtain a denoised frame sequence
expressed as Ti
(i=1, 2, 3, ..., n), namely a preprocessed frame sequence to which the to-be-
detected video
corresponds. As should be noted here, the denoising method is randomly
selectable in the
embodiments of the present invention, and no restriction is made thereto in
this context.
[0080] The step of performing feature point detection on the frame sequence
frame by frame is
performing feature point detection on the preprocessed frame sequence frame by
frame.
13
CA 03172605 2022- 9- 21

[0081] As a preferred embodiment in the embodiments of the present invention,
the step of
performing feature point detection on the frame sequence frame by frame,
obtaining
feature points of each frame includes:
[0082] employing a feature point detecting algorithm in which are fused FAST
features and
SURF features to perform feature point detection on the frame sequence frame
by frame,
and obtain feature points of each frame.
[0083] Specifically, since the precision of the video jitter detecting
algorithm is affected by
feature point extraction and the matching technique, the performance of the
feature point
extracting algorithm will directly affect the precision of the video jitter
detecting
algorithm, so the feature point extracting algorithm is optimized in the
embodiments of
the present invention. As a preferred embodiment, a feature point detecting
algorithm in
which are fused FAST features and SURF features is employed. The SURF
algorithm is
an improvement over SIFT algorithm. SIFT is a feature describing method with
excellent
robustness and invariant scale, while the SURF algorithm improves over the
problems of
large amount of data to be calculated, high time complexity and long duration
of
calculation inherent in the SIFT algorithm at the same time of maintaining the
advantages
of the SIFT algorithm. Moreover, SURF is more excellent in performance in the
aspect
of invariance of illumination change and perspective change, particularly
excellent in
processing severe blurs and rotations of images, and excellent in describing
local features
of images. FAST feature detection is a kind of corner detection method, and
the most
prominent advantage of its algorithm rests in its calculation efficiency, and
the capability
to excellently describe global features of images. Therefore, use of the
feature point
detecting algorithm in which are fused FAST features and SURF features to
perform
feature point extraction not only gives consideration to global features of
images, but also
fully retains local features of images, moreover, computational expense is
small,
robustness against image blurs and faint illumination is strong, and real-time
property and
precision of the detection are further enhanced.
14
CA 03172605 2022- 9- 21

[0084] Fig. 3 is a flowchart illustrating performing an operation on the frame
feature point
sequence matrix based on an optical flow tracking algorithm to obtain a motion
vector of
each frame according to an exemplary embodiment, with reference to Fig. 3, as
a
preferred embodiment in the embodiments of the present invention, the step of
basing on
an optical flow tracking algorithm to perform an operation on the frame
feature point
sequence matrix to obtain a motion vector of each frame includes the following
steps.
[0085] S301 - performing optical flow tracking calculation on the frame
feature point sequence
matrix of each frame, and obtaining an initial motion vector of each frame.
[0086] Specifically, while the optical flow tracking calculation is performed
on the frame feature
point sequence matrix, a pyramid optical flow tracking LK (Lucas-Kanade)
algorithm
can be employed. For instance, the change of a feature point sequence matrix
Zi in the ith
frame is tracked to the next i+lth frame, and a motion vector ai is obtained,
whose
expression is as follows:
dx'
a i (i = 1,2,¨, n) - - dyi
-1
I dr'
[0087] where dx-' represents a Euclidean column offset from the th frame to
the i+lth frame, dy
represents a Euclidean row offset from the ith frame to the i+1th frame, and
dri represents
an angle offset from the ith frame to the i+lth frame.
[0088] Use of the pyramid optical flow tracking LK (Lucas-Kanade) algorithm
that utilizes the
pyramid iterative structure can effectively solve the problem of failed
tracking due to
unduly large change from feature points of frame A (which is supposed to be
the current
frame) to feature points of frame B (which is supposed to be the next frame),
and lays the
CA 03172605 2022- 9- 21

foundation for the method of detecting video jitter provided by the
embodiments of the
present invention to enhance its jitter detecting sensitivity and robustness
when videos
are processed as captured under circumstances of abrupt large displacement,
strong jitter,
and excessive rotation of lenses.
[0089] S302 - obtaining a corresponding accumulative motion vector according
to the initial
motion vector.
[0090] Specifically, accumulative integral transformation is performed on the
initial motion
vector Ot: of each frame as obtained in step S301 to obtain an accumulative
motion
vector, expressed as /8 , of each frame, in which the expression of the
accumulative
motion vector pi is as follows:
j=1
dr
[0091] S303 - smoothening the accumulative motion vector, and obtaining a
smoothened motion
vector.
[0092] Specifically, a sliding average window is used to smoothen the motion
vector /8 i
obtained in step S302 to obtain a smoothened motion vector r , whose
expression is:
16
CA 03172605 2022- 9- 21

CbCj
J =1
= E dr'
.1 =1
j
J =1
[0093] where n represents the total number of frames of the video; the radius
of the smooth
window is r, and its expression is:
Fl n 20
1= 10 In (1+ prz)
_______________________________________________________ n > 20
1. Ill (1+ ,u)
[0094] where indicates a parameter of the sliding window, and la is a positive
number, the
specific numerical value of can be dynamically adjusted according to
practical
requirement, for instance, as a preferred embodiment, it can be set as =30.
[0095] In the embodiments of the present invention, the sliding average window
with extremely
small computational expense is used to smoothen the motion vector, while
Kalman
filtering with complicated computation is not used for the process, whereby
computational expense is further reduced and real-time property is further
enhanced, not
at the expense of losing any precision.
[0096] S304 - employing the accumulative motion vector and the smoothened
motion vector to
readjust the initial motion vector of each frame, and obtaining the motion
vector of each
frame.
17
CA 03172605 2022- 9- 21

[0097] Specifically, # and y obtained in the previous steps S302 and S303 are
used to
readjust ai obtained in step S301, to obtain a readjusted motion vector A,/ ,
whose
expression is:
r
1
dri - dr'
j=1 I 4
11/4,
F
= 1, 2, - = - , n) = +(7, -,,0,) = dy' + dyi-Z dyi
j=i j=1
dr'+ Zdri-Zdril
j=1 i=1
[0098] The readjusted motion vector
as obtained is taken as the motion vector of each frame
to participate in subsequent calculation, so that the calculation result is
made more precise,
i.e., the result of video jitter detection is made more precise.
[0099] Fig. 4 is a flowchart illustrating obtaining a feature value of the to-
be-detected video
according to the motion vector of each frame according to an exemplary
embodiment,
with reference to Fig. 4, as a preferred embodiment in the embodiments of the
present
invention, the step of obtaining a feature value of the to-be-detected video
according to
the motion vector of each frame includes the following steps.
[0100] S401 - merging and converting the motion vectors of all frames into a
matrix, and
calculating unbiased standard deviations of various elements in the matrix.
[0101] Specifically, the motion vectors of all frames obtained through the
previous steps are
18
CA 03172605 2022- 9- 21

firstly merged and converted into a matrix, for instance, with respect to the
motion vector
A A= = A,
A./ , it is converted into the form of a matrix [d 2
n , and unbiased
standard deviations of its elements are calculated by rows, the specific
calculation
expression is as follows:
( fr) 11
\41 ____________________________________________________ A)
i.n-
[0102] The unbiased standard deviations of the various elements in the matrix
can be obtained
through the above expression, and are respectively expressed as a [A, ( dx)] ,
[A.(dy )] and a[A,(dr)] , in which A represents the average value of the
samples.
[0103] S402 - weighting and fusing the unbiased standard deviations of the
various elements,
and obtaining a weighted value.
[0104] Specifically, weights are assigned to the unbiased standard deviations
of the various
elements according to practical requirements, and the unbiased standard
deviations of the
various elements are weighted and fused according to the weights, wherein the
weights
of the unbiased standard deviations of the various elements can be dynamically
readjusted
according to practical requirements. For instance, set the weight of a [A,
(dx)] as 3, the
weight of a [gdy)] as 3, and the weight of a[A,(dr)] as 10, then the fusing
expression is as follows:
1. K = 3
Cr[ 2(dX)] 3 a[2(dy)] +I Clo-[2.(dr)]
[0105] S403 - taking the unbiased standard deviations of the various elements
and the weighted
value as the feature value of the to-be-detected video.
19
CA 03172605 2022- 9- 21

[0106] Specifically, in the embodiments of the present invention, the feature
value of the to-be-
detected video S is the unbiased standard deviations of the various elements
and their
weighted value as obtained in the previous steps, which are expressed as:
{1X)1, cr[A4dyns 47[2.0ra
[0107] Fig. 5 is a view schematically illustrating the structure of a device
for detecting video
jitter according to an exemplary embodiment, with reference to Fig. 5, the
device
comprises:
[0108] a framing processing module, for performing a framing process on a to-
be-detected video
to obtain a frame sequence;
[0109] a feature point detecting module, for performing feature point
detection on the frame
sequence frame by frame, obtaining feature points of each frame, and
generating a frame
feature point sequence matrix;
[0110] a vector calculating module, for basing on an optical flow tracking
algorithm to perform
an operation on the frame feature point sequence matrix to obtain a motion
vector of each
frame;
[0111] a feature value extracting module, for obtaining a feature value of the
to-be-detected
video according to the motion vector of each frame; and
[0112] a jitter detecting module, for taking the feature value of the to-be-
detected video as an
input signal of a detection model to perform operation to obtain an output
signal, and
judging whether jitter occurs to the to-be-detected video according to the
output signal.
[0113] As a preferred embodiment in the embodiments of the present invention,
the device
further comprises:
[0114] a data preprocessing module, for preprocessing the frame sequence;
[0115] the data preprocessing module includes:
CA 03172605 2022- 9- 21

[0116] a grayscale-processing unit, for grayscale-processing the frame
sequence, and obtaining
a grayscale frame sequence; and
[0117] a denoising processing unit, for denoising the grayscale frame
sequence;
[0118] the feature point detecting module is employed for performing feature
point detection on
the preprocessed frame sequence frame by frame.
[0119] As a preferred embodiment in the embodiments of the present invention,
the feature point
detecting module is further employed for:
[0120] employing a feature point detecting algorithm in which are fused FAST
features and
SURF features to perform feature point detection on the frame sequence frame
by frame,
and obtain feature points of each frame.
[0121] As a preferred embodiment in the embodiments of the present invention,
the vector
calculating module includes:
[0122] an optical flow tracking unit, for performing optical flow tracking
calculation on the
frame feature point sequence matrix of each frame, and obtaining an initial
motion vector
of each frame;
[0123] an accumulation calculating unit, for obtaining a corresponding
accumulative motion
vector according to the initial motion vector;
[0124] a smoothening processing unit, for smoothening the accumulative motion
vector, and
obtaining a smoothened motion vector; and
[0125] a vector readjusting unit, for employing the accumulative motion vector
and the
smoothened motion vector to readjust the initial motion vector of each frame,
and
obtaining the motion vector of each frame.
[0126] As a preferred embodiment in the embodiments of the present invention,
the feature value
extracting module includes:
[0127] a matrix converting unit, for merging and converting the motion vectors
of all frames into
a matrix;
21
CA 03172605 2022- 9- 21

[0128] a standard deviation calculating unit, for calculating unbiased
standard deviations of
various elements in the matrix; and
[0129] a weighting and fusing unit, for weighting and fusing the unbiased
standard deviations of
the various elements, and obtaining a weighted value.
[0130] In summary, the technical solutions provided by the embodiments of the
present invention
bring about the following advantageous effects.
1. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, by basing on the optical flow tracking
algorithm to obtain a motion vector of each frame according to the frame
feature
point sequence matrix, the present invention effectively solves the problem of
failed tracking due to unduly large change between two adjacent frames,
exhibits
excellent toleration and adaptability in detecting jitters of videos captured
under
the condition in which the lens slowly moves, and achieves excellent
sensitivity
and robustness when videos are detected as captured under circumstances of
abrupt large displacement, strong jitter, and excessive rotation of the lens.
2. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, a feature point detecting algorithm in
which are fused FAST features and SURF features is employed, that is to say,
the
feature point extracting algorithm is so optimized that not only the image
global
feature is considered, but the local features are also retained, moreover,
computational expense is small, robustness against image blurs and faint
illumination is strong, and real-time property and precision of the detection
are
further enhanced.
3. In the method of and device for detecting video jitter provided by the
embodiments of the present invention, features of at least four dimensions are
22
CA 03172605 2022- 9- 21

extracted from the to-be-detected video, and an SVM model is used as the
detection model, so that the method of detecting video jitter as provided by
the
embodiments of the present invention is more advantageous in terms of
generality,
and precision of detection is further enhanced.
[0131] Of course, implementation of any one solution according to the present
application does
not necessarily achieve all of the aforementioned advantages simultaneously.
As should
be noted, when the device for detecting video jitter provided by the
aforementioned
embodiment triggers a detecting business, it is merely exemplarily described
with its
division into the aforementioned various functional modules, whereas in actual
application it is possible to base on requirements to assign the
aforementioned functions
to different functional modules for completion, that is to say, the internal
structure of the
device is divided into different functional modules to complete the entire or
partial
functions as described above. In addition, the device for detecting video
jitter provided
by the aforementioned embodiment pertains to the same inventive conception as
the
method of detecting video jitter, in other words, the device is based on the
method of
detecting video jitter ¨ see the method embodiment for its specific
implementation
process, while no repetition will be made in this context.
[0132] As comprehensible to persons ordinarily skilled in the art, the entire
or partial steps in the
aforementioned embodiments can be completed via hardware, or via a program
instructing relevant hardware, the program can be stored in a computer-
readable storage
medium, and the storage medium can be a read-only memory, a magnetic disk or
an
optical disk, etc.
[0133] The foregoing embodiments are merely preferred embodiments of the
present invention,
and they are not to be construed as restrictive to the present invention. Any
amendment,
equivalent substitution, and improvement makeable within the spirit and
principle of the
present invention shall all fall within the protection scope of the present
invention.
23
CA 03172605 2022- 9- 21

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Letter Sent 2024-01-02
Inactive: Grant downloaded 2024-01-02
Inactive: Grant downloaded 2024-01-02
Grant by Issuance 2024-01-02
Inactive: Cover page published 2024-01-01
Pre-grant 2023-11-16
Inactive: Final fee received 2023-11-16
4 2023-10-31
Letter Sent 2023-10-31
Notice of Allowance is Issued 2023-10-31
Inactive: Q2 passed 2023-10-27
Inactive: Approved for allowance (AFA) 2023-10-27
Amendment Received - Voluntary Amendment 2023-10-06
Amendment Received - Voluntary Amendment 2023-10-06
Examiner's Interview 2023-10-03
Amendment Received - Response to Examiner's Requisition 2023-08-11
Amendment Received - Voluntary Amendment 2023-08-11
Examiner's Report 2023-04-12
Inactive: Report - No QC 2023-04-11
Letter sent 2023-03-09
Advanced Examination Determined Compliant - paragraph 84(1)(a) of the Patent Rules 2023-03-09
Inactive: IPC assigned 2022-10-12
Inactive: First IPC assigned 2022-10-12
Letter Sent 2022-10-11
Inactive: Advanced examination (SO) fee processed 2022-09-21
Amendment Received - Voluntary Amendment 2022-09-21
All Requirements for Examination Determined Compliant 2022-09-21
Inactive: IPC assigned 2022-09-21
Letter sent 2022-09-21
Early Laid Open Requested 2022-09-21
Inactive: Advanced examination (SO) 2022-09-21
Amendment Received - Voluntary Amendment 2022-09-21
Priority Claim Requirements Determined Compliant 2022-09-21
Request for Priority Received 2022-09-21
National Entry Requirements Determined Compliant 2022-09-21
Application Received - PCT 2022-09-21
Request for Examination Requirements Determined Compliant 2022-09-21
Amendment Received - Voluntary Amendment 2022-09-21
Application Published (Open to Public Inspection) 2020-12-24

Abandonment History

There is no abandonment history.

Maintenance Fee

The last payment was received on 2023-12-15

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
MF (application, 2nd anniv.) - standard 02 2022-06-13 2022-09-21
Reinstatement (national entry) 2022-09-21 2022-09-21
Advanced Examination 2022-09-21 2022-09-21
MF (application, 3rd anniv.) - standard 03 2023-06-12 2022-09-21
Request for examination - standard 2024-06-11 2022-09-21
Basic national fee - standard 2022-09-21
Final fee - standard 2023-11-16
MF (application, 4th anniv.) - standard 04 2024-06-11 2023-12-15
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
10353744 CANADA LTD.
Past Owners on Record
CHONG MU
ERLONG LIU
WENZHE GUO
XUYANG ZHOU
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2023-12-31 23 840
Abstract 2023-12-31 1 28
Drawings 2023-12-31 3 43
Claims 2023-08-10 60 3,079
Claims 2023-10-05 60 3,067
Representative drawing 2023-12-12 1 12
Cover Page 2023-12-12 1 54
Description 2022-09-20 23 841
Claims 2022-09-20 4 137
Drawings 2022-09-20 3 43
Abstract 2022-09-20 1 28
Cover Page 2023-01-15 1 54
Representative drawing 2023-01-15 1 12
Claims 2022-09-21 60 1,989
Courtesy - Acknowledgement of Request for Examination 2022-10-10 1 422
Commissioner's Notice - Application Found Allowable 2023-10-30 1 578
Amendment / response to report 2023-08-10 127 4,796
Interview Record 2023-10-02 1 20
Amendment / response to report 2023-10-05 125 4,592
Final fee 2023-11-15 3 61
Electronic Grant Certificate 2024-01-01 1 2,527
Voluntary amendment 2022-09-20 62 2,023
Correspondence 2022-09-20 7 255
National entry request 2022-09-20 2 58
Declaration of entitlement 2022-09-20 1 34
International Preliminary Report on Patentability 2022-09-20 6 211
Patent cooperation treaty (PCT) 2022-09-20 2 107
International search report 2022-09-20 2 76
Patent cooperation treaty (PCT) 2022-09-20 1 57
National entry request 2022-09-20 10 234
Courtesy - Letter Acknowledging PCT National Phase Entry 2022-09-20 2 48
Courtesy - Advanced Examination Request - Compliant (SO) 2023-03-08 1 176
Request for examination / Advanced examination (SO) 2022-09-20 2 58
Examiner requisition 2023-04-11 5 242