Sommaire du brevet 3136990

(12) Demande de brevet:	(11) CA 3136990
(54) Titre français:	METHODE DE DETECTION DE POINT PRINCIPAL D'UN CORPS HUMAIN, APPAREIL, DISPOSITIF INFORMATIQUE ET SUPPORT DE STOCKAGE
(54) Titre anglais:	A HUMAN BODY KEY POINT DETECTION METHOD, APPARATUS, COMPUTER DEVICE AND STORAGE MEDIUM
Statut:	Examen

Données bibliographiques

(51) Classification internationale des brevets (CIB):	G6V 40/10 (2022.01) G6V 20/00 (2022.01)
(72) Inventeurs :	XU, ZHAOKUN (Chine) LI, YUNXI (Chine) JI, HUAIYUAN (Chine)
(73) Titulaires :	10353744 CANADA LTD.
(71) Demandeurs :	10353744 CANADA LTD. (Canada)
(74) Agent:	JAMES W. HINTONHINTON, JAMES W.
(74) Co-agent:
(45) Délivré:
(22) Date de dépôt:	2021-10-29
(41) Mise à la disponibilité du public:	2022-04-29
Requête d'examen:	2022-09-16
Licence disponible:	S.O.
Cédé au domaine public:	S.O.
(25) Langue des documents déposés:	Anglais

Traité de coopération en matière de brevets (PCT):	Non

(30) Données de priorité de la demande:

Numéro de la demande	Pays / territoire	Date
202011183648.9	(Chine)	2020-10-29

Abrégés

Abrégé anglais

The present invention discloses to a human body key point detection method,
apparatus, computer
device, and storage medium. The method comprises: detecting whether the
current scenario exists
human, if no, collecting first depth image; if yes, collecting second depth
image, storing human
detection frame; comparing frame difference between collected first depth
image and second depth
image, obtaining depth changing area map; obtaining human body mask code and
obtaining human
body area image; obtaining single body area image, inputting key point
detection model, outputting
plural body key points; obtaining confidence level of each body key point and
obtaining final body key
point. The present invention is based on multi-resolution key point detection
method of human body
mask code, improving key point detection accuracy, resolving background
detection issue, reducing
detection error scenario of wrong person positioning, wrong order and missing
order in self-service
shop, reducing economical losses and improving user experience.

Revendications

Note : Les revendications sont présentées dans la langue officielle dans laquelle elles ont été soumises.

Claims:
1. A detecting key points of human body method comprises:
detecting whether there is a human body in the current scenario, if no human
body is detected,
collecting first depth image in the current scenario and covering first depth
image of undetected
human body in precious scenario; if a human body is detected, collecting
second depth image in
the current scenario, storing a human body detection frame corresponding to
second depth image;
comparing frame difference between the collected first depth image and second
depth image,
obtaining a depth changing area map;
obtaining a human body mask code, obtaining a human body area image according
to the human
body mask code;
obtaining a single body area image by using the human body detection frame,
inputting a key point
detection model, outputting plural human body key points; and
obtaining confidence level of each key point of the human body and obtaining
final human body
key point according to the confidence level.
2. The method according to claim 1, wherein comparing frame difference between
the collected first depth
image and second depth image, comprising:
pre-setting a difference threshold, the difference threshold is determined by
scenario and depth
camera; and
subtracting the pixels of the first depth image and the second depth image one
by one, if the pixel
21

difference is greater than the threshold, then recording the difference, if
the pixel difference is not
greater than the threshold, then recoding zero.
3. The method according to claim 1, wherein obtaining a human body mask code,
comprising:
pre-setting a limited threshold of human body connected domain, determining
size relationship
between each non-zero connected domain in the depth changing area map and the
limited threshold;
if the non-zero connected domain is greater than the limited threshold,
determining of the non-zero
connected domain is human body connected domain, calculating centroid of the
human body
connected domain;
if the non-zero connected domain is not greater than the limited threshold,
determining of the non-
zero connected domain is not human body connected domain and discarding this
domain;
performing area growth by using each centroid as reference point to obtain
first human body area
mask code;
performing human body implementation segmentation on the second depth image,
combining each
segmented implementation human body mask code to obtain second human body area
mask code;
and
merging the first human body area mask code and the second human body area
mask code to obtain
the human body mask code.
4. The method according to claim 3, wherein obtaining the human body area
image through the human
body mask code, comprising:
22

intercepting the human body area in the second depth image by using the human
body mask code,
filtering out background of the image and obtaining the human body area image;
and
pre-processing the human body area image.
5. The method according to claim 4, wherein obtaining single human body area
image by using the human
body detection frame, comprising:
keeping center point of the human body detection frame unchanged, expanding
the range of the
human body detection frame in an equal proportion; and
intercepting the pre-processed human body area image according to the expanded
human body
detection frame, obtaining single human body area image, scaling the single
human body area
image to predetermined size.
6. The method according to claim 5, wherein inputting a key point detection
model, outputting plural human
body key points, comprising:
the single human body area image inputs key point detection model; and
the key point detection model outputs plural human body key point heat maps,
and each heat map
of the human body key point corresponds to a key point of the human body.
7. The method according to claim 6, wherein obtaining the confidence level of
each key point of the human
body and obtaining the final key point of the human body according to the
confidence level, comprising:
23

searching for peak position of each human body key point heat map, determining
of the peak
position is the detection position of the human body key point according to
the human body key
point heat map, and determining of the peak value is the confidence level of
the human body key
point;
pre-setting a threshold of confidence level, and determining the relationship
between the confidence
level of each human body key point and the threshold of confidence level;
if the confidence level of the human body key point is greater than the
threshold of confidence level,
outputting the coordinates and the confidence level of the human body key
point; and
if the confidence level of the human body key point is not greater than the
threshold of confidence
level, discarding the human body key point; and
obtaining the final human body key point.
8. The method according to claim 6, wherein inputting a key point detection
model, outputting plural human
body key points, comprising:
inputting the single human body area image into the key point detection model;
down-sampling the inputted single human body area image, obtaining a first
characteristic map;
linear interpolating and down-sampling on the first characteristic map
respectively, obtaining a
second characteristic map and a third characteristic map with different
resolution, turning on the
correspondingly first resolution branch, second resolution branch, and third
resolution branch,
processing through the residual block respectively;
24

the first resolution branch, the second resolution branch, and the third
resolution branch are
respectively subjected to first multi-resolution cross merge, wherein each
branch requires to add
the correspondingly characteristic map and the characteristic maps
corresponding to all other
branches, and after completing the merge, each branch is processed by the
residual block again;
after amplifying the characteristic maps corresponding to the first resolution
branch and the second
resolution branch by linear interpolation, performing the second multi-
resolution cross merge with
the characteristic map of the third resolution branch; and
outputting plural human body key point heat maps according to the obtained
characteristic map
after the second multi-resolution cross merge, each human body key point heat
map corresponds to
a human body key point.
9. A human body key point detection apparatus, comprising:
a detection module configured to detect whether there is a human body in the
current scenario;
a first collection module configured to if no human body is detected,
collecting first depth image
in the current scenario and covering first depth image of undetected human
body in precious
scenario;
a second collection module configure to if a human body is detected,
collecting second depth image
in the current scenario, storing a human body detection frame corresponding to
second depth image;
a comparison module configured to compare frame difference between the
collected first depth
image and second depth image, obtaining a depth changing area map;

an interception module configured to obtain a human body mask code and obtain
a human body
area image according to the human body mask code;
a key point acquisition module configured to obtain a single body area image
by using the human
body detection frame, inputting a key point detection model, outputting plural
human body key
points; and
a judgement module configured to obtain confidence level of each key point of
the human body
and obtaining final human body key point according to the confidence level.
10. A computer readable storage medium stored with a computer program
configured to achieve the steps
of any methods in claim 1 to 7 when the processor executes the computer
program.
26

Description

Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.

A HUMAN BODY KEY POINT DETECTION METHOD, APPARATUS, COMPUTER DEVICE AND
STORAGE MEDIUM
Field
[0001] The present disclosure relates to the field of computer vision
technology, particularly to a human
body key point detection method, apparatus, computer device, and storage
medium.
Background
[0002] With the continuous development of the computer vision field, the human
body key point detection
algorithm has attracted more and more attentions. The human body key point
detection is very important
for human action recognition, human body posture description, and even human
body behavior prediction.
The human body key points detection is the cornerstone of many computer vision
tasks, such as human-to-
goods interaction in self-service shops, abnormal human action detection in
various monitoring scenarios,
action capturing in movie production and so on.
[0003] At present, the human body key point detection method based on the deep
learning architecture
has achieved remarkable results in the case of a single person with a simple
background, it mainly includes
two types of methods which are bottom-up and top-down. However, no matter
which method is currently
available, the desired effect cannot be achieved even in the case of complex
background and close distances
of many people. There are mainly following two categories of problems:
[0004] 1. In actual scenario with complex background, current algorithm is
likely to mis-detect human
body key point to the background.
[0005] 2. When many people are close, the target human body key point is
likely to be mistakenly detected
to the limbs of the nearby human body. At the same time, the other human
bodies key points are likely to
1
Date recue / Date received 2021-10-29

be mistakenly detected to the target body limbs.
Invention Content
[0006] In order to solve the above-mentioned problems , the present invention
provides a method for
detecting the human body key point and the method includes:
[0007] Detecting whether there is a human body in the current scenario, if no
human body is detected,
collecting first depth image in the current scenario and covering first depth
image of undetected human
body in precious scenario; if a human body is detected, collecting second
depth image in the current
scenario, storing a human body detection frame corresponding to second depth
image;
[0008] Comparing frame difference between the collected first depth image and
second depth image,
obtaining a depth changing area map;
[0009] Obtaining a human body mask code, obtaining a human body area image
according to the human
body mask code;
[0010] Obtaining a single body area image by using the human body detection
frame, inputting a key point
detection model, outputting plural human body key points;
[0011] Obtaining confidence level of each key point of the human body and
obtaining final human body
key point according to the confidence level.
[0012] In an implementation, comparing frame difference between the collected
first depth image and
second depth image, comprising:
2
Date recue / Date received 2021-10-29

[0013] Pre-setting a difference threshold, the difference threshold is
determined by scenario and depth
camera;
[0014] Subtracting the pixels of the first depth image and the second depth
image one by one, if the pixel
difference is greater than the threshold, then recording the difference, if
the pixel difference is not greater
than the threshold, then recoding zero.
[0015] In an implementation, obtaining a human body mask code, comprising:
[0016] Pre-setting a limited threshold of human body connected domain,
determining size relationship
between each non-zero connected domain in the depth changing area map and the
limited threshold;
[0017] If the non-zero connected domain is greater than the limited threshold,
determining of the non-zero
connected domain is human body connected domain, calculating centroid of the
human body connected
domain;
[0018] If the non-zero connected domain is not greater than the limited
threshold, determining of the non-
zero connected domain is not human body connected domain and discarding this
domain;
[0019] Performing area growth by using each centroid as reference point to
obtain first human body area
mask code;
[0020] Performing human body implementation segmentation on the second depth
image, combining each
segmented implementation human body mask code to obtain second human body area
mask code;
[0021] Merging the first human body area mask code and the second human body
area mask code to
obtain the human body mask code.
3
Date recue / Date received 2021-10-29

[0022] In an implementation, obtaining the human body area image through the
human body mask code,
comprising:
[0023] Intercepting the human body area in the second depth image by using the
human body mask code,
filtering out background of the image and obtaining the human body area image;
[0024] Pre-processing the human body area image.
[0025] In an implementation, obtaining single human body area image by using
the human body detection
frame, comprising:
[0026] Keeping center point of the human body detection frame unchanged,
expanding the range of the
human body detection frame in an equal proportion;
[0027] Intercepting the pre-processed human body area image according to the
expanded human body
detection frame, obtaining single human body area image, scaling the single
human body area image to
predetermined size.
[0028] In an implementation, inputting a key point detection model, outputting
plural human body key
points, comprising:
[0029] The single human body area image inputs key point detection model;
[0030] The key point detection model outputs plural human body key point heat
maps, and each heat map
of the human body key point corresponds to a key point of the human body.
4
Date recue / Date received 2021-10-29

[0031] In an implementation, obtaining the confidence level of each key point
of the human body and
obtaining the final key point of the human body according to the confidence
level, comprising:
[0032] Searching for peak position of each human body key point heat map,
determining of the peak
position is the detection position of the human body key point according to
the human body key point heat
map, and determining of the peak value is the confidence level of the human
body key point;
[0033] Pre-setting a threshold of confidence level, and determining the
relationship between the
confidence level of each human body key point and the threshold of confidence
level;
[0034] If the confidence level of the human body key point is greater than the
threshold of confidence
level, outputting the coordinates and the confidence level of the human body
key point;
[0035] If the confidence level of the human body key point is not greater than
the threshold of confidence
level, discarding the human body key point;
[0036] Obtaining the final human body key point.
[0037] In an implementation, inputting a key point detection model, outputting
plural human body key
points, also comprising:
[0038] Inputting the single human body area image into the key point detection
model;
[0039] Down-sampling the inputted single human body area image, obtaining a
first characteristic map;
[0040] Linear interpolating and down-sampling on the first characteristic map
respectively, obtaining a
second characteristic map and a third characteristic map with different
resolution, turning on the
Date recue / Date received 2021-10-29

correspondingly first resolution branch, second resolution branch, and third
resolution branch, processing
through the residual block respectively;
[0041] The first resolution branch, the second resolution branch, and the
third resolution branch are
respectively subjected to first multi-resolution cross merge, wherein each
branch requires to add the
correspondingly characteristic map and the characteristic maps corresponding
to all other branches, and
after completing the merge, each branch is processed by the residual block
again;
[0042] After amplifying the characteristic maps corresponding to the first
resolution branch and the
second resolution branch by linear interpolation, performing the second multi-
resolution cross merge with
the characteristic map of the third resolution branch;
[0043] Outputting plural human body key point heat maps according to the
obtained characteristic map
after the second multi-resolution cross merge, each human body key point heat
map corresponds to a human
body key point.
[0044] A human body key point detection apparatus, comprising:
[0045] A detection module configured to detect whether there is a human body
in the current scenario;
[0046] A first collection module configured to if no human body is detected,
collecting first depth image
in the cm-rent scenario and covering first depth image of undetected human
body in precious scenario;
[0047] A second collection module configure to if a human body is detected,
collecting second depth
image in the cm-rent scenario, storing a human body detection frame
corresponding to second depth image;
[0048] A comparison module configured to compare frame difference between the
collected first depth
6
Date recue / Date received 2021-10-29

image and second depth image, obtaining a depth changing area map;
[0049] An interception module configured to obtain a human body mask code and
obtain a human body
area image according to the human body mask code;
[0050] A key point acquisition module configured to obtain a single body area
image by using the human
body detection frame, inputting a key point detection model, outputting plural
human body key points;
[0051] A judgement module configured to obtain confidence level of each key
point of the human body
and obtaining final human body key point according to the confidence level.
[0052] A computer device, including a memory, a processor and a computer
program stored in the
memory and ran on the processor configured to achieve the steps of any above-
mentioned methods when
the processor executes the computer program.
[0053] A computer readable storage medium stored with a computer program
configured to achieve the
steps of any above-mentioned methods when the processor executes the computer
program.
[0054] The human body key point detection method, device and computer storage
medium of the present
invention provide a multi-resolution human body key point detection method
based on a human body mask,
which has the following effects:
[0055] 1. The present invention combines two body mask extraction methods
based on deep dynamic
frame difference area growth and Yolact implementation segmentation to obtain
complete body mask
information and use the mask to filter out the complex background interference
information in the
corresponding image , It solves the problem that the human body key point is
incorrectly detected on the
complex background, and greatly improves the accuracy of the algorithm.
7
Date recue / Date received 2021-10-29

[0056] 2. The present invention innovatively uses the structure of multiple
resolutions and multiple
network branches in parallel, so that the key point detection model can take
care of both partial fine-grained
and whole person's global information, and can detect quasi partial fine-
grained key points while paying
attention to each connection between the key points of the global whole
person, thus solving the problem
of the key points mis-detect the people around, and improving the accuracy of
the key points.
[0057] 3. The multi-resolution network branches in the human body key point
detection network structure
in the present invention only cross and merge with each other twice,
respectively, located in the middle and
the last of the entire network, and are intended to allow each branch network
to fully analyze the image
before proceeding. Cross merging avoids the interference caused by other
branches in frequent multi-scale
merge. While reducing the computational consumption required by the network
model and speeding up the
operation, it also reduces the interference between the resolution branches
and improves the accuracy of
the algorithm.
[0058] 4. The present invention uses an RGBD depth camera to obtain RGB color
images and depth
images in the scenario, introducing the depth information based on RGB color
texture information, adding
the depth information to an additional image channel, and expanding the
information range of what the
human body key point detection algorithm can perceive, and the range of
information increases the accuracy
of the algorithm.
Drawing Description
[0059] Figure 1 is a steps diagram of a human body key point detecting method
in an implementation;
[0060] Figure 2 is a process diagram of a human body key point detecting
method in an implementation;
8
Date recue / Date received 2021-10-29

[0061] Figure 3 is a detection model diagram of a human body key point
detecting method in an
implementation;
[0062] Figure 4 is a structure diagram of a human body key point detecting
apparatus in an
implementation;
[0063] Figure 5 is an internal structure of a computer device in an
implementation.
Specific implementation methods
[0064] In order to make clearer application purposes, technical solutions, and
advantages, the present
disclosure is further explained in detail with a particular embodiment
thereof, and with reference to the
drawings. It shall be appreciated that these descriptions are only intended to
be illustrative, but not to limit
the scope of the disclosure thereto.
[0065] The human body key point detection method provided by the present
application, in one
implementation, as shown in Figure 1, Figure 2, including the following steps:
[0066] S100, detecting whether there is a human body in the current scenario,
if no human body is
detected, collecting first depth image in the current scenario and covering
first depth image of undetected
human body in precious scenario; if a human body is detected, collecting
second depth image in the current
scenario, storing a human body detection frame corresponding to second depth
image.
[0067] In the present implementation, the front human body detection algorithm
is first used to detect the
human body in the current scenario and determining whether there is a human
body in the current scenario.
[0068] Then, the judgment process in the following two directions is carried
out according to the detection
9
Date recue / Date received 2021-10-29

result.
[0069] If no human body is detected in the current scenario, the current
scenario is regarded as a non-
human-body condition, collecting the first depth image in the current
scenario, retaining the file, and
overwriting the first depth image of the previous scenario where no human body
is detected.
[0070] If a human body is detected in the current scenario, the current
scenario is regarded as a normal
human body condition, collecting the second depth image in the current
scenario, retaining the file, and
storing the related information of the human body detection frame
corresponding to the second depth
image. Preferably, the second depth image is a RGBD image.
[0071] S200: comparing frame difference between the collected first depth
image and second depth image,
obtaining a depth changing area map.
[0072] In the present implementation, comparing the first depth image without
human body and the
second depth image with human body collected in step S100 by frame difference
to obtain a depth changing
area map.
[0073] Specifically, for frame difference processing, firstly, pre-setting the
difference threshold according
to the characteristics of scenario and depth camera, and then subtracting the
pixels of the first depth image
and the second depth image one by one, if the pixel difference is greater than
the threshold, recording the
difference, if the pixel difference is not greater than the threshold, then
recording as zero.
[0074] S300: obtaining a human body mask code, obtaining a human body area
image according to the
human body mask code.
[0075] In the present implementation, obtaining the human body mask code by
merging method, and
Date recue / Date received 2021-10-29

obtaining the human body area image through the human body mask code.
[0076] Specifically, obtaining the human body mask code includes:
[0077] Pre-setting the limited threshold of the human body connected domain,
calculating the size of the
non-zero connected domain in the depth change area obtained in step S200, and
determining the size
relationship between each non-zero value connected domain in the depth
changing area map and the limited
threshold.
[0078] If the non-zero connected domain is greater than the limited threshold,
determining of the non-
zero connected domain is human body connected domain, calculating centroid of
the human body
connected domain, the calculating method is as following:
Ef xi
xc = ¨
'
Yc = ,
[0079] Among them, xc, yc represent the centroid coordinates of the human body
connected domain, /
is the number of pixels contained in the human body connected domain, and xi,
yi is the coordinate
of the ith pixel in the connected domain.
[0080] If the non-zero connected domain is not greater than the limited
threshold, determining of the non-
zero connected domain is not human body connected domain and discarding this
domain.
[0081] Performing area growth by using each centroid as reference point to
obtain first human body area
mask code.
[0082] Preferably, the ad area growth algorithm is specifically: firstly,
according to the accuracy of the
11
Date recue / Date received 2021-10-29

actual used depth camera and characteristic of growth threshold value setting,
determining the depth value
of the four pixels which are up, down, left, and right of the nearest neighbor
of the reference point, whether
the difference between them and reference point is less than the growth
threshold, if it is less than the growth
threshold, the pixel is included in the growth area where the reference point
is located; if it is not less than
the growth threshold, the pixel is not in the growth area where the reference
point is located, and then setting
the new added pixel point of the growth area as the reference point, and the
above steps are repeated until
all the pixels in the growth area have no neighboring pixels that can grow.
[0083] Using Yolact deep learning of human body implementation segmentation
network, performing
body implementation segmentation on the second depth image, extracting the
body mask code of each
implementation, and then merging the obtained body mask code of each
implementation together to obtain
the second body area mask code. This second human body area mask code includes
the human body mask
codes of all human bodies in the second depth image.
[0084] Performing binarization processing on the first human body area mask
code, setting non-zero
valued pixels to 1, and zero-valued pixels are still 0. Then, the binarized
first body area mask code and the
second body area mask code are merged to obtain a complete human body mask
code.
[0085] Furthermore, obtaining the human body area image through the human body
mask code includes:
multiplying the human body mask code obtained by the above-mentioned method
with the second depth
image, intercepting the human body area in the second depth image, and
filtering out the background part
in the image to obtain the human body area image, and pre-processing the human
body area image.
[0086] Specifically, the pre-processing includes numerically truncating the
depth channel image in the
human body area image and truncating it to an appropriate range according to
the scenario characteristics,
then uniformly normalizing the body area image. In an implementation , in the
overhead shooting scenario
of the self-service shop, because the maximum possible shooting depth of the
overhead camera in the self-
12
Date recue / Date received 2021-10-29

service shop is 3500, the depth channel value is preferably truncated to the
range of [0,3500], the value that
greater than 3500 is regarded as noise. In the implementations of different
scenarios, the range of the depth
channel truncation can be adjusted according to the actual situation of the
scenario.
[0087] S400, obtaining a single body area image by using the human body
detection frame, inputting a
key point detection model, outputting plural human body key points.
[0088] In the present implementation, the human body detection frame
corresponding to the second depth
image stored in step S100 is used to process the human body area image
obtained in step S300 to obtain a
single human body area image. Preferably, the single human body area image is
RGBD four-channel-image
and inputting the obtained single person human body area image to the key
point detection model, and
outputting plural human body key points.
[0089] Specifically, using the stored human body detection frame corresponding
to the current second
depth image, firstly, keeping the center point of the human body detection
frame unchanged, expanding the
range of the human body detection frame to 1.25 times of the original
detection range, and then according
to the expanded range of the human body detection frame, cutting out a single
human body area image from
the pre-processed human body area image, and the single human body area image
is scaled to a pre-
determined size, preferably a size with a resolution of 256*256.
[0090] Inputting the scaled single human body area image into a pre-trained
multi-resolution key point
detection model, and the model outputs plural human body key point heat maps,
and each human body key
point heat map corresponds to a human body key point. Preferably, the model
outputs 10 the human body
key point heat maps to obtain 10 human body key points. Among them, 10 human
body key point heat
maps, in sequence, each picture corresponds to the human body's left ear key
point, right ear key point, left
shoulder key point, right shoulder key point, left elbow key point, right
elbow key point, left wrist key point,
right wrist key point, left crotch key point and right crotch key point.
13
Date recue / Date received 2021-10-29

[0091] In one of the implementations, the network structure of the multi-
resolution key point detection
model of the present invention uses the residual block and the BottelNeck
residual block in the ResNet
network as the basic building blocks. The network structure includes the
following steps, as shown in Figure
3:
[0092] Receiving the incoming scaled image of a single human body area that is
256*256*4.
[0093] Using convolution layer with two 3*3 consecutive convolution kernels
and a stride of 2 to down-
sample the input single human body area map, and down sampling the image to
64* 64.
[0094] Inputting the 64* 64 characteristic map into the BottelNet residual
block with the basic unit of the
ResNet network.
[0095] Performing linear interpolation and down-sampling on the characteristic
maps output by BottelNet
to obtain two characteristic maps with different resolutions 128* 128 and
32*32 respectively, opening these
two network branches with different resolutions.
[0096] Three different resolution branches (32* 32, 64* 64, 128* 128) are
processed by 4 ResNet labeled
residual blocks respectively, during which keeping the resolution of each
different resolution branch
unchanged.
[0097]
For the first multi-resolution cross merge, the three branches of different
resolutions are
completely merged, and each branch needs to add the characteristic map of its
own branch to the
characteristic maps of all other branches. Before adding, the characteristic
maps of different branches need
to be scaled to the same size.
14
Date recue / Date received 2021-10-29

[0098] Specifically, for example, on the 128*128 resolution branch, the input
characteristic map is
128*128 resolution, and the input characteristic is enlarged from the 64*64
and 32*32 characteristic maps
to 128*128 through 2x and 4x linear interpolation and added with the original
128*128 characteristic. On
the contrary, when reducing the characteristic map, using 3*3 convolution for
down-sampling, down-
sampling to one-half with a 3*3 down-sampling convolution, and down-sampling
to one-quarter with two
3*3 down-sampling convolution.
[0099] After completing interactive merge, each branch passes through 4 ResNet
standard residual blocks
again.
[0100] The 64* 64 and 32* 32 resolution branches are enlarged to 128*128 by 2x
and 4x linear
interpolation respectively and then merged with adding the 128*128 branch
characteristic map to achieve
the second multi-resolution cross merge.
[0101] After the merge, the 128*128 characteristic map is subjected to a
1*I*10 convolution operation
to obtain the heat map output corresponding to 10 key points, and finally it
is enlarged to the input size
256*256 through linear interpolation.
[0102] S500: obtaining confidence level of each human body key point and
obtaining final human body
key point according to the confidence level.
[0103] In the present implementation, obtaining the confidence level of each
key point of the human body
according to the human body key point the heat map, obtaining the final the
human body key point
according to the confidence level.
[0104] Specifically, searching for the highest response point coordinate of
each human body key point
heat map, which is the peak position, and determining the searched peak
position as the detection position
Date recue / Date received 2021-10-29

of the human body key point corresponding to the human body key point heat
map, and the value of the
peak value is the confidence level of the human body key point.
[0105] Pre-setting a threshold of confidence level to determine the
relationship between the confidence
level of each human body key point and the threshold of confidence level. If
the confidence level of the
human body key point is greater than the threshold of confidence level,
outputting the coordinates and
confidence level of the human body key point; if the confidence level of the
human body key point is not
greater than the threshold of confidence level, discarding the human body key
point.
[0106] The final output human body key points are the obtained final human
body key points.
[0107] Although the above-mentioned steps in the flowchart are shown in
sequence as indicated by the
arrows, these steps are not necessarily executed in the order indicated by the
arrows. Unless explicitly
instruction in this article, there is no strict order in which these steps can
be performed, and they can be
performed in any other orders. In addition, at least parts of the appended
drawings in the steps can include
more sub steps or multiple stages, these sub steps or stages are not
necessarily completed at the same time
but can be executed in different time, the execution order of these sub steps
or stages is also not necessarily
in sequence order but can be performed alternately with the other steps or sub
steps of other steps or at least
one part of the other stages.
[0108] In an implementation, a human body key point detection apparatus is
provided, comprising:
detection module 100, first collection module 200, second collection module
300, comparison module 400,
interception module 500, key point acquisition module 600, judgement module
700. Wherein:
[0109] A detection module 100 configured to detect whether there is a human
body in the current scenario.
[0110] A first collection module 200 configured to if no human body is
detected, collecting first depth
16
Date recue / Date received 2021-10-29

image in the cm-rent scenario and covering first depth image of undetected
human body in precious scenario.
[0111] A second collection module 300 configure to if a human body is
detected, collecting second depth
image in the cm-rent scenario, storing a human body detection frame
corresponding to second depth image.
[0112] A comparison module 400 configured to compare frame difference between
the collected first
depth image and second depth image, obtaining a depth changing area map.
[0113] An interception module 500 configured to obtain a human body mask code
and obtain a human
body area image according to the human body mask code.
[0114] A key point acquisition module 600 configured to obtain a single body
area image by using the
human body detection frame, inputting a key point detection model, outputting
plural human body key
points; and
[0115] A judgement module 700 configured to obtain confidence level of each
key point of the human
body and obtaining final human body key point according to the confidence
level.
[0116] For the specific limitation of the human body key point detection
apparatus can refer to the above-
mentioned the message conversion method, which will not be repeated here. Each
module of the above data
cache apparatus can be achieved fully or partly by software, hardware, and
their combinations. The above
modules can be embedded in the processor or independent of the processor in
computer device and can
store in the memory of computer device in form of software, so that the
processor can call and execute the
operations corresponding to the above modules.
[0117] In an implementation, a computer device is provided to be a server and
whose internal structure
diagram is shown in Figure 5. The computer device includes a processor, a
memory, a network interface,
17
Date recue / Date received 2021-10-29

and a database connected through a system bus. The processor of the computer
device is configured to
provide calculation and control capabilities. The memory of computer device
includes non-volatile storage
medium and internal memory. The memory of non-volatile storage medium has
operation system, computer
programs and database. The internal memory provides an environment for the
operation system and
computer program running in a non-volatile storage medium. The network
interface of the computer device
is used to communicate with an external terminal through a network connection.
The computer program is
executed by the processor to implement a human body key point detection
method.
[0118] The skilled in the art can understand that the structure shown in
Figure 11 is only partial structural
diagram related this application solution and not constitute limitation to the
computer device applied on the
current application solution, the specific computer device can include more or
less components than what
is shown in the figure, or combinations of some components or different
components to what is shown in
the figure.
[0118] In an implementation, a computer device is provided which includes a
memory, a processor, and
a computer program stored on the memory and running on the processor. The
processor achieves the above-
mentioned human body key point detection method when executing the computer
program.
[0119] The skilled in the art can understand that all or partial of procedures
from the above-mentioned
methods can be performed by computer program instructions through related
hardware, the mentioned
computer program can be stored in a non-volatile material computer readable
storage medium, this
computer can include various implementation procedures from the abovementioned
methods when
execution. Any reference to the memory, the storage, the database, or the
other media used in each
implementation provided in current application can include non-volatile and/or
volatile memory. Non-
volatile memory can include read-only memory (ROM), programable ROM (PROM),
electrically
programmable ROM (EPRPMD), electrically erasable programmable ROM (EEPROM) or
flash memory.
Volatile memory can include random access memory (RAM) or external cache
memory. As an instruction
18
Date recue / Date received 2021-10-29

but not limited to, RAM is available in many forms such as static RAM (SRAM),
dynamic RAM
(DRAMD), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced
SRAM
(ESDRAM), synchronal link (Synchlink) DRAM (SLDRAM), memory bus (Rambus),
direct RAM
(RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM
(RDRAM),
etc.
[0120] The invention discloses a human body key point detection method,
apparatus, computer device
and storage medium. The multi-resolution key point detection method based on
human body mask code is
proposed in the patent of the present invention, based on the current key
point algorithm using RGB images,
depth information is introduced to expand information amount that the
algorithm model can perceive, which
can effectively reduce the interference of the RGB color complex noise
information on the algorithm and
improve the accuracy of the key point algorithm.
[0121] Aiming at the problem that background-independent information
interferes greatly with the human
body key point algorithm, the present invention proposes a high precision
method for extracting the human
body mask code, which uses the human body mask code to filter out the
background interference in the
RGBD image and solves the problem of key point errors in complex background.
The problem of
positioning on the background improves the accuracy of the key point detection
algorithm.
[0122] The human body key point detection model structure proposed in the
patent of the present
invention is different from the serial network structure commonly used in the
current key point detection.
The present invention innovatively uses a parallel network structure,
including plural parallel network
branches with different resolutions, so that the network model proposed in the
present invention can
consider the global information of the human body and the local fine-grained
information of the limbs,
which has greatly reduced the situation that the key points are incorrectly
located on the limbs of other
people around the target human body. It also makes the key point detection of
the local limbs more accurate,
thereby improving the performance of the human body key point detection
algorithm.
19
Date recue / Date received 2021-10-29

[O123] To improve the efficiency of the multi-resolution parallel network, the
present invention does not
frequently perform multi-scale merge like most current network models, but
only performs information
merge on each resolution network branch in the middle and the tail of the
network, so that each different
resolution branch can fully analyze and process their own information before
blending with each other,
reducing the mutual interference between branches of different resolutions,
and improving the efficiency
and running speed of the network model to meet the requirements of real-time
performance in many
scenarios with only limited computing ability. At the same time, it also
improves the accuracy of the
algorithm.
[0124] The technical characteristics of the above-mentioned implementations
can be randomly combined,
for concisely statement, not all possible combinations of technical
characteristics in the abovementioned
implementations are described. However, if there are no conflicts in the
combinations of these technical
characteristics, it shall be within the scope of this descriptions.
[0125] The above-mentioned implementations are only several implementations in
this disclosure and the
description is more specific and detailed but cannot be understood as the
limitation of the scope of the
invention patent. Evidently those ordinary skilled in the art can make various
modifications and variations
to the disclosure without departing from the spirit and scope of the
disclosure. Therefore, the appended
claims are intended to be construed as encompassing the described embodiment
and all the modifications
and variations coming into the scope of the disclosure.
Date recue / Date received 2021-10-29

Dessin représentatif

Une figure unique qui représente un dessin illustrant l'invention.

États administratifs

2024-08-01 : Dans le cadre de la transition vers les Brevets de nouvelle génération (BNG), la base de données sur les brevets canadiens (BDBC) contient désormais un Historique d'événement plus détaillé, qui reproduit le Journal des événements de notre nouvelle solution interne.

Veuillez noter que les événements débutant par « Inactive : » se réfèrent à des événements qui ne sont plus utilisés dans notre nouvelle solution interne.

Pour une meilleure compréhension de l'état de la demande ou brevet qui figure sur cette page, la rubrique Mise en garde , et les descriptions de Brevet , Historique d'événement , Taxes périodiques et Historique des paiements devraient être consultées.

Historique d'événement

Description	Date
Modification reçue - modification volontaire	2024-03-06
Modification reçue - réponse à une demande de l'examinateur	2024-03-06
Rapport d'examen	2023-12-08
Inactive : Rapport - CQ échoué - Mineur	2023-12-07
Lettre envoyée	2023-02-08
Inactive : Correspondance - SPAB	2022-12-23
Requête d'examen reçue	2022-09-16
Exigences pour une requête d'examen - jugée conforme	2022-09-16
Toutes les exigences pour l'examen - jugée conforme	2022-09-16
Demande publiée (accessible au public)	2022-04-29
Inactive : Page couverture publiée	2022-04-28
Inactive : CIB en 1re position	2022-01-01
Inactive : CIB attribuée	2022-01-01
Inactive : CIB attribuée	2022-01-01
Lettre envoyée	2021-11-22
Exigences de dépôt - jugé conforme	2021-11-22
Inactive : CIB en 1re position	2021-11-22
Inactive : CIB attribuée	2021-11-22
Demande de priorité reçue	2021-11-18
Exigences applicables à la revendication de priorité - jugée conforme	2021-11-18
Demande reçue - nationale ordinaire	2021-10-29
Inactive : Pré-classement	2021-10-29
Inactive : CQ images - Numérisation	2021-10-29

Historique d'abandonnement

Il n'y a pas d'historique d'abandonnement

Taxes périodiques

Le dernier paiement a été reçu le 2023-12-15

Avis : Si le paiement en totalité n'a pas été reçu au plus tard à la date indiquée, une taxe supplémentaire peut être imposée, soit une des taxes suivantes :

taxe de rétablissement ;
taxe pour paiement en souffrance ; ou
taxe additionnelle pour le renversement d'une péremption réputée.

Les taxes sur les brevets sont ajustées au 1er janvier de chaque année. Les montants ci-dessus sont les montants actuels s'ils sont reçus au plus tard le 31 décembre de l'année en cours.
Veuillez vous référer à la page web des taxes sur les brevets de l'OPIC pour voir tous les montants actuels des taxes.

Historique des taxes

Type de taxes	Anniversaire	Échéance	Date payée
Taxe pour le dépôt - générale		2021-10-29	2021-10-29
Requête d'examen - générale		2025-10-29	2022-09-16
TM (demande, 2e anniv.) - générale	02	2023-10-30	2023-06-15
TM (demande, 3e anniv.) - générale	03	2024-10-29	2023-12-15

Titulaires au dossier

Les titulaires actuels et antérieures au dossier sont affichés en ordre alphabétique.

Titulaires actuels au dossier
10353744 CANADA LTD.

Titulaires antérieures au dossier
HUAIYUAN JI
YUNXI LI
ZHAOKUN XU

Les propriétaires antérieurs qui ne figurent pas dans la liste des « Propriétaires au dossier » apparaîtront dans d'autres documents au dossier.

Documents

Pour visionner les fichiers sélectionnés, entrer le code reCAPTCHA :

Pour visualiser une image, cliquer sur un lien dans la colonne description du document (Temporairement non-disponible). Pour télécharger l'image (les images), cliquer l'une ou plusieurs cases à cocher dans la première colonne et ensuite cliquer sur le bouton "Télécharger sélection en format PDF (archive Zip)" ou le bouton "Télécharger sélection (en un fichier PDF fusionné)".

Liste des documents de brevet publiés et non publiés sur la BDBC .

Si vous avez des difficultés à accéder au contenu, veuillez communiquer avec le Centre de services à la clientèle au 1-866-997-1936, ou envoyer un courriel au Centre de service à la clientèle de l'OPIC.

Filtre

Télécharger sélection en format PDF (archive Zip)

Télécharger sélection (en un fichier PDF fusionné)

Description du Document	Date (yyyy-mm-dd)	Nombre de pages	Taille de l'image (Ko)
Revendications	2024-03-05	80	3 813
Dessin représentatif	2022-03-21	1	12
Description	2021-10-28	20	791
Revendications	2021-10-28	6	178
Dessins	2021-10-28	4	93
Abrégé	2021-10-28	1	22
Page couverture	2022-03-21	1	48
Modification / réponse à un rapport	2024-03-05	176	6 046
Courtoisie - Certificat de dépôt	2021-11-21	1	565
Courtoisie - Réception de la requête d'examen	2023-02-07	1	423
Demande de l'examinateur	2023-12-07	5	227
Nouvelle demande	2021-10-28	6	218
Requête d'examen	2022-09-15	9	326
Correspondance pour SPA	2022-12-22	4	153

Sélection de la langue

Menus

Abrégé anglais

Historique d'événement

Historique d'abandonnement

Taxes périodiques

Historique des taxes

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.

Sommaire du brevet 3136990

Abrégé anglais

Historique d'événement

Historique d'abandonnement

Taxes périodiques

Historique des taxes

Votre demande est en traitement.Les informations demandèes serontaccessibles dans quelques instants.Merci de patienter.

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.