Language selection

Search

Patent 3154025 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 3154025
(54) English Title: INTERACTIVE BEHAVIOR RECOGNIZING METHOD, DEVICE, COMPUTER EQUIPMENT AND STORAGE MEDIUM
(54) French Title: METHODE DE RECONNAISSANCE D'UN COMPORTEMENT D'INTERACTION, DISPOSITIF, MATERIEL INFORMATIQUE ET SUPPORT DE STOCKAGE
Status: Examination
Bibliographic Data
(51) International Patent Classification (IPC):
  • G6V 40/20 (2022.01)
  • G6V 40/10 (2022.01)
(72) Inventors :
  • ZHUANG, XIYANG (China)
  • YU, DAIWEI (China)
  • SUN, HAO (China)
  • YANG, XIAN (China)
(73) Owners :
  • 10353744 CANADA LTD.
(71) Applicants :
  • 10353744 CANADA LTD. (Canada)
(74) Agent: JAMES W. HINTONHINTON, JAMES W.
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2020-06-19
(87) Open to Public Inspection: 2021-03-18
Examination requested: 2022-09-16
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/CN2020/096994
(87) International Publication Number: CN2020096994
(85) National Entry: 2022-03-10

(30) Application Priority Data:
Application No. Country/Territory Date
201910857295.7 (China) 2019-09-11

Abstracts

English Abstract

The present application relates to an interaction behavior recognition method, an apparatus, a computer device, and a storage medium. Said method comprises: acquiring an image to be detected; performing human body posture detection on said image by means of a preset detection model, so as to obtain human body posture information and hand position information, the detection model being used for performing human body posture detection; tracking a human body posture according to the human body posture information, so as to obtain human body motion trajectory information; performing object tracking on a hand position according to the hand position information, and acquiring a hand area image; performing item recognition on the hand area image by means of a preset classification recognition model, so as to obtain an item recognition result, the classification recognition model being used for performing item recognition; and according to the human body motion trajectory information and the item recognition result, obtaining a first interaction behavior recognition result. The present method can improve the recognition accuracy of interaction behaviors, and has a good transportability.


French Abstract

La présente demande concerne un procédé de reconnaissance de comportement d'interaction, un appareil, un dispositif informatique et un support de stockage. Ledit procédé comprend : l'acquisition d'une image à détecter ; la réalisation d'une détection de posture de corps humain sur ladite image au moyen d'un modèle de détection prédéfini de façon à obtenir des informations de posture de corps humain et des informations de position de main, le modèle de détection étant utilisé pour effectuer une détection de posture de corps humain ; le suivi d'une posture de corps humain en fonction des informations de posture de corps humain de façon à obtenir des informations de trajectoire de mouvement de corps humain ; la réalisation d'un suivi d'objet sur une position de main selon les informations de position de main et l'acquisition d'une image de zone de main ; la réalisation d'une reconnaissance d'élément sur l'image de zone de main au moyen d'un modèle de reconnaissance de classification prédéfini de façon à obtenir un résultat de reconnaissance d'élément, le modèle de reconnaissance de classification étant utilisé pour effectuer une reconnaissance d'élément ; et en fonction des informations de trajectoire de mouvement de corps humain et du résultat de reconnaissance d'élément, l'obtention d'un premier résultat de reconnaissance de comportement d'interaction. Le présent procédé peut améliorer la précision de reconnaissance de comportements d'interaction, et présente une bonne aptitude au transport.

Claims

Note: Claims are shown in the official language in which they were submitted.


CA 03154025 2022-03-10
CLAIMS
What is claimed is:
1. An interactive behavior recognizing method, characterized in comprising:
obtaining an image to be detected;
performing human body posture detection on the image to be detected through a
preset detection
model, and obtaining human body posture information and hand position
information, wherein
the detection model is employed to perform human body posture detection;
tracing the human body posture according to the human body posture information
to obtain
human body motion track information, and performing a target tracing on the
hand position
according to the hand position information to obtain a hand area image;
performing article recognition on the hand area image through a preset
classification recognition
model, and obtaining an article recognition result, wherein the classification
recognition model
is employed to perform article recognition; and
obtaining a first interactive behavior recognition result according to the
human body motion track
information and the article recognition result.
2. The method according to Claim 1, characterized in that the step of
performing human body
posture detection on the image to be detected through a preset detection
model, and obtaining
human body posture information and hand position information includes:
preprocessing the image to be detected, and obtaining a human body image in
the image to be
detected; and
performing human body posture detection on the human body image through the
preset detection
model, and obtaining the human body posture information and the hand position
information.
3. The method according to Claim 2, characterized in further comprising:
obtaining human body position information according to the image to be
detected; and
obtaining a second interactive behavior recognition result according to the
human body motion
26
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
track information, the article recognition result, the human body position
information and a preset
shelf information, wherein the second interactive behavior recognition result
is a human-goods
interactive behavior recognition result.
4. The method according to Claim 3, characterized in that the step of
obtaining an image to be
detected includes:
obtaining the image to be detected as collected by an image collection device
at a preset first
shooting angle; wherein
preferably, the preset first shooting angle is an overhead angle perpendicular
to the ground, and
the image to be detected is of RGBD data.
5. The method according to anyone of Claims 1 to 4, characterized in
further comprising:
obtaining sample image data;
marking key points and hand position of a human body image in the sample image
data, and
obtaining a first marked image data;
performing an image enhancing process on the first marked image data, and
obtaining a first
training dataset; and
inputting the first training dataset to an HRNet model for training, and
obtaining the detection
model.
6. The method according to Claim 5, characterized in further comprising:
marking a hand area in the sample image data and performing article category
marking on an
article located in the hand area, and obtaining a second marked image data;
performing an image enhancing process on the second marked image data, and
obtaining a
second training dataset; and
inputting the second training dataset to a convolutional neural network for
training, and obtaining
the preset classification recognition model, wherein, preferably, the
convolutional neural network
is a yolov3-tiny network or a vgg16 network.
27
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
7. The method according to Claim 6, characterized in that the step of
obtaining sample image
data includes:
obtaining an image data collected by the image collection device at a preset
second shooting
angle within a preset time frame; and
screening from the collected image data to obtain sample image data containing
human-goods
interactive behaviors, wherein, preferably, the preset second shooting angle
is an overhead angle
perpendicular to the ground, and the sample image data is of RGBD data.
8. An interactive behavior recognizing device, characterized in comprising:
a first obtaining module, for obtaining an image to be detected;
a first detecting module, for performing human body posture detection on the
image to be
detected through a preset detection model, and obtaining human body posture
information and
hand position information, wherein the detection model is employed to perform
human body
posture detection;
a tracing module, for tracing the human body posture according to the human
body posture
information to obtain human body motion track information, and performing a
target tracing on
the hand position according to the hand position information to obtain a hand
area image;
a second detecting module, for performing article recognition on the hand area
image through a
preset classification recognition model, and obtaining an article recognition
result, wherein the
classification recognition model is employed to perform article recognition;
and
a first interactive behavior recognizing module, for obtaining a first
interactive behavior
recognition result according to the human body motion track information and
the article
recognition result.
9. A computer equipment, comprising a memory, a processor and a computer
program stored
on the memory and operable on the processor, characterized in that the method
steps according
to anyone of Claims 1 to 7 are realized when the processor executes the
computer program.
10. A computer-readable storage medium, storing a computer program thereon,
characterized
28
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
in that the method steps according to anyone of Claims 1 to 7 are realized
when the computer
program is executed by a processor.
29
Date Recue/Date Received 2022-03-10

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 03154025 2022-03-10
INTERACTIVE BEHAVIOR RECOGNIZING METHOD, DEVICE, COMPUTER
EQUIPMENT AND STORAGE MEDIUM
BACKGROUND OF THE INVENTION
Technical Field
[0001] The present application relates to an interactive behavior recognizing
method, and
corresponding device, computer equipment and storage medium.
Description of Related Art
[0002] With the development of science and technology, the unmanned selling
technique has
been increasingly highly regarded by various large-scale retailers. This
technique makes
use of such multiple smart recognition techniques as sensors, image analysis,
and
computer vision to achieve unmanned settlement of accounts. Use of the image
recognition technique to sense relative positions between human beings and
shelves and
movements of commodities on the shelves to carry out human-goods interactive
behavior
recognition is an important precondition to ensuring normal settlement of
consumptions
by customers.
[0003] However, the currently available human-goods interactive behavior
recognizing methods
usually make use of templates and rule matching, but the definition of
templates and
stipulation of rules require a great deal of manpower input, and such practice
is often
applicable only to the recognition of conventional human body postures, such
recognition
is inferior in precision, weak in transferability, and applicable merely to
human-goods
interactive behaviors under specific scenarios.
1
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
SUMMARY OF THE INVENTION
[0004] In view of the aforementioned technical problems, there is an urgent
need to provide an
interactive behavior recognizing method, and corresponding device, computer
equipment
and storage medium having higher recognition precision and better
transferability.
[0005] There is provided an interactive behavior recognizing method that
comprises:
[0006] obtaining an image to be detected;
[0007] performing human body posture detection on the image to be detected
through a preset
detection model, and obtaining human body posture information and hand
position
information, wherein the detection model is employed to perform human body
posture
detection;
[0008] tracing the human body posture according to the human body posture
information to
obtain human body motion track information, and performing a target tracing on
the hand
position according to the hand position information to obtain a hand area
image;
[0009] performing article recognition on the hand area image through a preset
classification
recognition model, and obtaining an article recognition result, wherein the
classification
recognition model is employed to perform article recognition; and
[0010] obtaining a first interactive behavior recognition result according to
the human body
motion track information and the article recognition result.
[0011] In one of the embodiments, the step of performing human body posture
detection on the
image to be detected through a preset detection model, and obtaining human
body posture
information and hand position information includes:
[0012] preprocessing the image to be detected, and obtaining a human body
image in the image
to be detected; and
[0013] performing human body posture detection on the human body image through
the preset
detection model, and obtaining the human body posture information and the hand
position
information.
2
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0014] In one of the embodiments, the method further comprises:
[0015] obtaining human body position information according to the image to be
detected; and
[0016] obtaining a second interactive behavior recognition result according to
the human body
motion track information, the article recognition result, the human body
position
information and a preset shelf information, wherein the second interactive
behavior
recognition result is a human-goods interactive behavior recognition result.
[0017] In one of the embodiments, the step of obtaining an image to be
detected includes:
[0018] obtaining the image to be detected as collected by an image collection
device at a preset
first shooting angle; wherein
[0019] preferably, the preset first shooting angle is an overhead angle
perpendicular to the ground,
and the image to be detected is of RGBD data.
[0020] In one of the embodiments, the method further comprises:
[0021] obtaining sample image data;
[0022] marking key points and hand position of a human body image in the
sample image data,
and obtaining a first marked image data;
[0023] performing an image enhancing process on the first marked image data,
and obtaining a
first training dataset; and
[0024] inputting the first training dataset to an HRNet model for training,
and obtaining the
detection model.
[0025] In one of the embodiments, the method further comprises:
[0026] marking a hand area in the sample image data and performing article
category marking
on an article located in the hand area, and obtaining a second marked image
data;
[0027] performing an image enhancing process on the second marked image data,
and obtaining
a second training dataset; and
[0028] inputting the second training dataset to a convolutional neural network
for training, and
3
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
obtaining the preset classification recognition model, wherein, the
convolutional neural
network is a yo1ov3-tiny network or a vgg16 network.
[0029] In one of the embodiments, the step of obtaining sample image data
includes:
[0030] obtaining an image data collected by the image collection device at a
preset second
shooting angle within a preset time frame; and
[0031] screening from the collected image data to obtain sample image data
containing human-
goods interactive behaviors, wherein, preferably, the preset second shooting
angle is an
overhead angle perpendicular to the ground, and the sample image data is of
RGBD data.
[0032] There is provided an interactive behavior recognizing device that
comprises:
[0033] a first obtaining module, for obtaining an image to be detected;
[0034] a first detecting module, for performing human body posture detection
on the image to
be detected through a preset detection model, and obtaining human body posture
information and hand position information, wherein the detection model is
employed to
perform human body posture detection;
[0035] a tracing module, for tracing the human body posture according to the
human body
posture information to obtain human body motion track information, and
performing a
target tracing on the hand position according to the hand position information
to obtain a
hand area image;
[0036] a second detecting module, for performing article recognition on the
hand area image
through a preset classification recognition model, and obtaining an article
recognition
result, wherein the classification recognition model is employed to perform
article
recognition; and
[0037] a first interactive behavior recognizing module, for obtaining a first
interactive behavior
recognition result according to the human body motion track information and
the article
recognition result.
[0038] There is provided a computer equipment that comprises a memory, a
processor and a
4
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
computer program stored on the memory and operable on the processor, and the
following
steps are realized when the processor executes the computer program:
[0039] obtaining an image to be detected;
[0040] performing human body posture detection on the image to be detected
through a preset
detection model, and obtaining human body posture information and hand
position
information, wherein the detection model is employed to perform human body
posture
detection;
[0041] tracing the human body posture according to the human body posture
information to
obtain human body motion track information, and performing a target tracing on
the hand
position according to the hand position information to obtain a hand area
image;
[0042] performing article recognition on the hand area image through a preset
classification
recognition model, and obtaining an article recognition result, wherein the
classification
recognition model is employed to perform article recognition; and
[0043] obtaining a first interactive behavior recognition result according to
the human body
motion track information and the article recognition result.
[0044] There is provided a computer-readable storage medium storing a computer
program
thereon, and the following steps are realized when the computer program is
executed by
a processor:
[0045] obtaining an image to be detected;
[0046] performing human body posture detection on the image to be detected
through a preset
detection model, and obtaining human body posture information and hand
position
information, wherein the detection model is employed to perform human body
posture
detection;
[0047] tracing the human body posture according to the human body posture
information to
obtain human body motion track information, and performing a target tracing on
the hand
position according to the hand position information to obtain a hand area
image;
[0048] performing article recognition on the hand area image through a preset
classification
recognition model, and obtaining an article recognition result, wherein the
classification
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
recognition model is employed to perform article recognition; and
[0049] obtaining a first interactive behavior recognition result according to
the human body
motion track information and the article recognition result.
[0050] In the aforementioned interactive behavior recognizing method, and
corresponding
device, computer equipment and storage medium, interactive behavior
recognition is
performed on the image to be detected through the detection model and the
classification
recognition model, whereby only few data is required to be collected on the
basis of
existing models to be deployed in different stores, stronger transferability
is achieved,
lower deployment cost is spent, and it is made possible for the detection
model to flexibly
and precisely recognize interactive behaviors, and to enhance recognition
precision.
BRIEF DESCRIPTION OF THE DRAWINGS
[0051] Fig. 1 is a view illustrating the application environment for an
interactive behavior
recognizing method in an embodiment;
[0052] Fig. 2 is a flowchart schematically illustrating an interactive
behavior recognizing method
in an embodiment;
[0053] Fig. 3 is a flowchart schematically illustrating an interactive
behavior recognizing method
in another embodiment;
[0054] Fig. 4 is a flowchart schematically illustrating the detection model
training steps in an
embodiment;
[0055] Fig. 5 is a flowchart schematically illustrating the classification
recognition model
training steps in an embodiment;
6
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0056] Fig. 6 is a block diagram illustrating the structure of an interactive
behavior recognizing
device in an embodiment; and
[0057] Fig. 7 is a view illustrating the internal structure of a computer
equipment in an
embodiment.
DETAILED DESCRIPTION OF THE INVENTION
[0058] To make more lucid and clear the objectives, technical solutions and
advantages of the
present application, the present application is described in greater detail
below with
reference to accompanying drawings and embodiments. As should be understood,
the
specific embodiments as described here are merely meant to explain the present
application, rather than to restrict the present application.
[0059] The interactive behavior recognizing method provided by the present
application is
applicable to the application environment as shown in Fig. 1, in which
terminal 102
communicates with server 104 through network. Terminal 102 can be, but is not
limited
to be, any of various image collection devices, moreover, terminal 102 can
employ one
or more depth camera(s) with shooting angle(s) perpendicular to the ground,
while server
104 can be embodied as an independent server or a server cluster consisting of
a plurality
of servers.
[0060] In one embodiment, as shown in Fig. 2, there is provided an interactive
behavior
recognizing method, and the method is explained with an example of its being
applied to
the server in Fig. 1, to comprise the following steps.
[0061] Step 202 ¨ obtaining an image to be detected.
[0062] The image to be detected is an interactive behavior image between a
human being and an
7
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
object to be detected.
[0063] In one of the embodiments, step 202 includes: the server obtains the
image to be detected
as collected by an image collection device at a preset first shooting angle,
preferably, the
preset first shooting angle is an overhead angle perpendicular to the ground
or
approximately perpendicular to the ground, and the image to be detected is of
RGBD data.
[0064] In other words, the image to be detected is of RGBD data collected by
an image collection
device under an overhead angle scenario, the image collection device can be
embodied
as a depth camera disposed above a shelf, the first shooting angle can be not
perpendicular
to the ground, and can be any overhead angle close to being perpendicular
insofar as the
installation environment allows, so as to avoid shooting dead angle as far as
possible.
[0065] The present technical solution makes use of a depth camera poised for
the overhead angle
to detect human-goods interactive behaviors, in comparison with the
traditional camera
installation mode whereby the camera is installed at a certain included angle
with respect
to the ground, the present technical solution effectively evades the problem
in which both
the human being and the shelf are shielded due to askance angle and the
problem in which
it is more difficult to trace the hand; in actual application, image
collection at overhead
angle makes it possible to better recognize the behaviors of picking up goods
by different
persons in turns.
[0066] Step 204 ¨ performing human body posture detection on the image to be
detected through
a preset detection model, and obtaining human body posture information and
hand
position information, wherein the detection model is employed to perform human
body
posture detection.
[0067] The detection model is a human body posture detection model that can be
used to detect
key points of the human skeleton.
8
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0068] Specifically, the server inputs a human body image to the detection
model; human body
posture detection is performed on the human body image in the detection model;
human
body posture information and hand position information output by the detection
model
are obtained; the human body posture detection can be via a common skeleton
line
detecting method, the human body posture information as obtained is an image
of human
skeletal key points, and the hand position information is the specific
position of a hand in
the image of human skeletal key points.
[0069] Step 206 ¨ tracing the human body posture according to the human body
posture
information to obtain human body motion track information, and performing a
target
tracing on the hand position according to the hand position information to
obtain a hand
area image.
[0070] Specifically, a target tracing algorithm is employed, such as the
Camshift algorithm
adaptable to changes in size and shape of a moving target, to trace the motion
tracks of
the human body and the hand, respectively, to obtain human body motion track
information, and to enlarge the hand position in the tracing process to obtain
a hand area
image.
[0071] Step 208 ¨ performing article recognition on the hand area image
through a preset
classification recognition model, and obtaining an article recognition result,
wherein the
classification recognition model is employed to perform article recognition.
[0072] The classification recognition model is an article recognition model
that can be trained
by deep learning.
[0073] Specifically, the hand area image is input to the classification
recognition model, and the
hand area image is detected in the classification recognition model to judge
whether an
9
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
article is held in the hand area, in the case there is an article being held,
the classification
recognition model recognizes the article and outputs an article recognition
result;
moreover, the classification recognition model can further make skin color
judgment on
the hand area image, and timely send out early warning against the behavior of
intentional
shielding of the hand by means of such articles as clothes, so as to achieve
the objective
of reducing goods loss.
[0074] Step 210¨ obtaining a first interactive behavior recognition result
according to the human
body motion track information and the article recognition result.
[0075] The first interactive behavior recognition result is a human-article
interactive behavior
recognition result.
[0076] Specifically, the human body motion track information can be used to
judge behavioral
actions of a human being, for example, hand stretching, bending, stooping and
squatting,
etc., and to judge whether any article is held in the hand, when an article is
held in the
hand, the article is recognized to obtain an article recognition result,
whereby it is possible
to judge that the human body is picking up or putting down the article, namely
to analyze
to obtain a human-article interactive behavior recognition result.
[0077] In the interactive behavior recognizing method provided by the present
technical solution,
interactive behavior recognition is performed on the image to be detected
through the
detection model and the classification recognition model, and it is made
possible, through
model training and algorithm tuning, to automatically recognize interactive
behaviors
between human beings and articles, and the recognition result is made more
precise;
moreover, only few data is required to be collected on the basis of the
current detection
model and classification recognition model to be deployed in different
scenarios, stronger
transferability is achieved, and lower deployment cost is spent.
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0078] In one of the embodiments, as shown in Fig. 3, the method comprises the
following steps.
[0079] Step 302 ¨ obtaining an image to be detected.
[0080] Step 304 ¨ preprocessing the image to be detected, and obtaining a
human body image in
the image to be detected.
[0081] Step 304 is a process to extract a human body image that is required to
be used in
subsequent steps from the image to be detected, and the unwanted background
image is
shielded out.
[0082] Specifically, the preprocessing can be background modeling, in other
words, background
modeling based on Gaussian mixture is performed on the image to be detected,
and a
background model is obtained.
[0083] A human body image in the image to be detected is obtained according to
the image to
be detected and the background model.
[0084] Step 306 ¨ performing human body posture detection on the human body
image through
the preset detection model, and obtaining the human body posture information
and the
hand position information.
[0085] Step 308 ¨ tracing the human body posture according to the human body
posture
information to obtain human body motion track information, and performing a
target
tracing on the hand position according to the hand position information to
obtain a hand
area image.
[0086] Step 310 ¨ performing article recognition on the hand area image
through a preset
classification recognition model, and obtaining an article recognition result,
wherein the
11
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
classification recognition model is employed to perform article recognition.
[0087] Step 312¨ obtaining a first interactive behavior recognition result
according to the human
body motion track information and the article recognition result.
[0088] In this embodiment, unwanted background image is shielded out in step
304 by
preprocessing the image to be detected, only the human body image to be
subsequently
used is retained, whereby the volume of data to be processed in the following
steps is
reduced, and data processing efficiency is enhanced.
[0089] In one of the embodiments, the method further comprises:
[0090] A - obtaining human body position information according to the image to
be detected;
wherein
[0091] the human body position information can indicate position information
in a three-
dimensional world coordinate system.
[0092] Specifically, collection position information of the image to be
detected in the three-
dimensional world coordinate system is obtained, three-dimensional world
coordinate
transformation is performed according to the position information of the human
body
image in the image to be detected and the collection position information, and
position
information of the human body in the three-dimensional world coordinate system
is
obtained.
[0093] B - obtaining a second interactive behavior recognition result
according to the human
body motion track information, the article recognition result, the human body
position
information and a preset shelf information, wherein the second interactive
behavior
recognition result is a human-goods interactive behavior recognition result.
[0094] The shelf information includes shelf position information and
information of articles in
12
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
the shelf, of which the shelf position information is a three-dimensional
world coordinate
position where the shelf locates.
[0095] Specifically, shelf information to which the human body position
corresponds is obtained
according to the human body position information and the preset shelf
information; an
interactive behavior between the human body and the shelf is determined by
tracing the
three-dimensional world coordinate positions where the human body and the
shelf locate,
and the occurrence of a valid human-goods interactive behavior is further
determined in
the tracing process by recognizing whether the hand area has any commodity
associated
with the shelf, the valid human-goods interactive behavior here can be a
behavior of a
customer completing one round of picking up goods from the shelf.
[0096] The present technical solution converts out the position of a customer
in the world
coordinate system through three-dimensional world coordinate transformation,
and the
association thereof with the shelf makes it possible to recognize whether the
customer
has effected a valid human-goods interactive behavior; on the other hand, on
the basis of
the recognition of the human-goods interactive behavior in conjunction with
the article
recognition result, under the premise that the shelf stock is known, it is
possible to
indirectly achieve counting of the existing stock by monitoring the number of
times of
valid interactions between humans and the shelf, in case of short supply, the
server can
timely remind shop assistants to administer the stock, whereby manual stock-
taking cost
is greatly reduced.
[0097] In one of the embodiments, as shown in Fig. 4, the method further
comprises detection
module training steps, which specifically include the following steps.
[0098] Step 402 ¨ obtaining sample image data.
[0099] Specifically, an image data collected by the image collection device at
a preset second
13
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
shooting angle within a preset time frame is obtained, i.e., interactive
behavioral image
data of a certain magnitude is collected; sample image data containing human-
goods
interactive behaviors are screened and obtained from the collected image data,
the preset
second shooting angle can be an overhead angle perpendicular to the ground or
approximately perpendicular to the ground, and the sample image data is of
RGBD data.
[0100] Step 404 ¨ marking key points and hand position of a human body image
in the sample
image data, and obtaining a first marked image data.
[0101] Specifically, the sample image data should essentially cover different
human-goods
interactive behaviors in actual scenarios, and it is further possible to
enhance the sample
data, increase the volume of the sample image data, and raise the proportion
of training
samples with large posture amplitudes in the interactive behavioral process,
for instance,
to raise the proportion of such human-goods interactive behavioral postures as
bending,
stooping and squatting etc., so as to enhance detection precision of the
detection model.
During the process of specific implementation, a part of the first marked
image data can
be taken to serve as a training dataset, while the remaining part serves as a
verification
dataset.
[0102] Step 406 ¨ performing an image enhancing process on the first marked
image data, and
obtaining a first training dataset; during the process of specific
implementation, the image
enhancing process is performed on the training dataset in the first marked
image data to
obtain a first training dataset.
[0103] Specifically, the image enhancing process can include any one or more
of the following
image transforming methods, such as image normalization, random clipping of
images,
image zooming, image rollover, image affine transformation, image contrast
change,
image hue change, image saturation change, and adding tone interference blocks
to
images, etc.
14
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0104] Step 408 ¨ inputting the first training dataset to an HRNet model for
training, and
obtaining the detection model. Specifically, different network architectures
of the HRNet
model can be employed to train human body posture detection models, various
models
obtained through training by the different network architectures are then
verified and
appraised through the verification dataset, and a model with the optimal
effect is selected
to serve as the detection model.
[0105] In one of the embodiments, as shown in Fig. 5, the method further
comprises
classification recognition module training steps, which specifically include
the following
steps.
[0106] Step 502 ¨ obtaining sample image data.
[0107] Step 504¨ marking a hand area in the sample image data and performing
article category
marking on an article located in the hand area, and obtaining a second marked
image data.
[0108] Step 506 ¨ performing an image enhancing process on the second marked
image data,
and obtaining a second training dataset.
[0109] Specifically, the image enhancing process can include any one or more
of the following
image transforming methods, such as image normalization, random clipping of
images,
image zooming, image rollover, image affine transformation, image contrast
change,
image hue change, image saturation change, and adding tone interference blocks
to
images, etc.
[0110] Step 508 ¨ inputting the second training dataset to a yo1ov3-tiny
network or a vgg16
network for training, and obtaining the preset classification recognition
model.
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0111] The present technical solution collects RGBD data through a depth
camera with angle
perpendicular or approximately perpendicular to the ground, then manually
sorts and
collects the RGBD data containing human-goods interactive behaviors to serve
as training
samples, namely sample image data, employs deep learning training, and
recognizes
different postures of the human body with the trained model results, whereby
the
detection model can more flexibly and precisely recognize interactive
behaviors, and
possesses stronger transferability.
[0112] As should be understood, although the various steps in the flowcharts
of Figs. 2-5 are
sequentially displayed as indicated by arrows, these steps are not necessarily
executed in
the sequences indicated by arrows. Unless otherwise explicitly noted in this
paper,
execution of these steps is not restricted by any sequence, as these steps can
also be
executed in other sequences (than those indicated in the drawings). Moreover,
at least
partial steps in the flowcharts of Figs. 2-5 may include plural sub-steps or
multi-phases,
these sub-steps or phases are not necessarily completed at the same timing,
but can be
executed at different timings, and these sub-steps or phases are also not
necessarily
sequentially performed, but can be performed in turns or alternately with
other steps or
with at least some of sub-steps or phases of other steps.
[0113] There is provided an interactive behavior recognizing device, as shown
in Fig. 6, the
device comprises a first obtaining module 602, a first detecting module 604, a
tracing
module 606, a second detecting module 608 and a first interactive behavior
recognizing
module 610, of which:
[0114] the first obtaining module 602 is employed for obtaining an image to be
detected;
[0115] the first detecting module 604 is employed for performing human body
posture detection
on the image to be detected through a preset detection model, and obtaining
human body
posture information and hand position information, wherein the detection model
is
employed to perform human body posture detection;
[0116] the tracing module 606 is employed for tracing the human body posture
according to the
16
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
human body posture information to obtain human body motion track information,
and
performing a target tracing on the hand position according to the hand
position
information to obtain a hand area image;
[0117] the second detecting module 608 is employed for performing article
recognition on the
hand area image through a preset classification recognition model, and
obtaining an
article recognition result, wherein the classification recognition model is
employed to
perform article recognition; and
[0118] the first interactive behavior recognizing module 610 is employed for
obtaining a first
interactive behavior recognition result according to the human body motion
track
information and the article recognition result.
[0119] In one of the embodiments, the first detecting module 604 is further
employed for
preprocessing the image to be detected, and obtaining a human body image in
the image
to be detected; and performing human body posture detection on the human body
image
through the preset detection model, and obtaining the human body posture
information
and the hand position information.
[0120] In one of the embodiments, the device further comprises:
[0121] a human body position module, for obtaining human body position
information according
to the image to be detected; and
[0122] a second interactive behavior recognizing module, for obtaining a
second interactive
behavior recognition result according to the human body motion track
information, the
article recognition result, the human body position information and a preset
shelf
information, wherein the second interactive behavior recognition result is a
human-goods
interactive behavior recognition result.
[0123] In one of the embodiments, the first obtaining module 602 is further
employed for
obtaining the image to be detected as collected by an image collection device
at a preset
first shooting angle; preferably, the preset first shooting angle is an
overhead angle
17
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
perpendicular to the ground, and the image to be detected is of RGBD data.
[0124] In one of the embodiments, the device further comprises:
[0125] a second obtaining module, for obtaining sample image data;
[0126] a first marking module, for marking key points and hand position of a
human body image
in the sample image data, and obtaining a first marked image data;
[0127] a first enhancing module, for performing an image enhancing process on
the first marked
image data, and obtaining a first training dataset; and
[0128] a first training module, for inputting the first training dataset to an
HRNet model for
training, and obtaining the detection model.
[0129] In one of the embodiments, the device further comprises:
[0130] a second marking module, for marking a hand area in the sample image
data and
performing article category marking on an article located in the hand area,
and obtaining
a second marked image data;
[0131] a second enhancing module, for performing an image enhancing process on
the second
marked image data, and obtaining a second training dataset; and
[0132] a second training module, for inputting the second training dataset to
a yo1ov3-tiny
network or a vgg16 network for training, and obtaining the preset
classification
recognition model.
[0133] In one of the embodiments, the second obtaining module is further
employed for
obtaining an image data collected by the image collection device at a preset
second
shooting angle within a preset time frame; and screening from the collected
image data
to obtain sample image data containing human-goods interactive behaviors,
preferably,
the preset second shooting angle is an overhead angle perpendicular to the
ground, and
the sample image data is of RGBD data.
[0134] Specific definitions relevant to the interactive behavior recognizing
device may be
18
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
inferred from the aforementioned definitions to the interactive behavior
recognizing
method, while no repetition is made in this context. The various modules in
the
aforementioned interactive behavior recognizing device can be wholly or partly
realized
via software, hardware, and a combination of software with hardware. The
various
modules can be embedded in the form of hardware in a processor in a computer
equipment or independent of any computer equipment, and can also be stored in
the form
of software in a memory in a computer equipment, so as to facilitate the
processor to
invoke and perform operations corresponding to the aforementioned various
modules.
[0135] In one embodiment, a computer equipment is provided, the computer
equipment can be
a server, and its internal structure can be as shown in Fig. 7. The computer
equipment
comprises a processor, a memory, a network interface, and a database connected
to each
other via a system bus. The processor of the computer equipment is employed to
provide
computing and controlling capabilities. The memory of the computer equipment
includes
a nonvolatile storage medium and an internal memory. The nonvolatile storage
medium
stores therein an operating system, a computer program and a database. The
internal
memory provides environment for the running of the operating system and the
computer
program in the nonvolatile storage medium. The database of the computer
equipment is
employed to store data. The network interface of the computer equipment is
employed to
connect to an external terminal via network for communication. The computer
program
realizes an interactive behavior recognizing method when it is executed by a
processor.
[0136] As understandable to persons skilled in the art, the structure
illustrated in Fig. 7 is merely
a block diagram of partial structure relevant to the solution of the present
application, and
does not constitute any restriction to the computer equipment on which the
solution of
the present application is applied, as the specific computer equipment may
comprise
component parts that are more than or less than those illustrated in Fig. 7,
or may combine
certain component parts, or may have different layout of component parts.
19
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0137] In one embodiment, there is provided a computer equipment that
comprises a memory, a
processor and a computer program stored on the memory and operable on the
processor,
and the following steps are realized when the processor executes the computer
program:
obtaining an image to be detected; performing human body posture detection on
the
image to be detected through a preset detection model, and obtaining human
body posture
information and hand position information, wherein the detection model is
employed to
perform human body posture detection; tracing the human body posture according
to the
human body posture information to obtain human body motion track information,
and
performing a target tracing on the hand position according to the hand
position
information to obtain a hand area image; performing article recognition on the
hand area
image through a preset classification recognition model, and obtaining an
article
recognition result, wherein the classification recognition model is employed
to perform
article recognition; and obtaining a first interactive behavior recognition
result according
to the human body motion track information and the article recognition result.
[0138] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: the step of performing human body posture
detection on the
image to be detected through a preset detection model, and obtaining human
body posture
information and hand position information includes: preprocessing the image to
be
detected, and obtaining a human body image in the image to be detected; and
performing
human body posture detection on the human body image through the preset
detection
model, and obtaining the human body posture information and the hand position
information.
[0139] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: obtaining human body position information
according to the
image to be detected; and obtaining a second interactive behavior recognition
result
according to the human body motion track information, the article recognition
result, the
human body position information and a preset shelf information, wherein the
second
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
interactive behavior recognition result is a human-goods interactive behavior
recognition
result.
[0140] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: the step of obtaining an image to be detected
includes: obtaining
the image to be detected as collected by an image collection device at a
preset first
shooting angle; wherein preferably, the preset first shooting angle is an
overhead angle
perpendicular to the ground, and the image to be detected is of RGBD data.
[0141] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: obtaining sample image data; marking key points
and hand
position of a human body image in the sample image data, and obtaining a first
marked
image data; performing an image enhancing process on the first marked image
data, and
obtaining a first training dataset; and inputting the first training dataset
to an HRNet
model for training, and obtaining the detection model.
[0142] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: marking a hand area in the sample image data and
performing
article category marking on an article located in the hand area, and obtaining
a second
marked image data; performing an image enhancing process on the second marked
image
data, and obtaining a second training dataset; and inputting the second
training dataset to
a convolutional neural network for training, and obtaining the preset
classification
recognition model.
[0143] In one embodiment, when the processor executes the computer program,
the following
steps are further realized: the step of obtaining sample image data includes:
obtaining an
image data collected by the image collection device at a preset second
shooting angle
within a preset time frame; and screening from the collected image data to
obtain sample
image data containing human-goods interactive behaviors, wherein, preferably,
the preset
21
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
second shooting angle is an overhead angle perpendicular to the ground, and
the sample
image data is of RGBD data.
[0144] In one embodiment, there is provided a computer-readable storage medium
storing
thereon a computer program, and the following steps are realized when the
computer
program is executed by a processor: obtaining an image to be detected;
performing human
body posture detection on the image to be detected through a preset detection
model, and
obtaining human body posture information and hand position information,
wherein the
detection model is employed to perform human body posture detection; tracing
the human
body posture according to the human body posture information to obtain human
body
motion track information, and performing a target tracing on the hand position
according
to the hand position information to obtain a hand area image; performing
article
recognition on the hand area image through a preset classification recognition
model, and
obtaining an article recognition result, wherein the classification
recognition model is
employed to perform article recognition; and obtaining a first interactive
behavior
recognition result according to the human body motion track information and
the article
recognition result.
[0145] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: the step of performing human body posture
detection on the
image to be detected through a preset detection model, and obtaining human
body posture
information and hand position information includes: preprocessing the image to
be
detected, and obtaining a human body image in the image to be detected; and
performing
human body posture detection on the human body image through the preset
detection
model, and obtaining the human body posture information and the hand position
information.
[0146] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: obtaining human body position information
according to the
22
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
image to be detected; and obtaining a second interactive behavior recognition
result
according to the human body motion track information, the article recognition
result, the
human body position information and a preset shelf information, wherein the
second
interactive behavior recognition result is a human-goods interactive behavior
recognition
result.
[0147] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: the step of obtaining an image to be detected
includes: obtaining
the image to be detected as collected by an image collection device at a
preset first
shooting angle; wherein preferably, the preset first shooting angle is an
overhead angle
perpendicular to the ground, and the image to be detected is of RGBD data.
[0148] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: obtaining sample image data; marking key points
and hand
position of a human body image in the sample image data, and obtaining a first
marked
image data; performing an image enhancing process on the first marked image
data, and
obtaining a first training dataset; and inputting the first training dataset
to an HRNet
model for training, and obtaining the detection model.
[0149] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: marking a hand area in the sample image data and
performing
article category marking on an article located in the hand area, and obtaining
a second
marked image data; performing an image enhancing process on the second marked
image
data, and obtaining a second training dataset; and inputting the second
training dataset to
a convolutional neural network for training, and obtaining the preset
classification
recognition model.
[0150] In one embodiment, when the computer program is executed by a
processor, the following
steps are further realized: the step of obtaining sample image data includes:
obtaining an
23
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
image data collected by the image collection device at a preset second
shooting angle
within a preset time frame; and screening from the collected image data to
obtain sample
image data containing human-goods interactive behaviors, wherein, preferably,
the preset
second shooting angle is an overhead angle perpendicular to the ground, and
the sample
image data is of RGBD data.
[0151] As comprehensible to persons ordinarily skilled in the art, the entire
or partial flows in
the methods according to the aforementioned embodiments can be completed via a
computer program instructing relevant hardware, the computer program can be
stored in
a nonvolatile computer-readable storage medium, and the computer program can
include
the flows as embodied in the aforementioned various methods when executed. Any
reference to the memory, storage, database or other media used in the various
embodiments provided by the present application can all include nonvolatile
and/or
volatile memory/memories. The nonvolatile memory can include a read-only
memory
(ROM), a programmable ROM (PROM), an electrically programmable ROM (EPROM),
an electrically erasable and programmable ROM (EEPROM) or a flash memory. The
volatile memory can include a random access memory (RAM) or an external cache
memory. To serve as explanation rather than restriction, the RAM is obtainable
in many
forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM
(SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM),
synchronous link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct Rambus dynamic RAM (DRDRAM), and Rambus dynamic RAM
(RDRAM), etc.
[0152] Technical features of the aforementioned embodiments are randomly
combinable, while
all possible combinations of the technical features in the aforementioned
embodiments
are not exhausted for the sake of brevity, but all these should be considered
to fall within
the scope recorded in the description as long as such combinations of the
technical
features are not mutually contradictory.
24
Date Recue/Date Received 2022-03-10

CA 03154025 2022-03-10
[0153] The foregoing embodiments are merely directed to several modes of
execution of the
present application, and their descriptions are relatively specific and
detailed, but they
should not be hence misunderstood as restrictions to the inventive patent
scope. As should
be pointed out, persons with ordinary skill in the art may further make
various
modifications and improvements without departing from the conception of the
present
application, and all these should pertain to the protection scope of the
present application.
Accordingly, the patent protection scope of the present application shall be
based on the
attached Claims.
Date Recue/Date Received 2022-03-10

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Amendment Received - Voluntary Amendment 2024-04-02
Amendment Received - Response to Examiner's Requisition 2024-04-02
Examiner's Report 2023-12-18
Inactive: Report - No QC 2023-12-15
Inactive: First IPC assigned 2023-09-21
Inactive: IPC assigned 2023-09-21
Inactive: IPC assigned 2023-09-21
Letter Sent 2023-02-03
Inactive: IPC expired 2023-01-01
Inactive: IPC removed 2022-12-31
Inactive: Correspondence - PAPS 2022-12-23
Request for Examination Requirements Determined Compliant 2022-09-16
Request for Examination Received 2022-09-16
All Requirements for Examination Determined Compliant 2022-09-16
Inactive: Cover page published 2022-06-08
Letter sent 2022-05-11
Letter sent 2022-04-11
Inactive: IPC assigned 2022-04-07
Application Received - PCT 2022-04-07
Inactive: First IPC assigned 2022-04-07
Priority Claim Requirements Determined Compliant 2022-04-07
Request for Priority Received 2022-04-07
National Entry Requirements Determined Compliant 2022-03-10
Application Published (Open to Public Inspection) 2021-03-18

Abandonment History

There is no abandonment history.

Maintenance Fee

The last payment was received on 2023-12-15

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Basic national fee - standard 2022-03-10 2022-03-10
MF (application, 2nd anniv.) - standard 02 2022-06-20 2022-03-10
Request for examination - standard 2024-06-19 2022-09-16
MF (application, 3rd anniv.) - standard 03 2023-06-19 2022-12-15
MF (application, 4th anniv.) - standard 04 2024-06-19 2023-12-15
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
10353744 CANADA LTD.
Past Owners on Record
DAIWEI YU
HAO SUN
XIAN YANG
XIYANG ZHUANG
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Claims 2024-04-01 39 1,958
Claims 2022-03-09 4 138
Description 2022-03-09 25 1,091
Abstract 2022-03-09 1 24
Drawings 2022-03-09 4 138
Cover Page 2022-06-07 1 59
Representative drawing 2022-06-07 1 21
Amendment / response to report 2024-04-01 91 6,949
Courtesy - Letter Acknowledging PCT National Phase Entry 2022-04-10 1 589
Courtesy - Letter Acknowledging PCT National Phase Entry 2022-05-10 1 591
Courtesy - Acknowledgement of Request for Examination 2023-02-02 1 423
Examiner requisition 2023-12-17 4 198
National entry request 2022-03-09 14 1,317
International search report 2022-03-09 5 176
Amendment - Abstract 2022-03-09 2 117
Request for examination 2022-09-15 8 296
Correspondence for the PAPS 2022-12-22 4 149