Patent 3112512 Summary

(12) Patent Application:	(11) CA 3112512
(54) English Title:	SYSTEMS AND METHODS TO CHECK-IN SHOPPERS IN A CASHIER-LESS STORE
(54) French Title:	SYSTEMES ET PROCEDES DE VERIFICATION D'ACHETEURS DANS UN MAGASIN SANS CAISSIER
Status:	Dead

Bibliographic Data

(51) International Patent Classification (IPC):	G06T 7/70 (2017.01) H04L 67/04 (2022.01) H04L 67/131 (2022.01) H04L 67/306 (2022.01) G06F 3/14 (2006.01) G06T 7/20 (2017.01) G06Q 30/06 (2012.01) G06K 9/00 (2006.01) G06N 3/04 (2006.01) G06N 3/08 (2006.01) H04L 29/08 (2006.01)
(72) Inventors :	FISHER, JORDAN E. (United States of America) GREEN, WARREN (United States of America) FISCHETTI, DANIEL L. (United States of America)
(73) Owners :	STANDARD COGNITION, CORP. (United States of America)
(71) Applicants :	STANDARD COGNITION, CORP. (United States of America)
(74) Agent:	GOWLING WLG (CANADA) LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2019-07-25
(87) Open to Public Inspection:	2020-01-30
Examination requested:	2021-07-05
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US2019/043526
(87) International Publication Number:	WO2020/023801
(85) National Entry:	2021-03-11

(30) Application Priority Data:

Application No.	Country/Territory	Date
62/703,785	United States of America	2018-07-26
16/255,573	United States of America	2019-01-23

Abstracts

English Abstract

Systems and techniques are provided for linking subjects in an area of real space with user accounts. The user accounts are linked with client applications executable on mobile computing devices. A plurality of cameras are disposed above the area. The cameras in the plurality of cameras produce respective sequences of images in corresponding fields of view in the real space. A processing system is coupled to the plurality of cameras. The processing system includes logic to determine locations of subjects represented in the images. The processing system further includes logic to match the identified subjects with user accounts by identifying locations of the mobile computing devices executing client applications in the area of real space and matching locations of the mobile computing devices with locations of the subjects.

French Abstract

L'invention concerne des systèmes et des techniques permettant de relier des sujets dans une zone d'un espace réel à des comptes utilisateurs. Les comptes utilisateurs sont liés à des applications clients exécutables sur des dispositifs informatiques mobiles. Une pluralité de caméras sont disposées au-dessus de la zone. Les caméras de la pluralité de caméras produisent des séquences respectives d'images de champs de vision correspondants dans l'espace réel. Un système de traitement est couplé à la pluralité de caméras. Le système de traitement comprend une logique pour déterminer les emplacements des sujets représentés dans les images. Le système de traitement comprend également une logique permettant d'apparier les sujets identifiés avec les comptes utilisateurs en identifiant les emplacements des dispositifs informatiques mobiles exécutant des applications client dans la zone d'espace réel et en appariant les emplacements correspondants des dispositifs informatiques mobiles avec les emplacements des sujets.

Claims

Note: Claims are shown in the official language in which they were submitted.

CA 03112512 2021-03-11
22
WO 2020/023801 PCT/US2019/043526
CLAIMS
1. A system for linking subjects in an area of real space with user
accounts, the user accounts being linked
with client applications executable on mobile computing devices, comprising:
a processing system configured to receive a plurality of sequences of images
of corresponding
fields of view in the real space, the processing system including logic to
determine locations of identified subjects
represented in the images, logic to match the identified subjects with user
accounts by identifying locations of
mobile devices executing client applications in the area of real space, and
matching locations of the mobile devices
with locations of the subjects.
2. The system of claim 1, wherein the mobile computing devices emit signals
usable to indicate locations of
the mobile computing devices in the area of real space, and the logic to match
the identified subjects with user
accounts identifies locations of mobile computing devices using the emitted
signals.
3. The system of claim 1, wherein the logic to match the identified
subjects with user accounts operates
without use of personal identifying biometric information associated with the
user accounts.
4. The system of claim 1, wherein the client applications on the mobile
computing devices cause display of
semaphore images on the mobile computing devices in the area of real space,
and the logic to match the identified
subjects with user accounts identifies locations of mobile computing devices
includes an image recognition engine
that determines locations of the mobile computing devices displaying semaphore
images.
5. The system of claim 4, including a set of semaphore images accessible to
the processing system, and
wherein the processing system includes logic to accept login communications
from a client application on a mobile
computing device identifying a user account before matching the user account
to an identified subject in the area of
real space, and after accepting login communications sends a selected
semaphore image from the set of semaphore
images to the client application on the mobile device.
6. The system of claim 1, wherein the client applications on the mobile
computing devices transmit location
data to the processing system, and the logic to match the identified subjects
with user accounts uses the location data
transmitted from the mobile computing devices.
7. The system of claim 6, wherein the logic to match the identified
subjects with user accounts includes logic
that uses the location data transmitted from the mobile computing device from
a plurality of locations over a time
interval in the area of real space, the logic further including:
determining that all other mobile computing devices transmitting location
information of
unmatched user accounts are separated from the mobile computing device by a
predetermined distance;
determining a closest unmatched identified subject to the mobile computing
device; and
matching the identified unmatched subject with the user account of the client
application
executing on the mobile computing device.

CA 03112512 2021-03-11
23
WO 2020/023801 PCT/US2019/043526
8. The system of claim 1, wherein the client applications on the mobile
computing devices transmit
accelerometer data to the processing system, and the logic to match the
identified subjects with user accounts uses
the accelerometer data transmitted from the mobile computing devices.
9. The system of claim 1, wherein the mobile computing devices emit signals
usable to indicate locations of
the mobile computing devices in the area of real space, and the logic to match
the identified subjects with user
accounts includes a trained network to identify locations of mobile computing
devices in the area of real space based
on the signals emitted by the mobile computing devices.
10. The system of claim 1, further including log data structures including
a list of inventory items for the
identified subjects, the processing system including logic to associate the
log data structure for the matched
identified subject to the user account for the identified subject.
11. A method of linking subjects in an area of real space with user
accounts, the user accounts being linked
with client applications executable on mobile computing devices, the method
including:
receiving a plurality of sequences of images of corresponding fields of view
in the real space;
determining locations of identified subjects represented in the sequences of
images;
matching the identified subjects with user accounts by identifying locations
of mobile computing devices
executing client applications in the area of real space; and
matching locations of the mobile computing devices with locations of the
subjects.
12. The method of claim 11, wherein:
the mobile devices emit signals usable to indicate locations of the mobile
computing devices in the area of
real space; and
matching the identified subjects with user accounts further includes
identifying locations of mobile
computing devices using the emitted signals.
13. The method of claim 11, wherein matching the identified subjects with
user accounts is performed without
use of personal identifying biometric information associated with the user
accounts.
14. The method of claim 11, wherein the client applications on the mobile
computing devices cause display of
semaphore images on the mobile computing devices in the area of real space,
and matching the identified subjects
with user accounts further includes identifying locations of mobile computing
devices using an image recognition
engine that determines locations of the mobile computing devices displaying
semaphore images.
15. The method of claim 14, further including accepting login
communications from a client application on a
mobile computing device identifying a user account before matching the user
account to an identified subject in the
area of real space, and after accepting login communications sending a
selected semaphore image from a set of
semaphore images to the client application on the mobile computing device.

CA 03112512 2021-03-11
24
WO 2020/023801 PCT/US2019/043526
16. The method of claim 11, wherein matching the identified subjects with
user accounts further includes using
location data transmitted from the mobile computing devices.
17. The method of claim 16, further including matching the identified
subjects with user accounts using the
location data transmitted from the mobile computing device from a plurality of
locations over a time interval in the
area of real space, including:
determining that all other mobile computing devices transmitting location
information of unmatched user
accounts are separated from the mobile computing device by a predetermined
distance;
determining a closest unmatched identified subject to the mobile computing
device; and
matching the identified unmatched subject with the user account of the client
application
executing on the mobile computing device.
18. The method of claim 11, wherein matching the identified subjects with
user accounts further includes using
accelerometer data transmitted from the mobile computing devices.
19. The method of claim 11, wherein matching the identified subjects with
user accounts further includes a
trained network identifying locations of mobile computing devices in the area
of real space based on signals emitted
by the mobile computing devices.
20. The method of claim 11, further including associating a log data
structure that comprises a list of inventory
items for the matched identified subject to the user account for the
identified subject.
21. A non-transitory computer readable storage medium impressed with
computer program instructions to link
subjects in an area of real space with user accounts, the user accounts being
linked with client applications
executable on mobile computing devices, the instructions, when executed on a
processor, implement a method
comprising:
receiving a plurality of sequences of images of corresponding fields of view
in the real space;
determining locations of identified subjects represented in the sequences of
images;
matching the identified subjects with user accounts by identifying locations
of mobile computing devices
executing client applications in the area of real space; and
matching locations of the mobile computing devices with locations of the
subjects.
22. The non-transitory computer readable storage medium of claim 21,
wherein the mobile computing devices
emit signals usable to indicate locations of the mobile computing devices in
the area of real space, and the matching
the identified subjects with user accounts further including identifying
locations of mobile computing devices using
the emitted signals.
23. The non-transitory computer readable storage medium of claim 21,
wherein the matching the identified
subjects with user accounts is performed without use of personal identifying
biometric information associated with
the user accounts.

CA 03112512 2021-03-11
WO 2020/023801 PCT/US2019/043526
24. The non-transitory computer readable storage medium of claim 21,
wherein the client applications on the
mobile computing devices cause display of semaphore images on the mobile
computing devices in the area of real
space, and the matching the identified subjects with user accounts further
includes identifying locations of mobile
computing devices using an image recognition engine that determines locations
of the mobile computing devices
displaying semaphore images.
25. The non-transitory computer readable storage medium of claim 24,
implementing the method further
comprising:
accepting login communications from a client application on a mobile computing
device identifying a user
account before matching the user account to an identified subject in the area
of real space, and after accepting login
communications sending a selected semaphore image from a set of semaphore
images to the client application on
the mobile computing device.
26. The non-transitory computer readable storage medium of claim 21,
wherein matching the identified
subjects with user accounts further includes using location data transmitted
from the mobile computing devices.
27. The non-transitory computer readable storage medium of claim 26,
implementing the method further
comprising:
matching the identified subjects with user accounts using the location data
transmitted from the mobile
device from a plurality of locations over a time interval in the area of real
space, including:
determining that all other mobile computing devices transmitting location
information of
unmatched user accounts are separated from the mobile computing device by a
predetermined distance;
determining a closest unmatched identified subject to the mobile device; and
matching the identified unmatched subject with the user account of the client
application
executing on the mobile computing device.
28. The non-transitory computer readable storage medium of claim 21,
wherein matching the identified
subjects with user accounts further includes using accelerometer data
transmitted from the mobile computing
devices.
29. The non-transitory computer readable storage medium of claim 21,
wherein matching the identified
subjects with user accounts further includes a trained network identifying
locations of mobile computing devices in
the area of real space based on signals emitted by the mobile computing
devices.
30. The non-transitory computer readable storage medium of claim 21,
further including associating a log data
structure that comprises a list of inventory items for the matched identified
subject to the user account for the
identified subject.

Description

Note: Descriptions are shown in the official language in which they were submitted.

CA 03112512 2021-03-11
1
WO 2020/023801 PCT/US2019/043526
SYSTEMS AND METHODS TO CHECK-IN SHOPPERS IN A
CASHIER-LESS STORE
PRIORITY APPLICATION
[0001] This application claims the benefit of U.S. Provisional Patent
Application No. 62/703,785 (Atty.
Docket No. STCG 1006-1), filed 26 July 2018, and of U.S. Non-Provisional
Application No. 16/255,573 (Atty.
Docket No. STCG 1009-1), filed 23 January 2019, which is a continuation-in-
part of U.S. Patent Application No.
15/945,473 (Atty. Docket No. STCG 1005-1) filed 04 April 2018, which is a
continuation-in-part of U.S. Patent
Application No. 15/907,112 (Atty. Docket No. STCG 1002-1) filed 27 February
2018 (now U.S. Patent No.
10,133,933, issued 20 November 2018), which is a continuation-in-part of U.S.
Patent Application No. 15/847,796
(Atty. Docket No. STCG 1001-1), filed 19 December 2017 (now U.S. Patent No.
10,055,853, issued 21 August),
which claims benefit of U.S. Provisional Patent Application No. 62/542,077
(Atty. Docket No. STCG 1000-1), filed
07 August 2017, which applications are incorporated herein by reference.
BACKGROUND
Field
[0002] The present invention relates to systems that link subjects in an
area of real space with user
accounts linked with client applications executing on mobile computing
devices.
Description of Related Art
[0003] Identifying subjects within an area of real space, such as people
in a shopping store, uniquely
associating the identified subjects with real people or with authenticated
accounts associated with responsible parties
can present many technical challenges. For example, consider such an image
processing system deployed in a
shopping store with multiple customers moving in aisles between the shelves
and open spaces within the shopping
store. Customers take items from shelves and put those in their respective
shopping carts or baskets. Customers may
also put items on the shelf, if they do not want the item. Though the system
may identify a subject in the images, and
the items the subject takes, the system must accurately identify an authentic
user account responsible for the taken
items by that subject.
[0004] In some systems, facial recognition, or other biometric
recognition technique, might be used to
identify the subjects in the images, and link them with accounts. This
approach, however, requires access by the
image processing system to databases storing the personal identifying
biometric information, linked with the
accounts. This is undesirable from a security and privacy standpoint in many
settings.
[0005] It is desirable to provide a system that can more effectively and
automatically link a subject in an
area of real space to a user known to the system for providing services to the
subject. Also, it is desirable to provide
image processing systems by which images of large spaces are used to identify
subjects without requiring personal
identifying biometric information of the subjects.
SUMMARY
[0006] A system, and method for operating a system, are provided for
linking subjects, such as persons in
an area of real space, with user accounts. The system can use image processing
to identify subjects in the area of real
space without requiring personal identifying biometric information. The user
accounts are linked with client

CA 03112512 2021-03-11
2
WO 2020/023801 PCT/US2019/043526
applications executable on mobile computing devices. This function of linking
identified subjects to user accounts
by image and signal processing presents a complex problem of computer
engineering, relating to the type of image
and signal data to be processed, what processing of the image and signal data
to perform, and how to determine
actions from the image and signal data with high reliability.
[0007] A system and method are provided for linking subjects in an area
of real space with user accounts.
The user accounts are linked with client applications executable on mobile
computing devices. A plurality of
cameras or other sensors produce respective sequences of images in
corresponding fields of view in the real space.
Using these sequences of images, a system and method are described for
determining locations of identified subjects
represented in the images and matching the identified subjects with user
accounts by identifying locations of mobile
devices executing client applications in the area of real space and matching
locations of the mobile devices with
locations of the subjects.
[0008] In one embodiment described herein, the mobile devices emit
signals usable to indicate locations
of the mobile devices in the area of real space. The system matches the
identified subjects with user accounts by
identifying locations of mobile devices using the emitted signals.
[0009] In one embodiment, the signals emitted by the mobile devices
comprise images. In a described
embodiment, the client applications on the mobile devices cause display of
semaphore images, which can be as
simple as a particular color, on the mobile devices in the area of real space.
The system matches the identified
subjects with user accounts by identifying locations of mobile devices by
using an image recognition engine that
determines locations of the mobile devices displaying semaphore images. The
system includes a set of semaphore
images. The system accepts login communications from a client application on a
mobile device identifying a user
account before matching the user account to an identified subject in the area
of real space. After accepting login
communications, the system sends a selected semaphore image from the set of
semaphore images to the client
application on the mobile device. The system sets a status of the selected
semaphore image as assigned. The system
receives a displayed image of the selected semaphore image, recognizes the
displayed image and matches the
recognized image with the assigned images from the set of semaphore images.
The system matches a location of the
mobile device displaying the recognized semaphore image located in the area of
real space with a not yet linked
identified subject. The system, after matching the user account to the
identified subject, sets the status of the
recognized semaphore image as available.
[0010] In one embodiment, the signals emitted by the mobile devices
comprise radio signals indicating a
service location of the mobile device. The system receives location data
transmitted by the client applications on the
mobile devices. The system matches the identified subjects with user accounts
using the location data transmitted
from the mobile devices. The system uses the location data transmitted from
the mobile device from a plurality of
locations over a time interval in the area of real space to match the
identified subjects with user accounts. This
matching the identified unmatched subject with the user account of the client
application executing on the mobile
device includes determining that all other mobile devices transmitting
location information of unmatched user
accounts are separated from the mobile device by a predetermined distance and
determining a closest unmatched
identified subject to the mobile device.
[0011] In one embodiment, the signals emitted by the mobile devices
comprise radio signals indicating
acceleration and orientation of the mobile device. In one embodiment, such
acceleration data is generated by
accelerometer of the mobile computing device. In another embodiment, in
addition to the accelerometer data,
direction data from a compass on the mobile device is also received by the
processing system. The system receives

CA 03112512 2021-03-11
3
WO 2020/023801 PCT/US2019/043526
the accelerometer data from the client applications on the mobile devices. The
system matches the identified
subjects with user accounts using the accelerometer data transmitted from the
mobile device. In this embodiment,
the system uses the accelerometer data transmitted from the mobile device from
a plurality of locations over a time
interval in the area of real space and derivative of data indicating the
locations of identified subjects over the time
interval in the area of real space to match the identified subjects with user
accounts.
[0012] In one embodiment, the system matches the identified subjects with
user accounts using a trained
network to identify locations of mobile devices in the area of real space
based on the signals emitted by the mobile
devices. In such an embodiment, the signals emitted by the mobile devices
include location data and accelerometer
data.
[0013] In one embodiment, the system includes log data structures
including a list of inventory items for
the identified subjects. The system associates the log data structure for the
matched identified subject to the user
account for the identified subject.
[0014] In one embodiment, the system processes a payment for the list of
inventory items for the
identified subject from a payment method identified in the user account linked
to the identified subject.
[0015] In one embodiment, the system matches the identified subjects with
user accounts without use of
personal identifying biometric information associated with the user accounts.
[0016] Methods and computer program products which can be executed by
computer systems are also
described herein.
[0017] Other aspects and advantages of the present invention can be seen
on review of the drawings, the
detailed description and the claims, which follow.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] Fig. 1 illustrates an architectural level schematic of a system in
which a matching engine links
subjects identified by a subject tracking engine to user accounts linked with
client applications executing on mobile
devices.
[0019] Fig. 2 is a side view of an aisle in a shopping store illustrating
a subject with a mobile computing
device and a camera arrangement.
[0020] Fig. 3 is a top view of the aisle of Fig. 2 in a shopping store
illustrating the subject with the mobile
computing device and the camera arrangement.
[0021] Fig. 4 shows an example data structure for storing joints
information of subjects.
[0022] Fig. 5 shows an example data structure for storing a subject
including the information of
associated joints.
[0023] Fig. 6 is a flowchart showing process steps for matching an
identified subject to a user account
using a semaphore image displayed on a mobile computing device.
[0024] Fig. 7 is a flowchart showing process steps for matching an
identified subject to a user account
using service location of a mobile computing device.
[0025] Fig. 8 is a flowchart showing process steps for matching an
identified subject to a user account
using velocity of subjects and a mobile computing device.
[0026] Fig. 9A is a flowchart showing a first part of process steps for
matching an identified subject to a
user account using a network ensemble.

CA 03112512 2021-03-11
4
WO 2020/023801 PCT/US2019/043526
[0027] Fig. 9B is a flowchart showing a second part of process steps for
matching an identified subject to
a user account using a network ensemble.
[0028] Fig. 9C is a flowchart showing a third part of process steps for
matching an identified subject to a
user account using a network ensemble.
[0029] Fig. 10 is an example architecture in which the four techniques
presented in Figs. 6 to 9C are
applied in an area of real space to reliably match an identified subject to a
user account.
[0030] Fig. 11 is a camera and computer hardware arrangement configured
for hosting the matching
engine of Fig. 1.
DETAILED DESCRIPTION
[0031] The following description is presented to enable any person
skilled in the art to make and use the
invention, and is provided in the context of a particular application and its
requirements. Various modifications to
the disclosed embodiments will be readily apparent to those skilled in the
art, and the general principles defined
herein may be applied to other embodiments and applications without departing
from the spirit and scope of the
present invention. Thus, the present invention is not intended to be limited
to the embodiments shown but is to be
accorded the widest scope consistent with the principles and features
disclosed herein.
System Overview
[0032] A system and various implementations of the subject technology is
described with reference to
Figs. 1-11. The system and processes are described with reference to Fig. 1,
an architectural level schematic of a
system in accordance with an implementation. Because Fig. 1 is an
architectural diagram, certain details are omitted
to improve the clarity of the description.
[0033] The discussion of Fig. 1 is organized as follows. First, the
elements of the system are described,
followed by their interconnections. Then, the use of the elements in the
system is described in greater detail.
[0034] Fig. 1 provides a block diagram level illustration of a system
100. The system 100 includes
cameras 114, network nodes hosting image recognition engines 112a, 112b, and
112n, a subject tracking engine 110
deployed in a network node 102 (or nodes) on the network, mobile computing
devices 118a, 118b, 118m
(collectively referred as mobile computing devices 120), a training database
130, a subject database 140, a user
account database 150, an image database 160, a matching engine 170 deployed in
a network node or nodes (also
known as a processing platform) 103 , and a communication network or networks
181. The network nodes can host
only one image recognition engine, or several image recognition engines. The
system can also include an inventory
database and other supporting data.
[0035] As used herein, a network node is an addressable hardware device
or virtual device that is attached
to a network, and is capable of sending, receiving, or forwarding information
over a communications channel to or
from other network nodes. Examples of electronic devices which can be deployed
as hardware network nodes
include all varieties of computers, workstations, laptop computers, handheld
computers, and smartphones. Network
nodes can be implemented in a cloud-based server system. More than one virtual
device configured as a network
node can be implemented using a single physical device.
[0036] For the sake of clarity, only three network nodes hosting image
recognition engines are shown in
the system 100. However, any number of network nodes hosting image recognition
engines can be connected to the
subject tracking engine 110 through the network(s) 181. Similarly, three
mobile computing devices are shown in the

CA 03112512 2021-03-11
WO 2020/023801 PCT/US2019/043526
system 100. However, any number of mobile computing devices can be connected
to the network node 103 hosting
the matching engine 170 through the network(s) 181. Also, an image recognition
engine, a subject tracking engine, a
matching engine and other processing engines described herein can execute
using more than one network node in a
distributed architecture.
[0037] The interconnection of the elements of system 100 will now be
described. Network(s) 181 couples
the network nodes 101a, 101b, and 101n, respectively, hosting image
recognition engines 112a, 112b, and 112n, the
network node 102 hosting the subject tracking engine 110, the mobile computing
devices 118a, 118b, and 118m, the
training database 130, the subject database 140, the user account database
150, the image database 160, and the
network node 103 hosting the matching engine 170. Cameras 114 are connected to
the subject tracking engine 110
through network nodes hosting image recognition engines 112a, 112b, and 112n.
In one embodiment, the cameras
114 are installed in a shopping store such that sets of cameras 114 (two or
more) with overlapping fields of view are
positioned over each aisle to capture images of real space in the store. In
Fig. 1, two cameras are arranged over aisle
116a, two cameras are arranged over aisle 116b, and three cameras are arranged
over aisle 116n. The cameras 114
are installed over aisles with overlapping fields of view. In such an
embodiment, the cameras are configured with
the goal that customers moving in the aisles of the shopping store are present
in the field of view of two or more
cameras at any moment in time.
[0038] Cameras 114 can be synchronized in time with each other, so that
images are captured at the same
time, or close in time, and at the same image capture rate. The cameras 114
can send respective continuous streams
of images at a predetermined rate to network nodes hosting image recognition
engines 112a-112n. Images captured
in all the cameras covering an area of real space at the same time, or close
in time, are synchronized in the sense that
the synchronized images can be identified in the processing engines as
representing different views of subjects
having fixed positions in the real space. For example, in one embodiment, the
cameras send image frames at the
rates of 30 frames per second (fps) to respective network nodes hosting image
recognition engines 112a-112n. Each
frame has a timestamp, identity of the camera (abbreviated as "camera_id"),
and a frame identity (abbreviated as
"frame_id") along with the image data. Other embodiments of the technology
disclosed can use different types of
sensors such as infrared or RF image sensors, ultrasound sensors, thermal
sensors, Lidars, etc., to generate this data.
Multiple types of sensors can be used, including for example ultrasound or RF
sensors in addition to the cameras
114 that generate RGB color output. Multiple sensors can be synchronized in
time with each other, so that frames
are captured by the sensors at the same time, or close in time, and at the
same frame capture rate. In all of the
embodiments described herein sensors other than cameras, or sensors of
multiple types, can be used to produce the
sequences of images utilized.
[0039] Cameras installed over an aisle are connected to respective image
recognition engines. For
example, in Fig. 1, the two cameras installed over the aisle 116a are
connected to the network node 101a hosting an
image recognition engine 112a. Likewise, the two cameras installed over aisle
116b are connected to the network
node 101b hosting an image recognition engine 112b. Each image recognition
engine 112a-112n hosted in a
network node or nodes 101a-101n, separately processes the image frames
received from one camera each in the
illustrated example.
[0040] In one embodiment, each image recognition engine 112a, 112b, and
112n is implemented as a
deep learning algorithm such as a convolutional neural network (abbreviated
CNN). In such an embodiment, the
CNN is trained using a training database 130. In an embodiment described
herein, image recognition of subjects in
the real space is based on identifying and grouping joints recognizable in the
images, where the groups of joints can

CA 03112512 2021-03-11
6
WO 2020/023801 PCT/US2019/043526
be attributed to an individual subject. For this joints-based analysis, the
training database 130 has a large collection
of images for each of the different types of joints for subjects. In the
example embodiment of a shopping store, the
subjects are the customers moving in the aisles between the shelves. In an
example embodiment, during training of
the CNN, the system 100 is referred to as a "training system." After training
the CNN using the training database
130, the CNN is switched to production mode to process images of customers in
the shopping store in real time.
[0041] In an example embodiment, during production, the system 100 is
referred to as a runtime system
(also referred to as an inference system). The CNN in each image recognition
engine produces arrays of joints data
structures for images in its respective stream of images. In an embodiment as
described herein, an array of joints
data structures is produced for each processed image, so that each image
recognition engine 112a-112n produces an
output stream of arrays of joints data structures. These arrays of joints data
structures from cameras having
overlapping fields of view are further processed to form groups of joints, and
to identify such groups of joints as
subjects. These groups of joints may not uniquely identify the individual in
the image, or an authentic user account
for the individual in the image, but can be used to track a subject in the
area. The subjects can be identified and
tracked by the system using an identifier "subject_id" during their presence
in the area of real space.
[0042] For example, when a customer enters a shopping store, the system
identifies the customer using
joints analysis as described above and is assigned a "subject_id". This
identifier is, however, not linked to real world
identity of the subject such as user account, name, driver's license, email
addresses, mailing addresses, credit card
numbers, bank account numbers, driver's license number, etc. or to identifying
biometric identification such as
finger prints, facial recognition, hand geometry, retina scan, iris scan,
voice recognition, etc. Therefore, the
identified subject is anonymous. Details of an example technology for subject
identification and tracking are
presented in United States Patent No. 10,055,853, issued 21 August 2018,
titled, "Subject Identification and
Tracking Using Image Recognition Engine" which is incorporated herein by
reference as if fully set forth herein.
[0043] The subject tracking engine 110, hosted on the network node 102
receives, in this example,
continuous streams of arrays of joints data structures for the subjects from
image recognition engines 112a-112n.
The subject tracking engine 110 processes the arrays of joints data structures
and translates the coordinates of the
elements in the arrays of joints data structures corresponding to images in
different sequences into candidate joints
having coordinates in the real space. For each set of synchronized images, the
combination of candidate joints
identified throughout the real space can be considered, for the purposes of
analogy, to be like a galaxy of candidate
joints. For each succeeding point in time, movement of the candidate joints is
recorded so that the galaxy changes
overtime. The output of the subject tracking engine 110 is stored in the
subject database 140.
[0044] The subject tracking engine 110 uses logic to identify groups or
sets of candidate joints having
coordinates in real space as subjects in the real space. For the purposes of
analogy, each set of candidate points is
like a constellation of candidate joints at each point in time. The
constellations of candidate joints can move over
time.
[0045] In an example embodiment, the logic to identify sets of candidate
joints comprises heuristic
functions based on physical relationships amongst joints of subjects in real
space. These heuristic functions are used
to identify sets of candidate joints as subjects. The sets of candidate joints
comprise individual candidate joints that
have relationships according to the heuristic parameters with other individual
candidate joints and subsets of
candidate joints in a given set that has been identified, or can be
identified, as an individual subject.

CA 03112512 2021-03-11
7
WO 2020/023801 PCT/US2019/043526
[0046] In the example of a shopping store, as the customer completes
shopping and moves out of the
store, the system processes payment of items bought by the customer. In a
cashier-less store, the system has to link
the customer with a "user account" containing preferred payment method
provided by the customer.
[0047] As described above, the "identified subject" is anonymous because
information about the joints
and relationships among the joints is not stored as biometric identifying
information linked to an individual or to a
user account.
[0048] The system includes a matching engine 170 (hosted on the network
node 103) to process signals
received from mobile computing devices 120 (carried by the subjects) to match
the identified subjects with user
accounts. The matching can be performed by identifying locations of mobile
devices executing client applications in
the area of real space (e.g., the shopping store) and matching locations of
mobile devices with locations of subjects,
without use of personal identifying biometric information from the images.
[0049] The actual communication path to the network node 103 hosting the
matching engine 170 through
the network 181 can be point-to-point over public and/or private networks. The
communications can occur over a
variety of networks 181, e.g., private networks, VPN, MPLS circuit, or
Internet, and can use appropriate application
programming interfaces (APIs) and data interchange formats, e.g.,
Representational State Transfer (REST),
JavaScriptTM Object Notation (JSON), Extensible Markup Language (XML), Simple
Object Access Protocol
(SOAP), JavaTm Message Service (JMS), and/or Java Platform Module System. All
of the communications can be
encrypted. The communication is generally over a network such as a LAN (local
area network), WAN (wide area
network), telephone network (Public Switched Telephone Network (PSTN), Session
Initiation Protocol (SIP),
wireless network, point-to-point network, star network, token ring network,
hub network, Internet, inclusive of the
mobile Internet, via protocols such as EDGE, 3G, 4G LTE, Wi-Fi, and WiMAX.
Additionally, a variety of
authorization and authentication techniques, such as username/password, Open
Authorization (0Auth), Kerberos,
SecureID, digital certificates and more, can be used to secure the
communications.
[0050] The technology disclosed herein can be implemented in the context
of any computer-implemented
system including a database system, a multi-tenant environment, or a
relational database implementation like an
OracleTM compatible database implementation, an IBM DB2 Enterprise ServerTM
compatible relational database
implementation, a MySQLTM or PostgreSQLTM compatible relational database
implementation or a Microsoft SQL
ServerTM compatible relational database implementation or a NoSQLTM non-
relational database implementation
such as a VampireTM compatible non-relational database implementation, an
Apache CassandraTM compatible non-
relational database implementation, a BigTableTm compatible non-relational
database implementation or an
HBaseTM or DynamoDBTM compatible non-relational database implementation. In
addition, the technology
disclosed can be implemented using different programming models like
MapReduceTM, bulk synchronous
programming, MPI primitives, etc. or different scalable batch and stream
management systems like Apache
StormTM, Apache SparkTM, Apache KafkaTM, Apache FlinkTM, TruvisoTm, Amazon
Elasticsearch ServiceTM,
Amazon Web ServicesTM (AWS), IBM Info-SphereTM, BorealisTM, and Yahoo! 54TM
Camera Arrangement
[0051] The cameras 114 are arranged to track multi-joint subjects (or
entities) in a three-dimensional
(abbreviated as 3D) real space. In the example embodiment of the shopping
store, the real space can include the area
of the shopping store where items for sale are stacked in shelves. A point in
the real space can be represented by an

CA 03112512 2021-03-11
8
WO 2020/023801 PCT/US2019/043526
(x, y, z) coordinate system. Each point in the area of real space for which
the system is deployed is covered by the
fields of view of two or more cameras 114.
[0052] In a shopping store, the shelves and other inventory display
structures can be arranged in a variety
of manners, such as along the walls of the shopping store, or in rows forming
aisles or a combination of the two
arrangements. Fig. 2 shows an arrangement of shelves, forming an aisle 116a,
viewed from one end of the aisle
116a. Two cameras, camera A 206 and camera B 208 are positioned over the aisle
116a at a predetermined distance
from a roof 230 and a floor 220 of the shopping store above the inventory
display structures, such as shelves. The
cameras 114 comprise cameras disposed over and having fields of view
encompassing respective parts of the
inventory display structures and floor area in the real space. The coordinates
in real space of members of a set of
candidate joints, identified as a subject, identify locations of the subject
in the floor area. In Fig. 2, a subject 240 is
holding the mobile computing device 118a and standing on the floor 220 in the
aisle 116a. The mobile computing
device can send and receive signals through the wireless network(s) 181. In
one example, the mobile computing
devices 120 communicate through a wireless network using for example a Wi-Fi
protocol, or other wireless
protocols like Bluetooth, ultra-wideband, and ZigBee, through wireless access
points (WAP) 250 and 252.
[0053] In the example embodiment of the shopping store, the real space
can include all of the floor 220 in
the shopping store from which inventory can be accessed. Cameras 114 are
placed and oriented such that areas of
the floor 220 and shelves can be seen by at least two cameras. The cameras 114
also cover at least part of the shelves
202 and 204 and floor space in front of the shelves 202 and 204. Camera angles
are selected to have both steep
perspective, straight down, and angled perspectives that give more full body
images of the customers. In one
example embodiment, the cameras 114 are configured at an eight (8) foot height
or higher throughout the shopping
store.
[0054] In Fig. 2, the cameras 206 and 208 have overlapping fields of
view, covering the space between a
shelf A 202 and a shelf B 204 with overlapping fields of view 216 and 218,
respectively. A location in the real space
is represented as a (x, y, z) point of the real space coordinate system. "x"
and "y" represent positions on a two-
dimensional (2D) plane which can be the floor 220 of the shopping store. The
value "z" is the height of the point
above the 2D plane at floor 220 in one configuration.
[0055] Fig. 3 illustrates the aisle 116a viewed from the top of Fig. 2,
further showing an example
arrangement of the positions of cameras 206 and 208 over the aisle 116a. The
cameras 206 and 208 are positioned
closer to opposite ends of the aisle 116a. The camera A 206 is positioned at a
predetermined distance from the shelf
A 202 and the camera B 208 is positioned at a predetermined distance from the
shelf B 204. In another embodiment,
in which more than two cameras are positioned over an aisle, the cameras are
positioned at equal distances from
each other. In such an embodiment, two cameras are positioned close to the
opposite ends and a third camera is
positioned in the middle of the aisle. It is understood that a number of
different camera arrangements are possible.
Joints Data Structure
[0056] The image recognition engines 112a-112n receive the sequences of
images from cameras 114 and
process images to generate corresponding arrays of joints data structures. In
one embodiment, the image recognition
engines 112a-112n identify one of the 19 possible joints of each subject at
each element of the image. The possible
joints can be grouped in two categories: foot joints and non-foot joints. The
19'h type of joint classification is for all
non-joint features of the subject (i.e. elements of the image not classified
as a joint).
Foot Joints:

CA 03112512 2021-03-11
9
WO 2020/023801 PCT/US2019/043526
Ankle joint (left and right)
Non-foot Joints:
Neck
Nose
Eyes (left and right)
Ears (left and right)
Shoulders (left and right)
Elbows (left and right)
Wrists (left and right)
Hip (left and right)
Knees (left and right)
Not a joint
[0057] An array of joints data structures for a particular image
classifies elements of the particular image
by joint type, time of the particular image, and the coordinates of the
elements in the particular image. In one
embodiment, the image recognition engines 112a-112n are convolutional neural
networks (CNN), the joint type is
one of the 19 types of joints of the subjects, the time of the particular
image is the timestamp of the image generated
by the source camera 114 for the particular image, and the coordinates (x, y)
identify the position of the element on
a 2D image plane.
[0058] The output of the CNN is a matrix of confidence arrays for each
image per camera. The matrix of
confidence arrays is transformed into an array of joints data structures. A
joints data structure 400 as shown in Fig. 4
is used to store the information of each joint. The joints data structure 400
identifies x and y positions of the element
in the particular image in the 2D image space of the camera from which the
image is received. A joint number
identifies the type of joint identified. For example, in one embodiment, the
values range from 1 to 19. A value of 1
indicates that the joint is a left anlde, a value of 2 indicates the joint is
a right ankle and so on. The type of joint is
selected using the confidence array for that element in the output matrix of
CNN. For example, in one embodiment,
if the value corresponding to the left-ankle joint is highest in the
confidence array for that image element, then the
value of the joint number is "1".
[0059] A confidence number indicates the degree of confidence of the CNN
in predicting that joint. If the
value of confidence number is high, it means the CNN is confident in its
prediction. An integer-Id is assigned to the
joints data structure to uniquely identify it. Following the above mapping,
the output matrix of confidence arrays per
image is converted into an array of joints data structures for each image. In
one embodiment, the joints analysis
includes performing a combination of k-nearest neighbors, mixture of
Gaussians, and various image morphology
transformations on each input image. The result comprises arrays of joints
data structures which can be stored in the
form of a bit mask in a ring buffer that maps image numbers to bit masks at
each moment in time.
Subject Tracking Engine
[0060] The tracking engine 110 is configured to receive arrays of joints
data structures generated by the
image recognition engines 112a-112n corresponding to images in sequences of
images from cameras having
overlapping fields of view. The arrays of joints data structures per image are
sent by image recognition engines
112a-112n to the tracking engine 110 via the network(s) 181. The tracking
engine 110 translates the coordinates of

CA 03112512 2021-03-11
WO 2020/023801
PCT/US2019/043526
the elements in the arrays of joints data structures corresponding to images
in different sequences into candidate
joints having coordinates in the real space. The tracking engine 110 comprises
logic to identify sets of candidate
joints having coordinates in real space (constellations of joints) as subjects
in the real space. In one embodiment, the
tracking engine 110 accumulates arrays of joints data structures from the
image recognition engines for all the
cameras at a given moment in time and stores this information as a dictionary
in the subject database 140, to be used
for identifying a constellation of candidate joints. The dictionary can be
arranged in the form of key-value pairs,
where keys are camera ids and values are arrays of joints data structures from
the camera. In such an embodiment,
this dictionary is used in heuristics-based analysis to determine candidate
joints and for assignment of joints to
subjects. In such an embodiment, a high-level input, processing and output of
the tracking engine 110 is illustrated
in table 1. Details of the logic applied by the subject tracking engine 110 to
create subjects by combining candidate
joints and track movement of subjects in the area of real space are presented
in United States Patent No. 10,055,853,
issued 21 August 2018, titled, "Subject Identification and Tracking Using
Image Recognition Engine" which is
incorporated herein by reference.
Table 1: Inputs, processing and outputs from subject tracking engine 110 in an
example embodiment.
Inputs Processing Output
Arrays of joints data structures per -
Create joints dictionary - List of identified subjects in
image and for each joints data -
Reproject joint positions in the the real space at a moment in
structure fields of view of cameras with time
overlapping fields of view to
- Unique ID candidate joints
- Confidence number
- Joint number
- (x, y) position in image
space
Subject Data Structure
[0061] The
subject tracking engine 110 uses heuristics to connect joints of subjects
identified by the
image recognition engines 112a-112n. In doing so, the subject tracking engine
110 creates new subjects and updates
the locations of existing subjects by updating their respective joint
locations. The subject tracking engine 110 uses
triangulation techniques to project the locations of joints from 2D space
coordinates (x, y) to 3D real space
coordinates (x, y, z). Fig. 5 shows the subject data structure 500 used to
store the subject. The subject data structure
500 stores the subject related data as a key-value dictionary. The key is a
frame_number and the value is another
key-value dictionary where key is the camera _id and value is a list of 18
joints (of the subject) with their locations in
the real space. The subject data is stored in the subject database 140. Every
new subject is also assigned a unique
identifier that is used to access the subject's data in the subject database
140.
[0062] In one embodiment, the system identifies joints of a subject and
creates a skeleton of the subject.
The skeleton is projected into the real space indicating the position and
orientation of the subject in the real space.
This is also referred to as "pose estimation" in the field of machine vision.
In one embodiment, the system displays
orientations and positions of subjects in the real space on a graphical user
interface (GUI). In one embodiment, the
image analysis is anonymous, i.e., a unique identifier assigned to a subject
created through joints analysis does not
identify personal identification of the subject as described above.

CA 03112512 2021-03-11
11
WO 2020/023801 PCT/US2019/043526
Matching Engine
[0063] The matching engine 170 includes logic to match the identified
subjects with their respective user
accounts by identifying locations of mobile devices (carried by the identified
subjects) that are executing client
applications in the area of real space. In one embodiment, the matching engine
uses multiple techniques,
independently or in combination, to match the identified subjects with the
user accounts. The system can be
implemented without maintaining biometric identifying information about users,
so that biometric information about
account holders is not exposed to security and privacy concerns raised by
distribution of such information.
[0064] In one embodiment, a customer logs in to the system using a client
application executing on a
personal mobile computing device upon entering the shopping store, identifying
an authentic user account to be
associated with the client application on the mobile device. The system then
sends a "semaphore" image selected
from the set of unassigned semaphore images in the image database 160 to the
client application executing on the
mobile device. The semaphore image is unique to the client application in the
shopping store as the same image is
not freed for use with another client application in the store until the
system has matched the user account to an
identified subject. After that matching, the semaphore image becomes available
for use again. The client application
causes the mobile device to display the semaphore image, which display of the
semaphore image is a signal emitted
by the mobile device to be detected by the system. The matching engine 170
uses the image recognition engines
112a-n or a separate image recognition engine (not shown in Fig. 1) to
recognize the semaphore image and
determine the location of the mobile computing device displaying the semaphore
in the shopping store. The
matching engine 170 matches the location of the mobile computing device to a
location of an identified subject. The
matching engine 170 then links the identified subject (stored in the subject
database 140) to the user account (stored
in the user account database 150) linked to the client application for the
duration in which the subject is present in
the shopping store. No biometric identifying information is used for matching
the identified subject with the user
account, and none is stored in support of this process. That is, there is no
information in the sequences of images
used to compare with stored biometric information for the purposes of matching
the identified subjects with user
accounts in support of this process.
[0065] In other embodiments, the matching engine 170 uses other signals
in the alternative or in
combination from the mobile computing devices 120 to link the identified
subjects to user accounts. Examples of
such signals include a service location signal identifying the position of the
mobile computing device in the area of
the real space, speed and orientation of the mobile computing device obtained
from the accelerometer and compass
of the mobile computing device, etc.
[0066] In some embodiments, though embodiments are provided that do not
maintain any biometric
information about account holders, the system can use biometric information to
assist matching a not-yet-linked
identified subject to a user account. For example, in one embodiment, the
system stores "hair color" of the customer
in his or her user account record. During the matching process, the system
might use for example hair color of
subjects as an additional input to disambiguate and match the subject to a
user account. If the user has red colored
hair and there is only one subject with red colored hair in the area of real
space or in close proximity of the mobile
computing device, then the system might select the subject with red hair color
to match the user account.
[0067] The flowcharts in Figs. 6 to 9C present process steps of four
techniques usable alone or in
combination by the matching engine 170.
Semaphore Images

CA 03112512 2021-03-11
12
WO 2020/023801 PCT/US2019/043526
[0068] Fig. 6 is a flowchart 600 presenting process steps for a first
technique for matching identified
subjects in the area of real space with their respective user accounts. In the
example of a shopping store, the subjects
are customers (or shoppers) moving in the store in aisles between shelves and
other open spaces. The process starts
at step 602. As a subject enters the area of real space, the subject opens a
client application on a mobile computing
device and attempts to login. The system verifies the user credentials at step
604 (for example, by querying the user
account database 150) and accepts login communication from the client
application to associate an authenticated
user account with the mobile computing device. The system determines that the
user account of the client
application is not yet linked to an identified subject. The system sends a
semaphore image to the client application
for display on the mobile computing device at step 606. Examples of semaphore
images include various shapes of
solid colors such as a red rectangle or a pink elephant, etc. A variety of
images can be used as semaphores,
preferably suited for high confidence recognition by the image recognition
engine. Each semaphore image can have
a unique identifier. The processing system includes logic to accept login
communications from a client application
on a mobile device identifying a user account before matching the user account
to an identified subject in the area of
real space, and after accepting login communications sends a selected
semaphore image from the set of semaphore
images to the client application on the mobile device.
[0069] In one embodiment, the system selects an available semaphore image
from the image database 160
for sending to the client application. After sending the semaphore image to
the client application, the system
changes a status of the semaphore image in the image database 160 as
"assigned" so that this image is not assigned
to any other client application. The status of the image remains as "assigned"
until the process to match the
identified subject to the mobile computing device is complete. After matching
is complete, the status can be changed
to "available." This allows for rotating use of a small set of semaphores in a
given system, simplifying the image
recognition problem.
[0070] The client application receives the semaphore image and displays
it on the mobile computing
device. In one embodiment, the client application also increases the
brightness of the display to increase the image
visibility. The image is captured by one or more cameras 114 and sent to an
image processing engine, referred to as
WhatCNN. The system uses WhatCNN at step 608 to recognize the semaphore images
displayed on the mobile
computing device. In one embodiment, WhatCNN is a convolutional neural network
trained to process the specified
bounding boxes in the images to generate a classification of hands of the
identified subjects. One trained WhatCNN
processes image frames from one camera. In the example embodiment of the
shopping store, for each hand joint in
each image frame, the WhatCNN identifies whether the hand joint is empty. The
WhatCNN also identifies a
semaphore image identifier (in the image database 160) or an SKU (stock
keeping unit) number of the inventory
item in the hand joint, a confidence value indicating the item in the hand
joint is a non-SKU item (i.e. it does not
belong to the shopping store inventory) and a context of the hand joint
location in the image frame.
[0071] As mentioned above, two or more cameras with overlapping fields of
view capture images of
subjects in real space. Joints of a single subject can appear in image frames
of multiple cameras in a respective
image channel. A WhatCNN model per camera identifies semaphore images
(displayed on mobile computing
devices) in hands (represented by hand joints) of subjects. A coordination
logic combines the outputs of WhatCNN
models into a consolidated data structure listing identifiers of semaphore
images in left hand (referred to as
left_hand_classid) and right hand (fight_hand_classid) of identified subjects
(step 610). The system stores this
information in a dictionary mapping subject_id to left_hand_classid and
right_hand_classid along with a timestamp,
including locations of the joints in real space. The details of WhatCNN are
presented in United States Patent

CA 03112512 2021-03-11
13
WO 2020/023801 PCT/US2019/043526
Application No. 15/907,112, filed 27 February 2018, titled, "Item Put and Take
Detection Using Image
Recognition" which is incorporated herein by reference as if fully set forth
herein.
[0072] At step 612, the system checks if the semaphore image sent to the
client application is recognized
by the WhatCNN by iterating the output of the WhatCNN models for both hands of
all identified subjects. If the
semaphore image is not recognized, the system sends a reminder at a step 614
to the client application to display the
semaphore image on the mobile computing device and repeats process steps 608
to 612. Otherwise, if the semaphore
image is recognized by WhatCNN, the system matches a user_account (from the
user account database 150)
associated with the client application to subject_id (from the subject
database 140) of the identified subject holding
the mobile computing device (step 616). In one embodiment, the system
maintains this mapping (subject_id-
user_account) until the subject is present in the area of real space. The
process ends at step 618.
Service Location
[0073] The flowchart 700 in Fig. 7 presents process steps for a second
technique for matching identified
subjects with user accounts. This technique uses radio signals emitted by the
mobile devices indicating location of
the mobile devices. The process starts at step 702, the system accepts login
communication from a client application
on a mobile computing device as described above in step 604 to link an
authenticated user account to the mobile
computing device. At step 706, the system receives service location
information from the mobile devices in the area
of real space at regular intervals. In one embodiment, latitude and longitude
coordinates of the mobile computing
device emitted from a global positioning system (GPS) receiver of the mobile
computing device are used by the
system to determine the location. In one embodiment, the service location of
the mobile computing device obtained
from GPS coordinates has an accuracy between 1 to 3 meters. In another
embodiment, the service location of a
mobile computing device obtained from GPS coordinates has an accuracy between
1 to 5 meters.
[0074] Other techniques can be used in combination with the above
technique or independently to
determine the service location of the mobile computing device. Examples of
such techniques include using signal
strengths from different wireless access points (WAP) such as 250 and 252
shown in Figs. 2 and 3 as an indication
of how far the mobile computing device is from respective access points. The
system then uses known locations of
wireless access points (WAP) 250 and 252 to triangulate and determine the
position of the mobile computing device
in the area of real space. Other types of signals (such as Bluetooth, ultra-
wideband, and ZigBee) emitted by the
mobile computing devices can also be used to determine a service location of
the mobile computing device.
[0075] The system monitors the service locations of mobile devices with
client applications that are not
yet linked to an identified subject at step 708 at regular intervals such as
every second. At step 708, the system
determines the distance of a mobile computing device with an unmatched user
account from all other mobile
computing devices with unmatched user accounts. The system compares this
distance with a pre-determined
threshold distance "d" such as 3 meters. If the mobile computing device is
away from all other mobile devices with
unmatched user accounts by at least "d" distance (step 710), the system
determines a nearest not yet linked subject
to the mobile computing device (step 714). The location of the identified
subject is obtained from the output of the
JointsCNN at step 712. In one embodiment the location of the subject obtained
from the Joints CNN is more accurate
than the service location of the mobile computing device. At step 616, the
system performs the same process as
described above in flowchart 600 to match the subject_id of the identified
subject with the user_account of the client
application. The process ends at a step 718.

CA 03112512 2021-03-11
14
WO 2020/023801 PCT/US2019/043526
[0076] No biometric identifying information is used for matching the
identified subject with the user
account, and none is stored in support of this process. That is, there is no
information in the sequences of images
used to compare with stored biometric information for the purposes of matching
the identified subjects with user
account in support of this process. Thus, this logic to match the identified
subjects with user accounts operates
without use of personal identifying biometric information associated with the
user accounts.
Speed and Orientation
[0077] The flowchart 800 in Fig. 8 presents process steps for a third
technique for matching identified
subjects with user accounts. This technique uses signals emitted by an
accelerometer of the mobile computing
devices to match identified subjects with client applications. The process
starts at step 802. The process starts at step
604 to accept login communication from the client application as described
above in the first and second techniques.
At step 806, the system receives signals emitted from the mobile computing
devices carrying data from
accelerometers on the mobile computing devices in the area of real space,
which can be sent at regular intervals. At
a step 808, the system calculates an average velocity of all mobile computing
devices with unmatched user accounts.
[0078] The accelerometers provide acceleration of mobile computing
devices along the three axes (x, y,
z). In one embodiment, the velocity is calculated by taking the accelerations
values at small time intervals (e.g., at
every 10 milliseconds) to calculate the current velocity at time "t" i.e.,vt=
vo + at, where vo is initial velocity. In one
embodiment, the vo is initialized as "0" and subsequently, for every time t+1,
vt becomes vo. The velocities along the
three axes are then combined to determine an overall velocity of the mobile
computing device at time "t." Finally at
step 808, the system calculates moving averages of velocities of all mobile
computing devices over a larger period
of time such as 3 seconds which is long enough for the walking gait of an
average person, or over longer periods of
time.
[0079] At step 810, the system calculates Euclidean distance (also
referred to as L2 norm) between
velocities of all pairs of mobile computing devices with unmatched client
applications to not yet linked identified
subjects. The velocities of subjects are derived from changes in positions of
their joints with respect to time,
obtained from joints analysis and stored in respective subject data structures
500 with timestamps. In one
embodiment, a location of center of mass of each subject is determined using
the joints analysis. The velocity, or
other derivative, of the center of mass location data of the subject is used
for comparison with velocities of mobile
computing devices. For each subject_id-user_account pair, if the value of the
Euclidean distance between their
respective velocities is less than a threshold 0, a score_counter for the
subject_id-user_account pair is incremented.
The above process is performed at regular time intervals, thus updating the
score_counter for each subject_id-
user_account pair.
[0080] At regular time intervals (e.g., every one second), the system
compares the score_counter values
for pairs of every unmatched user account with every not yet linked identified
subject (step 812). If the highest score
is greater than threshold_l (step 814), the system calculates the difference
between the highest score and the second
highest score (for pair of same user account with a different subject) at step
816. If the difference is greater than
threshold 2, the system selects the mapping of user_account to the identified
subject at step 818 and follows the
same process as described above in step 616. The process ends at a step 820.
[0081] In another embodiment, when JointsCNN recognizes a hand holding a
mobile computing device,
the velocity of the hand (of the identified subject) holding the mobile
computing device is used in above process
instead of using the velocity of the center of mass of the subject. This
improves performance of the matching

CA 03112512 2021-03-11
WO 2020/023801 PCT/US2019/043526
algorithm. To determine values of the thresholds (threshold 0, threshold 1,
threshold 2), the system uses training
data with labels assigned to the images. During training, various combinations
of the threshold values are used and
the output of the algorithm is matched with ground truth labels of images to
determine its performance. The values
of thresholds that result in best overall assignment accuracy are selected for
use in production (or inference).
[0082] No biometric identifying information is used for matching the
identified subject with the user
account, and none is stored in support of this process. That is, there is no
information in the sequences of images
used to compare with stored biometric information for the purposes of matching
the identified subjects with user
accounts in support of this process. Thus, this logic to match the identified
subjects with user accounts operates
without use of personal identifying biometric information associated with the
user accounts.
Network Ensemble
[0083] A network ensemble is a learning paradigm where many networks are
jointly used to solve a
problem. Ensembles typically improve the prediction accuracy obtained from a
single classifier by a factor that
validates the effort and cost associated with learning multiple models. In the
fourth technique to match user accounts
to not yet linked identified subjects, the second and third techniques
presented above are jointly used in an ensemble
(or network ensemble). To use the two techniques in an ensemble, relevant
features are extracted from application of
the two techniques. Figs. 9A-9C present process steps (in a flowchart 900) for
extracting features, training the
ensemble and using the trained ensemble to predict match of a user account to
a not yet linked identified subject.
[0084] Fig. 9A presents the process steps for generating features using
the second technique that uses
service location of mobile computing devices. The process starts at step 902.
At a step 904, a Count_X, for the
second technique is calculated indicating a number of times a service location
of a mobile computing device with an
unmatched user account is X meters away from all other mobile computing
devices with unmatched user accounts.
At step 906, Count_X values of all tuples of subject_id-user_account pairs is
stored by the system for use by the
ensemble. In one embodiment, multiple values of X are used e.g., lm, 2m, 3m,
4m, 5m (steps 908 and 910). For
each value of X, the count is stored as a dictionary that maps tuples of
subject_id-user_account to count score,
which is an integer. In the example where 5 values of X are used, five such
dictionaries are created at step 912. The
process ends at step 914.
[0085] Fig. 9B presents the process steps for generating features using
the third technique that uses
velocities of mobile computing devices. The process starts at step 920. At a
step 922, a Count_Y, for the third
technique is determined which is equal to score_counter values indicating
number of times Euclidean distance
between a particular subject_id-user_account pair is below a threshold 0. At a
step 924, Count_Y values of all
tuples of subject_id-user_account pairs is stored by the system for use by the
ensemble. In one embodiment,
multiple values of threshold_O are used e.g., five different values (steps 926
and 928). For each value of
threshold 0, the Count_Y is stored as a dictionary that maps tuples of
subject_id-user_account to count score, which
is an integer. In the example where 5 values of threshold are used, five such
dictionaries are created at step 930. The
process ends at step 932.
[0086] The features from the second and third techniques are then used to
create a labeled training data
set and used to train the network ensemble. To collect such a data set,
multiple subjects (shoppers) walk in an area
of real space such as a shopping store. The images of these subject are
collected using cameras 114 at regular time
intervals. Human labelers review the images and assign correct identifiers
(subject_id and user_account) to the
images in the training data. The process is described in a flowchart 900
presented in Fig. 9C. The process starts at a

CA 03112512 2021-03-11
16
WO 2020/023801 PCT/US2019/043526
step 940. At a step 942, features in the form of Count_X and Count_Y
dictionaries obtained from second and third
techniques are compared with corresponding true labels assigned by the human
labelers on the images to identify
correct matches (true) and incorrect matches (false) of subject_id and
user_account.
[0087] As we have only two categories of outcome for each mapping of
subject_id and user_account: true
or false, a binary classifier is trained using this training data set (step
944). Commonly used methods for binary
classification include decision trees, random forest, neural networks,
gradient boost, support vector machines, etc. A
trained binary classifier is used to categorize new probabilistic observations
as true or false. The trained binary
classifier is used in production (or inference) by giving as input Count_X and
Count_Y dictionaries for subject_id-
user_account tuples. The trained binary classifier classifies each tuple as
true or false at a step 946. The process ends
at a step 948.
[0088] If there is an unmatched mobile computing device in the area of
real space after application of the
above four techniques, the system sends a notification to the mobile computing
device to open the client application.
If the user accepts the notification, the client application will display a
semaphore image as described in the first
technique. The system will then follow the steps in the first technique to
check-in the shopper (match subject_id to
user_account). If the customer does not respond to the notification, the
system will send a notification to an
employee in the shopping store indicating the location of the unmatched
customer. The employee can then walk to
the customer, ask him to open the client application on his mobile computing
device to check-in to the system using
a semaphore image.
[0089] No biometric identifying information is used for matching the
identified subject with the user
account, and none is stored in support of this process. That is, there is no
information in the sequences of images
used to compare with stored biometric information for the purposes of matching
the identified subjects with user
accounts in support of this process. Thus, this logic to match the identified
subjects with user accounts operates
without use of personal identifying biometric information associated with the
user accounts.
Architecture
[0090] An example architecture of a system in which the four techniques
presented above are applied to
match a user_account to a not yet linked subject in an area of real space is
presented in Fig. 10. Because Fig. 10 is
an architectural diagram, certain details are omitted to improve the clarity
of description. The system presented in
Fig. 10 receives image frames from a plurality of cameras 114. As described
above, in one embodiment, the cameras
114 can be synchronized in time with each other, so that images are captured
at the same time, or close in time, and
at the same image capture rate. Images captured in all the cameras covering an
area of real space at the same time, or
close in time, are synchronized in the sense that the synchronized images can
be identified in the processing engines
as representing different views at a moment in time of subjects having fixed
positions in the real space. The images
are stored in a circular buffer of image frames per camera 1002.
[0091] A "subject identification" subsystem 1004 (also referred to as
first image processors) processes
image frames received from cameras 114 to identify and track subjects in the
real space. The first image processors
include subject image recognition engines such as the JointsCNN above.
[0092] A "semantic diffing" subsystem 1006 (also referred to as second
image processors) includes
background image recognition engines, which receive corresponding sequences of
images from the plurality of
cameras and recognize semantically significant differences in the background
(i.e. inventory display structures like
shelves) as they relate to puts and takes of inventory items, for example,
over time in the images from each camera.

CA 03112512 2021-03-11
17
WO 2020/023801 PCT/US2019/043526
The second image processors receive output of the subject identification
subsystem 1004 and image frames from
cameras 114 as input. Details of "semantic diffing" subsystem are presented in
United States Patent Application No.
15/945,466, filed 04 April 2018, titled, "Predicting Inventory Events using
Semantic Diffing," and United States
Patent Application No. 15/945,473, filed 04 April 2018, titled, "Predicting
Inventory Events using
Foreground/Background Processing," both of which are incorporated herein by
reference as if fully set forth herein.
The second image processors process identified background changes to make a
first set of detections of takes of
inventory items by identified subjects and of puts of inventory items on
inventory display structures by identified
subjects. The first set of detections are also referred to as background
detections of puts and takes of inventory
items. In the example of a shopping store, the first detections identify
inventory items taken from the shelves or put
on the shelves by customers or employees of the store. The semantic diffing
subsystem includes the logic to
associate identified background changes with identified subjects.
[0093] A "region proposals" subsystem 1008 (also referred to as third
image processors) includes
foreground image recognition engines, receives corresponding sequences of
images from the plurality of cameras
114, and recognizes semantically significant objects in the foreground (i.e.
shoppers, their hands and inventory
items) as they relate to puts and takes of inventory items, for example, over
time in the images from each camera.
The region proposals subsystem 1008 also receives output of the subject
identification subsystem 1004. The third
image processors process sequences of images from cameras 114 to identify and
classify foreground changes
represented in the images in the corresponding sequences of images. The third
image processors process identified
foreground changes to make a second set of detections of takes of inventory
items by identified subjects and of puts
of inventory items on inventory display structures by identified subjects. The
second set of detections are also
referred to as foreground detection of puts and takes of inventory items. In
the example of a shopping store, the
second set of detections identifies takes of inventory items and puts of
inventory items on inventory display
structures by customers and employees of the store. The details of a region
proposal subsystem are presented in
United States Patent Application No. 15/907,112, filed 27 February 2018,
titled, "Item Put and Take Detection
Using Image Recognition" which is incorporated herein by reference as if fully
set forth herein.
[0094] The system described in Fig. 10 includes a selection logic 1010 to
process the first and second sets
of detections to generate log data structures including lists of inventory
items for identified subjects. For a take or
put in the real space, the selection logic 1010 selects the output from either
the semantic diffing subsystem 1006 or
the region proposals subsystem 1008. In one embodiment, the selection logic
1010 uses a confidence score
generated by the semantic diffing subsystem for the first set of detections
and a confidence score generated by the
region proposals subsystem for a second set of detections to make the
selection. The output of the subsystem with a
higher confidence score for a particular detection is selected and used to
generate a log data structure 1012 (also
referred to as a shopping cart data structure) including a list of inventory
items (and their quantities) associated with
identified subjects.
[0095] To process a payment for the items in the log data structure 1012,
the system in Fig. 10 applies the
four techniques for matching the identified subject (associated with the log
data) to a user_account which includes a
payment method such as credit card or bank account information. In one
embodiment, the four techniques are
applied sequentially as shown in the figure. If the process steps in flowchart
600 for the first technique produces a
match between the subject and the user account then this information is used
by a payment processor 1036 to charge
the customer for the inventory items in the log data structure. Otherwise
(step 1028), the process steps presented in
flowchart 700 for the second technique are followed and the user account is
used by the payment processor 1036. If

CA 03112512 2021-03-11
18
WO 2020/023801 PCT/US2019/043526
the second technique is unable to match the user account with a subject (1030)
then the process steps presented in
flowchart 800 for the third technique are followed. If the third technique is
unable to match the user account with a
subject (1032) then the process steps in flowchart 900 for the fourth
technique are followed to match the user
account with a subject.
[0096] If the fourth technique is unable to match the user account with a
subject (1034), the system sends
a notification to the mobile computing device to open the client application
and follow the steps presented in the
flowchart 600 for the first technique. If the customer does not respond to the
notification, the system will send a
notification to an employee in the shopping store indicating the location of
the unmatched customer. The employee
can then walk to the customer, ask him to open the client application on his
mobile computing device to check-in to
the system using a semaphore image (step 1040). It is understood that in other
embodiments of the architecture
presented in Fig. 10, fewer than four techniques can be used to match the user
accounts to not yet linked identified
subjects.
Network Configuration
[0097] Fig. 11 presents an architecture of a network hosting the matching
engine 170 which is hosted on
the network node 103. The system includes a plurality of network nodes 103,
101a-101n, and 102 in the illustrated
embodiment. In such an embodiment, the network nodes are also referred to as
processing platforms. Processing
platforms (network nodes) 103, 101a-101n, and 102 and cameras 1112, 1114,
1116, ... 1118 are connected to
network(s) 1181.
[0098] Fig. 11 shows a plurality of cameras 1112, 1114, 1116, ... 1118
connected to the network(s). A
large number of cameras can be deployed in particular systems. In one
embodiment, the cameras 1112 to 1118 are
connected to the network(s) 1181 using Ethernet-based connectors 1122, 1124,
1126, and 1128, respectively. In
such an embodiment, the Ethernet-based connectors have a data transfer speed
of 1 gigabit per second, also referred
to as Gigabit Ethernet. It is understood that in other embodiments, cameras
114 are connected to the network using
other types of network connections which can have a faster or slower data
transfer rate than Gigabit Ethernet. Also,
in alternative embodiments, a set of cameras can be connected directly to each
processing platform, and the
processing platforms can be coupled to a network.
[0099] Storage subsystem 1130 stores the basic programming and data
constructs that provide the
functionality of certain embodiments of the present invention. For example,
the various modules implementing the
functionality of the matching engine 170 may be stored in storage subsystem
1130. The storage subsystem 1130 is
an example of a computer readable memory comprising a non-transitory data
storage medium, having computer
instructions stored in the memory executable by a computer to perform all or
any combination of the data processing
and image processing functions described herein, including logic to link
subjects in an area of real space with a user
account, to determine locations of identified subjects represented in the
images, match the identified subjects with
user accounts by identifying locations of mobile computing devices executing
client applications in the area of real
space by processes as described herein. In other examples, the computer
instructions can be stored in other types of
memory, including portable memory, that comprise a non-transitory data storage
medium or media, readable by a
computer.
[0100] These software modules are generally executed by a processor
subsystem 1150. A host memory
subsystem 1132 typically includes a number of memories including a main random
access memory (RAM) 1134 for
storage of instructions and data during program execution and a read-only
memory (ROM) 1136 in which fixed

CA 03112512 2021-03-11
19
WO 2020/023801 PCT/US2019/043526
instructions are stored. In one embodiment, the RAM 1134 is used as a buffer
for storing subject_id-user_account
tuples matched by the matching engine 170.
[0101] A file storage subsystem 1140 provides persistent storage for
program and data files. In an
example embodiment, the storage subsystem 1140 includes four 120 Gigabyte (GB)
solid state disks (SSD) in a
RAID 0 (redundant array of independent disks) arrangement identified by a
numeral 1142. In the example
embodiment, user account data in the user account database 150 and image data
in the image database 160 which is
not in RAM is stored in RAID 0. In the example embodiment, the hard disk drive
(HDD) 1146 is slower in access
speed than the RAID 0 1142 storage. The solid state disk (SSD) 1144 contains
the operating system and related files
for the matching engine 170.
[0102] In an example configuration, three cameras 1112, 1114, and 1116,
are connected to the processing
platform (network node) 103. Each camera has a dedicated graphics processing
unit GPU 11162, GPU 2 1164, and
GPU 3 1166, to process images sent by the camera. It is understood that fewer
than or more than three cameras can
be connected per processing platform. Accordingly, fewer or more GPUs are
configured in the network node so that
each camera has a dedicated GPU for processing the image frames received from
the camera. The processor
subsystem 1150, the storage subsystem 1130 and the GPUs 1162, 1164, and 1166
communicate using the bus
subsystem 1154.
[0103] A network interface subsystem 1170 is connected to the bus
subsystem 1154 forming part of the
processing platform (network node) 103. Network interface subsystem 1170
provides an interface to outside
networks, including an interface to corresponding interface devices in other
computer systems. The network
interface subsystem 1170 allows the processing platform to communicate over
the network either by using cables (or
wires) or wirelessly. The wireless radio signals 1175 emitted by the mobile
computing devices 120 in the area of
real space are received (via the wireless access points) by the network
interface subsystem 1170 for processing by
the matching engine 170. A number of peripheral devices such as user interface
output devices and user interface
input devices are also connected to the bus subsystem 1154 forming part of the
processing platform (network node)
103. These subsystems and devices are intentionally not shown in Fig. 11 to
improve the clarity of the description.
Although bus subsystem 1154 is shown schematically as a single bus,
alternative embodiments of the bus subsystem
may use multiple busses.
[0104] In one embodiment, the cameras 114 can be implemented using
Chameleon3 1.3 MP Color USB3
Vision (Sony ICX445), having a resolution of 1288 x 964, a frame rate of 30
FPS, and at 1.3 MegaPixels per image,
with Varifocal Lens having a working distance (mm) of 300 - cc, a field of
view field of view with a 1/3" sensor of
98.2 - 23.8 .
Particular Implementations
[0105] In various embodiments, the system for linking subjects in an area
of real space with user accounts
described above also includes one or more of the following features.
[0106] The system includes a plurality of cameras, cameras in the
plurality of cameras producing
respective sequences of images in corresponding fields of view in the real
space. The processing system is coupled
to the plurality of cameras, the processing system includes logic to determine
locations of identified subjects
represented in the images. The system matches the identified subjects with
user accounts by identifying locations of
mobile devices executing client applications in the area of real space, and
matches locations of the mobile devices
with locations of the subjects.

CA 03112512 2021-03-11
WO 2020/023801 PCT/US2019/043526
[0107] In one embodiment, the system the signals emitted by the mobile
computing devices comprise
images.
[0108] In one embodiment, the signals emitted by the mobile computing
devices comprise radio signals.
[0109] In one embodiment, the system includes a set of semaphore images
accessible to the processing
system. The processing system includes logic to accept login communications
from a client application on a mobile
computing device identifying a user account before matching the user account
to an identified subject in the area of
real space, and after accepting login communications the system sends a
selected semaphore image from the set of
semaphore images to the client application on the mobile device.
[0110] In one such embodiment, the processing system sets a status of the
selected semaphore image as
assigned. The processing system receives a displayed image of the selected
semaphore image. The processing
system recognizes the displayed image and matches the recognized semaphore
image with the assigned images from
the set of semaphore images. The processing system matches a location of the
mobile computing device displaying
the recognized semaphore image located in the area of real space with a not
yet linked identified subject. The
processing system, after matching the user account to the identified subject,
sets the status of the recognized
semaphore image as available.
[0111] In one embodiment, the client applications on the mobile computing
devices transmit
accelerometer data to the processing system, and the system matches the
identified subjects with user accounts using
the accelerometer data transmitted from the mobile computing devices.
[0112] In one such embodiment, the logic to match the identified subjects
with user accounts includes
logic that uses the accelerometer data transmitted from the mobile computing
device from a plurality of locations
over a time interval in the area of real space and a derivative of data
indicating the locations of identified subjects
over the time interval in the area of real space.
[0113] In one embodiment, the signals emitted by the mobile computing
devices include location data
and accelerometer data.
[0114] In one embodiment, the signals emitted by the mobile computing
devices comprise images.
[0115] In one embodiment, the signals emitted by the mobile computing
devices comprise radio signals.
[0116] A method of linking subjects in an area of real space with user
accounts is disclosed. The user
accounts are linked with client applications executable on mobile computing
devices is disclosed. The method
includes, using a plurality of cameras to produce respective sequences of
images in corresponding fields of view in
the real space. Then the method includes determining locations of identified
subjects represented in the images. The
method includes matching the identified subjects with user accounts by
identifying locations of mobile computing
devices executing client applications in the area of real space. Finally, the
method includes matching locations of the
mobile computing devices with locations of the subjects.
[0117] In one embodiment, the method also includes, setting a status of
the selected semaphore image as
assigned, receiving a displayed image of the selected semaphore image,
recognizing the displayed semaphore image
and matching the recognized image with the assigned images from the set of
semaphore images. The method
includes, matching a location of the mobile computing device displaying the
recognized semaphore image located in
the area of real space with a not yet linked identified subject. Finally, the
method includes after matching the user
account to the identified subject setting the status of the recognized
semaphore image as available.
[0118] In one embodiment, matching the identified subjects with user
accounts further includes using the
accelerometer data transmitted from the mobile computing device from a
plurality of locations over a time interval

CA 03112512 2021-03-11
21
WO 2020/023801 PCT/US2019/043526
in the area of real space. A derivative of data indicating the locations of
identified subjects over the time interval in
the area of real space.
[0119] In one embodiment, the signals emitted by the mobile computing
devices include location data
and accelerometer data.
[0120] In one embodiment, the signals emitted by the mobile computing
devices comprise images.
[0121] In one embodiment, the signals emitted by the mobile computing
devices comprise radio signals.
[0122] A non-transitory computer readable storage medium impressed with
computer program
instructions to link subjects in an area of real space with user accounts is
disclosed. The user accounts are linked
with client applications executable on mobile computing devices, the
instructions, when executed on a processor,
implement a method. The method includes using a plurality of cameras to
produce respective sequences of images in
corresponding fields of view in the real space. The method includes
determining locations of identified subjects
represented in the images. The method includes matching the identified
subjects with user accounts by identifying
locations of mobile computing devices executing client applications in the
area of real space. Finally, the method
includes matching locations of the mobile computing devices with locations of
the subjects.
[0123] In one embodiment, the non-transitory computer readable storage
medium implements the method
further comprising the following steps. The method includes setting a status
of the selected semaphore image as
assigned, receiving a displayed image of the selected semaphore image,
recognizing the displayed semaphore image
and matching the recognized image with the assigned images from the set of
semaphore images. The method
includes matching a location of the mobile computing device displaying the
recognized semaphore image located in
the area of real space with a not yet linked identified subject. After
matching the user account to the identified
subject setting the status of the recognized semaphore image as available.
[0124] In one embodiment, the non-transitory computer readable storage
medium implements the method
including matching the identified subjects with user accounts by using the
accelerometer data transmitted from the
mobile computing device from a plurality of locations over a time interval in
the area of real space and a derivative
of data indicating the locations of identified subjects over the time interval
in the area of real space.
[0125] In one embodiment, the signals emitted by the mobile computing
devices include location data
and accelerometer data.
[0126] Any data structures and code described or referenced above are
stored according to many
implementations in computer readable memory, which comprises a non-transitory
computer-readable storage
medium, which may be any device or medium that can store code and/or data for
use by a computer system. This
includes, but is not limited to, volatile memory, non-volatile memory,
application-specific integrated circuits
(ASICs), field-programmable gate arrays (FPGAs), magnetic and optical storage
devices such as disk drives,
magnetic tape, CDs (compact discs), DVDs (digital versatile discs or digital
video discs), or other media capable of
storing computer-readable media now known or later developed.
The preceding description is presented to enable the making and use of the
technology disclosed. Various
modifications to the disclosed implementations will be apparent, and the
general principles defined herein may be
applied to other implementations and applications without departing from the
spirit and scope of the technology
disclosed. Thus, the technology disclosed is not intended to be limited to the
implementations shown, but is to be
accorded the widest scope consistent with the principles and features
disclosed herein. The scope of the technology
disclosed is defined by the appended claims.

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	Unavailable
(86) PCT Filing Date	2019-07-25
(87) PCT Publication Date	2020-01-30
(85) National Entry	2021-03-11
Examination Requested	2021-07-05
Dead Application	2024-03-28

Abandonment History

Abandonment Date	Reason	Reinstatement Date
2023-03-28	R86(2) - Failure to Respond
2024-01-25	FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Reinstatement of rights		2021-03-11	$204.00	2021-03-11
Application Fee		2021-03-11	$408.00	2021-03-11
Request for Examination		2024-07-25	$816.00	2021-07-05
Maintenance Fee - Application - New Act	2	2021-07-26	$100.00	2021-07-12
Maintenance Fee - Application - New Act	3	2022-07-25	$100.00	2022-07-15
Extension of Time		2023-01-30	$210.51	2023-01-30

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
STANDARD COGNITION, CORP.

Past Owners on Record
None

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Abstract	2021-03-11	2	87
Claims	2021-03-11	4	224
Drawings	2021-03-11	13	223
Description	2021-03-11	21	1,587
Representative Drawing	2021-03-11	1	23
Patent Cooperation Treaty (PCT)	2021-03-11	5	205
International Preliminary Report Received	2021-03-11	5	183
International Search Report	2021-03-11	2	102
Declaration	2021-03-11	2	97
National Entry Request	2021-03-11	9	257
Cover Page	2021-03-31	1	51
Request for Examination	2021-07-05	3	81
Examiner Requisition	2022-09-28	6	288
Extension of Time	2023-01-30	4	153
Acknowledgement of Extension of Time	2023-02-07	2	221

Language selection

Menus

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 3112512 Summary

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.