Patent 3191888 Summary

(12) Patent Application:	(11) CA 3191888
(54) English Title:	SYSTEMS AND METHODS FOR PRIVATE AUTHENTICATION WITH HELPER NETWORKS
(54) French Title:	SYSTEMES ET PROCEDES D'AUTHENTIFICATION PRIVEE AVEC DES RESEAUX AUXILIAIRES
Status:	Compliant

Bibliographic Data

(51) International Patent Classification (IPC):	G06F 21/00 (2013.01) G06N 20/00 (2019.01)
(72) Inventors :	STREIT, SCOTT EDWARD (United States of America)
(73) Owners :	PRIVATE IDENTITY LLC (United States of America)
(71) Applicants :	PRIVATE IDENTITY LLC (United States of America)
(74) Agent:	SMART & BIGGAR LP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2021-08-12
(87) Open to Public Inspection:	2022-02-17
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US2021/045745
(87) International Publication Number:	WO2022/036097
(85) National Entry:	2023-02-14

(30) Application Priority Data:

Application No.	Country/Territory	Date
16/993,596	United States of America	2020-08-14
17/155,890	United States of America	2021-01-22
17/183,950	United States of America	2021-02-24
17/398,555	United States of America	2021-08-10

Abstracts

English Abstract

Helper neural network can play a role in augmenting authentication services that are based on neural network architectures. For example, helper networks are configured to operate as a gateway on identification information used to identify users, enroll users, and/or construct authentication models (e.g., embedding and/or prediction networks). Assuming, that both good and bad identification information samples are taken as part of identification information capture, the helper networks operate to filter out bad identification information prior to training, which prevents, for example, identification information that is valid but poorly captured from impacting identification, training, and/or prediction using various neural networks. Additionally, helper networks can also identify and prevent presentation attacks or submission of spoofed identification information as part of processing and/or validation.

French Abstract

La présente invention concerne un réseau neuronal auxiliaire pouvant jouer un rôle dans l'augmentation de services d'authentification qui sont basés sur des architectures de réseau neuronal. Par exemple, des réseaux auxiliaires sont configurés pour fonctionner en tant que passerelle sur des informations d'identification utilisées pour identifier des utilisateurs, inscrire des utilisateurs et/ou construire des modèles d'authentification (par exemple, des réseaux d'incorporation et/ou de prédiction). En supposant que des échantillons d'informations de bonne et de mauvaise identification sont pris en tant que partie de capture d'informations d'identification, les réseaux auxiliaires fonctionnent pour filtrer les informations de mauvaise identification avant l'apprentissage, ce qui empêche, par exemple, que des informations d'identification qui sont valides mais mal capturées aient un impact sur l'identification, l'apprentissage et/ou la prédiction en utilisant divers réseaux neuronaux. De plus, des réseaux auxiliaires peuvent également identifier et empêcher des attaques de présentation ou la soumission d'informations d'identification mystifiées en tant que partie de traitement et/ou de validation.

Claims

Note: Claims are shown in the official language in which they were submitted.

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
CLAIMS
1. A system for managing privacy-enabled identification or authentication,
the system
comprising:
at least one processor operatively connected to a memory;
an identification data gateway, executed by the at least one processor,
configured to
filter invalid identification information from subsequent verification,
enrollment,
identification, or authentication functions, the identification data gateway
comprising at least:
a first pre-trained validation helper network associated with identification
information of a first type, wherein the first pre-trained validation helper
network is
configured to:
evaluate an identification instance of the first type, responsive to input
of the identification instance of the first type to the first pre-trained
validation
helper network, wherein the first pre-trained validation helper network is pre-

trained on evaluation criteria that is independent of a subject of the
identification instance seeking to be verified, enrolled, identified, or
authenticated:
responsive to a determination that the identification instance meets the
evaluation criteria, validate the identification instance for use in
subsequent
verification, enrollment, identification, or authentication;
responsive to a determination that the identification instance fails the
evaluation criteria, reject the unknown information instance for use in
subsequent verification, enrollment, identification, or authentication; and
generate at least a binary evaluation of the identification information
instance based on the determination of the evaluation criteria, wherein the at
least the binary evaluation includes generation of an output probability by
the
first pre-trained validation helper network that the identification instance
is
valid or invalid.
2. The system of claim 1, wherein the identification data gateway is
configured to filter
bad audio data from use in subsequent processing.
3. The system of claim 2, wherein the identification data gateway is
configured to accept
audio data input and validate the audio input for use in transcription.
-92-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
4. The system of claim 1, wherein the first pre-trained validation
helper network is
trained on presence data, and configured to determine the presence of a target
to be evaluated.
5. The system of claim 3, wherein the first pre-trained validation helper
network is
configured to validate the presence data independent of the subject seeking to
be enrolled,
identified, or authenticated.
6. The system of claim 1, wherein the authentication data gateway further
comprises a
plurality of validation helper networks each associated with a respective type
of identification
information, wherein each of the plurality of validation helper networks
generate a binary
evaluation of respective identification inputs to establish validity, wherein
at least a plurality
of the validation helper networks are configured to validate respective
identification
information independent of the subject seeking to be enrolled, identified, or
authenticated .
7. The system of claim 1, wherein the first pre-trained validation helper
network is
configured process an image as identification information, and output a
probability that the
subject is wearing a mask.
8. The system of claim 7, wherein the first pre-trained validation helper
network is
configured to determine the mask is being worn properly by the subject.
9. The system of claim 7, wherein the first pre-trained validation helper
network is
configured to determine the mask is being worn properly by the subject
irrespective of the
subject to be identified.
10. The system of claim 1, wherein the first pre-trained validation helper
network is
configured to process location associated input as identification information,
and output a
probability that the location associated input is invalid.
11. The system of claim 1, wherein the identification data gateway further
comprises
a first pre-trained geometry helper network configured to:
process identification information of the first type,
-93-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
accept as input unencrypted identification information of the fist type,
and
output processed identification information of the first type.
12. The system of claim 11, wherein the first pre-trained validation helper
network is
paired with the geometry helper network, and further configured to:
accept the output of the geometry helper neural network, and
validate the input identification information of the first type or reject the
identification information of the first type.
13. The system of claim 1, wherein the first pre-trained validation
helper network is
configured to process an image input as identification information, and output
a probability
that the image input is a presentation attack.
14. The system of claim 5, wherein the first pre-trained validation helper
network is
configured to process a video input as identification information, and output
a probability that
the video input is a presentation attack.
15. A computer implemented method for managing privacy-enabled
identification or
authentication, the system comprising:
filter, by at least one processor, invalid identification information from
subsequent
verification, enrollment, identification, or authentication functions, wherein
the act of
filtering includes:
executing, by at least one processor, a first pre-trained validation helper
network associated with identification information of a first type;
evaluating, by the first pre-trained validation helper network, an
identification
instance of the first type, responsive to input of the identification instance
of the first
type to the first pre-trained validation helper network, wherein the first pre-
trained
validation helper network is pre-trained on evaluation criteria that is
independent of a
subject of the identification instance seeking to be verified, enrolled,
identified, or
authenticated;
validating, by the at least one processor, the identification instance for use
in
subsequent verification, enrollment, identification, or authentication, in
response to
determining that the identification instance meets the evaluation criteria;
-94-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
rejecting, by the at least one processor, the unknown information instance for

use in subsequent verification, enrollment, identification, or authentication
responsive
to determining that the identification instance fails the evaluation criteria;
and
generating, by the at least one processor, at least a binary evaluation of the
identification instance based on the determination of the evaluation criteria,
wherein the at
least the binary evaluation includes generation of an output probability by
the first pre-trained
validation helper network that the identification instance is valid or
invalid.
16. The method of claim 15, wherein the act of filtering includes an act of
filtering bad
audio data from use in subsequent processing.
17. The method of claim 16, wherein the method further comprises accepting
audio data
input and validating the audio input for use in transcription.
18. The method of claim 15, wherein the first pre-trained validation helper
network is
trained on presence data, and the method further comprises determining the
presence of a
valid target to be evaluated.
19. The method of claim 18, wherein the method further comprises validating
the
presence data independent of the subject seeking to be verified, enrolled,
identified, or
authenticated.
20. The method of claim 15, wherein the method further comprises:
executing a plurality of validation helper networks each associated with a
respective
type of identification information, wherein each of the plurality of
validation helper networks
generates at least a binary evaluation of respective identification inputs to
establish validity;
and
validating respective identification information independent of the subject
seeking to
be verified, enrolled, identified, or authenticated.
21. The method of claim 15, wherein the first pre-trained validation helper
network is
configured process an image as identification information, and the method
further comprises
an act of outputting a probability that the subject is wearing a mask.
-95-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
22. The method of claim 21, wherein the method further comprises
determining by the
first pre-trained validation helper network that the mask is being worn
properly by the
subject.
23. The method of claim 21, wherein the method further comprises
determining by the
first pre-trained validation helper network that the mask is being worn
properly by the subject
irrespective of the subject to be identified.
24. The method of claim 15, wherein method further comprises processing
a location
associated input as identification information by the first pre-trained
validation helper
network and generating by the first pre-trained validation helper network a
probability that
the location associated input is invalid.
-96-

Description

Note: Descriptions are shown in the official language in which they were submitted.

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
SYSTEMS AND METHODS FOR PRIVATE
AUTHENTICATION WITH HELPER NETWORKS
BACKGROUND
Various conventional approaches exist that attempt to implement authentication
and/or identification in the context of machine learning. Some conventional
approaches have
developed optimizations to improve the training and predictive accuracy of the
machine
learning models. For example, a number of solutions use procedural programming
to prepare
data for processing by machine learning models. In one example, procedural
programming
can be used to process user images (e.g., face images) to crop or align images
around user
faces, to improve the image data used to train machine learning models to
recognize the
users. A number of approaches exist to filter training data sets to improve
the training of
respective machine learning models based on procedural programming or rules.
SUMMARY
The inventors have realized that there is still a need to utilize the power of
machine
learning models as gateways or filters on data being used for subsequent
machine learning
based recognition whether in authentication settings or identification
settings. A similar need
exists in the context of procedural recognition and other processing tasks,
and machine
learning models can be used as gateways or filters on data being used for any
subsequent
operation, including for example, procedural based or other recognition tasks
whether in
authentication or identification settings. According to some aspects, using
machine learning
to filter data or remove bad data instances enables any subsequent operation
to be performed
more effectively and/or with reduced error over many conventional approaches.
For
example, recognition operations (e.g. identity, authentication, and/or
enrollment, etc.) can be
improved by validating the date used, and/or identifying invalid data before
further
processing occurs. It is further realized that approaches to filter data based
on procedural
programming fail to achieve the level of filtering required, and further fail
to provide a good
balance between processing requirements and accuracy.
According to various aspects, provided are authentication systems that are
configured
to leverage machine learning approaches in the context of pre-processing data
for use in
subsequent tasks, for example, recognition tasks (including e.g., recognition
by machine
learning models that support identification and/or authentication). The
inventors have further
realized that, unlike prior solutions, it is possible to create lightweight
models (e.g., small file
-1-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
size models) that provide sufficient accuracy (e.g., >90%) in identifying
features or states of
input identification/authentication data to serve as a gateway for further
processing. For
example, the system can implement a plurality of helper networks configured to
process
incoming identification data (e.g., biometrics, behavioral, passive, active,
etc.) and exclude
data instances that would not improve identification/authentication. For
example, a helper
network can be trained on identification data to ensure that "good" data
improves the ability
to distinguish between targets to be identified or expands the circumstances
(e.g., poor
lighting conditions, noisy environment, bad image capture, etc.) in which
subsequent
operations can identify or authenticate a target. Stated broadly various
embodiment validate
the data used for subsequent processing, eliminating, for example, poor data
instances,
malicious data instances, etc.
In further example, the helper network can be trained to identify "bad" data
which if
used would result in a reduction in the ability to recognize a target. To
illustrate, an image of
a first target that is too blurry may make the blurry image of the first
target resemble an
image of another target. If used in a recognition data set, the result could
be a reduction in
the ability to distinguish between the first target and another target because
of an image of the
first target that, inappropriately, bears a closer resemblance to another
target than the first.
Various instances of the helper networks are configured to identify and
validate good data for
use in recognition tasks, and identify and, for example, discard bad data that
would reduce the
ability to perform a recognition task.
According to some embodiments, the helper networks validate submitted
identification information as good or bad data and filter the bad data from
use in subsequent
operations, for example, identification, authentication, enrollment, training,
and in some
examples, prediction.
In further embodiments, helper networks can be implemented in an
authentication
system and operate as a gateway for embedding neural networks, where the
embedding
neural networks are configured to extract encrypted features from
authentication information.
The helper network can also operate as a gateway for prediction models that
predict matches
between input and enrolled authentication information. In other examples, the
helper
networks can be configured to filter identification data for any recognition
task (e.g.,
identification, authentication, enrollment, etc.), which can be based in
machine learning
approaches, procedural programming approaches, etc.
According to various aspects, embedding machine learning models are used to
generate encrypted embeddings from input plaintext identification information.
The
-2-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
embedding machine learning models can be tailored to respective authentication
modalities,
and similarly, helper networks can be configured to process specific
authentication inputs or
authentication modalities and validate the same before they are used in
subsequent models.
An authentication modality can be associated with the sensor/system used to
capture the
authentication information (e.g., image capture for face, iris, or
fingerprint, audio capture for
voice, etc.), and may be further limited based on the type of information
being analyzed
within a data capture (e.g., face, iris, fingerprint, voice, behavior, etc.).
Broadly stated,
authentication modality refers to the capability in the first instance to
identify a subject to
confirm an assertion of identity and/or to authenticate the subject to
adjudicate identity and/or
authorization based on a common set of identity information. In one example,
an
authentication modality can collect facial images to train a neural network on
a common
authentication data input. In another example, speech inputs or more generally
audio inputs
can be processed by a first network, where another physical biometric input
(e.g., face, iris,
etc.) can be processed by another network trained on the different
authentication modality. In
further example, image captures for user faces can be processed as a different
modality from
image capture for iris identification, and/or fingerprint identification.
Other authentication
modalities can include behavioral identification information (e.g., speech
pattern, movement
patterns (e.g., angle of carrying mobile device, etc.), timing of activity,
location of activity,
etc.), passive identification information capture, active identification
information capture,
among other options.
According to another aspect, helper networks, also referred to as pre-
processing
neural networks and/or validation networks, are configured to operate as a
gateway on
identification information used to identify and/or authenticate entities.
Assuming, that both
good and bad identification information samples are taken as part of
information capture, the
helper networks operate to filter out bad information, for example, prior to
training, which
prevents, for example, information that is valid but poorly captured from
impacting training
or prediction using various neural networks. Additionally, helper networks can
also identify
and prevent presentation attacks or submission of spoofed authentication. In
various
embodiments, filtering bad identification information samples can be used to
improve
machine learning identification, enrollment, and/or authentication operations
as well as
procedural based identification, enrollment, and/or authentication operations.
According to various aspects, training of machine learning models typically
involves
expansion and generation of variants of training data. These operations
increase the size of
the training data pool and improve the accuracy of the trained model. However,
the inventors
-3-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
have realized that including bad data in such expanded training data sets
compromises
accuracy. Worse, capturing and expanding bad instances of data can multiply
the detrimental
effect. According to various embodiments, data validation by helper networks
identifies and
eliminates data that would reduce identification or authentication accuracy
(i.e. bad data).
Unexpectedly, the helper networks are also able to identify bad data in this
context that is
undetected by human perception. This allows various embodiments to yield
capability that
cannot naturally be produced in a procedural programming context, where a
programmer is
attempting to code human based analysis (limited by human perception) of
identification
data.
In further aspects, the authentication system can be configured to leverage a
plurality
of helper neural networks (e.g., a plurality of neural networks (e.g., deep
neural networks
(e.g., DNNs))), where sets of helper networks can be trained to acquire and
transform
biometric values or types of biometrics to improve biometric capture, increase
accuracy,
reduce training time for embedding and/or classification networks, eliminate
vulnerabilities
(e.g., liveness checking and validation), and further sets of helper networks
can be used to
validate any type or modality of identification input. In further example,
data is validated if it
improves the accuracy or capability of recognition operations (e.g., improves
feature
embedding models, prediction models, distance evaluations, etc.). In some
embodiments, by
only using validated data, downstream recognition tasks can be improved over
conventional
approaches.
According to one aspect, an authentication system for privacy-enabled
authentication
is provided. The system comprises at least one processor operatively connected
to a memory;
an authentication data gateway, executed by the at least one processor,
configured to filter
invalid identification information, the authentication data gateway comprising
at least a first
pre-trained geometry helper network configured to process identification
information of a first
type, accept as input unencrypted identification information of the fist type,
and output
processed identification information of the first type; and a first pre-
trained validation helper
network associated with the geometry helper network configured to process
identification
information of the first type, accept the output of the geometry helper neural
network, and
validate the input identification information of the first type or reject the
identification
information of the first type.
According to one embodiment, the authentication data gateway is configured to
filter
bad authentication data from training data sets used to build embedding
network models.
According to one embodiment, the first pre-trained validation helper network
is trained on
-4-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
evaluation criteria independent of the subject seeking to be enrolled or
authenticated.
According to one embodiment, the authentication data gateway further comprises
at least a
second geometry helper network and a second validation helper network pair
configured to
process and valid identification information of a second type. According to
one embodiment,
the authentication data gateway further comprises a plurality of validation
helper networks each
associated with a respective type of identification information, wherein each
of the plurality of
validation helper networks generate a binary evaluation of respective
authentication inputs to
establish validity. According to one embodiment, the first pre-trained
validation helper network
is configured process an image input as identification information, and output
a probability that
the image input is invalid. According to one embodiment, the first pre-trained
validation helper
network is configured to process an image input as identification information,
and output a
probability that the image input is a presentation attack. According to one
embodiment, the
first pre-trained validation helper network is configured to process a video
input as
identification information and output a probability that the video input is
invalid. According to
one embodiment, the first pre-trained validation helper network is configured
to process a video
input as identification information and output a probability that the video
input is a presentation
attack.
According to one aspect, an authentication system for privacy-enabled
authentication
is provided. The system comprises at least one processor operatively connected
to a memory;
an authentication data gateway, executed by the at least one processor,
configured to filter
invalid identification information, the authentication data gateway comprising
at least a merged
validation network associated with a first type of identification information,
the merged
validation network configured to process identification information of the
first type and output
a probability that the identification information of the first type is valid
for use in enrolling a
user for subsequent identification or a probability that the identification
information is invalid.
According to one embodiment, the merged validation network is configured to
test a
plurality of binary characteristics of the identification information input.
According to one
embodiment, the output probability is based at least in part on a state
determined for the
plurality of binary characteristics. According to one embodiment, the merged
validation
.. network is configured to determine if an identification information input
is based on a
presentation attack. According to one embodiment, the merged validation
network is
configured to determine if an identification information input improves
training set entropy.
According to one aspect, a computer implemented method for privacy-enabled
authentication is provided. The method comprises filtering, by at least one
processor, invalid
-5-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
identification information; executing by the at least one processor, a first
pre-trained geometry
helper network; accepting, by the first pre-trained geometry helper network,
unencrypted
identification information of the fist type as input; generating processed
identification
information of the first type; executing by the at least one processor, a
first pre-trained
validation helper network; accepting the output of the geometry helper neural
network; and
validating the input identification information of the first type or reject
the identification
information of the first type.
According to one embodiment, the method further comprises filtering bad
authentication data from training data sets used to build embedding network
models. According
to one embodiment, the method further comprises training the first pre-trained
validation helper
network on evaluation criteria independent of the subject seeking to be
enrolled or
authenticated. According to one embodiment, the method further comprises
executing at least
a second geometry helper network and a second validation helper network pair
configured to
process and validate identification information of a second type. According to
one
embodiment, the method further comprises executing a plurality of validation
helper networks
each associated with a respective type of identification information, and
generating a binary
evaluation of respective authentication inputs by respective ones of the
plurality of validation
helper networks to establish validity. According to one embodiment, the method
further
comprises processing, by the first pre-trained validation helper network an
image input as
identification information, and output a probability that the image input is
invalid.
According to one embodiment, the method further comprises processing an image
input
as identification information, and generating a probability that the image
input is a presentation
attack, by the first pre-trained validation helper network. According to one
embodiment, the
method further comprises processing, the first pre-trained validation helper
network, a video
input as identification information; and generating, the first pre-trained
validation helper
network, a probability that the video input is invalid, by the first pre-
trained validation helper
network. According to one embodiment, the method further comprises processing,
the first pre-
trained validation helper network, a video input as identification
information, and generating,
the first pre-trained validation helper network, a probability that the video
input is a
presentation attack.
According to one aspect, an authentication system for privacy-enabled
authentication
is provided. The method comprises executing, by at least one processor, a
merged validation
network associated with a first type of identification information;
processing, by the merged
validation network, identification information of the first type, generating,
by the merged
-6-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
validation network, a probability that the identification information of the
first type is valid for
use in enrolling a user for subsequent identification or a probability that
the identification
information is invalid. According to one embodiment, the method further
comprises testing, by
the merged validation network, a plurality of binary characteristics of the
identification
.. information input. According to one embodiment, generating the probability
is based at least
in part on a state determined for the plurality of binary characteristics.
According to one
embodiment, the method further comprises determining, by the merged validation
network if
an identification information input is based on a presentation attack.
According to one
embodiment, the method further comprises determining if an identification
information input
improves training set entropy.
According to one aspect, a system for managing privacy-enabled identification
or
authentication is provided. The system comprises at least one processor
operatively connected
to a memory; an identification data gateway, executed by the at least one
processor, configured
to filter invalid identification information from subsequent verification,
enrollment,
identification, or authentication functions, the identification data gateway
comprising at least a
first pre-trained validation helper network associated with identification
information of a first
type, wherein the first pre-trained validation helper network is configured to
evaluate an
identification instance of the first type, responsive to input of the
identification instance of the
first type to the first pre-trained validation helper network, wherein the
first pre-trained
validation helper network is pre-trained on evaluation criteria that is
independent of a subject
of the identification instance seeking to be enrolled, identified, or
authenticated, responsive to
a determination that the identification instance meets the evaluation
criteria, validate the
identification instance for use in subsequent verification, enrollment,
identification, or
authentication, responsive to a determination that the identification instance
fails the evaluation
criteria, reject the unknown information instance for use in subsequent
verification, enrollment,
identification, or authentication, and generate at least a binary evaluation
of the identification
information instance based on the determination of the evaluation criteria,
wherein the at least
the binary evaluation includes generation of an output probability by the
first pre-trained
validation helper network that the identification instance is valid or
invalid.
According to one embodiment, the identification data gateway is configured to
filter
bad audio data from use in subsequent processing. According to one embodiment,
the
identification data gateway is configured to accept audio data input and
validate the audio input
for use in transcription. According to one embodiment, the first pre-trained
validation helper
network is trained on presence data, and configured to determine the presence
of a target to be
-7-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
evaluated. According to one embodiment, the first pre-trained validation
helper network is
configured to validate the presence data independent of the subject seeking to
be enrolled,
identified, or authenticated. According to one embodiment, the authentication
data gateway
further comprises a plurality of validation helper networks each associated
with a respective
type of identification information, wherein each of the plurality of
validation helper networks
generate a binary evaluation of respective identification inputs to establish
validity, wherein at
least a plurality of the validation helper networks are configured to validate
respective
identification information independent of the subject seeking to be enrolled,
identified, or
authenticated. According to one embodiment, the first pre-trained validation
helper network is
configured process an image as identification information, and output a
probability that the
subject is wearing a mask. According to one embodiment, the first pre-trained
validation helper
network is configured to determine the mask is being worn properly by the
subject. According
to one embodiment, the first pre-trained validation helper network is
configured to determine
the mask is being worn properly by the subject irrespective of the subject to
be identified.
According to one embodiment, the first pre-trained validation helper network
is configured to
process location associated input as identification information, and output a
probability that the
location associated input is invalid.
According to one aspect, a computer implemented method for managing
privacy-enabled identification or authentication is provided. The system
comprises filtering,
by at least one processor, invalid identification information from subsequent
verification,
enrollment, identification, or authentication functions, wherein the act of
filtering includes
executing, by the at least one processor, a first pre-trained validation
helper network associated
with identification information of a first type; evaluating, by the first pre-
trained validation
helper network, an identification instance of the first type, responsive to
input of the
identification instance of the first type to the first pre-trained validation
helper network,
wherein the first pre-trained validation helper network is pre-trained on
evaluation criteria that
is independent of a subject of the identification instance seeking to be
verified, enrolled,
identified, or authenticated; validating, by the at least one processor, the
identification instance
for use in subsequent verification, enrollment, identification, or
authentication, in response to
determining that the identification instance meets the evaluation criteria;
rejecting, by the at
least one processor, the unknown information instance for use in subsequent
verification,
enrollment, identification, or authentication responsive to determining that
the identification
instance fails the evaluation criteria; and generating, by the at least one
processor, at least a
binary evaluation of the identification instance based on the determination of
the evaluation
-8-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
criteria, wherein the at least the binary evaluation includes generation of an
output probability
by the first pre-trained validation helper network that the identification
instance is valid or
invalid.
According to one embodiment, the act of filtering includes an act of filtering
bad audio
data from use in subsequent processing. According to one embodiment, the
method further
comprises accepting audio data input and validating the audio input for use in
transcription.
According to one embodiment, the first pre-trained validation helper network
is trained on
presence data, and the method further comprises determining the presence of a
valid target to
be evaluated. According to one embodiment, the method further comprises
validating the
presence data independent of the subject seeking to be verified, enrolled,
identified, or
authenticated. According to one embodiment, the method further comprises
executing a
plurality of validation helper networks each associated with a respective type
of identification
information, wherein each of the plurality of validation helper networks
generates at least a
binary evaluation of respective identification inputs to establish validity;
and validating
respective identification information independent of the subject seeking to be
verified, enrolled,
identified, or authenticated.
According to one embodiment, the first pre-trained validation helper network
is
configured process an image as identification information, and the method
further comprises
an act of outputting a probability that the subject is wearing a mask.
According to one
embodiment, the method further comprises determining by the first pre-trained
validation
helper network that the mask is being worn properly by the subject. According
to one
embodiment, the method further comprises determining by the first pre-trained
validation
helper network that the mask is being worn properly by the subject
irrespective of the subject
to be identified. According to one embodiment, method further comprises
processing a location
associated input as identification information by the first pre-trained
validation helper network
and generating by the first pre-trained validation helper network a
probability that the location
associated input is invalid.
Still other aspects, examples, and advantages of these exemplary aspects and
examples,
are discussed in detail below. Moreover, it is to be understood that both the
foregoing
information and the following detailed description are merely illustrative
examples of various
aspects and examples, and are intended to provide an overview or framework for
understanding
the nature and character of the claimed aspects and examples. Any example
disclosed herein
may be combined with any other example in any manner consistent with at least
one of the
objects, aims, and needs disclosed herein, and references to "an example,"
"some examples,"
-9-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
"an alternate example," "various examples," "one example," "at least one
example," "this and
other examples" or the like are not necessarily mutually exclusive and are
intended to indicate
that a particular feature, structure, or characteristic described in
connection with the example
may be included in at least one example. The appearances of such terms herein
are not
necessarily all referring to the same example.
BRIEF DESCRIPTION OF DRAWINGS
Various aspects of at least one embodiment are discussed below with reference
to the
accompanying figures, which are not intended to be drawn to scale. The figures
are included
to provide an illustration and a further understanding of the various aspects
and embodiments
and are incorporated in and constitute a part of this specification but are
not intended as a
definition of the limits of any particular embodiment. The drawings, together
with the
remainder of the specification, serve to explain principles and operations of
the described and
claimed aspects and embodiments. In the figures, each identical or nearly
identical component
that is illustrated in various figures is represented by a like numeral. For
purposes of clarity,
not every component may be labeled in every figure. In the figures:
FIG. 1 is a block diagram of a helper network implementation, according to one
embodiment;
FIG. 2 is a block diagram of examples helper networks for processing
respective
authentication inputs, according to one embodiment;
FIG. 3 illustrates example multiclass and binary helper network models,
according to
some embodiments;
FIG. 4 illustrates example processing for detecting presentation attacks,
according to
some embodiments;
FIG. 5 illustrates example process flow for voice processing, according to
some
embodiments;
FIG. 6 illustrates example process flow for facial image processing, according
to some
embodiments;
FIG. 7 illustrates example process flow for fingerprint processing, according
to some
embodiments;
FIG. 8 is a block diagram of an example authentication system, according to
one
embodiment;
FIG. 9 is an example process flow for processing authentication information,
according
to one embodiment,
-10-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
FIG. 10 is an example process flow for processing authentication information,
according to one embodiment;
FIG. 11 is an example process flow for processing authentication information,
according to one embodiment;
FIG. 12 is block diagram of a special purpose computer system on which the
disclosed
functions can be implemented;
FIG. 13 is an example process flow for classifying biometric information,
according to
one embodiment;
FIG. 14 is an example process flow for authentication with secured biometric
data,
.. according to one embodiment;
FIG. 15 is an example process flow for one to many matching execution,
according to
one embodiment;
FIG. 16 is a block diagram of an embodiment of a privacy-enabled biometric
system,
according to one embodiment;
FIGs. 17-20 are diagrams of embodiments of a fully connected neural network
for
classification;
FIGs. 21-24 illustrate example processing steps and example outputs during
identification, according to one embodiment;
FIG. 25 is a block diagram of an embodiment of a privacy-enabled biometric
system
with liveness validation, according to one embodiment;
FIG. 26A-B is a table showing comparative considerations of example
implementation,
according to various embodiments;
FIG. 27 is an example process for determining identity and liveness, according
to one
embodiment;
FIG. 28 is an example process for determining identity and liveness, according
to one
embodiment;
FIG. 29 is an example process flow for validating an output of a
classification network,
according to one embodiment; and
FIGs. 30-31 illustrate execution timing during operation with accuracy
percentages for
the respective examples.
DETAILED DESCRIPTION
According to some embodiments, validation and generation of identification
information can be supported by execution of various helper networks.
According to one
-11-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
embodiment, these specially configured helper networks can be architected
based on the type
of identification information/credential to be processed or more generally
based on an
authentication modality being processed. Various embodiments describe example
functions
with respect to authentication and authentication systems. The nomenclature
"authentication
system" is used for illustration, and in various embodiments describes systems
that perform
identification operations that employ helper networks in the context of
identifying an entity
or subject, and the disclosed operations should be understood to encompass
data validation in
the context of identification. The described examples and embodiments can also
be used for
authentication where identification is a first step, and adjudication of the
identity and/or
permissions for the entity is required or desired.
In various embodiments, the system can execute a plurality of helper networks
that
are configured to filter inputs (including, for example, inputs to training
models) that are later
used in authentication or identification. For example, geometry helper
networks can be
executed to facilitate analysis of features within authentication information,
by identifying
salient features and, for example, providing location information. In various
embodiments,
examples are described to process authentication information, and are not
intended to limit
the operations on the input to authentication assertions, but rather include
operations that
include identification, and identification with authentication.
According to one embodiment, validation helper networks are configured to
.. determine that an identification sample is a good identification and/or
authentication sample.
For example, only identification samples that improve accuracy or expand
recognition can be
validated. The validation network can, for example, identify that a face image
is too blurry
for use, the image of the user has been taken in poor lighting conditions, the
imaged face is
too far away from the capture device, the imaged face is obscured, the imaged
face is too near
to the capture device, the imaged face is out of focus, the imaged face is
looking away from
the camera, among other options. In various examples, the helper networks are
pre-trained
using bad identification samples. For example, the bad identification samples
are identified
as samples that reduce the entropy of the resulting data set. To illustrate,
if a blurry image of
a first user is used to create encrypted features, the resulting encrypted
features will then
match on more encrypted features, and which may include matches reflecting
source
identification information not of the first user ¨ this is an example of
reduced identification
entropy. In another example, the helper networks are pre-trained on bad
identification
samples that reduce or hamper the execution or efficiency of subsequent
processing.
-12-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In further example, various state determinations can be used to identify data
instances
that reduce the effectiveness of recognition operations and then exclude such
bad
identification information (e.g., a face image from an identification data
set). Stated more
generally, the validation helper networks are configured to weed out bad
identification data
and prevent bad data from impacting subsequent operations, including for
example, training
of machine learning models for various identification and/or authentication
scenarios or other
subsequent processing scenarios. In further embodiments, the validation helper
networks can
be configured to validate data instances whose use and/or incorporation into a
body of
identification data will result in improvement in recognition circumstances
and/or processing
accuracy. In some examples, the validation helper networks are trained to
identify
identification data instances that improve identification entropy.
In further examples, some helper networks include a face plus mask helper
network
tailored to operate on identification instances of facial images, where the
identification target
is wearing a mask, mask on/off detection helper network, eyeglasses on/off
detection helper
network, fingerprint validation network, eye geometry helper network, eyes
open/closed
detection helper network, training data helper networks, eye validation helper
network, etc.
In various embodiments, the helper networks are configured to: improve
processing of
identification credentials, for example, to eliminate noise in processed
credentials; ensure
valid credentials are captured, including for example, quality processing to
ensure proper
credentials are captured. In further embodiments, various helper networks can
be configured
to establish liveness of a data capture, for example, based on liveness
validation (e.g.,
submitted identification credential is not a spoofed credential submission),
among other
options.
Fig. 1 is a block diagram of an authentication system 100. According to
various
embodiments the authentication system 100 can accept a variety of
identification inputs (e.g.,
101) and produce filtered identification data (e.g., at 120) for use in
identification/
enrollment/authentication functions (e.g., 130). For example, the
authentication system 100
can be configured to accept various biometric inputs 101A including images of
a user's face,
101B including images of a user's fingerprint, 101C including captures of the
user's voice,
among other options (e.g., as shown by the three dots appearing under the
various inputs).
Various embodiments can be configured to operate on the various inputs shown,
or subsets of
those instances. According to some embodiments, the authentication system can
be
configured with an authentication gateway 102. The authentication gateway may
include a
plurality of helper networks each tailored to process a respective
identification input. For
-13-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
example, a helper network can be tailored specifically to deal with facial
recognition images
and/or video for identifying a user face. Different types of helper networks
can be tailored to
specific functions, including, for example, geometry helper networks (e.g.,
104) that are
configured to identify characteristics within an identification/authentication
input and/or
positional information within the input that can be used for validation and/or
creation of
embeddings (e.g., encrypted feature vectors produced by an embedding network ¨
discussed
below).
In various embodiments, geometry helper networks can be configured to support
analysis by validation helper networks (e.g., 106). Although in other
embodiments, validation
helper networks are configured to operate on input data without requiring the
output or
analysis of geometry helper networks. In yet other embodiments, some
validation networks
can receive information from geometry helper networks while other helper
networks operate
independently and ultimately deliver an assessment of the validity of an
identification/authentication instance. In the context of image inputs, the
validation helper
network can determine that the submitted image is too blurry, off-center,
skewed, taken in
poor lighting conditions, among other options, that lead to a determination of
a bad instance.
In some embodiments, the various helper networks can include processing helper

networks configured to manage inputs that are not readily adaptable to
geometric analysis. In
some examples, the processing helper networks (e.g., 108) can also be loosely
described as
geometry helper networks and the two classifications are not mutually
exclusive, and are
describe herein to facilitate understanding and to illustrate potential
applications without
limitation. According to one example, processing helper networks can take
input audio
information and isolate singular voices within the audio sample. In one
example, a processing
helper network can be configured for voice input segmentation and configured
to acquire
voice samples of various time windows across an audio input (e.g., multiple
samples of 10ms
may be captured from one second to input). The processing helper networks can
take audio
input and include pulse code modulation transformation (PCM) that down samples
the audio
time segments to a multiple of the frequency range (e.g., two times the
frequency range). In
further example, PCM can be coupled with fast fourier transforms to convert
the audio signal
from the time domain to a frequency domain.
In some embodiments, a series of helper networks can be merged into a singular

neural network (e.g., 110) that performs the operations of all the neural
networks that have
been merged. For example, geometry helper networks can be merged with
validation helper
-14-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
networks and the merged network can be configured to provide an output
associated with
validity of the identification/authentication data input.
Regardless of whether a plurality of helper networks is used or a merged
network is
used or even combinations thereof, the authentication data gateway 102
produces a set of
filtered authentication data (e.g., 120) that has pruned bad authentication
instances from the
data set. Shown in Fig. 1 is communication of the filtered authentication data
120 for use in
identification, enrollment, and/or authentication services at 130. In some
embodiments, an
authentication system can include components for performing identification of
entities,
enrollment of users, and components for authenticating enrolled users.
Filtered data can be
used for any of the example preceding operations. In some examples, filtering
of training data
can be prioritized, and an authentication system does not need to filter
authentication inputs
when performing a specific request for authentication against enrolled data.
In some other
embodiments, an authentication system can provide data gateway operations and
pass the
filtered data onto other systems that may be used to identify, enroll, and/or
authenticate users.
Other implementations can provide data gateway operations, identification
operations,
enrollment operations and/or authentication operations as part of a single
system or as part of
a distributed system with multiple participants. Some embodiments can used
helper network
validation or invalidation determinations to request an identification target
re-submit
identification information, among other options.
In other embodiments, the operation of the helper networks shown can be used
in the
context of identification. The helper networks are used to ensure valid data
capture that can
then be used in identifying an individual or entity based on acquired
information. Broadly
stated, the geometry and/or processing helper networks operate to find
identification data in
an input, which is communicated to respective validation helper networks to
ensure a valid
submission has been presented. One example of an identification setting versus
an
authentication setting, can include airport security and identification of
passengers.
According to various embodiments, identification is the goal in such example
and
authentication (e.g., additional functions for role gathering and
adjudication) is not necessary
once a passenger has been identified. Conversely, the system may be tasked
with
authenticating a pilot (e.g., identification of the pilot, determining role
information for the
pilot, and adjudication) when seeking to access a plane or plane flight
control systems.
Fig. 2 is a block diagram of authentication system 200 executing a variety of
example
helper networks. The respective helper networks are configured to process
(e.g., at 220)
respective identification credential input (e.g., biometric input (e.g., 251
face image, 252 face
-15-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
image with mask, 253 fingerprint capture, 254, voice capture, among other
input options and
corresponding helper networks, shown by three dots)) and filter bad
credentials (e.g., at 230)
from being used in subsequent recognition tasks, for example, incorporation
into embedding
generation networks (e.g., at 240). Description of various functions,
operations, embedding
network architecture, and uses of generated embeddings for identification.
authentication
and/or for training classification networks, among other examples, are
described in co-
pending US. Application 16/832,014, filed on March 27, 2020, titled "SYSTEMS
AND
METHODS FOR PRIVACY-ENABLE BIOMETRIC PROCESSING," (the '014
Application) incorporated herein in its entirety.
Various embodiments of an authentication system can be configured to process
and
filter authentication data using helper networks, where the filtered data is
made available for
subsequent use by, for example, the embedding networks described in the '014
application.
Stated broadly embedding networks can be executed to accept authentication
inputs in a
plain-text or unencrypted form and transform the input into an encoded
representation. In
one example, embedding networks are configured to transform an authentication
input into a
geometrically measurable one-way encoding of an authentication input (e.g., a
one way
homomorphic encryption). Use of such encodings preserves the secrecy of
underlying
authentication data, while providing embeddings than can be
evaluated/classified in an
encoded space. The inventors have realized that improvements in data
enrollment using
helper networks results in improved accuracy for embedding networks and
resulting
authentication operations.
Returning to Fig. 2, the respective biometric inputs (e.g., 251 ¨ 254) are
captured and
used as input in a processing stage (e.g., 220) configured to confirm or
identify relevant or
interesting characteristics within the respective biometric input. For
example, respective
helper networks (e.g., 202 ¨ 208) are configured to process input biometric
information and
establish characteristics for analysis based on the input data. In one
example, the geometric
helper network 202 can be configured to process an input face image and return
coordinates
for characteristic features within the image (e.g., eyes, nose, mouth, ears,
etc.). Another
geometric helper network (e.g., 204) can be configured to analyze facial
images where the
user is wearing a mask. The output of these geometric helper networks can be
processed by
similar validation helper networks configured to validate (e.g., at 230).
Other geometric
helper networks include a fingerprint geometric helper networks 206 and a
voice helper
network 208.
-16-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to one embodiment, the fingerprint helper networks 206 can be
configured
to align, crop, and/or identify fingerprint characteristics within an image.
For example, the
helper network 206 can identify position information for ridges and whorls and
other
characteristics that would be analyzed in a fingerprint image. The outputs of
helper network
206 can then be processed by a validation network (e.g., 212) to filter any
bad inputs.
Likewise, the voice geometric helper network 208 is configured to capture
characteristics
from an audio sample and communicate processed samples to a validation network
(e.g.,
214). Processing by the voice geometric helper network can include PCM and
fast fourier
transformation of audio samples, which are then validated as good or bad
samples by, for
example, validation network 214.
According to various embodiments, the validation networks are configured to
protect
the embedding neural networks shown in phase 240. For example, if a poor image
is allowed
into the embedding network 215 the poor image will disturb the distance
measurements on
the output of the embedding network and the embedding model 215 itself.
Incorporation of
bad data can compromise the entire network, which results in false positives
and false
negatives for subsequent authentications.
Returning to the validation phase (e.g., 230), a plurality of validation
networks is
configured to determine if an authentication input is valid for use or not.
For example, a face
validation helper network can be configured to determine if an input image was
taken with
the camera too far away from the subject or too close to the subject, where
either condition is
used to identify the bad credential and exclude it from use. In other
examples, face validation
helper networks can also determine if an image is too blurry, if an image is
spoofed (e.g., a
photo of a user is presented rather than a capture of the user directly), if
video input used for
submitting facial information is spoofed rather than presented by the actual
user, if the user or
subject is wearing a mask or not, among other options.
In various embodiments the validation networks are architected based on a deep

neural network model and each can return the probability, score, or value that
determines if
an input is valid or bad. In further embodiments, the helper network can
return state
information, including whether a user is wearing a mask or not. In some
examples, a
determination that a user is wearing a mask may cause an authentication system
to exclude
the identification information from use, and in other examples, the
authentication system can
use the state determination, wearing mask, to select a respective embedding
DNN (e.g., 216 ¨
an embedding network trained on images with users wearing masks).
-17-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In further example, an authentication system can include a fingerprint
validation
helper network (e.g., 212) that is configured to determine if a fingerprint
capture includes
enough ridges or characteristics to provide good analysis. In addition,
fingerprint helper
networks can also determine liveness - confirm that spoofed video is not the
source of a
submission or an image spoof is not the source of submission.
Additional embodiments can include voice validation helper networks configured
to
determine if too many voices are present in an input, and if no sound is
present in an input, if
too much external noise is present in an input, among other options.
Once an input is validated the inputs can undergo further processing,
including,
identification, authentication, enrollment, etc. For example, the input can be
processed by a
respective embedding network in stage 240. For example, a face embedding DNN
215 can
process user face images. In further example, a face with mask embedding
network 216 can
process images of users wearing masks. Other examples include a fingerprint
embedding
DNN 217 for processing fingerprint images and voice embedding DNN 218 for
processing
audio inputs.
In various embodiments, the output of stage 240 is an embedding or feature
vector
representative of the input but in an encoded form. For example, the embedding
networks
can generate encrypted feature vectors or other one-way encoded
representations that are
geometrically measurable for comparison. In one example, an embedding network
can
accept an unencrypted input and produce encrypted feature vectors that are a
homomorphic
one-way encryption of the input.
Fig. 3 is a block diagram illustrating various example helper networks,
according to
various embodiments. According to one embodiment, an authentication system can
execute a
variety of different helper networks architected on a variety of models. For
example, a group
of helper networks can be configured to establish one of a pair of states.
Stated broadly, the
helper networks configured to establish one of a pair of states responsive to
input can be
referred to as binary models. For example, a respective binary helper network
is configured
to determine if an input is associated with the first or second state. In an
identification or
authentication setting, a variety of helper networks can be configured to
process images for
facial recognition (e.g., 360) using a plurality of binary or other models.
According to some embodiments, face processing helper networks can include
evaluations of whether, or not, an image is too blurry to use in the context
of identification,
authentication, and/or training. In another example, a face helper network can
be configured
to determine if there are not enough landmarks in an input image for facial
recognition or in
-18-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
the alternative if there are enough landmarks (e.g., 362). Further embodiments
include any
combination of the prior helper networks and may also include helper networks
configured to
determine if the user is wearing a mask or not, if the user is wearing glasses
or not, if the
user's eyes are closed or not, if an image of the user was taken too far from
or too close to the
camera or image source (e.g., see 361 ¨ 368), among other options.
Other helper networks may be used in conjunction with different embodiments to

determine a state of an authentication input which may involve more than
binary state
conditions. In further embodiments, other authentication modalities can be
processed by
different helper networks. According to one embodiment, a fingerprint helper
network can be
configured to accept an image input of a user's fingerprint and process that
image to
determine if a valid authentication instance has been presented (e.g., 370).
For example, the
fingerprint validation network can be configured to accept an image input and
determine a
state output specifying if not enough fingerprint landmarks (e.g., ridges) are
present for
authentication, or alternatively that enough fingerprint ridges are present
(e.g. 371). In
another example, a fingerprint validation network can be configured to
determine if a
fingerprint image is too blurry to use (e.g. 372). In further example, the
fingerprint validation
network can also be configured to determine if a fingerprint image is too
close to the image
source that captured it or too far from the image source that captured it
(e.g. 373). Similar to
face validation, a fingerprint validation network can also be configured to
identify
submissions that are spoofed video (e.g. 374), or spoofed images (e.g. 375).
According to some embodiments, validation models can be configured to score an

authentication input and based on evaluation of the score a respective state
can be
determined. For example, a validation helper network can produce a probability
score as an
output. Scores above the threshold can be classified as being one state with
scores below the
threshold being another. In some examples, intermediate values or probability
scores can be
excluded or assigned an inconclusive state.
Further embodiments are configured to execute helper networks to process
additional
authentication modalities. According to one embodiment, an authentication
system can
include voice validation helper networks (e.g. 380) configured to accept an
audio input and
output of probability of validity. In one example, a voice helper network is
configured to
determine if too many voices are present in a sample (e.g., 381). In another
example, a voice
validation network can be configured to determine if no sound is present in an
audio sample
(e.g. 382). Further examples include voice validation networks configured to
determine if too
much external noise is present in an audio sample for proper validation (e.g.,
383).
-19-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to some embodiments, audio spoof detection can use an induced audio
signal. Such an induced audio signal can be an audible tone or frequency and
may also
include a signal outside human hearing. Various patterns and/or randomized
sounds can be
triggered to aid in presentation attack detection. Various validation networks
can be
configured to identify the induced audio signal as part of authentication
input collection to
confirm live authentication input.
Shown at 310 are examples of multiclass models that can be based on
combinations
and/or collections of various binary or other state models. For example, a
face validation
model can incorporate a variety of operations to output a collective
determination on validity
based on the underlying state determinations. In one example, the face
validation network
(e.g., 320) can analyze an image of a user face to determine if any of the
following
characteristics make the image a bad authentication input: image is too far or
too close, image
is too blurry, image is spoofed, video spoof produced the input, the user is
wearing a mask,
the user's eyes are open or closed, the user is or is not wearing eyeglasses,
etc. (e.g., 321). In
other embodiments, any combination of the foregoing conditions can be tested
and as few as
two of the foregoing options can be tested to determine the validity. In still
other
embodiments, different numbers of conditions can be used to determine if an
authentication
input is valid.
According to other embodiments, different multiclass models can be applied to
different authentication inputs. For example, at 330 shown is a fingerprint
validation model
that can test a number of conditions to determine validity. In one example, a
fingerprint
validation network (e.g. 331) is configured to test if enough ridges are
present, if the input is
a video spoof, if the input is an image spoof, if the image is too blurry, and
if the image was
captured too far or too close to an image source, among other options.
According to one embodiment, a voice validation network (e.g., 340) is
configured to
validate an audio input as a good authentication instance. In another example,
the voice
validation network can be configured to determine if there are too many voices
present, no
sound present, if too much external noise is present in an audio input, among
other options
(e.g., 341). In addition, the voice validation network can also include
operations to determine
liveness. In one example, an authentication system can induce an audio tone,
sound, or
frequency that should be detected by a validation network in order to
determine that an
authentication input is live and not spoofed. Certain time sequences or
patterns may be
induced, as well as random audio sequences and/or patterns.
-20-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Fig. 4 is a block diagram illustrating operations performed by validation
helper
networks configured to determine liveness. Fig. 4 illustrates various
considerations for
implementing validation networks to detect input spoofing according to some
embodiments.
The illustrated examples of helper networks (e.g. 408, 458) are trained by
creating a
.. multitude of input spoofed images that are created in a variety of lighting
conditions and
backgrounds. The spoofed images are received at 454, and the spoofed images
are
transformed into augmented image format that limits lighting effects, and
limits the effects of
subject skin color, and facial contour. The augmented image format can include
for example
an HSL image format. Various considerations for color harmonization are
discussed in,
"Color Harmonization," by D. Cohen-Or et al., published 2006 by Association
for Computing
Machinery, Inc. Other augmentation/ homogenization formats could be used
including, for
example, LAB color space or contrast limited adaptive histogram equalization
"CLAHE"
method for light normalization.
Once a variety of spoofed images are produced and the lighting conditions
normalized, various additional spoofed instances can be created with multiple
alignments,
cropping's, zooms (e.g., in and out) to have a body of approximately two
million approved
images. The validation network is trained on the images and its determinations
tested. After
each training, false positives and false negatives remain in the training set.
In some example
executions, the initial two million images are reduced to about 100,000. The
validation
network is retrained on the remaining samples. In further embodiments,
retraining can be
executed repeatedly until no false positives or false negatives remain. A
similar training
process can be used in the context of video spoofed video inputs. A video
liveness validation
network can be trained similarly on false positives and false negatives until
the network
identifies all valid inputs without false positives or false negatives.
Once trained, processing follows a similar approach with any authentication
input.
Shown are two pathways one for video spoof inputs and one for image spoof
inputs (e.g. 402
and 452 respectively). The spoofed data is received as 404/454 and the data is
transformed
into the HSL format at 406/456, which is processed by respective validation
networks (e.g.
408/458 - which can be, for example, pre-trained helper validation deep neural
networks). In
response to the input of potentially spoofed authentication data, the
validation networks
408/458 output respective scores 410/460, and based on the respective scores
an
authentication system can determine if an authentication input is valid or
simply a replay or
spoof of a valid authentication input.
-21-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Unlike some conventional systems that can use machine learning approaches to
cluster images before processing, the validation networks are trained on
universal
characteristics that apply to all authentication inputs, and each
determination of validity
establishes that a singular authentication instance is valid or not. In
various embodiments, the
validation network is trained on characteristics within the data set that are
independent of the
subject to identified, authentication, and/or enrolled. With the training as
described above,
various embodiments provide helper networks that are capable of presentation
attack
detection (e.g., spoofed submission of a valid image). Clustering of similar
images, as done
in some conventional approaches, is not expected to solve this issue, and the
likely result of
such an approach would include introduction of spoofed images into such
clusters, which
ultimately will result in incorporation into and successful attacks on
resulting authentication
models.
Shown in Fig. 5 are various embodiments of helper networks configured to
analyze
voice input and determine if a valid authentication input has been submitted.
According to
some embodiments, voice helper networks can be configured to determine if too
many voices
are present in an authentication instance, if no sound is present, and/or if
external noise is too
loud, among other options to validate that a good authentication instance has
been provided.
Various sets of training data can be used to train respective voice helper
networks (e.g., voice
training data with multiple voices, training data with no voice data, training
data with external
noise, etc.).
According to one embodiment, voice validation helper networks are trained to
identify various states to determine if an authentication instance is valid
for use in
authentication. The helper networks can be trained on various audio inputs. In
one example, a
body of audio inputs are captured that are clean and valid (e.g., capture of
known valid users'
voices). The initial audio data is mixed and/or modified with external noises
that impact how
good they are in terms of authentication sources. For example, to determine
impact of the
noise, an output of a voice embedding network can be used to evaluate a cosine
distance
between various audio inputs. Where the introduction of external noise impacts
the cosine
distance evaluation, those instances are useful in establishing a training
data set for
identifying valid/invalid audio instances.
According to one embodiment, a set of 500 clean samples are captured and used
to
mix with external noises (e.g., 500 external noises evaluated for impact on
cosine distance).
The 500 initial samples are expanded and mixed with external voices until a
large number of
audio samples are available for training. In one example, helper networks can
be trained on
-22-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
over eight million audio samples. Once trained, the results produced by the
helper networks
are tested to determine how well the helper networks identified valid data.
False-positive
results and false negative results are then used for subsequent training
operations. According
to one embodiment, millions of samples can be reduced to hundreds of thousands
of false
positives and false negatives. In various example executions, human perception
is incapable
of determining a difference between the spoofed audio and a valid instance
once the training
data has been reduced to the level of ¨100K instances, however, the trained
model is able to
distinguish between such audio samples.
In some implementations, false positives and false negatives are used
repeatedly to
train the model until the model is able to execute with no false positives or
false negatives.
Once that result is achieved or substantially close to that result (e.g. less
than 1 ¨ 5 % false-
positive/false-negative exists) the voice validation model is trained and
ready for use.
According to one example, an authentication system can use any number of voice
validation
helper networks that are pre-trained to detect spoofed audio instances.
Returning to Fig. 5, three example pre-trained voice helper networks (e.g.,
DNNs) are
illustrated. In the first block illustrated each helper network is configured
to detect a state ¨ at
502 too many voices, at 522 no sound is present, and/or at 542 too much
external noise. The
respective helper networks receive audio for processing (e.g. 504, 524, 544).
According to
various embodiments, PCM is executed on received audio (e.g., 506, 526, 546).
The result is
.. transformed into the frequency domain (e.g. 508, 528, 548 ¨ fourier
transform). The
respective outputs are evaluated by pre-trained helper DNNs at 510, 530, and
550. The
respective helper networks are configured to output scores associated with
their state
evaluation. For example, the respective networks output scores at 512, 532,
and 552. The
scores can be used to determine if the audio input is valid for use in
authentication. For
example, the output value can reflect a probability an instance is valid or
invalid. In one
implementation, values above a threshold are deemed invalid and vice versa. In
further
example, some ranges for probable matching can be determined to be
inconclusive.
According to some embodiments, the various states described above (e.g., too
many
voices, no sound, external noise issues, among other options) can be tested
via a merged
network that incorporates the illustrated pre-trained helper networks into a
single neural
network, and the output represents a collective evaluation of validity of an
audio input.
Fig. 6 illustrates a variety of helper networks configured to evaluate facial
images and
output a scoring for determining validity. In the first column shown in Fig.
6, the state being
tested is specified. For example, at 604 some of the states that respective
helper networks can
-23-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
test are illustrated. Various embodiments include tests for whether an image
is too blurry,
does not contain enough landmarks, images a user with a mask on or off, images
a user with
glasses on or off, images the user with eyes closed or open, an imaged face is
too far or too
close to an image source or camera, etc. According to some embodiments,
processing by the
helper networks proceeds at column 608 where the respective helper networks
receive image
data that is processed into normalized image data at 612 (e.g., processed into
an HSL image).
At column 616, the respective helper networks evaluate respective HSL images
and at
column 620 output a score used to determine validity based on the evaluated
state specified in
column 604.
According to various embodiments face validation helper networks are trained
based
on an initial set of valid input images which are taken in a variety of
lighting conditions and
background so that each lighting condition has multiple backgrounds and each
background
has multiple lighting conditions. A large training set is beneficial according
to some
embodiments. In some examples 500,000 images can be used to establish the
variety of
lighting conditions and backgrounds. The initial set of images can then be
normalized to
produce HSL images. Other processes can be used to normalize the training set
of images.
The resulting images are manipulated to generate an expanded set of training
images. For
example, a variety of alignments and/or cropping of the images can be
executed. In other
examples, and in addition or in the alternative, a variety of zoom operations
(e.g., in and out)
can be applied to the images. As part of expanding the training set, the
images can be
integrated with defects, including, adding bad lighting, occlusions,
simulating light beams
over a facial image, eliminating landmarks on faces present, having images
that are too far
and too close to an image source and or introducing blurring into the training
images, among
other options. The initial body of training images can be expanded
significantly and for
example, a set of 500,000 images can be expanded into 2 million images for a
training set.
Once the training set is prepared, the helper network is trained against the
data to
recognized valid authentication inputs. The results produced by the helper
network are
evaluated. Based on the results evaluation, any false positives and any false
negatives are
used for further training of the model. According to one example execution,
about one
hundred thousand images remain that are false-positives or false-negatives
after the first
attempt. Training can be repeated until no new false-positive or false-
negative remain, using
the remaining false results to retrain. In other examples once a sufficient
level of accuracy is
achieved greater than 95% training can be considered complete. According to
some
embodiments, facial validation helper networks are architected on a deep
neural network
-24-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
model that can identify any of a number of states associated with a facial
image, and further
can be used to determine if the image is valid for use in authentication.
Shown in Fig. 7 is a similar approach for executing helper networks on
fingerprint
images, according to some embodiments. In the first column at 702, specified
is a state being
tested by a respective helper network. For example, a validation helper
network can
determine if not enough fingerprint ridges are available, if an image is too
blurry, is a
fingerprint image is too far or too close to an image source, among other
options. At column
708, image data is received, and at column 714, the received image data is
transformed into
HSL image format. The HSL image is reduced to a grayscale image at column 720.
The
result is analyzed by respective helper networks (e.g., input to pre-trained
helper DNNs) at
726. Once analyzed, the respective networks output a score used to determine
validity of the
authentication instance (e.g., at column 732).
Similar to the approach discussed with respect to Fig. 6, fingerprint image
data can be
captured in multiple lighting conditions and with multiple backgrounds to
produce training
data sets used to define the helper network models. Once a body of images is
produced, the
images are transformed into HSL images and then into grayscale. A variety of
alignments,
crops, zooms (e.g. in and out), are applied to the body of images. In
addition, operations are
executed to various ones of the body of training images to introduce defects.
For example,
bad lighting conditions can be added, as well as occlusions, introduction of
light beams into
.. images, removal of landmarks from the image, as well as using images where
the fingerprint
image is too far and/or too close to an image source. Other example images can
include
blurry fingerprint captures or introduction of blur into training data images.
According to
some embodiments, an initial body of 500,000 images can be expanded into a
body of 2
million images to train the model.
According to one embodiment, once the expanded set of images is created a
helper
network model can be trained on the body of images to identify valid
authentication inputs.
Initially the output determination of the helper network yields false
positives and false
negatives. Any resulting false-positives and false negatives are used to
continue training of
the helper network. In one example execution, an initial set of two million
images yields
approximately 100,000 false-positives and/or false negatives when the helper
networks
results are evaluated. The helper network model is retrained based on the
remaining images
and tested to identify any further false-positives and/or false negatives. The
approach can be
repeated to refine the model until no false positives or false negatives are
identified. In other
embodiments, an authentication system can use a threshold level of accuracy to
determine a
-25-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
model is fully trained for use (e.g. greater than 90% accuracy, greater than
95% accuracy,
among other options).
Once respective helper networks are trained on their expanded data sets and
iterated
until no false positives or false negatives are output, an authentication
system can execute the
pre-trained helper network to determine the validity of any authentication
input and filter bad
inputs from use in training authentication models (e.g., embedding generation
networks).
Further helper network embodiments include a transcription helper network. For

example, some embodiments include one or more helper networks configured to
accept an
audio input and evaluate where the audio sample is of suitable quality to use
in subsequent
processing. In some examples, subsequent processing includes identification
and/or
authentication settings. In other examples, the transcription helper network
(and any helper
network described can be used in other subsequent processing. In one example,
the
transcription helper network is configured to evaluate input audio and
generate a
determination that the audio sample is of suitable quality to forward for a
voice transcription.
In some embodiments, the transcription network can be trained as described
with
respect to the audio and/or voice networks herein. In further example, the
transcription can
be trained to identify transcribable audio by defining a training set of good
audio and bad
audio. Training can be iterative as described herein. For example, bad data
and false
positives can be used to iteratively train a transcription helper network
until no further result
are left. The resulting network can then be used on any new audio input to
evaluate whether
the input is transcribable. In some settings, an indication that the audio
input is not
transcribable can end the analysis.
Further embodiments can include a helper network trained to verify presence or
a
target. For example, similar in effect as a captcha check, the helper network
can work on its
own to identify the presence of a human being or other entity. In some
embodiments, the
presence verification can be configured to operate without a requirement for
determining
identity, and can provide a determination on if a face is a human face.
Further examples of
the presence network can also determine if the information submitter is "live"
- not an image
or video spoof. In still other examples, the helper networks can be configured
to determined
liveness in the context of a submitter who is wearing a face mask (e.g.,
face+mask network),
a submitter who is wearing a human facsimile mask, and in the context of
fingerprint
submission. For example, a fingerprint validation network can be trained on a
variety of
valid fingerprint submissions inputs and a variety of invalid input
submissions. Various
-26-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
approaches for generating invalid face submission instances are described
herein and can be
extended to the fingerprint instance.
According to various embodiments, helper network can be configured to provide
a
CAPTCHA type service. For example, ones or combinations of helper networks can
be used
to verify a human subject is seeking identification, authentication,
verification, etc. In further
embodiments, one or more helper networks can be executed for detecting and
differentiating
input provided by a human or machine. In an example environment, the system
and
associated helper networks can be used primarily in Internet applications for
verifying that
data originating from a source is from a human, and not from an unauthorized
computer
program/software agent/robot. The following helper network can be used alone
and/or in any
combination to identify human versus computer actors:
1. Camera input analysis networks: determines valid identification input
(e.g.,
biometric of user's face (therefore is not a robot))
a. Video spoofing DNN - protects against video presentation attack (PAD)
b. Image spoofing DNN - protects against image presentation attack (PAD)
c. Geometry DNN (finds valid face input (e.g., face biometric) in image)
d. Blurry image DNN (makes sure face input in image is not too blurry)
2. Microphone Input analysis networks: determines valid biometric of user's
voice
(therefore is not a robot)
a. Voice spoofing DNN - protects against deepfake or recorded audio attack
b. Validation DNN - finds valid human voice
c. Random sentence (optional) - displays a random sentence, then uses
automatic speech recognition (ASR) DNN to convert speech to text to
ensure the human said the requested words.
Various embodiments for captcha operation relate to electronic systems for
detecting
and differentiating input provided by humans and machines. These systems are
used
primarily in Internet applications for verifying that data originating from a
source is from a
human, and not from an unauthorized computer program/software agent/robot.
According to
one embodiment, a method of validating a source of image data input to a
computing system
is provided. The method comprises: receiving one or more images, processing
the images
using helper networks to ascertain the validity, and generating a
determination of whether the
face images originated from a machine or a human. A second embodiment concerns
a method
of validating a source of audio data input to a computing system comprising:
receiving
speech utterance from a microphone that (optionally) read out loud a randomly
selected
-27-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
challenge text; processing the speech audio with helper networks to ascertain
the validity, and
generating a determination of whether the audio images originated from a
machine or a
human.
Further embodiments can include a step of: granting or denying access to data
and/or
a data processing device based on the results of the CAPTCHA like function,
including a
signup for an email account or a blog posting. For others the step of granting
or denying
access to an advertisement based on the determination is performed. Other
embodiments
perform a separate automated visual challenge test so that both visual
processing and
articulation processing is considered in one or more of the determinations.
The access is preferably used for one or more of the following processing
contexts: a)
establishing an online account; and/or b) accessing an online account; and/or
c) establishing a
universal online ID; and/or d) accessing a universal online ID; and/or e)
sending email;
and/or f) accessing email; and/or g) posting on a message board; and/or h)
posting on a web
log; and/or i) posting on a social network site page; j) buying or selling on
an auction site;
and/or k) posting a recommendation for an item/service; and/or 1) selecting an
electronic ad.
In some embodiments, the various helper networks described are intended to
operate
independently of other processing and/or functions. For example, the helper
networks can be
configured to determine if face information or fingerprint information is
suitable for
continued processing. In an identification/authentication context, the attempt
to identify
and/or authenticate may terminate upon identification of an unsuitable input
(e.g., bad
collection, spoof, etc.). In other processing contexts, the helper network can
also stop
subsequent processing or require resubmission.
Other embodiments can include one or more stand-alone helper network
functionality
and/or integrate the one or more helper networks into a processing flow.
In other embodiments, helper networks embodiment can be configured to
determine if
a person (e.g. a doctor entering a hospital) is wearing a mask or wearing a
mask in the correct
way. In some settings, the helper network and its determination can be used to
prevent or
allow entry (which can also be coupled with identity and/or authentication
protocols). For
example, the system can be connected to a physical controller that is
configured to only allow
entry if a mask is on and/or being worn properly. In various embodiments, the
mask helper
network is configured to validate a state of mask on/off, and can also be
configured to
validate a state mask worn properly or not irrespective of a subject to be
identified.
In further embodiments, a helper network can be trained on location
information and
validate that a current geolocation of a requesting device is not blacklisted.
In some
-28-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
examples, the location helper networks are trained on location information
inputs that are
known to be valid as well and location information inputs that are known to be
invalid (e.g.,
as described herein with respect to various helper networks). The trained
network can then
validate location information captured at the time of an identification
function request.
Still other embodiments can include helper networks that validate
accelerometer
information captured from a device (e.g., a device requesting an
identification function, a
device associated with an identification function request, etc.). Helper
networks can be
trained on accelerometer information that reflects valid position information
(e.g., normal or
range of angles for known valid requests) and/or invalid position information
(e.g., angles or
ranges of angles for invalid requests). In one example, a helper network is
configured to
access and process accelerometer information to determine the user's angle
(holding the
phone), which can be used by the system to assert/validate liveness and/or
identity. Further
embodiments can include helper network trained on and configured to validate
temperature
information to ensure the user/device is where the user/device asserts they
are. It is implicit
.. in such location assertions, for example, is that it will not be 0 degrees
in California during
the summer. Various embodiments are configured to employ weather for helping
with the
determination of validity. As discussed with respect to various examples,
validity
determinations can be made independent of a subject to be identified and
various helper
networks are configured to validate submitted data before it is used for
identification
.. functions.
According to one embodiment, liveness helper networks can be trained on and
configured to test if a person is live (not a spoof) using a microphone. The
system can employ
a spoken random liveness sentence to make sure the person making the request
is active
(alive). If the user's spoken words match the requested words (above a
predetermined
.. threshold), the system can then establish a liveness dimension. Fig. 8 is a
block diagram of
an example embodiment of an authentication system 1400 employing private
biometrics with
supporting helper networks. As shown in Fig. 8 the system can be configured to
accept
various authentication credentials in plain text or unencrypted form (e.g.,
1401) processes the
unencrypted authentication credentials (e.g., via an authentication credential
processing
.. component 1402), to ensure the input is valid and good for authentication.
For example, a
plurality of helper networks can process authentication input to determine
validity before they
a processed by embedding neural networks (e.g., 1425) into one-way homomorphic

representations of the same, wherein the one-way homomorphic representations
can be
analyzed by a classification component (e.g., 1418) to determine if submitted
credentials
-29-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
matched enrolled credentials (e.g., return known for match or unknown at
1450), for
example, with a neural network trained on encrypted feature vectors produced
by the
embedding networks. Evaluations of matches can be validated for example, with
a validation
component 1420 that is configured to provide validation function once matches
or unknown
results are determined. In further embodiments, the classification component
can operate by
itself and in others as a part of a classification subsystem 1416 that can
also include various
validation functions to confirm matches or unknown results.
Various embodiments include architectures that separate authentication
credential
processing (e.g., 1402) from operations of the classification subsystem (e.g.,
1416), and other
embodiments can provide either or both operations as a service-based
architecture for
authentication on private encryptions of authentication credentials.
The various functions, processes, and/or algorithms that can be executed by
the
authentication credential processing component 1402 are discussed throughout,
and the
various functions, processes, and/or algorithms that can be executed by the
classification
subsystem 1416 are also described with respect to the '014 Application. Fig. 8
is included to
provide some examples of helper networks and support functionality and/or
algorithms that
can be incorporated in the various examples, embodiments, and aspects
disclosed herein. The
following descriptions focus on the helper network functions to provide
illustration, but are
not limited to the examples discussed with Fig. 8.
For example, credential processing can include various helper networks (e.g.,
face
1404, face and mask 1406, fingerprint 1408, eyeglasses 1410, eye geometry
1412, and the
"..." at 1414, and the preceding networks can each be associated with a
validation network
configured to determine the validity of the submitted/processed authentication
instance. In
some examples, geometry or processing networks (e.g., 1404 & 1408) are
configured to
.. identify relevant characteristics in respective authentication input (e.g.,
position of eyes in a
face image, position of ridges in a fingerprint image respectively, etc.). The
output of such
networks is then validated by a validation network trained on that type of
authentication
input. The "..." at 1414 illustrates the option of including additional helper
networks, and/or
processing functions, where any number or combination of helper network can be
used in any
combination with various embodiments disclosed herein.
According to some embodiments, the helper networks can be based on similar
neural
network architectures, including, for example, Tensorflow models that are
lightweight in size
and processing requirements. In further examples, the helper networks can be
configured to
execute as part of a web-based client that incorporates pre-trained neural
networks to acquire,
-30-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
validate, align, reduce noise, transform, test, and once validated to
communicate validated
data to embedding networks to produce, for example, one-way encrypted input
authentication
credentials. Unlike many conventional approaches, the lightweight helper
networks can be
universally employed by conventional browsers without expensive hardware or on-
device
training. In further example, the helper networks are configured to operate
with millisecond
response time on commercially available processing power. This is in contrast
to many
conventional approaches that require specialized hardware and/or on-device
training, and still
that fail to provide millisecond response time.
According to some embodiments, various helper networks can be based on deep
neural network architectures, and in further examples, can employ you only
look once
("YOLO") architectures. In further embodiments, the helper networks are
configured to be
sized in the range of 10kB to 100kB, and are configured to process
authentication credentials
in < 10 ms with accuracies > 99%. The data footprint of these helper network
demonstrates
improved capability over a variety of systems that provide authentication
based on complex,
bulky, and size intensive neural network architectures.
According to one aspect, each authentication credential modality requires an
associated helper DNN ¨ for example, for each biometric type one or more
tailored helper
networks can be instantiated to handle that biometric type. In one example, a
face helper
network and a fingerprint helper network (e.g., 1404 and 1408) can be
configured to identify
specific landmarks, boundaries, and/or other features appearing in input
authentication
credentials (e.g., face and fingerprint images respectively). Additional
helper networks can
include face and fingerprint validation models configured to determine that
the submitted
authentication credential is valid. Testing for validity can include
determining that a
submitted authentication credential is a good training data instance. In
various embodiments,
trained validation models are tailored during training so that validated
outputs improve the
entropy of the training data set, either expanding the circumstances in which
trained models
will authenticate correctly or refining the trained model to better
distinguish between
authentication classes and/or unknown results. In one example, distances
metrics can be used
to evaluate outputs of an embedding model. For example, valid instances
improve the
distance measure between dissimilar instances as well as to identify similar
instances, and the
validity networks can be trained to achieve this property.
In the context of image data, a validation helper network can identify if
appropriate
lighting and clarity is present. Other helper networks can provide processing
of image data
prior to validation, for example, to support crop and align functions
performed on the
-31-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
authentication credentials prior to communication to embedding network for
transforming
them into one-way encryptions.
Other options include: helper networks configured to determine if an input
credential
includes an eyes open/eyes closed state ¨ which can be used for passive
liveness in face
recognition settings, among other options; helper networks configured to
determine an
eyeglasses on or eyeglasses off state within an input credential. The
difference in eyeglass
state can be used by the system to prevent false negatives in face
recognition. Further options
include data augmentation helper networks for various authentication
credential modalities
that are configured to increase the entropy of the enrollment set, for
example, based on
increasing the volume and robustness of the training data set.
In the voice biometric acquisition space, helper networks (e.g., helper DNNs)
can be
configured to isolate singular voices, and voice geometry voice helper
networks can be
trained to isolate single voices in audio data. In another example, helper
network processing
can include voice input segmentation to acquire voice samples using a sliding
time
(e.g.,10ms) window across, for example, one second of input. In some
embodiments,
processing of voice data includes pulse code modulation transformation that
down samples
each time segment to 2x the frequency range, which may be coupled with voice
fast fourier
transforms to convert the signal from the time domain to the frequency domain.
Various embodiments can use any one or more and/or any combination of the
.. following helper networks and/or associated functions. In one embodiment,
the system can
include a helper network that includes a face geometry detection DNN. The face
geometry
DNN can be configured to support locating face(s) and associated
characteristics in an image
by transforming each image into geometric primitives and measuring the
relative position,
width, and other parameters of eyes, mouth(s), nose(s), and chin(s).
Facial recognition functions can be similar to fingerprint recognition
functions
executed by fingerprint helper networks as both networks process similar
modalities (e.g.,
image data and identification of structures within the images data to build an
authentication
representation). According to one embodiment, a helper network can include a
fingerprint
geometry detection DNN configured to accurately locate finger(s) in an image,
and analysis
.. can include transforming each image into geometric primitives to measure
each finger's
relative position, width, and other parameters. In one example, helper
networks that process
image data can be configured to identify relevant structures in the image and
return positional
information in the image (e.g., X and Y coordinates), video frame, and/or
video stream
submitted for processing of the relevant structures. In one example, geometry
networks
-32-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
process image credentials and their output can be used in validating the
authentication
instance or rejecting the instance as invalid.
In another embodiment, a helper network can include a face validation DNN
configured validate face input images (e.g., front looking face images). In
various
embodiments, the validation DNN is configured to validate any one or more or
any
combination of the following: a valid image input image was received, the
submitted image
data has forward facing face images, the image includes features consistent
with a facial
image (e.g., facial characteristics are present, and/or present in sufficient
volume, etc.);
lighting is sufficient; boundaries within image are consistent with facial
images, etc.
Similarly, a helper network can include a fingerprint validation DNN
configured to
validate fingerprint input images. Such validation networks can be configured
to return a
validation score used to determine if an image is valid for further
processing. In one
example, the validation networks can return a score in the range between 0 to
100, where 100
is a perfect image, although other scoring systems and/or ranges can be used.
In further embodiments, a helper network can include one or more image state
detection neural networks. The image state neural networks can be configured
to detect
various states (e.g., binary image conditions (e.g., face mask on/face mask
off, eye blink
yes/eye blink no, etc.)) or other more complex state values. The state values
can be used in
authentication credential processing. In one example, the system can employ an
image state
value to select an embedding generation neural network or to select a neural
network to
process an input authentication credential, among other options. In one
example, a detection
helper network can include a face mask detection DNN configured to determine
if image data
includes an entity wearing a face mask.
In further example, the system can also execute face mask detection algorithms
to
determine if a subject is wearing a mask. Stated broadly, masks used during
enrollment
lower subsequent prediction performance. In some embodiments, the face + mask
on/off
detection DNN accepts a face input image (e.g., a forward-looking facial
image) and returns a
value 0 to 100, where 0 is mask off and 100 is mask on. Various thresholds can
be applied to
a range of values to establish an on/off state.
In one example, a web client can include a URL parameter for enrollment and
prediction (e.g., "maskCheck=true"), and based on the output (e.g., state =
Mask On) can
communicate real-time instructions to the user to remove the mask. In other
examples, the
system can be set to automatically select a face + mask embedding DNN tailored
to process
images with face and masks. In various embodiments, the face + mask embedding
DNN is a
-33-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
specialized pre-trained neural network configured to process user image data
where the user
to be authenticated is wearing a mask. A corresponding classification network
can be trained
on such data (e.g., one-way encryptions of image data where users are in
masks), and once
trained to predict matches on user's wearing masks.
In another embodiment, a helper network can be configured to determine a state
of
image data where a user is or is not wearing glasses. In one example, a
detection helper
network can include an eyeglasses detection DNN configured to determine if
image data
includes an entity wearing eyeglasses. In further example, the system can also
execute
eyeglass helper network to determine if a subject is wearing eyeglasses. In
one example, the
system can also execute an eyeglass detection algorithm to determine if a
subject is wearing
eyeglasses before allowing enrollment. Stated broadly, eyeglasses used during
enrollment can
lower subsequent prediction performance. In some embodiments, the eyeglasses
on/off
detection DNN accepts a front view of face input image, returns a value 0 to
100, where 0 is
eyeglasses off and 100 is eyeglasses on. In some embodiments, various
thresholds can be
applied to a range of values to establish an on/off state. For example, values
above 60 can be
assigned to an on state with values below 40 assigned to an off state (or, for
example, above
50/below 50). Intermediate values can be deemed inconclusive or in other
embodiments the
complete range between 0 to 100 can be assigned to either state.
Various authentication system can test if a user is wearing glasses. For
example, a
web client can include a URL parameter for enrollment and prediction (e.g.,
"eyeGlassCheck=true"), and based on the output (e.g., state = Glasses On) can
communicate
real-time instructions to the user to remove the glasses. In other
embodiments,
generation/classification networks can be trained on image data of a user with
glasses and the
associated networks can be selected based on processing images of users with
glasses and
predicting on encrypted representations of the same.
In another embodiment, a helper network can include an eye geometry detection
DNN. The detection DNN is configured to locate eye(s) in an image by
transforming a front
facing facial image into geometric primitives and measuring relative position
of the
geometric primitives. In one example, the DNN is configured to return
positional
information (e.g., x, y coordinates) of eyes in an image, video frame or video
stream.
In one embodiment, a helper network can include an eyes open/closed detection
DNN. For example, a real-time determination that an entity seeking
authentication is
blinking provides real-time passive facial liveness confirmation. Determining
that a user is
actually submitting their authentication information at the time of the
authentication request
-34-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
prevents spoofing attacks (e.g., holding up an image of an authentic user). In
various
examples, the system can include algorithms to test liveness and mitigate the
risk of a photo
or video spoofing attack during unattended operation. In one example, the eye
open detection
DNN receives an input image of an eye and outputs a validation score between 0
and 100,
where 0 is eyes closed and 100 is eyes open. Various thresholds can be applied
to a range of
values to establish an eye open/closed state as discussed herein.
According to one embodiment, the authentication system prevents a user/entity
from
proceeding until the detection of a pair of eye-open/eye-closed events. In one
example, the
web client can be configured with a URL parameter "faceLiveness=true" that
allows the
system to require an eye-blink check. The parameter can be used to change
operation of
blinking testing and/or default settings. In further examples, rates of
blinking can be
established and linked to users as behavioral characteristics to validate.
In some embodiments, helper networks can be configured to augment
authentication
credential data. For example, a helper network can include facial and
fingerprint
augmentation DNNs that are used as part of training validation networks. In
various
embodiments, data augmentation via helper networks is configured to generalize
the
enrollment of authentication information, improve accuracy and performance
during
subsequent prediction, and allow the classification component and/or subsystem
to handle
real-world conditions. Stated generally, enrollment can be defined on the
system to require a
certain number of instances to achieve a level of accuracy while balancing
performance. For
example, the system can require >50 instances of an authentication credential
(e.g., >50
biometric input images) to maintain accuracy and performance. The system can
be
configured to execute algorithms to augment valid credential inputs to reach
or exceed 50
instances. For example, a set of images can be expanded to 50 or more
instances that can also
be broadened to add boundary conditions to generalize the enrollment. The
broadening can
include any one or more and/or any combination of: enhanced image rotations
flips, color and
lighting homogenizations, among other options. Each instance of an
augmentation can be
tested to require improvement in evaluation of the distance metric (Euclidean
distances or
cosine similarity) comparison, and also be required not to surpass class
boundaries. For
example, the system can be configured to execute algorithms to remove any
authentication
credentials (e.g., images) that exceed class boundaries. Once filtered, the
remaining images
challenge the distance metric boundaries without surpassing them.
In the example of image data used to authenticate, if only one image is
available for
enrollment, the system is configured to augment the facial input image >50
(e.g., 60, 70, 80,
-35-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
etc.) times, remove any outliers, and then enroll the user. According to one
embodiment, the
web client is configured to capture 8 images, morph each image, for example, 9
times,
remove any outliers and then enroll the user. As discussed, the system can be
configured to
require a baseline number of instances for enrollment. For example, enrollment
can require
>50 augmented biometric input images to maintain the health, accuracy, and
performance of
the recognition operations. In various embodiments, the system accepts
biometric input
image(s), morphs and homogenizes the lighting and contrast once, and discards
the original
images once encrypted representations are produced.
It is realized that that there is no intrinsic requirement to morph images for
prediction.
Thus, some embodiments are configured to morph/augment images only during
enrollment.
In other embodiments, the system can also be configured to homogenize images
submitted
for prediction (e.g., via HSL transforms, etc.). In some examples, homogenized
images used
during prediction can increase system performance when compared to non-
homogenized
images. According to some examples, image homogenization can be executed based
on
convenience libraries (e.g., in Python and JavaScript). According to some
embodiments,
during prediction the web client is configured to capture three images, morph
and
homogenize the lighting and contrast once, and then discards the original
images once
encrypted representations are generated.
In various embodiments, helper networks can be configured to support
transformation
of authentication credentials into encrypted representations by pre-trained
neural networks
(e.g., referred to as embedding networks or generation networks). The
embedding networks
can be tailored to specific authentication credential input. According to one
embodiment, the
system includes face, face + mask, and fingerprint embedding neural networks,
among others.
Where respective embedding networks are configured to transform the input
image to
distance measurable one-way homomorphic encryptions (e.g., embedding, or
vector
encryption) which can be a two-dimensional positional array of 128 floating-
point numbers.
In various implementations, face, face + mask, and fingerprint embedding
neural
networks maintain full accuracy through real-world boundary conditions. Real
world
conditions have been tested to include poor lighting; inconsistent camera
positioning;
expression; image rotation of up to 22.5'; variable distance; focus impacted
by blur and
movement; occlusions of 20-30% including facial hair, glasses, scars, makeup,
colored lenses
and filters, and abrasions; and B/W and grayscale images. In various
embodiments, the
embedding neural networks are architected on the MobileNetV2 architecture and
are
configured to output a one-way encrypted payload in <100ms.
-36-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In various embodiments, voice input can include additional processing. For
example,
the system can be configured to execute voice input segmentation that
generalizes the
enrollment data, improves accuracy and performance during prediction, and
allows the
system to handle real-world conditions. In various embodiments, the system is
configured to
require >50 10ms voice samples, to establish a desired level of accuracy and
performance. In
one example, the system is configured to capture voice instances based on a
sliding 10ms
window that can be captured across one second of voice input, which enables
the system to
reach or exceed 50 samples.
In some embodiments, the system is configured to execute pulse code modulation
to
reduce the input to two times the frequency range, and PCM enables the system
to use the
smallest possible Fourier transform without computational loss. In other
embodiments, the
system is configured to execute voice fast fourier transform (FFT) which
transforms the pulse
code modulated audio signal from the time domain to a representation in the
frequency
domain. According to some examples, the transform output is a 2-dimensional
array of
frequencies that can be input to a voice embedding DNN. For example, the
system can
include a voice embedding network that is configured to accept input of one 2-
dimensional
array of frequencies and transform the input to a 4kB, 2-dimensional
positional array of 128
floating-point numbers (e.g., cosine-measurable embedding and/or 1-way vector
encryption),
and then deletes the original biometric.
According to various embodiments, the web client can be configured to acquire
authentication credentials (e.g., biometrics) at the edge with or without a
network. For
example, the web client can be configured to automatically switch to a local
mode after
detection of loss of network. According to some embodiments, the web client
can support
offline operation ("local mode") using Edge computing. In one example, the
device in local
mode authenticates a user using face and fingerprint recognition, and can do
so in 10ms with
intermittent or no Internet connection as long as the user authenticates at
least once to the
device while online. In some embodiments, the device is configured to store
the user's
embeddings and/or encrypted feature vectors locally using a web storage API
during the
prediction.
Fig. 9 illustrates an example process flow 1500 for facial recognition
according to one
embodiment. At 1502 facial image data is processed by a face geometry neural
network
using a probe. As part of execution of 1502, the neural network operates to
transform the
input data into geometric primitives and uses the geometric primitives to
locate facial
structures including, for example, eyes, mouth, nose, chin, and other relevant
facial
-37-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
structures. Based on the analysis of the geometric primitives positional
information can be
output as part of 1502, and the positional information can be used in
subsequent processing
steps. For example, process 1500 can continue 1504 with processing via a face
validation
neural network. The processing of 1504 can include validation of the image
data is including
facial structures, information, and may employ the position information
developed in 1502.
In further example, processing and validation in 1502-1504 can include
operations to align an
input image on facial features and can include additional operations to crop
an input image
around relevant facial features (e.g., using position information). Process
1500 continues at
1506 with processing by an eyes open/closed neural network. The neural network
is
configured to detect whether facial input data includes transitions between
eyes open and
closed states, which is indicative of a live person or more specifically a
blinking person
during use of the authentication functions. According to some embodiments,
detection of
blinking can be used to validate "liveness" of authentication information
submission (e.g., not
spoofed submission).
According to some embodiments, the process flow 1500 can also include
operations
to detect whether the user is wearing glasses. For example, at 1508, submitted
user data can
be processed to determine if a submitted image includes the user wearing
eyeglasses or not.
In one example, an image capture is processed through a neural network (e.g.,
eyeglasses
on/off neural network) to determine if the image data includes the user
wearing eyeglasses or
not. The system can be configured to respond to the determination in a variety
of ways. In
one example if eyeglasses are detected a user may be requested to re-image
their face for
authentication. In other examples, the system can be configured to use
different neural
networks to process the image data. For example, a first neural network can be
configured to
process image data in which users are wearing glasses and a second different
neural network
to process image data of users (e.g., even the same user) when wearing
glasses. The state
determination glasses on/off can be used to select between such networks.
In some embodiments, process 1500 can include data augmentation operations.
For
example, at 1510, data augmentation can be executed to flip and rotate
acquired images,
and/or morph acquired images to achieve a system defined requisite number of
image
samples. Various embodiments are configured to confirm and validate input
authentication
information prior to performing data expansion operations (e.g., 1510).
Ensuring valid data
and filtering bad data ensures the accuracy of any resulting enrollment. In
another example at
1510, data augmentation neural networks can be employed to homogenize lighting
conditions
for submitted image data. In another example at 1510, data augmentation neural
networks can
-38-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
be employed to homogenize lighting conditions for submitted image data.
According to
various embodiments, multiple techniques can be used to augment and/or
homogenize the
lighting for a subject image. In one example, two homogenization techniques
are used to
update the image data.
As shown in process flow 1500, a number of steps can be executed prior to
creation of
encrypted feature vectors/embeddings that are one-way encrypted
representations of
submitted authentication inputs. In other embodiments, the processing can be
omitted and/or
executed in fewer steps and such process flows can be reduced to functions for
creation of
one-way encryptions of authentication credentials by an embedding network
(e.g., at 1512).
In still other embodiments, processing to validate authentication inputs can
be executed to
improve enrollment and subsequent authentication can be handled by other
processes and/or
systems.
According to various embodiments, the process 1500 includes steps 1502 through

1510 which can be performed by various helper networks that improve the data
provided for
.. enrollment and creation of one-way encryptions of submitted authentication
information that
are derived to be measurable in their encrypted form. For example, the
operations performed
at 1502 through 1510 can improve the data input to an embedding network that
is configured
to take a plain text input and produce a one-way encrypted output of the
authentication
information. As shown in the process flow 1500, once an encrypted
representation of an
authentication input is produced, the original authentication credential
(e.g., original
biometric) can be deleted at 1514.
Fig. 10 is an example process flow 1600 for biometric acquisition of a
fingerprint. At
1602, image data captured by a probe is transformed into geometric primitives
based on input
to a fingerprint geometry neural network (e.g., a fingerprint geometry DNN).
The neural
network can be configured to transform image data into geometric primitives
and locate
fingerprints within the image data based on analysis of the geometric
primitives, relative
spacing, boundaries, structures, etc. In some embodiments, output of the
fingerprint geometry
DNN can include positional information for fingerprints and/or characteristics
within the
image data.
In step 1604, submitted data can be processed to determine validity. For
example, the
image data can be input into a fingerprint validation neural network at 1604.
In one example,
the fingerprint validation neural network can be architected as a DNN. The
neural network
can be configured to validate a proper fingerprint capture exists in the image
data (e.g., based
on analysis of the image data by the neural network and/or geometric
primitives produced by
-39-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
the fingerprint geometry neural network). In further embodiments the
fingerprint validation
neural network can also be configured to determine the validity of the
submitted fingerprint
data. For example, the validity helper network can be configured to determine
that a live
sample (and not spoofed) is being presented, as well as validating the input
as a good
authentication data source.
Similar to process 1500, process 1600 includes operations to augment data
submission. Data augmentation (e.g., 1606) can be executed as part of
enrollment to ensure a
threshold number of data instances are provided during enrollment. In various
embodiment,
process flow 1600 is configured to validate authentication inputs to ensure
good inputs are
augmented for training further models.
In further examples, data augmentation can also be used during prediction
operations.
In one example, data augmentation during prediction can be limited to
homogenizing light
conditions for submitted image data (e.g., face image, fingerprint image,
other image, etc.).
According to one embodiment, fingerprint image data is manipulated to improve
the image
.. data and or create additional instances as part of data augmentation steps.
Manipulation can
include image flips, rotations, skews, offsets, cropping, among other options.
Operations
executed during data augmentation can also include homogenization of the
lighting
conditions for an input image (e.g., transform into HSL). Various lighting
homogenization
functions can be executed on the image data. In one example, the system is
configured to
.. execute at least two homogenization techniques to standardize lighting
conditions. According
to some embodiments, the operations of 1606 can also include conversion of the
image to a
grayscale image.
Steps 1602 through 1606 can be executed to improve and/or prepare fingerprint
image
data for enrollment by a fingerprint embedding neural network (e.g., at 1608).
The fingerprint
.. embedding neural network is configured to generate one-way distance
measurable encrypted
representations of input authentication credentials. For example, the
fingerprint embedding
neural network can be architected as a deep neural network. The fingerprint
embedding DNN
can be configured to create one-way homomorphic encryptions of input
fingerprint data.
Once the encrypted representations are produced, the encrypted representations
can be used
.. in subsequent operations (e.g., classification and/or prediction), and the
process flow 1600
can include a step (e.g., 1610) to delete any original authentication
credential information,
including any original biometric.
Fig. 11 is an example process flow 1700 for acquisition of vocal
authentication
credentials. According to one embodiment, process 1700 can begin based on
transformation
-40-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
of voice data captured by a probe at 1702. According to one example, input
voice data is
transformed based on voice pulse code modulation (PCM). Processing of the
audio data can
include capturing samples of time segments from the audio information. In one
example,
silence is removed from the audio information and PCM is executed against one
second
samples from the remaining audio data. In other embodiments, different sample
sizes can be
used to achieve a minimum number of authentication instances for enrollment
and/or
prediction. According to some embodiments, the PCM operation is configured to
down
sample the audio information to two times the frequency range. In other
embodiments
different down sampling frequencies can be used. Once PCM is complete at 1702,
process
1700 continues at 1704 with a fourier transformation of the PCM signal from
the time
domain to the frequency domain. According to some embodiments, a voice fast
fourier
transformation operation is executed at 1704 to produce the frequency domain
output.
Process 1700 continues at 1706, where the frequency domain output of 1704 can
be
input into a voice embedding neural network. According to some embodiments,
the voice
embedding neural network can include or be based on a deep neural network
architecture. As
discussed herein, the embedding neural network is configured to produce a one-
way
encryption of input authentication information. In this example, the voice
embedding DNN is
configured to generate an encrypted representation of audio/voice data that is
geometrically
measurable (e.g., cosine measurable). Once the encrypted representation is
generated, any
original authentication information can be deleted at 1708. For example, once
the voice
embedding DNN produces its encryption, the original audio input can be deleted
to preserve
privacy.
Modifications and variations of the discussed embodiments will be apparent to
those
of ordinary skill in the art and all such modifications and variations are
included within the
scope of the appended claims. For example, while many examples and embodiments
are
discussed above with respect to a user or person, and
identification/authentication of same, it
is realized that the system can identify and/or authenticate any item or thing
or entity for
which image capture is possible (e.g., family pet, heirloom, necklace, ring,
landscape, etc.) or
other type of digital capture is possible (e.g., ambient noise in a location,
song, signing,
specific gestures by an individual, sign language movements, words in sign
language, etc.).
Once digitally captured the object of identification/authentication can be
processed by a first
generation/embedding network, whose output is used to train a second
classification network,
enabling identification of the object in both distance measure and
classification settings on
fully encrypted identifying information. In further aspects, the
authentication systems (e.g.,
-41-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
embedding and classification networks) are protected by various helper
networks that process
and validate authentication data as good or bad sources of data. Filtering of
bad data sources
protects subsequent embedding models and yields authentication systems that
are more
accurate and flexible than conventional approaches.
An illustrative computer system on which the discussed functions, algorithms,
and/or
neural network can be implements is shown by way of computer system 1200, FIG.
12,
which may be used in connection with any of the embodiments of the disclosure
provided
herein. The computer system 1200 may include one or more processors 1210 and
one or
more articles of manufacture that comprise non-transitory computer-readable
storage media
(e.g., memory 1220 and one or more non-volatile storage media 1230). The
processor 1210
may control writing data to and reading data from the memory 1220 and the non-
volatile
storage device 1230 in any suitable manner. To perform any of the
functionality described
herein, the processor 1210 may execute one or more processor-executable
instructions stored
in one or more non-transitory computer-readable storage media (e.g., the
memory 1220),
which may serve as non-transitory computer-readable storage media storing
processor-
executable instructions for execution by the processor 1210.
Private Biometric Implementation (Figs. 1-5d)
Various embodiments are discussed below for enrolling users with private
biometrics
and prediction on the same. Various embodiments describe some considerations
broadly and
.. provide illustrative examples for implementation of private biometrics.
These examples and
embodiments can be used with liveness verification of the respective private
biometrics as
discussed above. Further embodiments can include and/or be coupled with
various helper
networks to facilitate authentication information acquisition, validation,
and/or enrollment of
the same, and establish a fully private implementation for identification and
authentication.
Fig. 13 is an example process flow 2100 for enrolling in a privacy-enabled
biometric
system (e.g., Fig. 3, 304 described in greater detail below or Fig. 7, 704
below). Process 2100
begins with acquisition of unencrypted biometric data at 2102. The unencrypted
biometric data
(e.g., plaintext, reference biometric, etc.) can be directly captured on a
user device, received
from an acquisition device, or communicated from stored biometric information.
In one
.. example, a user takes a photo of themselves on their mobile device for
enrollment. Pre-
processing steps can be executed on the biometric information at 2104. For
example, given a
photo of a user, pre-processing can include cropping the image to significant
portions (e.g.,
around the face or facial features). Various examples exist of photo
processing options that
can take a reference image and identify facial areas automatically.
-42-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In another example, the end user can be provided a user interface that
displays a
reference area, and the user is instructed to position their face from an
existing image into the
designated area. Alternatively, when the user takes a photo, the identified
area can direct the
user to focus on their face so that it appears within the highlighted area. In
other options, the
system can analyze other types of images to identify areas of interest (e.g.,
iris scans, hand
images, fingerprint, etc.) and crop images accordingly. In yet other options,
samples of voice
recordings can be used to select data of the highest quality (e.g., lowest
background noise), or
can be processed to eliminate interference from the acquired biometric (e.g.,
filter out
background noise).
Having a given biometric, the process 2100 continues with generation of
additional
training biometrics at 2106. For example, a number of additional images can be
generated
from an acquired facial image. In one example, an additional twenty five
images are created
to form a training set of images. In some examples, as few as three or even
one images can be
used but with the tradeoff of reduced accuracy. In other examples, as many as
forty training
images may be created or acquired. The training set is used to provide for
variation of the
initial biometric information, and the specific number of additional training
points can be
tailored to a desired accuracy (see e.g., Tables 1-VIII below provide example
implementation
and test results).
Other embodiments can omit generation of additional training biometrics.
Various
ranges of training set production can be used in different embodiments (e.g.,
any set of images
from two to one thousand). For an image set, the training group can include
images of different
lighting, capture angle, positioning, etc. For audio based biometrics
different background
noises can be introduced, different words can be used, different samples from
the same vocal
biometric can be used in the training set, among other options. Various
embodiments of the
system are configured to handle multiple different biometric inputs including
even health
profiles that are based at least in part on health readings from health
sensors (e.g., heart rate,
blood pressure, EEG signals, body mass scans, genome, etc.), and can, in some
examples,
include behavioral biometric capture/processing. According to various
embodiments,
biometric information includes Initial Biometric Values (IBV) a set of
plaintext values
(pictures, voice, SSNO, driver's license number, etc.) that together define a
person.
At 2108, feature vectors are generated from the initial biometric information
(e.g., one
or more plain text values that identify an individual). Feature vectors are
generated based on
all available biometric information which can include a set of and training
biometrics generated
from the initial unencrypted biometric information received on an individual
or individuals.
-43-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to one embodiment, the IBV is used in enrollment and for example in
process 2100.
The set of IBVs are processed into a set of initial biometric vectors (e.g.,
encrypted feature
vectors) which are used downstream in a subsequent neural network.
In one implementation, users are directed to a website to input multiple data
points for
biometric information (e.g., multiple pictures including facial images), which
can occur in
conjunction with personally identifiable information ("PIT"). The system
and/or execution of
process 2100 can include tying the PIT to encryptions of the biometric as
discussed below.
In one embodiment, a convolutional deep neural network is executed to process
the
unencrypted biometric information and transform it into feature vector(s)
which have a
property of being one-way encrypted cipher text. The neural network is applied
(2108) to
compute a one-way homomorphic encryption of the biometric ¨ resulting in
feature vectors
(e.g., at 2110). These outputs can be computed from an original biometric
using the neural
network but the values are one-way in that the neural network cannot then be
used to regenerate
the original biometrics from the outputs.
Various embodiments employ networks that take as input a plaintext input and
return
Euclidean measurable output. One such implementation is FaceNet which takes in
any image
of a face and returns 128 floating point numbers, as the feature vector. The
neural network is
fairly open ended, where various implementations are configured to return a
distance or
Euclidean measurable feature vector that maps to the input. This feature
vector is nearly
impossible to use to recreate the original input biometric and is therefore
considered a one-way
encryption.
Various embodiments are configured to accept the feature vector(s) produced by
a first
neural network and use it as input to a new neural network (e.g., a second
classifying neural
network). According to one example, the new neural network has additional
properties. This
neural network is specially configured to enable incremental training (e.g.,
on new users and/or
new feature vectors) and configured to distinguish between a known person and
an unknown
person. In one example, a fully connected neural network with 2 hidden layers
and a "hinge"
loss function is used to process input feature vectors and return a known
person identifier (e.g.,
person label or class) or indicate that the processed biometric feature
vectors are not mapped
to a known person. For example, the hinge loss function outputs one or more
negative values
if the feature vector is unknown. In other examples, the output of the second
neural network
is an array of values, wherein the values and their positions in the array
determined a match to
a person or identification label.
-44-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Various embodiments use different machine learning models for capturing
feature
vectors in the first network. According to various embodiments, the feature
vector capture is
accomplished via a pre-trained neural network (including, for example, a
convolutional neural
network) where the output is distance measurable (e.g., Euclidean measurable).
In some
examples, this can include models having a softmax layer as part of the model,
and capture of
feature vectors can occur preceding such layers. Feature vectors can be
extracted from the pre-
trained neural network by capturing results from the layers that are Euclidean
measurable. In
some examples, the softmax layer or categorical distribution layer is the
final layer of the
model, and feature vectors can be extracted from the n-1 layer (e.g., the
immediately preceding
layer). In other examples, the feature vectors can be extracted from the model
in layers
preceding the last layer. Some implementations may offer the feature vector as
the last layer.
In some embodiments, an optional step can be executed as part of process 2100
(not
shown). The optional step can be executed as a branch or fork in process 2100
so that
authentication of a user can immediately follow enrollment of a new user or
authentication
information. In one example, a first phase of enrollment can be executed to
generate encrypted
feature vectors. The system can use the generated encrypted feature vectors
directly for
subsequent authentication. For example, distance measures can be application
to determine a
distance between enrolled encrypted feature vectors and a newly generated
encrypted feature
vector. Where the distance is within a threshold, the user can be
authenticated or an
authentication signal returned. In various embodiments, this optional
authentication approach
can be used while a classification network is being trained on encrypted
feature vectors in the
following steps.
The resulting feature vectors are bound to a specific user classification at
2112. For
example, deep learning is executed at 2112 on the feature vectors based on a
fully connected
neural network (e.g., a second neural network, an example classifier network).
The execution
is run against all the biometric data (i.e., feature vectors from the initial
biometric and training
biometric data) to create the classification information. According to one
example, a fully
connected neural network having two hidden layers is employed for
classification of the
biometric data. In another example, a fully connected network with no hidden
layers can be
used for the classification. However, the use of the fully connected network
with two hidden
layers generated better accuracy in classification in some example executions
(see e.g., Tables
1-VIII described in greater detail below). According to one embodiment,
process 2100 can be
executed to receive an original biometric (e.g., at 2102) generate feature
vectors (e.g., 2110),
and apply a FCNN classifier to return a label for identification at 2112
(e.g., output #people).
-45-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In further embodiments, step 2112 can also include filtering operations
executed on the
encrypted feature vectors before binding the vectors to a label via training
the second network.
For example, encrypted feature vectors can be analyzed to determine if they
are within a certain
distance of each other. Where the generated feature vectors are too far apart,
they can be
rejected for enrollment (i.e., not used to train the classifier network). In
other examples, the
system is configured to request additional biometric samples, and re-evaluate
the distance
threshold until satisfied. In still other examples, the system rejects the
encrypted biometrics
and request new submissions to enroll.
Process 2100 continues with discarding any unencrypted biometric data at 2114.
In
one example, an application on the user's phone is configured to enable
enrollment of captured
biometric information and configured to delete the original biometric
information once
processed (e.g., at 2114). In other embodiments, a server system can process
received
biometric information and delete the original biometric information once
processed.
According to some aspects, only requiring that original biometric information
exists for a short
period during processing or enrollment significantly improves the security of
the system over
conventional approaches. For example, systems that persistently store or
employ original
biometric data become a source of vulnerability. Unlike a password that can be
reset, a
compromised biometric remains compromised, virtually forever.
Returning to process 2100, at 2116 the resulting cipher text (e.g., feature
vectors)
biometric is stored. In one example, the encrypted biometric can be stored
locally on a user
device. In other examples, the generated encrypted biometric can be stored on
a server, in the
cloud, a dedicated data store, or any combination thereof. In one example, the
encrypted
biometrics and classification is stored for use in subsequent matching or
searching. For
instance, new biometric information can be processed to determine if the new
biometric
information matches any classifications. The match (depending on a probability
threshold) can
then be used for authentication or validation.
In cases where a single match is executed, the neural network model employed
at 2112
can be optimized for one to one matching. For example, the neural network can
be trained on
the individual expected to use a mobile phone (assuming no other authorized
individuals for
the device). In some examples, the neural network model can include training
allocation to
accommodate incremental training of the model on acquired feature vectors over
time. Various
embodiments, discussed in great detail below incorporate incremental training
operations for
the neural network to permit additional people and to incorporate newly
acquired feature
vectors.
-46-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
In other embodiments, an optimized neural network model (e.g., FCNN) can be
used
for a primary user of a device, for example, stored locally, and remote
authentication can use a
data store and one to many models (e.g., if the first model returns unknown).
Other
embodiments may provide the one to many models locally as well. In some
instances, the
authentication scenario (e.g., primary user or not) can be used by the system
to dynamically
select a neural network model for matching, and thereby provide additional
options for
processing efficiency.
Fig. 14 illustrates an example process 2200 for authentication with secured
biometric
data. Process 2200 begins with acquisition of multiple unencrypted biometrics
for analysis at
2202. In one example, the privacy-enabled biometric system is configured to
require at least
three biometric identifiers (e.g., as plaintext data, reference biometric, or
similar identifiers).
If for example, an authentication session is initiated, the process can be
executed so that it only
continues to the subsequent steps if a sufficient number of biometric samples
are taken, given,
and/or acquired. The number of required biometric samples can vary, and take
place with as
few as one.
Similar to process 222100, the acquired biometrics can be pre-processed at
2204 (e.g.,
images cropped to facial features, voice sampled, iris scans cropped to
relevant portions, etc.).
Once pre-processing is executed the biometric information is transformed into
a one-way
homomorphic encryption of the biometric information to acquire the feature
vectors for the
biometrics under analysis (e.g., at 2206). Similar to process 222100, the
feature vectors can be
acquired using any pre-trained neural network that outputs distance measurable
encrypted
feature vectors (e.g., Euclidean measurable feature vectors, homomorphic
encrypted feature
vectors, among other options). In one example, this includes a pre-trained
neural network that
incorporates a softmax layer. However, other examples do not require the pre-
trained neural
network to include a softmax layer, only that they output Euclidean measurable
feature vectors.
In one example, the feature vectors can be obtained in the layer preceding the
softmax layer as
part of step 2206.
In various embodiments, authentication can be executed based on comparing
distances
between enrolled encrypted biometrics and subsequently created encrypted
biometrics. In
further embodiments, this is executed as a first phase of authentication. Once
a classifying
network is trained on the encrypted biometrics a second phase of
authentication can be used,
and authentication determinations made via 2208.
According to some embodiments, the phases of authentication can be executed
together
and even simultaneously. In one example, an enrolled user will be
authenticated using the
-47-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
classifier network (e.g., second phase), and a new user will be authenticated
by comparing
distances between encrypted biometrics (e.g., first phase). As discussed, the
new user will
eventually be authenticated using a classifier network trained on the new
user's encrypted
biometric information, once the classifier network is ready.
At 2208, a prediction (e.g., a via deep learning neural network) is executed
to determine
if there is a match for the person associated with the analyzed biometrics. As
discussed above
with respect to process 2100, the prediction can be executed as a fully
connected neural network
having two hidden layers (during enrollment the neural network is configured
to identify input
feature vectors as (previously enrolled) individuals or unknown, and an
unknown individual
(not previously enrolled) can be added via incremental training or full
retraining of the model).
In other examples, a fully connected neural network having no hidden layers
can be used.
Examples of neural networks are described in greater detail below (e.g., Figs.
17-20 illustrates
an example neural network). Other embodiments of the neural network can be
used in process
2200. According to some embodiments, the neural network features include
operates as a
classifier during enrollment to map feature vectors to identifications;
operates as a predictor to
identify a known person or an unknown. In some embodiments, different neural
networks can
be tailored to different types of biometrics, and facial images processed by
one, while voice
biometrics are processed by another.
According to some embodiments, process 2208 is described agnostic to submitter
security. In other words, process 2200 relies on front end application
configuration to ensure
submitted biometrics are captured from the person trying to authenticate. As
process 2200 is
agnostic to submitter security, the process can be executed in local and
remote settings in the
same manner. However, according to some implementations the execution relies
on the native
application or additional functionality in an application to ensure an
acquired biometric
represents the user to be authenticated or matched.
Fig. 15 illustrates an example process flow 2250 showing additional details
for a one to
many matching execution (also referred to as prediction). According to one
embodiment,
process 2250 begins with acquisition of feature vectors (e.g., step 2206 of
Fig. 22A or 2110 of
Fig. 13). At 2254, the acquired feature vectors are matched against existing
classifications via
a deep learning neural network. In one example, the deep learning neural
network has been
trained during enrollment on s set of individuals. The acquired feature
vectors will be
processed by the trained deep learning network to predict if the input is a
match to known
individual or does not match and returns unknown. In one example, the deep
learning network
-48-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
is a fully connected neural network ("FCNN"). In other embodiments, different
network
models are used for the second neural network.
According to one embodiment, the FCNN outputs an array of values. These
values,
based on their position and the value itself, determine the label or unknown.
According to one
embodiment, returned from a one to many case are a series of probabilities
associated with the
match ¨ assuming five people in the trained data: the output layer showing
probability of match
by person: [0.1, 0.9, 0.3, 0.2, 0.1] yields a match on Person 2 based on a
threshold set for the
classifier (e.g., > .5). In another run, the output layer: [0.1, 0.6, 0.3,
0.8, 0.1] yields a match on
Person 2 & Person 4 (e.g., using the same threshold).
However, where two results exceed the match threshold, the process and or
system is
configured to select the maximum value and yield a (probabilistic) match
Person 4. In another
example, the output layer: [0.1, 0.2, 0.3, 0.2, 0.1] shows no match to a known
person ¨ hence
an UNKNOWN person - as no values exceed the threshold. Interestingly, this may
result in
adding the person into the list of authorized people (e.g., via enrollment
discussed above), or
this may result in the person being denied access or privileges on an
application. According to
various embodiments, process 250 is executed to determine if the person is
known or not. The
functions that result can be dictated by the application that requests
identification of an
analyzed biometrics.
For an UNKNOWN person, i.e. a person never trained to the deep learning
enrollment
and prediction neural network, an output layer of an UNKNOWN person looks like
[-0.7, -1.7,
-6.0, -4.3]. In this case, the hinge loss function has guaranteed that the
vector output is all
negative. This is the case of an UNKNOWN person. In various embodiments, the
deep
learning neural network must have the capability to determine if a person is
UNKNOWN.
Other solutions that appear viable, for example, support vector machine
("SVM") solutions
break when considering the UNKNOWN case. In one example, the issue is
scalability. An
svm implementation cannot scale in the many-to-many matching space becoming
increasing
unworkable until the model simply cannot be used to return a match in any time
deemed
functional (e.g., 100 person matching cannot return a result in less than 20
minutes). According
to various embodiments, the deep learning neural network (e.g., an enrollment
& prediction
neural network) is configured to train and predict in polynomial time.
Step 2256 can be executed to vote on matching. According to one embodiment,
multiple images or biometrics are processed to identify a match. In an example
where three
images are processed the FCNN is configured to generate an identification on
each and use
each match as a vote for an individual's identification. Once a majority is
reached (e.g., at least
-49-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
two votes for person A) the system returns as output identification of person
A. In other
instance, for example, where there is a possibility that an unknown person may
result ¨ voting
can be used to facilitate determination of the match or no match. In one
example, each result
that exceeds the threshold probability can count as one vote, and the final
tally of votes (e.g.,
often 4 out of 5) is used to establish the match. In some implementations, an
unknown class
may be trained in the model ¨ in the examples above a sixth number would
appear with a
probability of matching the unknown model. In other embodiments, the unknown
class is not
used, and matching is made or not against known persons. Where a sufficient
match does not
result, the submitted biometric information is unknown.
Responsive to matching on newly acquired biometric information, process 2250
can
include an optional step 2258 for retraining of the classification model. In
one example, a
threshold is set such that step 2258 tests if a threshold match has been
exceeded, and if yes, the
deep learning neural network (e.g., classifier & prediction network) is
retrained to include the
new feature vectors being analyzed. According to some embodiments, retraining
to include
newer feature vectors permits biometrics that change over time (e.g., weight
loss, weight gain,
aging or other events that alter biometric information, haircuts, among other
options).
Fig. 16 is a block diagram of an example privacy-enabled biometric system
2304.
According to some embodiments, the system can be installed on a mobile device
or called from
a mobile device (e.g., on a remote server or cloud based resource) to return
an authenticated or
not signal. In various embodiments system 2304 can executed any of the
preceding processes.
For example, system 2304 can enroll users (e.g., via process 2100), identify
enrolled users (e.g.,
process 2200), and search for matches to users (e.g., process 2250).
According to various embodiments, system 2304 can accept, create or receive
original
biometric information (e.g., input 2302). The input 2302 can include images of
people, images
of faces, thumbprint scans, voice recordings, sensor data, etc. A biometric
processing
component (e.g., 2308) can be configured to crop received images, sample voice
biometrics,
etc., to focus the biometric information on distinguishable features (e.g.,
automatically crop
image around face). Various forms of pre-processing can be executed on the
received
biometrics, designed to limit the biometric information to important features.
In some
embodiments, the pre-processing (e.g., via 2308) is not executed or available.
In other
embodiments, only biometrics that meet quality standards are passed on for
further processing.
Processed biometrics can be used to generate additional training data, for
example, to
enroll a new user. A training generation component 2310 can be configured to
generate new
biometrics for a user. For example, the training generation component can be
configured to
-50-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
create new images of the users face having different lighting, different
capture angles, etc., in
order to build a train set of biometrics. In one example, the system includes
a training threshold
specifying how many training samples to generate from a given or received
biometric. In
another example, the system and/or training generation component 2310 is
configured to build
twenty five additional images from a picture of a user's face. Other numbers
of training images,
or voice samples, etc., can be used.
The system is configured to generate feature vectors from the biometrics
(e.g., process
images from input and generated training images). In some examples, the system
2304 can
include a feature vector component 2312 configured to generate the feature
vectors. According
to one embodiment, component 2312 executes a convolution neural network
("CNN"), where
the CNN includes a layer which generates Euclidean measurable output. The
feature vector
component 2312 is configured to extract the feature vectors from the layers
preceding the
softmax layer (including for example, the n-1 layer). As discussed above,
various neural
networks can be used to define feature vectors tailored to an analyzed
biometric (e.g., voice,
image, health data, etc.), where an output of or with the model is Euclidean
measurable. Some
examples of these neural networks include model having a softmax layer. Other
embodiments
use a model that does not include a softmax layer to generate Euclidean
measurable vectors.
Various embodiments of the system and/or feature vector component are
configured to
generate and capture feature vectors for the processed biometrics in the layer
or layer preceding
the softmax layer.
According to another embodiment, the feature vectors from the feature vector
component 2312 or system 2304 are used by the classifier component 2314 to
bind a user to a
classification (i.e., mapping biometrics to a match able /searchable
identity). According to one
embodiment, the deep learning neural network (e.g., enrollment and prediction
network) is
executed as a FCNN trained on enrollment data. In one example, the FCNN
generates an
output identifying a person or indicating an UNKNOWN individual (e.g., at
2306). Other
examples, use not fully connected neural networks.
According to various embodiments, the deep learning neural network (e.g.,
which can
be an FCNN) must differentiate between known persons and the UNKNOWN. In some
examples, this can be implemented as a sigmoid function in the last layer that
outputs
probability of class matching based on newly input biometrics or showing
failure to match.
Other examples achieve matching based on a hinge loss functions.
In further embodiments, the system 2304 and/or classifier component 2314 are
configured to generate a probability to establish when a sufficiently close
match is found. In
-51-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
some implementations, an unknown person is determined based on negative return
values. In
other embodiments, multiple matches can be developed and voting can also be
used to increase
accuracy in matching.
Various implementations of the system have the capacity to use this approach
for more
than one set of input. The approach itself is biometric agnostic. Various
embodiments employ
feature vectors that are distance measurable and/or Euclidean measurable,
which is generated
using the first neural network. In some instances, different neural networks
are configured to
process different types of biometrics. Using that approach the encrypted
feature vector
generating neural network may be swapped for or use a different neural network
in conjunction
with others where each is capable of creating a distance and/or Euclidean
measurable feature
vector based on the respective biometric. Similarly, the system may enroll in
two or more
biometric types (e.g., use two or more vector generating networks) and predict
on the feature
vectors generated for both (or more) types of biometrics using both neural
networks for
processing respective biometric type simultaneously. In one embodiment,
feature vectors from
each type of biometric can likewise be processed in respective deep learning
networks
configured to predict matches based on feature vector inputs or return
unknown. The
simultaneous results (e.g., one from each biometric type) may be used to
identify using a voting
scheme or may better perform by firing both predictions simultaneously
According to further embodiments, the system can be configured to incorporate
new
identification classes responsive to receiving new biometric information. In
one embodiment,
the system 2304 includes a retraining component configured to monitor a number
of new
biometrics (e.g., per user/identification class or by total number of new
biometrics) and
automatically trigger a re-enrollment with the new feature vectors derived
from the new
biometric information (e.g., produced by 2312). In other embodiments, the
system can be
configured to trigger re-enrollment on new feature vectors based on time or
time period
elapsing.
The system 2304 and/or retraining component 316 can be configured to store
feature
vectors as they are processed, and retain those feature vectors for retraining
(including for
example feature vectors that are unknown to retrain an unknown class in some
examples).
Various embodiments of the system are configured to incrementally retrain the
model on
system assigned numbers of newly received biometrics. Further, once a system
set number of
incremental retraining have occurred the system is further configured to
complete a full retrain
of the model. The variables for incremental retraining and full retraining can
be set on the
system via an administrative function. Some defaults include incremental
retrain every 3, 4, 5,
-52-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
6 identifications, and full retrain every 3, 4, 5, 6, 7, 8, 9, 10 incremental
retrains. Additionally,
this requirement may be met by using calendar time, such as retraining once a
year. These
operations can be performed on offline (e.g., locked) copies of the model, and
once complete
the offline copy can be made live.
Additionally, the system 2304 and/or retraining component 2316 is configured
to
update the existing classification model with new users/identification
classes. According to
various embodiments, the system builds a classification model for an initial
number of users,
which can be based on an expected initial enrollment. The model is generated
with empty or
unallocated spaces to accommodate new users. For example, a fifty user base is
generated as
a one hundred user model. This over allocation in the model enables
incremental training to
be executed on the classification model. When a new user is added, the system
is and/or
retraining component 316 is configured to incrementally retrain the
classification model ¨
ultimately saving significant computation time over convention retraining
executions. Once
the over allocation is exhausted (e.g., 100 total identification classes) a
full retrain with an
additional over allocation can be made (e.g., fully retrain the 100 classes to
a model with 150
classes). In other embodiments, an incremental retrain process can be executed
to add
additional unallocated slots.
Even with the reduced time retraining, the system can be configured to operate
with
multiple copies of the classification model. One copy may be live that is used
for authentication
or identification. A second copy may be an updated version, that is taken
offline (e.g., locked
from access) to accomplish retraining while permitting identification
operations to continue
with a live model. Once retraining is accomplished, the updated model can be
made live and
the other model locked and updated as well. Multiple instances of both live
and locked models
can be used to increase concurrency.
According to some embodiments, the system 2300 can receive encrypted feature
vectors instead of original biometrics and processing original biometrics can
occur on different
systems ¨ in these cases system 2300 may not include, for example, 2308, 2310,
2312, and
instead receive feature vectors from other systems, components or processes.
Figs. 17-20 illustrate example embodiments of a classifier network. The
embodiments
show a fully connected neural network for classifying feature vectors for
training and for
prediction. Other embodiments implement different neural networks, including
for example,
neural networks that are not fully connected. Each of the networks accepts
distance and/or
Euclidean measurable feature vectors and returns a label or unknown result for
prediction or
binds the feature vectors to a label during training.
-53-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Figs. 21-24 illustrate examples of processing that can be performed on input
biometrics
(e.g., facial image) using a neural network. Encrypted feature vectors can be
extracted from
such neural networks and used by a classifier (e.g., Figs. 21-24) during
training or prediction
operations. According to various embodiments, the system implements a first
pre-trained
neural network for generating distance and/or Euclidean measurable feature
vectors that are
used as inputs for a second classification neural network. In other
embodiments, other neural
networks are used to process biometrics in the first instance. In still other
examples, multiple
neural networks can be used to generate Euclidean measurable feature vectors
from
unencrypted biometric inputs each may feed the feature vectors to a respective
classifier. In
some examples, each generator neural network can be tailored to a respective
classifier neural
network, where each pair (or multiples of each) is configured to process a
biometric data type
(e.g., facial image, iris images, voice, health data, etc.).
User Interface Examples
According to some embodiments, the user interface screens can include visual
representations showing operation of helper network functions or operations to
support helper
network functions. For example, and eye blink status can be displayed in the
user interface
showing a lockout condition that prevents further operation until a threshold
number of eye
blinks are detected. In other examples, the user interface can display a
detected mask status, a
detected glasses status, among other options. Depending on system
configuration, the detected
status can prevent advancement or authentication until remedial action is
taken ¨ remove mask,
remove glasses, etc. In other embodiments, the system can use detected
statuses to select
further authentication steps (e.g., tailor selection of embedding networks and
associated
classification networks, among other options).
Implementation Examples
The following example instantiations are provided to illustrate various
aspects of
privacy-enabled biometric systems and processes. The examples are provided to
illustrate
various implementation details and provide illustration of execution options
as well as
efficiency metrics. Any of the details discussed in the examples can be used
in conjunction
with various embodiments.
It is realized that conventional biometric solutions have security
vulnerability and
efficiency/scalability issues. Apple, Samsung, Google and MasterCard have each
launched
biometric security solutions that share at least three technical limitations.
These solutions are
(1) unable to search biometrics in polynomial time; (2) do not one-way encrypt
the reference
biometric; and (3) require significant computing resources for confidentiality
and matching.
-54-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Modern biometric security solutions are unable to scale (e.g. Apple Face IDTM
authenticates only one user) as they are unable to search biometrics in
polynomial time. In
fact, the current "exhaustive search" technique requires significant computing
resources to
perform a linear scan of an entire biometric datastore to successfully one-to-
one record match
each reference biometric and each new input record ¨ this is as a result of
inherent variations
in the biometric instances of a single individual.
Similarly, conventional solutions are unable to one-way encrypt the reference
biometric
because exhaustive search (as described above) requires a decryption key and a
decryption to
plaintext in the application layer for every attempted match. This limitation
results in an
unacceptable risk in privacy (anyone can view a biometric) and authentication
(anyone can use
the stolen biometric). And, once compromised, a biometric -- unlike a password
-- cannot be
reset.
Finally, modern solutions require the biometric to return to plaintext in
order to match
since the encrypted form is not Euclidean measurable. It is possible to choose
to make a
biometric two-way encrypted and return to plaintext -- but this requires
extensive key
management and, since a two-way encrypted biometric is not Euclidean
measurable, it also
returns the solution to linear scan limitations.
Various embodiments of the privacy-enabled biometric system and/or methods
provide
enhancement over conventional implementation (e.g., in security, scalability,
and/or
management functions). Various embodiments enable scalability (e.g., via
"encrypted search")
and fully encrypt the reference biometric (e.g., "encrypted match"). The
system is configured
to provide an "identity" that is no longer tied independently to each
application and a further
enables a single, global "Identity Trust Store" that can service any identity
request for any
application.
Various operations are enabled by various embodiments, and the functions
include. For
example:
- Encrypted Match: using the techniques described herein, a deep neural
network
("DNN") is used to process a reference biometric to compute a one-way,
homomorphic
encryption of the biometric before transmitting or storing any data. This
allows for
computations and comparisons on cipher texts without decryption, and ensures
that
only the distance and/or Euclidean measurable, homomorphic encrypted biometric
is
available to execute subsequent matches in the encrypted space. The plaintext
data can
then be discarded and the resultant homomorphic encryption is then transmitted
and
stored in a datastore. This example allows for computations and comparisons on
cipher
-55-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
texts without decryption and ensures that only the Euclidean measurable,
homomorphic
encrypted biometric is available to execute subsequent matches in the
encrypted space.
- Encrypted Search: using the techniques described herein, encrypted search is
done in
polynomial time according to various embodiments. This allows for comparisons
of
biometrics and achieve values for comparison that indicate "closeness" of two
biometrics to one another in the encrypted space (e.g. a biometric to a
reference
biometric) while at the same time providing for the highest level of privacy.
Various examples detail implementation of one-to-many identification using,
for
example, the N-1 layer of a deep neural network. The various techniques are
biometric
agnostic, allowing the same approach irrespective of the biometric or the
biometric type. Each
biometric (face, voice, IRIS, etc.) can be processed with a different, fully
trained, neural
network to create the biometric feature vector.
According to some aspects, an issue with current biometric schemes is they
require a
mechanism for: (1) acquiring the biometric, (2) plaintext biometric match, (3)
encrypting the
biometric, (4) performing a Euclidean measurable match, and (5) searching
using the second
neural network prediction call. To execute steps 1 through 5 for every
biometric is time
consuming, error prone and frequently nearly impossible to do before the
biometric becomes
deprecated. One goal with various embodiments, is to develop schemes,
techniques and
technologies that allow the system to work with biometrics in a privacy
protected and
polynomial-time based way that is also biometric agnostic. Various embodiments
employ
machine learning to solve problems issues with (2)-(5).
According to various embodiments, assumed is or no control over devices such
as
cameras or sensors that acquire the to be analyzed biometrics (thus arriving
as plain text).
According to various embodiments, if that data is encrypted immediately and
only process the
biometric information as cipher text, the system provides the maximum
practical level of
privacy. According to another aspect, a one-way encryption of the biometric,
meaning that
given cipher text, there is no mechanism to get to the original plaintext,
reduces/eliminates the
complexity of key management of various conventional approaches. Many one-way
encryption algorithms exist, such as MD5 and SHA-512 - however, these
algorithms are not
homomorphic because they are not Euclidean measurable. Various embodiments
discussed
herein enable a general purpose solution that produces biometric cipher text
that is Euclidean
measurable using a neural network. Apply a classifying algorithm to the
resulting feature
vectors enables one-to-many identification. In various examples, this
maximizes privacy and
runs between 0(n) = 1 and 0(n) = log(n) time.
-56-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
As discussed above, some capture devices can encrypt the biometric via a one-
way
encryption and provide feature vectors directly to the system. This enables
some embodiments,
to forgo biometric processing components, training generation components, and
feature vector
generation components, or alternatively to not use these elements for already
encrypted feature
vectors.
Example Execution and Accuracy
In some executions, the system is evaluated on different numbers of images per
person
to establish ranges of operating parameters and thresholds. For example, in
the experimental
execution the num-epochs establishes the number of interactions which can be
varied on the
system (e.g., between embodiments, between examples, and between executions,
among other
options). The LFW dataset is taken from the known labeled faces in the wild
data set. Eleven
people is a custom set of images and faces94 from the known source ¨ faces94.
For our
examples, the epochs are the number of new images that are morphed from the
original images.
So if the epochs are 25, and we have 10 enrollment images, then we train with
250 images.
The morphing of the images changed the lighting, angels and the like to
increase the accuracy
in training.
TABLE I
(fully connected neural network model with 2 hidden layers + output sigmoid
layer):
)=- Input => [100, 50] => num people (train for 100 people given 50
individuals to
identify). Other embodiments improve over these accuracies for the UNKNOWN.
Dataset Training Test UNKNOWN #images Amages Parameters Accuracy
Accuracy
Set Set PERSON in Test Set in UNKNOWN in Test set ill
UNKNOWN
Set PERSON Set
PERSON Set
LFW 70% 30% 11 people 1304 257 m4Limages_oer_person =
98.90% 86.40%
dataset
............................................. nu-epochs = 25
LFW 70% 30% 11 people 2226 257 min_imaoes_per_persari = 3
93.90% 8720%
dataset num-epochs = 25
11 people 70% 30% C;c9Y 2 Pe Ple 77 4
min_imaces_per_person = 2 100.00% 50.00%
from LFW num-epochs = 25
faces94 70% 30% 11 people 916 257 min_in-zges_per_person = 2
9E00% 79.40%
data set num-epochs = 25
TABLE II
(0 hidden layers & output linear with decision f(x); Decision at .5 value)
Improves accuracy for the UNKNOWN case, but other implementations achieve
higher accuracy.
-57-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Dataset Training Test UNKNOWN *images #inlages Parameters
Accuracy Accuracy
Set Set PERSON In Test Set In UNKNOWN In Test Set In
UNKNOWN
Set PERSON Set PERSON Set
................................................................... i ......
LFW 70% 30% 11 people 1304
257 min_images_oer_person =10 88.80% 9110% 3s
dataset
¨ ------ ¨ --- ¨ ------- num-epochs = 25
- - ¨
LFW 70% i 30% 11 people 2225 257
rnin_images per_person = 3 96.60% 97.70% 'in
dataset num-epochs = 25
11 people 70% 430% Copy 2 people 7 4
min_images_per_person = 2 98.70% 50 00% 'k
from LFW nurn-epocns = 25
-------- _.....

1,' fa min¨images_
per_person = 2
faces94 70% i 30% 11 people 915 257
99.10% 82.10%
nun 1-epochs = 25
dataset Cut-off = 0õ5
¨ ...........................................................................
faces94 70% 30% 11 people 918 257
rnm jrnages_per_person = 2 . 98.30% 95.70%
num-epochs = 25
dataset
-------------- -i ........................ Cut-off = 1 .0
TABLE III ¨ FCNN with 1 hidden layer (500 nodes) + output linear with decision
Dataset Training Test UNKNOWN *images *images
Parameters Accuracy Accuracy
Set PERSON Set In Test Set in UNKNOWN
In Test Set In UNKNOWN
Set
PERSON Set PERSON Set
........................................... =i= ................... =i= ....
LFW 70% 30% 11 people dataset
1304 257 min_images oer_person = 10 99.30% 92.20%
num-epochs = 25
........................................... + ..................... + ......
LFW 70% 30% 11 people dataset 2226
257 min_imades per_person = 3 97.50% 97.70%
num-epochs = 25
........................................... + ..................... + ......
11 people 70% 30% Copy 2 people 77 4
miri_imaces_per_person = 2
from LFW num-epochs = 25
........ + ..
faces94 70% 30% 11 people dataset 918
257 min_imaces_per_person = 2 99.20% 92.60%
num-epochs = 25
Cut-off = 0.5
........................................... + ..................... + ......
faces94 70% 30% 11 people dataset 918 257
mirLimades per_p.erson = 2
num-epochs = 25
Cut-off = 1.0
TABLE IV
¨ FCNN 2 Hidden Layers (500, 2*num people) + output linear, decisions f(x)
#images #images Accuracy
Accuracy
UNKNOWN
Training Test In In
Dataset PERSON Parameters
Set Set UNKNOWN In Test
UNKNOWN
SET In Test Set
PERSON Set PERSON
SET Set
LFW min_images_
11 per_person =
70% 30% people 1304 257 10 98.30%
97.70%
data set num-epochs
=25
LFW 11 min_images_
70% 30% people 2226 257 per_person =
98.50% 98.10%
data set 3
-58-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
num-epochs
=25
Cut-off = 0
11 min_images_
2
people Copy per_person =
people
70% 30% 77 4 2
from
LFW num-epochs
=25
min_images_
11 per_person =
2
70% 30% people 918 257 98.60%
93.80%
data set num-epochs
faces94 =25
Cut-off = 0
In various embodiments, the neural network model is generated initially to
accommodate incremental additions of new individuals to identify (e.g., 2*num
people is an
example of a model initially trained for 100 people given an initial 50
individuals of biometric
information). The multiple or training room provides can be tailored to the
specific
implementation. For example, where additions to the identifiable users is
anticipated to be
small additional incremental training options can include any number with
ranges of 1% to
200%. In other embodiments, larger percentages can be implemented as well.
TABLE V
¨ FCNN: 2 Hidden Layers (500, 2*num people) + output linear, decisions f(x),
and
voting ¨ where the model is trained on 2* the number of class identifiers for
incremental
training.
#images #images Accuracy Accuracy Accuracy
UNKNOWN In In
In
Training Test UNKNOWN
Dataset PERSON In Test UNKNOWN
Parameters In Test UNKNOWN
Set Set PERSON
SET Set PERSON Set
PERSON
Set = 11
SET Set=
faces94
people
min_images_
11 people per_person = 98.20% 98.80%
88.40%
LFW 70% 30% 1304 257 10
dataset
num-epochs (vote) (vote)
(vote)
=25 100.00%
100.00% 90.80%
min_images_
per_person = 98.10% 98.40%
93.60%
LFW 70% 30% 11 people 32226 257
dataset num-epochs (vote) (vote) (vote)
=25 98.60%
100.00% 95.40%
Cut-off = 0
min_images_
Copy 2 per_person =
11
70% 30% people 77 4 2
people
from LFW num-epochs
=25
11 people min_images_
70% 30% 918 257
dataset per_person =
-59-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
2
num-epochs
=25
faces94 Cut-off = 0
According to one embodiment the system can be implemented as a REST compliant
API that can be integrated and/or called by various programs, applications,
systems, system
components, etc., and can be requested locally or remotely.
In one example, the privacy-enabled biometric API includes the following
specifications:
= Preparing data: this function takes the images & labels and saves them
into the
local directory.
1
def add training data(list of images, list of label) :
@params list of images: the list of images
@params list of label: the list of corresponding labels
1
= Training model: each label (person/individual) can include at least 2
images.
In some examples, if the person does not have the minimum that person will
be ignored.
1
def train() :
1
= Prediction:
1
def predict(list of images) :
@params list of images: the list of images of the same person
@return label: a person name or "UNKNOWN PERSON"
1
Further embodiments can be configured to handle new people (e.g., labels or
classes in
the model) in multiple ways. In one example, the current model can be
retrained every time
(e.g., with a threshold number) a certain number of new people are introduced.
In this example,
the benefit is improved accuracy ¨ the system can guarantee a level of
accuracy even with new
people. There exists a trade-off in that full retraining is a slow time
consuming and a heavy
-60-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
computation process. This can be mitigated with live and offline copies of the
model so the
retraining occurs offline and the newly retrain model is swapped for the live
version. In one
example, training time executed in over 20 minutes. With more data the
training time increases.
According to another example, the model is initialized with slots for new
people. The
expanded model is configured to support incremental training (e.g., the
network structure is
not changed when adding new people). In this example, the time to add new
people is
significantly reduced (even over other embodiments of the privacy-enabled
biometric system).
It is realized that there may be some reduction in accuracy with incremental
training, and as
more and more people are added the model can trends towards overfit on the new
people i.e.,
become less accurate with old people. However, various implementations have
been tested to
operate at the same accuracy even under incremental retraining.
Yet another embodiment implements both incremental retraining and full
retraining at
a threshold level (e.g., build the initial model with a multiple of the people
as needed ¨ (e.g., 2
times - 100 labels for an initial 50 people, 50 labels for an initial 25
people, etc.)). Once the
number of people reaches the upper bound (or approaches the upper bound) the
system can be
configured to execute a full retrain on the model, while building in the
additional slots for new
users. In one example, given 100 labels in the model with 50 initial people
(50 unallocated)
reaches 50 new people, the system will execute a full retrain for 150 labels
and now 100 actual
people. This provides for 50 additional users and incremental retraining
before a full retrain is
executed.
Stated generally, the system in various embodiments is configured to retrain
the whole
network from beginning for every N people. Training data: have 100 people;
step 1: train the
network with N = 1000 people; assign 100 people and reserving 900 to train
incremental; train
incrementally with new people until we reach 1000 people; and reach 1000
people, full retrain.
Full retrain: train the network with 2N = 2000 people; now have 1000 people
for reserving to
train incremental; train incrementally with new people until we reach 2000
people; and repeat
the full retrain with open allocations when reach the limit.
An example implementation of the API includes the following code:
drop database if exists trueid;
create database trueid;
grant all on trueid.* to trueid@'localhost' identified by 'trueid';
drop table if exists feature;
drop table if exists image;
drop table if exists PII;
-61-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
drop table if exists subject;
CREATE TABLE subject
(
id MT PRIMARY KEY AUTO INCREMENT,
when created TIMESTAMP DEFAULT CURRENT TIMESTAMP
);
CREATE TABLE PII
(
id MT PRIMARY KEY AUTO INCREMENT,
subject id INT,
tag VARCHAR(254),
value VARCHAR(254)
);
CREATE TABLE image
(
id MT PRIMARY KEY AUTO INCREMENT,
subject id INT,
image name VARCHAR(254),
is train boolean,
when created TIMESTAMP DEFAULT CURRENT _TIMESTAMP
);
CREATE TABLE feature
(
id MT PRIMARY KEY AUTO INCREMENT,
image id INT NOT NULL,
feature order INT NOT NULL,
feature value DECIMAL(32,24) NOT NULL
);
ALTER TABLE image ADD CONSTRAINT fk subject id FOREIGN KEY
(subject id) REFERENCES subject(id);
ALTER TABLE PII ADD CONSTRAINT fk subject id pii FOREIGN KEY
(subject id) REFERENCES subject(id);
ALTER TABLE feature ADD CONSTRAINT fk image id FOREIGN KEY
(image id) REFERENCES image(id);
-62-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
CREATE INDEX piisubjectid ON PII(subject id);
CREATE INDEX imagesubjectid ON image(subject id);
CREATE INDEX imagesubjectidimage ON image(subject id, image name);
CREATE INDEX featureimage id ON feature(image id);
API Execution Example:
- Push the known LFW feature embeddings to biometric feature database.
- Simulate the incremental training process:
num seed = 50 # build the model network, and first num seed people was
trained fully
num window = 50 # For every num window people: build the model network, and
people trained fully
num step = 1 # train incremental every new num step people
num eval = 10 # evaluate the model every num eval people
- Build the model network with #class = 100. Train from beginning (#epochs
=
100) with the first 50 people. The remaining 50 classes are reserved for
incremental training.
i) Incremental training for the 51st person. Train the previous model
with all 51 people (#epochs = 20)
ii) Incremental training for the 52st person. Train the previous model
with all 52 people (#epochs = 20)
iii) continue ...
- (Self or automatic monitoring can be executed by various embodiments to
ensure
accuracy over time ¨ alert flags can be produced if deviation or excessive
inaccuracy is detected; alternatively or in conjunction full retraining can be

executed responsive to excess inaccuracy and the fully retrained model
evaluated
to determine is accuracy issues are resolved ¨ if so the full retrain
threshold can be
automatically adjusted). Evaluate the accuracy of the previous model (e.g., at
every
10 steps), optionally record the training time for every step.
- Achieve incremental training for maximum allocation (e.g., the 100th
person). Full
train of the previous model with all 100 people (e.g., #epochs = 20)
-63-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
- Build the model network with #class = 150. Train from beginning
(e.g., #epochs =
100) with the first 100 people. The remaining 50 classes are reserved for
incremental training.
i) Incremental training for the 101st person. Train the previous model
with all 101 people (#epochs = 20)
ii) continue ...
- Build the model network with #class = 200. Train from beginning
(e.g., #epochs
= 100) with the first 150 people. The remaining 50 classes are reserved for
incremental training.
i) Incremental training for the 151st person. Train the previous model with
all 151 people (#epochs = 20)
ii) Continue ...
Refactor problem:
According to various embodiments, it is realized that incremental training can
trigger
concurrency problems: e.g., a multi-thread problem with the same model, thus
the system can
be configured to avoid retrain incrementally at the same time for two
different people (data can
be lost if retraining occurs concurrently). In one example, the system
implements a lock or a
semaphore to resolve. In another example, multiple models can be running
simultaneously ¨
and reconciliation can be executed between the models in stages. In further
examples, the
system can include monitoring models to ensure only one retrain is executed
one multiple live
models, and in yet others use locks on the models to ensure singular updates
via incremental
retrain. Reconciliation can be executed after an update between models. In
further examples,
the system can cache feature vectors for subsequent access in the
reconciliation.
According to some embodiments, the system design resolves a data pipeline
problem:
in some examples, the data pipeline supports running one time due to queue and
thread
characteristics. Other embodiments, avoid this issue by extracting the
embeddings. In
examples, that do not include that functionality the system can still run
multiple times without
issue based on saving the embedding to file, and loading the embedding from
file. This
approach can be used, for example, where the extracted embedding is
unavailable via other
approaches. Various embodiments can employ different options for operating
with
embeddings: when we give a value to a tensorflow, we have several ways: Feed
dict (speed
trade-off for easier access); and Queue: faster via multi-threads, but can
only run one time (the
queue will be ended after it's looped).
-64-

CA 03191888 2023-02-14
WO 2022/036097 PCT/US2021/045745
Table VI (Fig. 30) & TABLE VII (Fig. 31) show execution timing during
operation and
accuracy percentages for the respective example.
-65-

CA 03191888 2023-02-14
WO 2022/036097 PCT/US2021/045745
TABLE VIII shows summary information for additional executions.
#images #images
Accuracy
UNKNOWN #people in Parameter
Training Test In Test Set In
Dataset PERSON Training s
Set Set UNKNOWN
Set Set PERSON In
Test Set
Set
min_images
_per_person 98.70%
11 people =10
LFW 70% 30% 158 1304 257 num-
epochs
dataset (vote)
=25 100.00%
Cut-off= 0
min_images
_per_person 93.80%
=3
LFW 70% 30% 11 people 901 2226 .. 257
num-epochs
dataset = 25 (vote)
95.42%
Cut-off= 0
According to one embodiment, the system can be described broadly to include
any one
or more or any combination of the following elements and associated functions:
-
Preprocessing: where the system takes in an unprocessed biometric, which can
include
cropping and aligning and either continues processing or returns that the
biometric
cannot be processed.
- Neural network 1: Pre-trained. Takes in unencrypted biometrics. Returns
biometric
feature vectors that are one-way encrypted and distance and/or Euclidean
measurable.
Regardless of biometric type being processed ¨ NN1 generates Euclidean
measurable
encrypted feature vectors. In various embodiments, the system can instantiate
multiple
NN 1(s) for individual credentials and also where each or groups of NN is are
tailored
to different authentication credential.
-
Distance evaluation of NN1 output for a phase of authentication and/or to
filter output
of NN1: As discussed above, a first phase of authentication can use encrypted
feature
vectors to determine a distance and authenticate or not based on being within
a
threshold distance. Similarly during enrollment the generated feature vectors
can be
evaluated to ensure they are within a threshold distance and otherwise require
new
biometric samples.
- Neural network 2: Not pre-trained. It is a deep learning neural network that
does
classification. Includes incremental training, takes a set of label, feature
vector pairs as
input and returns nothing during training ¨ the trained network is used for
matching or
-66-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
prediction on newly input biometric information. Does prediction, which takes
a
feature vector as input and returns an array of values. These values, based on
their
position and the value itself, determine the label or unknown.
- Voting functions can be executed with neural network 2 e.g., during
prediction.
- System
may have more than one neural network 1 for different biometrics. Each would
generate Euclidean measurable encrypted feature vectors based on unencrypted
input.
- System may have multiple neural network 2(s) one for each biometric type.
According to further aspects, the system achieves significant improvements in
accuracy
of identification based at least in part on bounded enrollment of encrypted
feature vectors over
conventional approaches. For example, at any point when encrypted feature
vectors are created
for enrollment (e.g., captured by device and processed by a generation
network, built from
captures to expand enrollment pool and processes by a generation network),
those encrypted
feature vectors are analyzed to determine that they are similar enough to each
other to use for
a valid enrollment. In some embodiments, the system evaluates the produced
encryptions and
tests whether any encrypted features vectors have a Euclidean distance of
greater than 1 from
each other (e.g., other thresholds can be used). If so, those values are
discarded. If a minimum
number of values is not met, the entire enrollment can be deemed a failure,
and new inputs
requested, processed and validated prior to training a respective
classification network. Stated
broadly, the bounded enrollment thresholds can be established based, at least
in part, on what
threshold is being used to determine a measurement (e.g., two encrypted
feature vectors) is the
same as another. Constraining training inputs to the classification network so
that all the inputs
are within a boundary close to the identification threshold ensures that the
resulting
classification network is stable and accurate. In some examples, even singular
outliers can
destabilize an entire network, and significantly reduce accuracy.
Fig. 25 is a block diagram of an example privacy-enabled biometric system 2504
with
liveness validation. According to some embodiments, the system can be
installed on a mobile
device or called from a mobile device (e.g., on a remote server or cloud based
resource) to
return an authenticated or not signal. In further embodiments, the system can
include a web
based client or application that provides fully private authentication
services. In various
embodiments, system 2504 can execute any of the following processes. For
example, system
2504 can enroll users (e.g., via process 2100), identify enrolled users (e.g.,
process 2200) and/or
include multiple enrollment phases (e.g., distance metric evaluation and fully
encrypted
input/evaluation), and search for matches to users (e.g., process 2250). In
various
embodiments, system 2504 includes multiple pairs of neural networks, and any
associated
-67-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
number of helper networks to provide improve data sets used is later
identification/authentication, including, for example with the paired neural
networks. In some
embodiments, each pair includes a processing/generating neural network for
accepting an
unencrypted authentication credential (e.g., biometric input (e.g., images or
voice, etc.),
behavioral input (e.g., health data, gesture tracking, eye movement, etc.) and
processing to
generate an encrypted embedding or encrypted feature vector. Each pair of
networks can also
include a classification neural network than can be trained on the generated
encrypted feature
vectors to classify the encrypted information with labels, and that is further
used to predict a
match to the trained labels or an unknown class based on subsequent input of
encrypted feature
vectors to the trained network. According to some embodiments, the predicted
match(es) can
be validated by comparing the input to the classification network (e.g.,
encrypted
embedding/feature vector) against encrypted embedding/feature vectors of the
identified
match(es). Various distance metrics can be used to compare the encrypted
embeddings,
including, least squares analysis, L2 analysis, distance matrix analysis, sum
of-squared-errors,
cosine measure, etc.
In various embodiments, authentication capture and/or validation can be
augmented by
a plurality of helper networks configured to improve identification of
information to capture
from provided authentication information, improve validation, improve
authentication entropy,
among other options. The authentication architecture can be separated in
various
embodiments. For example , the system can be configured with a trained
classification neural
network and receive from another processing component, system, or entity,
encrypted feature
vectors to use for prediction with the trained classification network.
In further example, the system configured to generate encrypted authentication

information can be coupled with various helper networks configured to
facilitate capture and
processing of the unencrypted authentication information into filtered data
that can be used in
generating one-way encryptions. According to various embodiments, system 2504
can accept,
create or receive original biometric information (e.g., input 2502). The input
2502 can include
images of people, images of faces, thumbprint scans, voice recordings, sensor
data, etc.
Further, the voice inputs can be requested by the system, and correspond to a
set of randomly
selected biometric instances (including for example, randomly selected words)
as part of
liveness validation. According to various embodiments, the inputs can be
processed for
identity matching and in conjunction the inputs can be analyzed to determine
matching to the
randomly selected biometric instances for liveness verification. As discussed
above, the
system 2504 can also be architected to provide a prediction on input of an
encrypted feature
-68-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
vector, and another system or component can accept unencrypted biometrics
and/or generate
encrypted feature vectors, and communicate the same for processing.
According to one embodiment, the system can include a biometric processing
component 2508. A biometric processing component (e.g., 2508) can be
configured to crop
received images, sample voice biometrics, eliminate noise from microphone
captures, etc., to
focus the biometric information on distinguishable features (e.g.,
automatically crop image
around face, eliminate background noise for voice sample, normalized health
data received,
generate samples of received health data, etc.). Various forms of pre-
processing can be
executed on the received biometrics, and the pre-processing can be executed to
limit the
biometric information to important features or to improve identification by
eliminating noise,
reducing an analyzed area, etc. In some embodiments, the pre-processing (e.g.,
via 2508) is
not executed or not available. In other embodiments, only biometrics that meet
quality
standards are passed on for further processing.
According to further embodiments, the system can also include a plurality of
neural
networks that facilitate processing of plaintext authentication information
and the
transformation of the same into fully private or one-way encrypted
authentication information.
Processed biometrics can also be used to generate additional training data,
for example,
to enroll a new user, and/or train a classification component/network to
perform predictions.
According to one embodiment, the system 2504 can include a training generation
component
2510, configured to generate new biometrics for use in training to identify a
user. For example,
the training generation component 2510 can be configured to create new images
of the user's
face or voice having different lighting, different capture angles, etc.,
different samples, filtered
noise, introduced noise, etc., in order to build a larger training set of
biometrics. In one
example, the system includes a training threshold specifying how many training
samples to
generate from a given or received biometric. In another example, the system
and/or training
generation component 2510 is configured to build twenty five additional images
from a picture
of a user's face. Other numbers of training images, or voice samples, etc.,
can be used. In
further examples, additional voice samples can be generated from an initial
set of biometric
inputs to create a larger set of training samples for training a voice network
(e.g., via 2510).
In some other embodiments, the training generation component can include a
plurality
of helper networks configured to homogenize input
identification/authentication information
based on a credential modality (e.g., face biometric data, voice biometric
data, behavioral data,
etc.).
-69-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to one embodiment, the system is configured to generate encrypted
feature
vectors from an identification/authentication information input (e.g., process
images from input
and/or generated training images, process voice inputs and/or voice samples
and/or generated
training voice data, among other options). In various embodiments, the system
2504 can
include an embedding component 2512 configured to generate encrypted
embeddings or
encrypted feature vectors (e.g., image feature vectors, voice feature vectors,
health data feature
vectors, etc.). The terms authentication information input can be used to
referred to information
used for identification, for identification and authentication, and for
authentication, and each
implementation is contemplated, unless other context requires.
According to one embodiment, component 2512 executes a convolution neural
network
("CNN") to process image inputs (and for example, facial images), where the
CNN includes a
layer which generates geometrically (e.g., distance, Euclidean, cosine, etc.)
measurable output.
The embedding component 2512 can include multiple neural networks each
tailored to specific
biometric inputs, and configured to generate encrypted feature vectors (e.g.,
for captured
images, for voice inputs, for health measurements or monitoring, etc.) that
are distance
measurable. According to various embodiments, the system can be configured to
required
biometric inputs of various types, and pass the type of input to respective
neural networks for
processing to capture respective encrypted feature vectors, among other
options. In various
embodiments, one or more processing neural networks is instantiated as part of
the embedding
component 2512, and the respective neural network process unencrypted
biometric inputs to
generate encrypted feature vectors.
In one example, the processing neural network is a convolutional neural
network
constructed to create encrypted embeddings from unencrypted biometric input.
In one
example, encrypted feature vectors can be extracted from a neural network at
the layers
preceding a softmax layer (including for example, the n-1 layer). As discussed
herein, various
neural networks can be used to define embeddings or feature vectors with each
tailored to an
analyzed biometric (e.g., voice, image, health data, etc.), where an output of
or with the model
is Euclidean measurable. Some examples of these neural network include a model
having a
softmax layer. Other embodiments use a model that does not include a softmax
layer to
generate Euclidean measurable feature vectors. Various embodiments of the
system and/or
embedding component are configured to generate and capture encrypted feature
vectors for the
processed biometrics in the layer or layer preceding the softmax layer.
Optional processing of the generated encrypted biometrics can include filter
operations
prior to passing the encrypted biometrics to classifier neural networks (e.g.,
a DNN). For
-70-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
example, the generated encrypted feature vectors can be evaluated for distance
to determine
that they meet a validation threshold. In various embodiments, the validation
threshold is used
by the system to filter noisy or encrypted values that are too far apart.
According to one aspect, filtering of the encrypted feature vectors improves
the
subsequent training and prediction accuracy of the classification networks. In
essence, if a set
of encrypted embeddings for a user are too far apart (e.g., distances between
the encrypted
values are above the validation threshold) the system can reject the
enrollment attempt, request
new biometric measurements, generate additional training biometrics, etc.
Each set of encrypted values can be evaluated against the validation threshold
and
values with too great a distance can be rejected and/or trigger requests for
additional/new
biometric submission. In one example, the validation threshold is set so that
no distance
between comparisons (e.g., of face image vectors) is greater than 0.85. In
another example,
the threshold can be set such that no distance between comparisons is greater
than 1Ø Stated
broadly, various embodiments of the system are configured to ensure that a set
of enrollment
vectors are of sufficient quality for use with the classification DNN, and in
further
embodiments configured to reject enrollment vectors that are bad (e.g., too
dissimilar).
According to some embodiments, the system can be configured to handle noisy
enrollment conditions. For example, validation thresholds can be tailored to
accept distance
measures of having an average distance greater than .85 but less than 1 where
the minimum
distance between compared vectors in an enrollment set is less than .06.
Different thresholds
can be implemented in different embodiments, and can vary within 10%, 15%
and/or 20% of
the examples provided. In further embodiments, each authentication credential
instance (e.g.,
face, voice, retina scan, behavioral measurement, etc.) can be associated with
a respective
validation threshold. Additionally, the system can use identification
thresholds that are more
constrained than the validation threshold. For example, in the context of
facial identification,
the system can require a validation threshold of no greater than a Euclidean
distance of 1
between enrollment face images of an entity to be identified. In one example,
the system can
be configured to require better precision in actual identification, and for
example, that the
subsequent authentication/identification measure be within 0.85 Euclidean
distance to return a
match.
According to some embodiments, the system 2504 can include a classifier
component
2514. The classifier component can include one or more deep neural networks
trained on
encrypted feature vector and label inputs for respective users and their
biometric inputs. The
trained neural network can then be used during prediction operations to return
a match to a
-71-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
person (e.g., from among a group of labels and people (one to many matching)
or from a
singular person (one to one matching)) or to return a match to an unknown
class.
During training of the classifier component 2514, the feature vectors from the

embedding component 2512 or system 2504 are used by the classifier component
2514 to bind
a user to a classification (i.e., mapping biometrics to a matchable
/searchable identity).
According to one embodiment, a deep learning neural network (e.g., enrollment
and prediction
network) is executed as a fully connected neural network ("FCNN") trained on
enrollment data.
In one example, the FCNN generates an output identifying a person or
indicating an
UNKNOWN individual (e.g., at 2506). Other examples can implement different
neural
networks for classification and return a match or unknown class accordingly.
In some
examples, the classifier is a neural network but does not require a fully
connected neural
network.
According to various embodiments, a deep learning neural network (e.g., which
can be
an FCNN) must differentiate between known persons and the UNKNOWN. In some
examples,
the deep learning neural network can include a sigmoid function in the last
layer that outputs
probability of class matching based on newly input biometrics or that outputs
values showing
failure to match. Other examples achieve matching based on executing a hinge
loss function
to establish a match to a label/person or an unknown class.
In further embodiments, the system 2504 and/or classifier component 2514 are
configured to generate a probability to establish when a sufficiently close
match is found. In
some implementations, an unknown person is determined based on negative return
values (e.g.,
the model is tuned to return negative values for no match found). In other
embodiments,
multiple matches can be developed by the classifier component 2514 and voting
can also be
used to increase accuracy in matching.
Various implementations of the system (e.g., 2504) have the capacity to use
this
approach for more than one set of input. In various embodiments, the approach
itself is
biometric agnostic. Various embodiments employ encrypted feature vectors that
are distance
measurable (e.g., Euclidean, homomorphic, one-way encrypted, etc.), generation
of which is
handled using the first neural network or a respective first network tailored
to a particular
biometric.
In some embodiments, the system can invoke multiple threads or processes to
handle
volumes of distance comparisons. For example, the system can invoke multiple
threads to
accommodate an increase in user base and/or volume of authentication requests.
According to
various aspects, the distance measure authentication is executed in a brute
force manner. In
-72-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
such settings, as the user population grows so does the complexity or work
required to resolve
the analysis in a brute force (e.g., check all possibilities (e.g., until
match)) fashion. Various
embodiments are configured to handle this burden by invoking multiple threads,
and each
thread can be used to check a smaller segment of authentication information to
determine a
match.
In some examples, different neural networks are instantiated to process
different types
of biometrics. Using that approach the vector generating neural network may be
swapped for
or use a different neural network in conjunction with others where each is
capable of creating
a distance measurable encrypted feature vector based on the respective
biometric. Similarly,
the system may enroll on both or greater than multiple biometric types (e.g.,
use two or more
vector generating networks) and predict on the feature vectors generated for
both types of
biometrics using both neural networks for processing respective biometric
types, which can
also be done simultaneously. In one embodiment, feature vectors from each type
of biometric
can likewise be processed in respective deep learning networks configured to
predict matches
based on the feature vector inputs (or return unknown). The co-generated
results (e.g., one
from each biometric type) may be used to identify a user using a voting scheme
and may better
perform by executing multiple predictions simultaneously. For each biometric
type used, the
system can execute multi-phase authentication approaches with a first
generation network and
distance measures in a first phase, and a network trained on encrypted feature
vectors in a
second phase. At various times each of the phases may be in use ¨ for example,
an enrolled
user can be authenticated with the trained network (e.g., second phase), while
a newly enrolling
user is enrolled and/or authenticated via the generation network and distance
measure phase.
In some embodiments, the system can be configured to validate an unknown
determination. It is realized that accurately determining that an input to the
authentication
system is an unknown is an unsolved problem in this space. Various embodiments
leverage
the deep learning construction (including, for example, the classification
network) described
herein to enable identification/return of an unknown result. In some
embodiments, the DNN
can return a probability of match that is below a threshold probability. If
the result is below
the threshold, the system is configured to return an unknown result. Further
embodiments
leverage the distance store to improve the accuracy of the determination of
the unknown result.
In one example, upon a below threshold determination output from the DNN, the
system can
validate the below threshold determination by performing distance
comparison(s) on the
authentication vectors and the vectors in the distance store for the most
likely match (e.g.,
greatest probability of match under the threshold).
-73-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to another aspect, generating accurate (e.g., greater than 90%
accuracy in
example executions described below) identification is only a part of a
complete authentication
system. In various embodiments, identification is coupled with liveness
testing to ensure that
authentication credential inputs are not, for example, being recorded and
replayed for
verification or faked in another manner. For example, the system 2504 can
include a liveness
component 2518. According to one embodiment, the liveness component can be
configured to
generate a random set of biometric instances that the system requests a user
submit. The
random set of biometric instances can serve multiple purposes. For example,
the biometric
instances provide a biometric input that can be used for identification, and
can also be used for
liveness (e.g., validate matching to random selected instances). If both tests
are valid, the
system can provide an authentication indication or provide access or execution
of a requested
function. Further embodiments can require multiple types of biometric input
for identification,
and couple identification with liveness validation. In yet other embodiments,
liveness testing
can span multiple biometric inputs as well.
According to one embodiment, the liveness component 2518 is configured to
generate
a random set of words that provide a threshold period of voice data from a
user requesting
authentication. In one example, the system is configured to require a five
second voice signal
for processing, and the system can be configured to select the random
biometric instances
accordingly. Other thresholds can be used (e.g., one, two, three, four, six,
seven, eight, nine
seconds or fractions thereof, among other examples), each having respective
random selections
that are associated with a threshold period of input.
According to other embodiments, liveness validation can be the accumulation of
a
variety of many authentication dimensions (e.g., biometric and/or behavioral
dimensions). For
example, the system can be configured to test a set of authentication
credentials to determine
liveness. In another example, the system can build a confidence score
reflecting a level of
assurance certain inputs are "live" or not faked. According to various
embodiments, instead of
using just one measure (e.g., voice) to test liveness, the system is
configured to manage an
ensemble model of many dimensions. As an example, the system can be configured
to read a
sentence from the screen (to prove he/she is alive) -- but by using user
behavior analytics
("UBA") the system can validate on an infinite number of additional metrics
(additional
dimensions) to determine a liveness score. In further embodiments, each factor
being analyzed
is also contributing to the user's identity score, too.
Various embodiments of the system are configured to handle multiple different
behavioral inputs including, for example, health profiles that are based at
least in part on health
-74-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
readings from health sensors (e.g., heart rate, blood pressure, EEG signals,
body mass scans,
genome, etc.), and can, in some examples, include behavioral biometric
capture/processing.
Once processed through a generation network as discussed herein, such UBA data
becomes
private such that no user actions or behaviors are ever transmitted across the
internet in plain
form.
According to various aspects, system is configured to manage liveness
determinations
based on an ensemble of models. In some embodiments, the system uses a
behavioral biometric
model to get an identity. In various embodiments, the system is configured to
bifurcate
processing in the following ways - any one test is a valid liveness measure
and all the tests
together make for a higher measure of confidence the system has accurately
determined the
user's identity. In further aspects, each test of liveness provides a certain
level of confidence
a user is being properly identified, and each additional test of liveness
increases that level of
confidence, in essence stepping up the strength of the identification. Some
embodiments can
require different levels of authentication confidence to permit various
actions ¨ and more
secure or risky actions can require ever increasing confidence thresholds.
According to further embodiments, the system (e.g. 2504) can be configured to
incorporate new identification classes responsive to receiving new biometric
information. In
one embodiment, the system 2504 includes a retraining component configured to
monitor a
number of new biometrics (e.g., per user/identification class or by a total
number of new
biometrics) and automatically trigger a re-enrollment with the new feature
vectors derived from
the new biometric information (e.g., produced by 2512). In other embodiments,
the system can
be configured to trigger re-enrollment on new feature vectors based on time or
time period
elapsing.
The system 2504 and/or retraining component 2516 can be configured to store
feature
vectors as they are processed, and retain those feature vectors for retraining
(including for
example feature vectors that are unknown to retrain an unknown class in some
examples).
Various embodiments of the system are configured to incrementally retrain the
classification
model (e.g., classifier component 2514 and/or a DNN) on system assigned
numbers of newly
received biometrics. Further, once a system set number of incremental re-
trainings have
occurred the system is further configured to complete a full retrain of the
model.
According to various aspects, the incremental retrain execution avoids the
conventional
approach of fully retraining a neural network to recognize new classes and
generate new
identifications and/or to incorporate new feature vectors as they are input.
Incremental re-
-75-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
training of an existing model to include a new identification without
requiring a full retraining
provides significant execution efficiency benefits over conventional
approaches.
According to various embodiments, the variables for incremental retraining and
full
retraining can be set on the system via an administrative function. Some
defaults include
incremental retrain every 3, 4, 5, 6, etc., identifications, and full retrain
every 3, 4, 5, 6, 7, 8, 9,
10, etc., incremental retrains. Additionally, this requirement may be met by
using calendar
time, such as retraining once a year. These operations can be performed on
offline (e.g.,
locked) copies of the model, and once complete, the offline copy can be made
live.
Additionally, the system 2504 and/or retraining component 2516 is configured
to
update the existing classification model with new users/identification
classes. According to
various embodiments, the system builds a classification model for an initial
number of users,
which can be based on an expected initial enrollment. The model is generated
with empty or
unallocated spaces to accommodate new users. For example, a fifty user base is
generated as
a one hundred user model. This over allocation in the model enables
incremental training to
be executed and incorporated, for example, new classes without requiring fully
retraining the
classification model. When a new user is added, the system is and/or
retraining component
2516 is configured to incrementally retrain the classification model ¨
ultimately saving
significant computation time over convention retraining executions. Once the
over allocation
is exhausted (e.g., all identification classes) a full retrain with an
additional over allocation can
be made (e.g., fully retrain the 100 classes to a model with 150 classes). In
other embodiments,
an incremental retrain process can be executed to add additional unallocated
slots.
Even with the reduced time retraining, the system can be configured to operate
with
multiple copies of the classification model. One copy may be live that is used
for authentication
or identification. A second copy may be an update version, that is taken
offline (e.g., locked
from access) to accomplish retraining while permitting identification
operations to continue
with a live model. Once retraining is accomplished, the updated model can be
made live and
the other model locked and updated as well. Multiple instances of both live
and locked models
can be used to increase concurrency.
According to some embodiments, the system 2500 can receive feature vectors
instead
of original biometrics and processing original biometrics can occur on
different systems ¨ in
these cases system 2500 may not include, for example, 2508, 2510, 2512, and
instead receive
feature vectors from other systems, components or processes.
Example Liveness Execution And Considerations
-76-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
According to one aspect, in establishing identity and authentication an
authentication
system is configured to determine if the source presenting the features is, in
fact, a live source.
In conventional password systems, there is no check for liveliness. A typical
example of a
conventional approach includes a browser where the user fills in the fields
for username and
password or saved information is pre-filled in a form on behalf of the user.
The browser is not
a live feature, rather the entry of the password is pulled from the browser'
form history and
essentially replayed. This is an example of replay, and according to another
aspect, presents
many challenges where biometric input could be copied and replayed.
The inventors have realized that biometrics have the potential to increase
security and
convenience simultaneously. However, there are many issues associated with
such
implementation, including, for example, liveness. Some conventional approaches
have
attempted to introduce biometrics ¨ applying the browser example above, an
approach can
replace authentication information with an image of a person's face or a video
of the face. In
such conventional systems that do not employ liveness checks, these
conventional systems may
be compromised by using a stored image of the face or stored video and
replaying for
authentication.
The inventors have realized that use of biometrics (e.g., such as face, voice
or
fingerprint, etc.) include the consequence of the biometric potentially being
offered in non-live
forms, and thus allowing a replayed biometric to be an offering of a plausible
to the system.
Without liveness, the plausible will likely be accepted. The inventors have
further realized that
to determine if a biometric is live is an increasingly difficult problem.
Examined are some
approaches for resolving the liveness problem ¨ which are treated broadly as
two classes of
liveness approaches (e.g., liveness may be subdivided into active liveness and
passive liveness
problem domains). Active liveness requires the user to do something to prove
the biometric is
not a replica. Passive liveness makes no such requirement to the user and the
system alone
must prove the biometric is not a replica. Various embodiments and examples
are directed to
active liveness validation (e.g., random words supplied by a user), however,
further examples
can be applied in a passive context (e.g., system triggered video capture
during input of
biometric information, ambient sound validation, etc.). Table X (Figs 26A-B)
illustrates
example implementation that may be employed, and includes analysis of
potential issues for
various interactions of the example approaches. In some embodiments, various
ones of the
examples in Table X can be combined to reduce inefficiencies (e.g., potential
vulnerabilities)
in the implementation. Although some issues are present in the various
comparative
-77-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
embodiments, the implementation can be used, for example, where the potential
for the
identified replay attacks can be minimized or reduced.
According to one embodiment, randomly requested biometric instances in
conjunction
with identity validation on the same random biometric instances provides a
high level of
assurance of both identity and liveness. In one example (Row 8), the random
biometric
instances include a set of random words selected for liveness validation in
conjunction with
voice based identification.
According to one embodiment, an authentication system, assesses liveness by
asking
the user to read a few random words or a random sentence. This can be done in
various
embodiments, via execution of process 2900, Fig. 27. According to various
embodiments,
process 2900 can being at 2902 with a request to a user to supply a set of
random biometric
instances. Process 2900 continues with concurrent (or, for example,
simultaneous)
authentication functions ¨ identity and liveness at 2904. For example, an
authentication system
can concurrently or simultaneously process the received voice signal through
two algorithms
(e.g., liveness algorithm and identity algorithm (e.g., by executing 2904 of
process 2900),
returning a result in less than one second. The first algorithm (e.g.,
liveness) performs a speech
to text function to compare the pronounced text to the requested text (e.g.,
random words) to
verify that the words were read correctly, and the second algorithm uses a
prediction function
(e.g., a prediction application programming interface (API)) to perform a one-
to-many (1:N)
identification on a private voice biometric to ensure that the input correctly
identifies the
expected person. At 2908, for example, process 2900 can return an
authentication value for
identified and live inputs 2906 YES. If either check fails 2906 NO, process
2900 can return an
invalid indicator at 2910 or alter a confidence score associated with
authentication.
Further embodiments implement multiple biometric factor identification with
liveness
to improve security and convenience. In one example, a first factor, face
(e.g., image capture),
is used to establish identity. In another example, the second factor, voice
(e.g., via random set
of words), is used to confirm identity, and establish authentication with the
further benefit of
confirming (or not) that the source presenting the biometric input is live. In
yet other
embodiments, the system can implement comprehensive models of liveness
validation that
span multiple authentication credentials (e.g., biometric and/or behavioral
instances).
Various embodiments of private biometric systems are configured to execute
liveness.
The system generates random text that is selected to take roughly 5 seconds to
speak (in
whatever language the user prefers ¨ and with other example threshold minimum
periods). The
user reads the text and the system (e.g., implemented as a private biometrics
cloud service or
-78-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
component) then captures the audio and performs a speech to text process,
comparing the
pronounced text to the requested text. The system allows, for example, a
private biometric
component to assert the liveness of the requestor for authentication. In
conjunction with
liveness, the system compares the random text voice input and performs an
identity assertion
on the same input to ensure the voice that spoke the random words matches the
user's identity.
For example, input audio is now used for liveness and identity.
In other embodiments, liveness is determined based on multiple dimensions. For

example, the system can be configured to handle multiple different behavioral
biometric inputs
including even health profiles that are based at least in part on health
readings from health
sensors (e.g., heart rate, blood pressure, EEG signals, body mass scans,
genome, etc.), and can,
in some examples, include behavioral biometric capture/processing. Once
processed through
a generation neural network such UBA data becomes private such that no user
actions or
behaviors are ever transmitted across the internet ¨ rather the encrypted form
output by the
generation network is used.
According to one embodiment, the solution for liveness uses an ensemble of
models.
The system can initially use a behavioral biometric model to establish an
identity ¨ on
authentication the system can use any one test of dimensions in model to
determine a valid
liveness measure. Based on an action being requested and/or confidence
thresholds established
for that action, the system can be configured to test additional dimensions
until the threshold
is satisfied.
An example flow for multiple dimension liveness testing can include any one or
more
of the following steps:
1. gather plaintext behavioral biometric input (e.g. face, fingerprint, voice,
UBA) and
use data as input for the first DNN to generate encrypted embeddings
2. A second DNN (a classifier network) classifies the encrypted embeddings
from (1)
and returns an identity score (or put another way, the system gathers an
original
behavioral biometric identity via a prediction after transmitting the
embedding.
3. One example test of liveness can be executed with spoken random liveness
sentence
to make sure the person making the request is active (alive). If the user's
spoken
words match the requested words (above a predetermined threshold) the system
established a liveness dimension.
4. The same audio from Step #1 is employed by the system to predict an
identity. If
the identity from Step #1 and Step #3 are the same, we have another liveness
dimension.
-79-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
5. The system can then also use private UBA to determine identity and
liveness. For
example, current actions are input to Private UBA (Step #1) and to return an
identity
and a probability that the measurements reflect that identity. If the behavior
identity
is the same as the previous identity, we have an additional liveness
dimension.
Example executions can include the following: acquire accelerometer and
gyroscope
data to determine if the user is holding the phone in the usual manner;
acquire finger tapping
data to determine if the user is touching the phone in the expected manner;
and/or acquire
optical heart sensor data from a watch to determine if the user's heart is
beating in the expected
manner.
Table XI describes various example behavioral instances that can be used as
input to a
generation network to output distance measurable encrypted versions of the
input.
-80-

CA 03191888 2023-02-14
WO 2022/036097 PCT/US2021/045745
TABLE XI
Hum an behavioral biometrics Machine behavioral biometrics
Fincw,print. Mous Pro xi ty
I s Time GPS
Network. Access Latency,
f-ace WiF
Pa c.iets
Voice Geolocation uekooth
Palm Fingerprint Sens or toothtY Fie 8 C.:0
ns
Cloth in o Cam era - Faces Magnetic Field
'vascular scans C;.atri era Avg Light .. LinearPceeration
Tim e history Microphone/ Audio Gravity
Cheek /ear A.idio Magnitude Orientation
Skin color / features Touch sensor Pedometer
Hair style color Tom perature -A.mbient Screen state
Board I moustache Ac.c.telerotri eter Log messages
Eye rn ovem ent (Eye Tracking) Dsr./i ce access App Usage
Heart beat App access And rai d Configuration
Gait Ciouct access Browsing history
Android Apps with 0 m s
Gestures Credit card paym ents
Lis age
Behavior Pr ent ethods GALAXY WATCH
i-Dsychological Healtt rn onitoring ME M S Atc oe I e
rotri star
Co ntextue.1 behavior S iM card MEMS Gyroscope=
Finger tapping Gyroscope MEMS Barom star
Electro-optical sensor (for
Location Meg netorri star
heart rate rnoniiering
Photodetector (for am bient
Posture Watc:n Accei (Tom star
Watch Corn pass :APPLE WATCH
Location (quick.) GPS & ::;LOSNASS
Phone State (App status.
b attery s tate , .s/vi Fi
Optical heart sensor
availability, on the phone,
ti n.:. a...of-0 ay)
f.rnAron: Air pressure, ECG /f.:.:KG (Electrical
Hum idity. Temperature heart sensor)
Accei ero m star
Gyroscope
Am bort Ll0ht Sense :-
According to various aspects, the system can be configured to evaluate
liveness as an
ensemble model of many dimensions, in addition to embodiments that evaluate
single liveness
measures (e.g., voice).
-81-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Thus, any confidence measure can be obtained using UBA, by evaluating a nearly

infinite number of additional metrics (additional dimensions) to the liveness
score. And, as
described in the example steps 1-5, each UBA factor can also contribute a
system generated
identity score, as well.
Stated broadly, multi-dimension liveness can include one or more of the
following
operations: 1) a set of plaintext UBA input points are acquired as input data
to a model; 2) the
first DNN (e.g., a generation network tailored the UBA input points) generates
encrypted
embeddings based on the plaintext input and the system operates on the
embeddings such that
the actual user behavior data is never transmitted. For example, the encrypted
behavioral
embeddings have no correlation to any user action nor can any user action data
be inferred from
the embeddings; and 3) the behavioral embeddings are sent for processing
(e.g., from a mobile
device to a server) to generate a liveness measure as a probability through a
second DNN
(second network or classification network/model).
Example Technical Models for UBA (e.g., Generation Network)
Various neural networks can be used to accept plaintext behavioral information
as input
and output distance measurable encrypted feature vectors. According to one
example, the first
neural network (i.e., the generation neural network) can be architected as a
Long Short-Term
Memory (LSTM) model which is a type of Recurrent Neural Network (RNN). In
various
embodiments, the system is configured to invoke these models to process UBA,
which is a time
series data. In other embodiments, different first or generation networks can
be used to create
distance measurable encrypted embeddings from behavioral inputs. For example,
the system
can use a Temporal Convolutional Networks (TCNs) as the model to process
behavioral
information, and in another example, a Gated Recurrent Unit Networks (GRUs) as
the model.
According to some embodiments, once the first network generates distance
measurable
embeddings, a second network can be trained to classify on the embeddings and
return an
identification label or unknown result. For example, the second DNN (e.g.,
classification
network) can be a fully connected neural network ("FCNN"), or commonly called
a feed
forward neural network ("FFNN"). In various embodiments, the system is
configured to
implement this type of model, to facilitate processing of attribute data, as
opposed to image or
binary data.
According to some embodiments, the second DNN model used for classifying is a
FCNN which outputs classes and probabilities. In this setting, the feature
vectors are used by
the classifier component to bind a user's behavioral biometrics to a
classification (i.e., mapping
-82-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
behavioral biometrics to a matchable/searchable identity). According to one
embodiment, the
deep learning neural network (e.g., enrollment and prediction network) can be
executed by the
system as a RNN trained on enrollment data. For example, the RNN is configured
to generate
an output identifying a person or indicating an UNKNOWN individual. In various
embodiments, the second network (e.g., classification network which can be a
deep learning
neural network (e.g., an RNN)) is configured to differentiate between known
persons and
UNKNOWN.
According to another embodiment, the system can implement this functionality
as a
sigmoid function in the last layer that outputs probability of class matching
based on newly
input behavioral biometrics or showing failure to match. In further examples,
the system can
be configured to achieve matching based on one or more hinge loss functions.
As discussed,
the system and/or classifier component are configured to generate a
probability to establish
when a sufficiently close match is found. In one example, an "unknown" person
is determined
responsive to negative return values being generated by the classifier
network. In further
example, multiple matches on a variety of authentication credentials can be
developed and
voting can also be used based on the identification results of each to
increase accuracy in
matching.
According to various embodiments, the authentication system is configured to
test
liveness and test behavioral biometric identity using fully encrypted
reference behavioral
biometrics. For example, the system is configured to execute comparisons
directly on the
encrypted behavioral biometrics (e.g., encrypted feature vectors of the
behavioral biometric or
encrypted embeddings derived from unencrypted behavioral information) to
determine
authenticity with a learning neural network. In further embodiments, a first
neural network is
used to process unencrypted behavioral biometric inputs and generate distance
or Euclidean
measurable encrypted feature vectors or encrypted embeddings (e.g., distance
measurable
encrypted values ¨ referred to as a generation network). The encrypted feature
vectors are used
to train a classification neural network. Multiple learning networks (e.g.,
deep neural networks
¨ which can be referred to as classification networks) can be trained and used
to predict matches
on different types of authentication credential (e.g. behavioral biometric
input (e.g.,
facial/feature behavioral biometrics, voice behavioral biometrics,
health/biologic data
behavioral biometrics, etc.). In some examples, multiple behavioral biometric
types can be
processed into an authentication system to increase accuracy of
identification.
-83-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
Various embodiments of the system can incorporate liveness, multi-dimensional
liveness and various confidence thresholds for validation. A variety of
processes can be
executed to support such operation.
Fig. 28 is an example process flow 3000 for executing identification and
liveness
validation. Process 3000 can be executed by an authentication system (e.g.,
2704, Fig. 25 or
2304, Fig. 16). According to one embodiment, process 3000 begins with
generation of a set of
random biometric instances (e.g., set of random words) and triggering a
request for the set of
random words at 3002. In various embodiments, process 3000 continues under
multiple
threads of operation. At 3004, a first biometric type can be used for a first
identification of a
user in a first thread (e.g., based on images captured of a user during input
of the random
words). Identification of the first biometric input (e.g., facial
identification) can proceed as
discussed herein (e.g., process unencrypted biometric input with a first
neural network to output
encrypted feature vectors, predict a match on the encrypted feature vectors
with a DNN, and
return an identification or unknown and/or use a first phase for distance
evaluation), and as
described in, for example, process 2200 and/or process 2250 below. At 3005, an
identity
corresponding to the first biometric or an unknown class is returned. At 3006,
a second
biometric type can be used for a second identification of a user in a second
thread. For example,
the second identification can be based upon a voice biometric. According to
one embodiment,
processing of a voice biometric can continue at 3008 with capture of at least
a threshold amount
of the biometric (e.g., 5 second of voice). In some examples, the amount of
voice data used
for identification can be reduced at 3030 with biometric pre-processing. In
one embodiment,
voice data can be reduced with execution of pulse code modulation. Various
approaches for
processing voice data can be applied, including pulse code modulation,
amplitude modulation,
etc., to convert input voice to a common format for processing. Some example
functions that
can be applied (e.g., as part of 3030) include Librosa (e.g., to eliminate
background sound,
normalize amplitude, etc.); pydub (e.g., to convert between mp3 and .wav
formats); Librosa
(e.g., for phase shift function); Scipy (e.g. to increase low frequency);
Librosa (e.g., for pulse
code modulation); and/or soundfile (e.g., for read and write sound file
operations).
In various embodiments, processed voice data is converted to the frequency
domain via
a fourier transform (e.g., fast fourier transform, discrete fourier transform,
etc.) which can be
provided by numpy or scipy libraries. Once in the frequency domain, the two
dimensional
frequency array can be used to generate encrypted feature vectors.
In some embodiments, voice data is input to a pre-trained neural network to
generate
encrypted voice feature vectors at 3012. In one example, the frequency arrays
are used as input
-84-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
to a pre-trained convolutional neural network ("CNN") which outputs encrypted
voice feature
vectors. In other embodiments, different pre-trained neural networks can be
used to output
encrypted voice feature vectors from unencrypted voice input. As discussed
throughout, the
function of the pre-trained neural network is to output distance measurable
encrypted feature
vectors upon voice data input. Once encrypted feature vectors are generated at
3012, the
unencrypted voice data can be deleted. Some embodiments receive encrypted
feature vectors
for processing rather than generate them from unencrypted voice directly, in
such embodiments
there is no unencrypted voice to delete.
In one example, a CNN is constructed with the goal of creating embeddings and
not for
its conventional purpose of classifying inputs. In further example, the CNN
can employ a triple
loss function (including, for example, a hard triple loss function), which
enables the CNN to
converge more quickly and accurately during training than some other
implementations. In
further examples, the CNN is trained on hundreds or thousands of voice inputs.
Once trained,
the CNN is configured for creation of embeddings (e.g., encrypted feature
vectors). In one
example, the CNN accepts a two dimensional array of frequencies as an input
and provides
floating point numbers (e.g., 32, 64, 128, 256, 3028, ... floating point
numbers) as output.
In some executions of process 3000, the initial voice capture and processing
(e.g.,
request for random words - 3002 - 3012) can be executed on a user device
(e.g., a mobile
phone) and the resulting encrypted voice feature vector can be communicated to
a remote
service via an authentication API hosted and executed on cloud resources. In
some other
executions, the initial processing and prediction operations can be executed
on the user device
as well. Various execution architectures can be provided, including fully
local authentication,
fully remote authentication, and hybridization of both options.
In one embodiment, process 3000 continues with communication of the voice
feature
vectors to a cloud service (e.g., authentication API) at 3014. The voice
feature vectors can then
be processed by a fully connected neural network ("FCNN") for predicting a
match to enrolled
feature vectors and returning a trained label at 3016. As discussed, the input
to the FCNN is
an embedding generated by a first pre-trained neural network (e.g., an
embedding comprising
32, 64, 128, 256, 1028, etc. floating point numbers). Prior to execution of
process 3000, the
FCNN is trained with a threshold number of people for identification (e.g.,
500, 750, 1000,
1250, 1500 ... etc.). The initial training can be referred to as "priming" the
FCNN. The
priming function is executed to improve accuracy of prediction operations
performed by the
FCNN.
-85-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
At 3018, the FCNN returns a result matching a label or an unknown class ¨
i.e., matches
to an identity from among a group of candidates or does not match to a known
identity. The
result is communicated for evaluation of each threads' result at 3022.
According to various embodiments, the third thread of operation is executed to
.. determine that the input biometrics used for identification are live (i.e.,
not spoofed, recorded,
or replayed). For example, at 3020 the voice input is processed to determine
if the input words
matches the set of random words requested. In one embodiment, a speech
recognition function
is executed to determine the words input, and matching is executed against the
randomly
requested words to determine an accuracy of the match. If any unencrypted
voice input remains
in memory, the unencrypted voice data can be deleted as part of 3020. In
various embodiments,
processing of the third thread, can be executed locally on a device requesting
authorization, on
a remote server, a cloud resource, or any combination. If remote processing is
executed, a
recording of the voice input can be communicated to a server or cloud resource
as part of 3020,
and the accuracy of the match (e.g., input to random words) determined
remotely. Any
unencrypted voice data can be deleted once encrypted feature vectors are
generated and/or once
matching accuracy is determined.
In further embodiments, the results of each thread is joined to yield an
authorization or
invalidation. At 3024, the first thread returns an identity or unknown for the
first biometric,
the second thread returns an identity or unknown for the second biometric, and
the third thread
an accuracy of match between a random set of biometric instances and input
biometric
instances. At 3024, process 3000 provides a positive authentication indication
wherein first
thread identity matches the second thread identity and one of the biometric
inputs is determined
to be live (e.g., above a threshold accuracy (e.g., 33% or greater among other
options). If not
positive, process 3000 can be re-executed (e.g., a threshold number of times)
or a denial can
be communicated.
According to various embodiments, process 3000 can include concurrent,
branched,
and/or simultaneous execution of the authentication threads to return a
positive authentication
or a denial. In further embodiments, process 3000 can be reduced to a single
biometric type
such that one identification thread and one liveness thread is executed to
return a positive
.. authentication or a denial. In further embodiments, the various steps
described can be executed
together or in different order, and may invoke other processes (e.g., to
generate encrypted
feature vectors to process for prediction) as part of determining identity and
liveness of
biometric input. In yet other embodiments, additional biometric types can be
tested to confirm
identity, with at least one liveness test on one of the biometric inputs to
provide assurance that
-86-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
submitted biometrics are not replayed or spoofed. In further example, multiple
biometrics
types can be used for identity and multiple biometric types can be used for
liveness validation.
Example Authentication System With Liveness
In some embodiments, an authentication system interacts with any application
or
system needing authentication service (e.g., a Private Biometrics Web
Service). According to
one embodiment, the system uses private voice biometrics to identify
individuals in a datastore
(and provides one to many (1:N) identification) using any language in one
second. Various
neural networks measure the signals inside of a voice sample with high
accuracy and thus allow
private biometrics to replace "username" (or other authentication schemes) and
become the
primary authentication vehicle.
In some examples, the system employs face (e.g., images of the user's face) as
the first
biometric and voice as the second biometric type, providing for at least two
factor
authentication ("2FA"). In various implementation, the system employs voice
for identity and
liveness as the voice biometric can be captured with the capture of a face
biometric. Similar
biometric pairings can be executed to provide a first biometric
identification, a second
biometric identification for confirmation, coupled with a liveness validation.
In some embodiments, an individual wishing to authenticate is asked to read a
few
words while looking into a camera and the system is configured to collect the
face biometric
and voice biometric while the user is speaking. According to various examples,
the same audio
that created the voice biometric is used (along with the text the user was
requested to read) to
check liveness and to ensure the identity of the user's voice matches the
face.
Such authentication can be configured to augment security in a wide range of
environments. For example, private biometrics (e.g., voice, face, health
measurements, etc.)
can be used for common identity applications (e.g., "who is on the phone?")
and single factor
authentication (1FA) by call centers, phone, watch and TV apps, physical
security devices
(door locks), and other situations where a camera is unavailable.
Additionally, where
additional biometrics can be captured 2FA or better can provide greater
assurance of identity
with the liveness validation.
Broadly stated, various aspects implement similar approaches for privacy-
preserving
encryption for processed biometrics (including, for example, face and voice
biometrics).
Generally stated, after collecting an unencrypted biometric (e.g., voice
biometric), the system
creates a private biometric (e.g., encrypted feature vectors) and then
discards the original
unencrypted biometric template. As discussed herein, these private biometrics
enable an
authentication system and/or process to identify a person (i.e., authenticate
a person) while still
-87-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
guaranteeing individual privacy and fundamental human rights by only operating
on biometric
data in the encrypted space.
To transform the unencrypted voice biometric into a private biometric, various

embodiments are configured to pre-process the voice signal and reduce the
voice data to a
smaller form (e.g., for example, without any loss). The Nyquist sampling rate
for this example
is two times the frequency of the signal. In various implementations, the
system is configured
to sample the resulting data and use this sample as input to a Fourier
transform. In one example,
the resulting frequencies are used as input to a pre-trained voice neural
network capable of
returning a set of embeddings (e.g., encrypted voice feature vectors). These
embeddings, for
example, sixty four floating point numbers, provide the system with private
biometrics which
then serve as input to a second neural network for classification.
Example Validation Augmentation
Fig. 29 is an example process flow 3100 for validating an output of a
classification
network. According to some embodiments, a classification network can accept
encrypted
authentication credentials as an input and return a match or unknown result
based on analyzing
the encrypted authentication credential. According to one embodiment, process
3100 can be
executed responsive to generation of an output by the classification network.
For example, at
3102 a classification output is tested. At 3104, the testing determines if any
of the output values
meet or exceed a threshold for determining a match. If yes (e.g., 3104 YES),
the matching
result is returned at 3106.
If the threshold is not met, 3104 NO, process 3100 continues at 3108.
According to
one embodiment, a reference encrypted credential associated with the closest
matches
determined by the classification network can be retrieved at 3108. Although
the probability of
the match main be too low to return an authentication or identification
result, the highest
probability matches can be used to retrieve stored encrypted authentication
credentials for those
matches or the highest probability match. At 3110 the retrieved credentials
can be compared
to the input that was processed by the classification network (e.g. a new
encrypted
authentication credential).
According to one example, the comparison at 3110 can include a distance
evaluation between
the input authentication credential and the reference authentication
credentials associated with
known labels/entities. If the distance evaluation meets a threshold, 3112 YES,
process 3100
continues at 3116 and returns a match to the known label/entity. If the
threshold is not met,
3112 NO, then process 3100 continues at 3114 with a return of no match. Post
classification
-88-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
validation can be used in cases where a threshold probability is not met, as
well as case where
a threshold is satisfied (e.g., to confirm a high probability match), among
other options.
The terms "program" or "software" are used herein in a generic sense to refer
to any
type of computer code or set of processor-executable instructions that can be
employed to
program a computer or other processor to implement various aspects of
embodiments as
discussed above. Additionally, it should be appreciated that according to one
aspect, one or
more computer programs that when executed perform methods of the disclosure
provided
herein need not reside on a single computer or processor, but may be
distributed in a modular
fashion among different computers or processors to implement various aspects
of the
disclosure provided herein.
As described herein "authentication system" includes systems that can be used
for
authentication as well as systems that be used for identification. Various
embodiments
describe helper network that can be used to improve operation in either
context. The various
functions, processes, and algorithms can be executed in the context of
identifying an entity
and/or in the context of authenticating an entity.
Processor-executable instructions may be in many forms, such as program
modules,
executed by one or more computers or other devices. Generally, program modules
include
routines, programs, objects, components, data structures, etc. that perform
particular tasks or
implement particular abstract data types. Typically, the functionality of the
program modules
may be combined or distributed as desired in various embodiments.
Also, data structures may be stored in one or more non-transitory computer-
readable
storage media in any suitable form. For simplicity of illustration, data
structures may be
shown to have fields that are related through location in the data structure.
Such relationships
may likewise be achieved by assigning storage for the fields with locations in
a non-transitory
computer-readable medium that convey relationship between the fields. However,
any
suitable mechanism may be used to establish relationships among information in
fields of a
data structure, including through the use of pointers, tags or other
mechanisms that establish
relationships among data elements.
Also, various inventive concepts may be embodied as one or more processes, of
which examples (e.g., the processes described with reference to Figs. 4-7, 9-
11, etc.) have
been provided. The acts performed as part of each process may be ordered in
any suitable
way. Accordingly, embodiments may be constructed in which acts are performed
in an order
different than illustrated, which may include performing some acts
simultaneously, even
though shown as sequential acts in illustrative embodiments.
-89-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
All definitions, as defined and used herein, should be understood to control
over
dictionary definitions, and/or ordinary meanings of the defined terms. As used
herein in the
specification and in the claims, the phrase "at least one," in reference to a
list of one or more
elements, should be understood to mean at least one element selected from any
one or more
of the elements in the list of elements, but not necessarily including at
least one of each and
every element specifically listed within the list of elements and not
excluding any
combinations of elements in the list of elements. This definition also allows
that elements
may optionally be present other than the elements specifically identified
within the list of
elements to which the phrase "at least one" refers, whether related or
unrelated to those
elements specifically identified. Thus, as a non-limiting example, "at least
one of A and B"
(or, equivalently, "at least one of A or B," or, equivalently "at least one of
A and/or B") can
refer, in one embodiment, to at least one, optionally including more than one,
A, with no B
present (and optionally including elements other than B); in another
embodiment, to at least
one, optionally including more than one, B, with no A present (and optionally
including
elements other than A); in yet another embodiment, to at least one, optionally
including more
than one, A, and at least one, optionally including more than one, B (and
optionally including
other elements); etc.
The phrase "and/or," as used herein in the specification and in the claims,
should be
understood to mean "either or both" of the elements so conjoined, i.e.,
elements that are
conjunctively present in some cases and disjunctively present in other cases.
Multiple
elements listed with "and/or" should be construed in the same fashion, i.e.,
"one or more" of
the elements so conjoined. Other elements may optionally be present other than
the elements
specifically identified by the "and/or" clause, whether related or unrelated
to those elements
specifically identified. Thus, as a non-limiting example, a reference to "A
and/or B", when
used in conjunction with open-ended language such as "comprising" can refer,
in one
embodiment, to A only (optionally including elements other than B); in another
embodiment,
to B only (optionally including elements other than A); in yet another
embodiment, to both A
and B (optionally including other elements); etc.
Use of ordinal terms such as "first," "second," "third," etc., in the claims
to modify a
claim element does not by itself connote any priority, precedence, or order of
one claim
element over another or the temporal order in which acts of a method are
performed. Such
terms are used merely as labels to distinguish one claim element having a
certain name from
another element having a same name (but for use of the ordinal term).
-90-

CA 03191888 2023-02-14
WO 2022/036097
PCT/US2021/045745
The phraseology and terminology used herein is for the purpose of description
and
should not be regarded as limiting. The use of "including," "comprising,"
"having,"
"containing", "involving", and variations thereof, is meant to encompass the
items listed
thereafter and additional items.
Having described several embodiments of the techniques described herein in
detail,
various modifications, and improvements will readily occur to those skilled in
the art. Such
modifications and improvements are intended to be within the spirit and scope
of the
disclosure. Accordingly, the foregoing description is by way of example only,
and is not
intended as limiting. The techniques are limited only as defined by the
following claims and
the equivalents thereto.
-91-

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	Unavailable
(86) PCT Filing Date	2021-08-12
(87) PCT Publication Date	2022-02-17
(85) National Entry	2023-02-14

Abandonment History

There is no abandonment history.

Maintenance Fee

Last Payment of $100.00 was received on 2023-08-04

Upcoming maintenance fee amounts

Description	Date	Amount
Next Payment if standard fee	2024-08-12	$125.00
Next Payment if small entity fee	2024-08-12	$50.00

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

the reinstatement fee;
the late payment fee; or
additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Application Fee		2023-02-14	$421.02	2023-02-14
Maintenance Fee - Application - New Act	2	2023-08-14	$100.00	2023-08-04

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
PRIVATE IDENTITY LLC

Past Owners on Record
None

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Abstract	2023-02-14	2	77
Claims	2023-02-14	5	196
Drawings	2023-02-14	34	949
Description	2023-02-14	91	5,471
Representative Drawing	2023-02-14	1	12
Patent Cooperation Treaty (PCT)	2023-02-14	1	37
Patent Cooperation Treaty (PCT)	2023-02-14	1	69
International Search Report	2023-02-14	1	57
National Entry Request	2023-02-14	6	178
Cover Page	2023-07-19	1	48

Language selection

Menus

English Abstract

French Abstract

Administrative Status

Abandonment History

Maintenance Fee

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 3191888 Summary

English Abstract

French Abstract

Administrative Status

Abandonment History

Maintenance Fee

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.