Sommaire du brevet 2403249

(12) Demande de brevet:	(11) CA 2403249
(54) Titre français:	METHODE DU GRADIENT POUR RESEAUX NEURONAUX, ET APPLICATION DANS LE CADRE D'UN MARKETING CIBLE
(54) Titre anglais:	GRADIENT CRITERION METHOD FOR NEURAL NETWORKS AND APPLICATION TO TARGETED MARKETING
Statut:	Morte

Données bibliographiques

(51) Classification internationale des brevets (CIB):	G06N 3/08 (2006.01) G06Q 30/00 (2006.01)
(72) Inventeurs :	GALPERIN, YURI (Etats-Unis d'Amérique) FISHMAN, VLADIMIR (Etats-Unis d'Amérique)
(73) Titulaires :	GALPERIN, YURI (Non disponible) FISHMAN, VLADIMIR (Non disponible)
(71) Demandeurs :	MARKETSWITCH CORPORATION (Etats-Unis d'Amérique)
(74) Agent:	FETHERSTONHAUGH & CO.
(74) Co-agent:
(45) Délivré:
(86) Date de dépôt PCT:	2000-03-15
(87) Mise à la disponibilité du public:	2000-09-21
Licence disponible:	S.O.
(25) Langue des documents déposés:	Anglais

Traité de coopération en matière de brevets (PCT):	Oui
(86) Numéro de la demande PCT:	PCT/US2000/006735
(87) Numéro de publication internationale PCT:	WO2000/055790
(85) Entrée nationale:	2002-09-13

(30) Données de priorité de la demande:

Numéro de la demande	Pays / territoire	Date
60/124,217	Etats-Unis d'Amérique	1999-03-15

Abrégés

Abrégé français

La présente invention concerne une application unique de la méthode statistique du maximum de vraisemblance aux techniques des réseaux neuronaux commerciaux. La présente invention utilise la nature spécifique du résultat de problèmes de marketing ciblé, et permet la production de résultats prévisionnels plus précis par une minimisation d'un gradient visant à produire des pondérations de modèles permettant d'obtenir le résultat assorti du maximum de vraisemblance. Ce procédé s'utilise, de préférence, pour les données bruitées et lorsque l'on cherche à déterminer la précision générale d'une distribution, ou la meilleure description générale de la réalité.

Abrégé anglais

The present invention is drawn to a unique application of the Maximum
Likelihood statistical method to commercial neural network technologies. The
present invention utilizes the specific nature of the output in target
marketing problems and makes it possible to produce more accurate and
predictive results by minimizing a gradient criterion to produce model weights
to get the maximum likelihood result. It is best used on "noisy" data and when
one is interested in determining a distribution's overall accuracy, or best
general description of reality.

Revendications

Note : Les revendications sont présentées dans la langue officielle dans laquelle elles ont été soumises.

CLAIMS

7. A system for training neural networks with a maximum likelihood utility
function,
comprising:

a central application server;

a modeling database connected to said central application server;

at least one workstation networked to said central application server;

at least one multithreaded calculation engine networked to said central
application
server; and

software instructions on said central application server, at least one
workstation
and at least one multithreaded calculation engine so as to provide for:

said at least one workstation to select an initial model function for a
propensity score g(X,W, where W is a set of weights of the neural network and
X is a vector of customer attributes from a modeling database; and
said at least one multithreaded calculation engine to
calculate propensity scores for the customers in the modeling database;

calculate a training error Err, where

Image

measure the error to cheek for convergence below a desired value;
obtain a new model and apply it to new data when convergence occurs;
minimize the error to solve for new weights W by minimizing the gradient
criterion defined by the formula:

Image

11

begin a new iteration of the process by calculating new propensity scores
for the customers in the modeling database.

8. The system for training neural networks with a maximum likelihood utility
function of claim 7, further comprising:

a customer database connected to said central application server and said
at least one multithreaded calculation engine; and

software instructions to apply the new model to customer data from said
customer database upon being selected by said at least one workstation.

9. The system for training neural networks with a maximum likelihood utility
function of claim 7, further comprising:

software instructions on said at least one multithreaded calculation engine
to:

define f as a normalized propensity score related to g(X,W) by the
formula:

g(X,W)=f~~(X,W)

where f is the output of the neural network; and

choose the parameter t in such a way that f may be of the order of
0.5;

wherein 12 is an average response rate in the sample and the above condition
is
satisfied if:

Image

wherein:

12

Image

and gradient criterion is computed as follows:

Image

10. The system for training neural networks with a maximum likelihood utility
function of claim 7, further comprising:

a customer database connected to said central application server and said
at least one multithreaded calculation engine; and

software instructions to apply the new model to a top 20% of a targeted
marketing sample customer pool selected from said customer database by said a
least one workstation.

11. The system for training neural networks with a maximum likelihood utility
function of claim 9, further comprising:

a customer database connected to said central application server and said
at least one multithreaded calculation engine; and

software instructions to apply the new model to a top 20% of a targeted
marketing sample customer pool selected from said customer database by said a
least one workstation.

13

claims 7-12 added to define apparatus of invention.

All the remaining claims are unchanged.

14

Description

Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
1 TITLE OF THE INVENTION: Gradient Criterion Method for Neural Networks and
2 Application to Targeted Marketing
3 FIELD OF THE INVENTION:
This invention relates generally to the development of neural network models
to optimize the effects of targeted marketing programs. More specifically,
this
6 invention is an improvement on the Maximum Likelihood method of training
neural
7 networks using a gradient criterion, and is specially designed for binary
output having
8 strongly uneven proportion, which is typical for direct marketing problems.
9
1o BACKGROUND OF THE INVENTION:
11 The goal of most modeling procedures is to minimize the discrepancy between
12 real results and model outputs. If the discrepancy, or error, can be
accumulated on a
13 record by record basis, it is suitable for gradient algorithms like Maximum
14 Likelihood.
The goal of target marketing modeling is typically to find a method to
16 calculate the probability of any prospect in the list to respond to an
offer. The neural
17 network model is built based on the experimental data (test mailing), and
the
18 traditional approach to this problem is to choose a model and compute model
19 parameters with a model fitting procedure.
2o The topology of model-for example, number of nodes, input and transfer
21 functions-defines the formula that expresses the probability of response as
a
22 function of attributes.
23 In a special model fitting procedure, the output of the model is tested
against
24 actual output (from the results of a test mailing) and discrepancy is
accumulated in a
special error function. Different types of error functions can be used (e.g.,
mean

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
1 square, absolute error); model parameters are determined to minimize the
error .-
2 function. The best fitting of model parameters is an implicit indication
that the model
3 is good (not necessarily the best) in terms of its original objective.
Thus the model building process is defined by two entities: the type of model
and the error (or utility) function. The type of model defines the ability of
the model
6 to discern various patterns in the data. For example, increasing the number
of nodes
7 results in more complicated formulae, so a model can more accurately discern
8 complicated patterns.
9 The "goodness" of the model is ultimately defined by the choice of an error
to function, since it is the error function that is minimized during the model
training
1i process.
12 To reach the goal of modeling, one wants to use a utility function that
assigns
13 probabilities that are most in compliance with the results of the
experiment (the test
14 mailing). The Maximum Likelihood criterion is the explicit measure of this
compliance. However, the modeling process as it exists today has a significant
16 drawback: it uses conventional utility functions (least mean square, cross
entropy)
17 only because there is a mathematical apparatus developed for these utility
functions.
1 g What would really be useful is a process that builds a response model that
19 directly maximizes Maximum Likelihood.
For example, a random variable X exists with the distribution p(X, A), where
21 A is an unknown vector of parameters to be estimated based on the
independent
22 observations of X: (x1, x2, ..., xN). The goal is to find such a vector A
that makes a
23 probability of the output p(xl,A)*p( x2,A)* ... *p( xN,A) maximally
possible. Note that
24 the function p(X, A) should be a known function of two variables. The
Maximum
2

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
1 Likelihood technique provides the mathematical apparatus to solve this
optimization
2 problem.
3 In general, the Maximum Likelihood method can be applied to neural
4 networks as follows. Let the neural network calculate a value of the output
variable y
based on the input vector X. The observed values (y1, y2, ..., yN) represent
the actual
6 output with some error e. Assuming that this error has, for example, a
normal
7 distribution, the method can fmd weights W of the neural network that makes
a
8 probability of the output p(yl,W)*p( y2,W)* ~ ~-*p( Yrr~W) maximally
possible. In
9 the case of a normal probability function, the Maximum Likelihood criterion
is
l0 equivalent to the Least Mean Square criterion-which is, in fact, most
widely used for
11 neural network training.
12 In the case of target marketing, the observed output X is a binary variable
that
13 is equal to 1 if a customer responded to the offer, and is 0 otherwise. The
normality
14 assumption is too rough, and leads to a sub-optimal set of neural network
weights if
used in neural network training. This is a typical direct marketing scenario.
16
1~ SUMMARY OF THE INVENTION:
18 The present invention represents a unique application of the Maximum
19 Likelihood statistical method to commercial neural network technologies.
The present
2o invention utilizes the specific nature of the output in target marketing
problems and
21 makes it possible to produce more accurate and predictive results. It is
best used on
22 "noisy" data and when one is interested in determining a distribution's
overall
23 accuracy, or best general description of reality.
24 The present invention provides a competitive advantage over off the-shelf
modeling packages in that it greatly enhances the application of Maximum
Likelihood
3

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
1 to quantitative marketing applications such as customer acquisition, cross-
selling/up
2 selling, predictive customer profitability modeling, and channel
optimization.
3 Specifically, the superior predictive modeling capability provided by using
the present
4 invention means that marketing analysts will be better able to:
~ Predict the propensity of individual prospects to respond to an offer, thus
enabling
6 marketers to better identify target markets.
7 ~ Identify customers and prospects who are most likely to default on loans,
so that
8 remedial action can be taken, or so that those prospects can be excluded
from
9 certain offers.
l0 ~ Identify customers or prospects who are most likely to prepay loans, so a
better
11 estimate can be made of revenues.
12 ~ Identify customers who are most amenable to cross-sell and up-sell
opportunities.
13 ~ Predict claims experience, so that insurers can better establish risk and
set
14 premiums appropriately.
~ Identify instances of credit-card fraud.
16
17 BRIEF DESCRIPTION OF THE DRAWINGS
18 Figure 1 shows the dataflow of the method of training the model of the
present
19 invention.
Figure 2 illustrates a preferred system architecture for employing the present
21 invention.
22
23 DETAILED DESCRIPTION OF THE INVENTION
4

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
1 The present invention uses the neural network to calculate a propensity
score
2 g(X, W), where W is a set of weights of the neural network, X is a vector of
customer
3 attributes (input vector). The probability to respond to an offer for a
customer with
4 attributes X can be calculated by a formula:
_ g(X,W)
p 1+g(X,W)
6 If there are N independent samples and among them n are responders, the
7 probability of such output is:
~g(X;,W)~' ~(1-g(Xi,W))
L - isresp isnon-resp
N
~(1+g(Xi,W))
r=i
9 Using the logarithm of L as a training criterion (training error) in the
form of:
to Err=-1nL=~ln(1+gi )- ~In(gi>- ~ln(1-gi)
(W ieresp isnon-resp
11 The neural network training procedure fords the optimal weights W that
12 minimize Err and thus maximize likelihood of the observed output L. One can
use
13 back propagation or a similar method to perform training. The gradient
criterion that
14 is required by a training procedure is computed as follows:
15 Err _ (~ gi - ~ 1 ~ g; + ~ g' )g~
i=~ 1 ~' gi ieresp ienon-resp 1 - gi
16 In order for the training procedure be robust and stable the output of the
neural
17 network should be in the middle of the working interval [0, 1]. To ensure
that, the
18 present invention introduces the normalized propensity score f which is
related to g
19 as:
2o g(X,W) - f~n (X,W)
s

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
I Now, let f be the output of the neural network and choose the parameter i in
2 such a way that f may be of the order of 0.5.
3 Let R be an average response rate in the sample. The above condition is
4 satisfied if:
i =1/lnl R
R
6 While training the model, the criterion is optimized so the calculation is
based
7 on the output of the neural network using the formula:
N N
8 Err=-1nP=~ln(1+ ft~z~-1 ~~(.f>- ~ln(1-fr'~~~
(-t ~ ieresp ienon-resp
9 The gradient criterion is computed as follows:
f.m-t N fuzes
l0 Err ' --
_ ( 1 ~ vz 1 ~ 1 / f + ~ ~'v~ ).~
1 + f 2 ieresp isnon_resp 1 -,l;
11 The method was tested on a variety of business cases against both Least
Mean
12 Square and Cross-Entropy criteria. In all cases the method gave 20% - SO%
13 improvement in the lift on top 20% of the target marketing sample customer
pools.
14 As shown in figure 1, the method inputs data from modeling database 11 into
a selected model 12 to calculate scores 13. The error 14 is calculated from
comparison
16 with the known responses from modeling database 11 and checked for
convergence
17 15 below a desired level. When convergence occurs, a new model 16 is the
result to
18 be used for targeted marketing 17. Otherwise, the process minimizes the
error and
19 solves for a new set of weights at 18 and begins a new iteration.
The present invention operates on a computer system and is used for targeted
21 marketing purposes. In a preferred embodiment as shown in figure 2, the
system runs
22 on a three-tier architecture that supports CORBA as an intercommunications
protocol.
23 The desktop client software on targeted marketing workstations 20 supports
JAVA. The
6

CA 02403249 2002-09-13
WO 00/55790 PCTlUS00/06735
1 central application server 22 and multithreaded calculation engines 24, 25
run on
2 Windows NT or UNIX. Modeling database 26 is used for training new models to
be
3 applied for targeted marketing related to customer database 28. The
recommended
4 minimum system requirements for application server 22 and multithreaded
calculation
engines 24, 25 are as follows:
9 _
~HP Platform~i'~
Processor: ~ HP
emory~ ~j 256 MB
isk Space: ~ 10 MB*W
*Approximately 100 MB/1 million records in customer database. The
above assumes the user client is installed on a PC with the recommended
configuration found below.

CA 02403249 2002-09-13
WO 00/55790 PCT/US00/06735
Read/Write permissions in area of server
Permissions: installation (no root permissions)
perating System: ~~ HP/UX 11 (32 Bit)
Protoco~ 1: ~~ CP/IP ___ _
Daemons: ~ elnet and FTP (Optional)
1
2 The recommended minimum requirements for the targeted marketing workstations
20
3 are as follows:
4
6 Using the present invention in conjunction with a neural network, the
present
'7 invention provides a user with data indicating the individuals or classes
of individuals
s who are most likely to respond to direct marketing.

Dessin représentatif

Une figure unique qui représente un dessin illustrant l'invention.

États administratifs

Pour une meilleure compréhension de l'état de la demande ou brevet qui figure sur cette page, la rubrique Mise en garde , et les descriptions de Brevet , États administratifs , Taxes périodiques et Historique des paiements devraient être consultées.

États administratifs

Titre	Date
Date de délivrance prévu	Non disponible
(86) Date de dépôt PCT	2000-03-15
(87) Date de publication PCT	2000-09-21
(85) Entrée nationale	2002-09-13
Demande morte	2004-12-16

Historique d'abandonnement

Date d'abandonnement	Raison	Reinstatement Date
2003-12-16	Absence de réponse à la lettre du bureau
2004-03-15	Taxe périodique sur la demande impayée

Historique des paiements

Type de taxes	Anniversaire	Échéance	Montant payé	Date payée
Rétablissement des droits			200,00 $	2002-09-13
Le dépôt d'une demande de brevet			300,00 $	2002-09-13
Taxe de maintien en état - Demande - nouvelle loi	2	2002-03-15	100,00 $	2002-09-13
Taxe de maintien en état - Demande - nouvelle loi	3	2003-03-17	100,00 $	2002-10-22

Titulaires au dossier

Les titulaires actuels et antérieures au dossier sont affichés en ordre alphabétique.

Titulaires actuels au dossier
GALPERIN, YURI
FISHMAN, VLADIMIR

Titulaires antérieures au dossier
S.O.

Les propriétaires antérieurs qui ne figurent pas dans la liste des « Propriétaires au dossier » apparaîtront dans d'autres documents au dossier.

Documents

Pour visionner les fichiers sélectionnés, entrer le code reCAPTCHA :

Pour visualiser une image, cliquer sur un lien dans la colonne description du document. Pour télécharger l'image (les images), cliquer l'une ou plusieurs cases à cocher dans la première colonne et ensuite cliquer sur le bouton "Télécharger sélection en format PDF (archive Zip)" ou le bouton "Télécharger sélection (en un fichier PDF fusionné)".

Liste des documents de brevet publiés et non publiés sur la BDBC .

Si vous avez des difficultés à accéder au contenu, veuillez communiquer avec le Centre de services à la clientèle au 1-866-997-1936, ou envoyer un courriel au Centre de service à la clientèle de l'OPIC.

Filtre

Télécharger sélection en format PDF (archive Zip)

Télécharger sélection (en un fichier PDF fusionné)

Description du Document	Date (yyyy-mm-dd)	Nombre de pages	Taille de l'image (Ko)
Abrégé	2002-09-13	2	65
Revendications	2002-09-13	4	85
Dessins	2002-09-13	2	53
Dessins représentatifs	2003-01-13	1	6
Page couverture	2003-01-14	1	38
Description	2002-09-13	8	345
PCT	2002-09-13	10	331
Cession	2002-09-13	3	100
Correspondance	2003-01-10	1	25

Sélection de la langue

Menus

Abrégé français

Abrégé anglais

États administratifs

Historique d'abandonnement

Historique des paiements

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.

Sommaire du brevet 2403249

Abrégé français

Abrégé anglais

États administratifs

Historique d'abandonnement

Historique des paiements

Votre demande est en traitement.Les informations demandèes serontaccessibles dans quelques instants.Merci de patienter.

Votre demande est en traitement.

Les informations demandèes seront
accessibles dans quelques instants.

Merci de patienter.