Language selection

Search

Patent 2356891 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2356891
(54) English Title: METHODS FOR ROBUST DISCRIMINATION OF PROFILES
(54) French Title: PROCEDES DE DISCRIMINATION ROBUSTE DE PROFILS
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/10 (2006.01)
  • G01N 33/50 (2006.01)
(72) Inventors :
  • FRIEND, STEPHEN H. (United States of America)
  • STOUGHTON, ROLAND (United States of America)
(73) Owners :
  • ROSETTA INPHARMATICS, INC.
(71) Applicants :
  • ROSETTA INPHARMATICS, INC. (United States of America)
(74) Agent: OSLER, HOSKIN & HARCOURT LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 1999-12-21
(87) Open to Public Inspection: 2000-07-06
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US1999/030577
(87) International Publication Number: US1999030577
(85) National Entry: 2001-06-22

(30) Application Priority Data:
Application No. Country/Territory Date
09/220,274 (United States of America) 1998-12-23
09/222,597 (United States of America) 1998-12-28

Abstracts

English Abstract


Methods for discriminating between the subtle effects of a first perturbation
and a second perturbation on a biological sample are provided. Further,
methods for identifying disease states in patients and methods for optimizing
drug therapy regiments in diseased subjects are provided. Finally, improved
methods for determining the subtle effects of pharmacological agents on a
biological system are provided.


French Abstract

La présente invention concerne des procédés de discrimination entre les effets subtiles d'une première perturbation et d'une seconde perturbation sur un échantillon biologique. En outre, l'invention se rapporte à des procédés destinés à identifier des états maladifs chez des patients, ainsi que des procédés d'optimisation de régimes de thérapie médicamenteuse chez des malades. Enfin, l'invention concerne des procédés améliorés destinés à déterminer les effets discrets d'agents pharmacologiques sur un système biologique.

Claims

Note: Claims are shown in the official language in which they were submitted.


WHAT IS CLAIMED IS:
1. A method for determining a degree of similarity between an effect of a
first perturbation
and an effect of a second perturbation on a biological sample, the method
comprising:
(a) determining a first set of constituent profiles, each constituent profile
of said
first set is determined using a different one of a plurality of initial states
of said
biological sample by measuring a response of said biological sample to said
first
perturbation when said biological sample is in said different one of said
initial
states;
(b) determining a second set of constituent profiles, each constituent profile
of
said second set determined using a different one of said plurality of initial
states
by measuring a response of said biological sample to said second perturbation
when said biological sample is in said different one of said initial states;
(c) combining said first set of constituent profiles into a first augmented
profile;
(d) combining said second set of constituent profiles into a second augmented
profile; and
(e) comparing said first augmented profile with said second augmented profile
to determine said degree of similarity.
2. A method for determining a degree of similarity between an effect of a
first perturbation
and an effect of a second perturbation on a biological sample, the method
comprising:
(a) combining a first set of constituent profiles into a first augmented
profile;
each constituent profile in said first set determined by:
a different one of a plurality of initial states of said biological sample
wherein a response of said biological sample to said first perturbation
when said biological sample is in said different one of said initial states
is measured;
(b) combining a second set of constituent profiles into a second augmented
profile; each constituent profile of said second set determined by:
a different one of said plurality of initial states of said biological sample;
wherein a response of said biological sample to said second perturbation
when said biological sample is in said different one of said initial states
is measured; and
(c) comparing said first augmented profile with said second augmented profile
to determine said degree of similarity.
3. A method for determining a degree of similarity between an effect of a
first perturbation
and an effect of a second perturbation on a biological sample by comparing a
first
-54-

augmented profile with a second augmented profile to determine said degree of
similarity; wherein:
(i) said first augmented profile is determined by combining a first set of
constituent profiles; each constituent profile of said first set determined
with a
different one of a plurality of initial states of said biological sample by
measuring a response of said biological sample to said first perturbation when
said biological sample is in said different one of said initial states; and
(ii) said second augmented profile is determined by combining a second set of
constituent profiles; each constituent profile of said second set is
determined
with said different one of said plurality of initial states of said biological
sample
by measuring a response of said biological sample to said second perturbation
when said biological sample is in said different one of said initial states.
4. The method of claim 1, 2 or 3 wherein each said initial state is different.
5. The method of claims 1, 2 or 3 wherein two or more of said initial states
are the same.
6. The method of claims 1, 2 or 3 wherein at least one constituent profile in
said first set
of constituent profiles is a first response profile and at least one
constituent profile in
said second set of constituent profiles is a second response profile, wherein
said first
response profile is determined by at least one measurement of at least one
cellular
constituent in said biological sample when said biological sample is in an
initial state
selected from said plurality of initial states, and said second response
profile is
determined by at least one measurement of at least one said cellular
constituent in said
biological sample when said biological sample is in said initial state.
7. The method of claim 6 wherein said first response profile and said second
response
profile is determined by said initial state of said biological sample at a
time when said
measurements are made.
8. The method of claim 1, 2 or 3 wherein at least one constituent profile in
said first set of
constituent profiles is a first projected profile and at least one constituent
profile in said
second set of constituent profiles is a second projected profile, wherein said
first and
said second projected profile each contain a plurality of cellular constituent
set values
derived according to a definition of co-varying cellular constituent sets.
9. The method of claim 8 wherein said first projected profile and said second
projected
profile is determined by an initial state selected from said plurality of
initial states.
-55-

10. The method of claims 8 wherein said definition is based upon co-variation
of said
cellular constituents under a plurality of different perturbations.
11. The method of claim 8 wherein said definition of co-varying cellular
constituent sets is
defined by a similarity tree derived by a cluster analysis of said cellular
constituents
under said plurality of perturbations.
12. The method of claim 11 wherein said co-varying cellular constituent sets
are defined as
branches of said similarity tree.
13. The method of claim 1, 2, or 3 wherein said biological sample is an
organism having
a cell wall and at least one initial state selected from said plurality of
initial states is
determined by altering said biological sample in a manner that alters said
cell wall
permeability.
14. The method of claim 1, 2, or 3 wherein said biological sample is a cell
line.
15. The method of claim 14 wherein said biological sample is substantially
isogenic to
Saccharomyces cerevisiae.
16. The method of claim 14 wherein said cell line expresses a macromolecule
that has an
ability to act as a drug efflux pump and an initial state that is selected
from said plurality
of initial states is determined by a mutant activity of said macromolecule in
said cell
line.
17. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by a first set of culture growth conditions
and a second
initial state that is selected from said plurality of initial states is
determined by a second
set of culture growth conditions, wherein said first culture growth conditions
and said
second culture growth conditions vary by an amount of a component of said
culture
growth conditions.
18. The method of claim 17 wherein said component of said culture growth
conditions is
an amount of a nutrient that is necessary for viability of said cell line.
19. The method of claim 17 wherein said component of said culture growth
conditions is
an amount of a trace element.
-56-

20. The method of claim 17 wherein said component is selected from the group
consisting
of iron, manganese, zinc, copper, molybdenum, boron, chlorine, calcium,
sodium,
chromium, potassium, magnesium, and selenium.
21. The method of claim 17 wherein said component of said culture growth
conditions is
an incubation temperature.
22. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by a culture growth density of said cell line
and a second
initial state that is selected from said plurality of initial states is
determined by a second
culture growth density of said cell line, wherein said first culture growth
density and
said second culture growth density vary by an amount.
23. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by a first amount of a pharmacological agent
that is
contacted with said biological sample and a second initial state that is
selected from said
plurality of initial states is determined by a second amount of a
pharmacological agent
that is contacted with said biological sample.
24. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by incubating said cell line on a surface.
25. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by incubating said cell line in a liquid.
26. The method of claim 14 wherein said biological sample is incubated in a
container and
a first initial state that is selected from said plurality of initial states
is determined by the
container that said biological sample is incubated in and the container is
selected from
the group consisting of shaker flasks, culture plates, incubators, 96-well
microtiter
plates, and 384-well microtiter plates.
27. The method of claim 1, 2, or 3 wherein a first initial state that is
selected from said
plurality of initial states is determined by a genetic feature of said
biological sample.
28. The method of claim 27 wherein the biological sample is substantially
isogenic to
Saccharomyces cerevisiae having a genome; and a first initial state, which is
selected
from said plurality of initial states, is determined by a genetic feature
selected from the
-57-

group consisting of a haploid state of said genome, a diploid state of said
genome, a
heterozygous state of a gene included in said genome, a homozygous state of a
gene
included in said genome, a mutation of a gene included in said genome, a
deletion of a
portion of a gene from sand genome, an alteration of a regulatory sequence of
a gene in
said genome, an exogenous gene integrated into said genome and an exogenous
oligonucleotide integrated into said genome.
29. The method of claim 27 wherein said biological sample is a cell line
having a genome;
wherein a first initial state is selected from said plurality of initial
states; wherein said
first initial state is determined by a genetic feature selected from the group
consisting
of a heterozygous state of a gene included in said genome, a homozygous state
of a gene
included in said genome, a mutation of a gene included in said genome, a
deletion of a
portion of a gene from said genome, an alteration of a regulatory sequence of
a gene in
said genome, an exogenous gene integrated into said genome of said cell line,
and an
exogenous oligonucleotide integrated into said genome.
30. The method of claim 29 wherein a second initial state that is selected
from said plurality
of initial states is determined by contacting said biological sample with an
amount of
a composition; wherein said compositioncomprises a pharmacological agent, an
endogenous hormone, a growth factor, a peptide, or an oligonucleotide.
31. The method of claim 14 wherein a first initial state that is selected from
said plurality
of initial states is determined by a state of a biological pathway; wherein
said biological
pathway is selected from a compendium ofbiological pathways present in said
cell line.
32. The method of claim 31 wherein said biological sample is substantially
isogenic to
Saccharomyces cerevisiae and said biological pathway is a mating pathway.
33. The method of claim 1, 2, or 3 wherein said first perturbation is a first
amount of a first
pharmacological agent that is contacted with said biological sample.
34. The method of claim 33 wherein said second perturbation is a second amount
of said
first pharmacological agent that is contacted with said biological sample,
wherein said
first and said second amount of said first pharmacological agent are
different.
35. The method of claim 33 wherein said second perturbation is an amount of a
second
pharmacological agent that is contacted with said biological sample.
-58-

36. The method claim 1, 2, or 3 wherein said biological sample includes a
genome and said
first perturbation is determined by the introduction of an exogenous gene into
said
genome.
37. The method of claim 1, 2, or 3 wherein said biological sample includes a
genome and
said first perturbation includes a deletion of at least a substantial portion
of one gene in
said genome.
38. The method of claim 1, 2, or 3 wherein said first perturbation is a
method, the method
comprising: contacting said biological sample with an agent selecting from the
group
consisting of a hormone, a drug, a peptide, an oligonucleotide, a mineral, a
composition
of media, a phage, a trace element, a salt, a colony stimulating factor, and a
source of
irradiation.
39. The method of claim 1, 2, or 3 wherein said first perturbation is a
method, the method
comprising: contacting an amount of an organic compound that has a molecular
weight
less than 1000 Daltons with said biological sample.
40. The method of claim 1, 2, or 3 wherein said first set of constituent
profiles is combined
into said first augmented profile by concatenating said first set of
constituent profiles
and said second set of constituent profiles is combined into said second
augmented
profile by concatenating said second set of constituent profiles.
41. The method of claim 1, 2, or 3 wherein said first augmented profile is:
P i=[P'l;...;P'N]
wherein,
P i is said first augmented profile;
P'l is a first constituent profile in said first set of constituent profiles
that is
determined by measuring a response of said biological sample to said first
perturbation when said biological sample is in said first biological state;
P'N is an N th constituent profile in said first set of constituent profiles
that is
determined by measuring a response of said biological sample to said first
perturbation when said biological sample is in an N th biological state
selected
from said plurality of initial states; and
said second augmented profile is:
P j = [P''l;...;P n N]
-59-

wherein,
P i is said second augmented profile;
P''l is a first constituent profile in said second set of constituent profiles
that
is determined by measuring a response of said biological sample to said
second perturbation when said biological sample is in said first biological
state;
P''N is an N th constituent profile in said second set of constituent profiles
that
is determined by measuring a response of said biological sample to said
second perturbation when said biological sample is in an N th biological state
selected from said plurality of initial states;
and
N is the number of states in said plurality of initial states; and said step
of comparing
said first augmented profile with said second augmented profile to determine
said
correlation is performed by comparing P i to P j using a quantitative measure
of
similarity.
42. The method of claim 41 wherein said quantitative measure of similarity is
a
generalized dot product:
r ij = P i * P j/(¦P i~P j¦)
wherein * denotes dot product, ¦¦ denotes vector norm and r ij denotes
similarity.
43. The method of claim 41 wherein said quantitative measure of similarity is
derived
from Shannon mutual information theory.
44. The method of claim 1, 2 or 3 wherein each constituent profile includes a
plurality of
elements, each element representing an amount of a cellular constituent in
said
biological sample.
45. The method of claim 44 wherein each said element of at least one
constitutive profile
in said first set and each said element of at least one constitutive profile
in said
second set is assigned a
"-1", if said element exceeds a negative threshold,
"1", if said element exceeds a positive threshold, and
"0", if said element does not exceed said positive and said negative
threshold;
and
said positive threshold corresponds to a first amount of one or more cellular
constituents in said biological sample and said second threshold corresponds
to a
-60-

second amount of one or more cellular constituents in said biological sample.
46. The method of claim 44 wherein each said cellular constituent is
independently
selected from the group consisting of a gene expression level, an amount of an
mRNA encoding a gene, an amount of a protein, an amount of an enzymatic
activity,
an amount of an epitope presented by a macromolecule, an amount of a divalent
cation, an amount of a phosphorylated protein, an amount of a dephosphorylated
protein, an amount of a hormone, and an amount of a peptide.
47. The method of claim 1, 2, or 3 wherein each said initial state of said
biological
sample is provided by selecting said biological sample at a different time.
48. The method of claim 1, 2, or 3 wherein said second set of constituent
profiles
represents a baseline state of said biological sample.
49. The method of 1, 2, or 3 wherein said second perturbation is wild-type
activity and
said second set of constituent profiles represents a wild-type state of said
biological
sample.
50. A method of determining an effect of a first perturbation on a subject,
the method
comprising:
(a) determining a plurality of augmented profiles; each augmented profile
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is determined by obtaining a biological sample from said subject
at a different time; and
each constituent profile in said constituent profile set is determined by
measuring a biological response of said biological sample to a
different second perturbation selected from a plurality of
perturbations;
and
(b) comparing said plurality of augmented profiles to determine said effect of
said first perturbation on said subject.
51. The method of claim 50, wherein said first perturbation is selected from
the group
consisting of a diseased state, introduction of an exogenous gene into the
genome of
said subject, and a behavioral health risk.
-61-

52. A method of determining an effect of a first perturbation on a subject,
the method
comprising:
(a) determining a plurality of augmented profiles; each augmented profile
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is determined by obtaining a biological sample from said subject
at a different stage of an environmental insult; and
each constituent profile in said constituent profile set is determined by
measuring a biological response of said biological sample to a
different second perturbation selected from a plurality of
perturbations;
and
(b) comparing said plurality of augmented profiles to determine said effect of
said first perturbation on said subject:
53. The method of claim 52, wherein the environmental insult is a disease that
has
afflicted said subject.
54. The method of claim 50 or 52 wherein a first constituent profile set in
said plurality
of constituent profiles sets represents a baseline state and all other
constituent profile
sets in said plurality of constituent profile sets are expressed as a ratio of
said first
constituent profile set.
55. The method of claim 50 or 52 wherein a first constituent profile set in
said plurality
of constituent profiles sets represents a baseline state and all other
constituent profile
sets in said plurality of constituent profile sets are expressed as a
logarithmic ratio of
said first constituent profile set.
56. The method of claim 50 or 52 wherein said first perturbation is a drug
that is taken
by said subject at regular intervals.
57. A method of determining a biological state of a first subject, the method
comprising:
(a) determining a first set of constituent profiles, each constituent profile
of
said first set being determined by measuring a response of a biological
sample derived from said first subject to a perturbation at a different time;
(b) determining a second set of constituent profiles, each constituent profile
-62-

of said second set being determined by measuring a response of a second
biological sample, which is derived from a second subject having a known
biological state, to said perturbation at a different time;
(c) combining said first set of constituent profiles into a first augmented
profile;
(d) combining said second set of constituent profiles into a second augmented
profile; and
(e) comparing said first augmented profile with said second augmented
profile to predict the biological state of said first subject.
58. A method of diagnosing a disease state in a subject, the method
comprising:
(a) determining a first set of constituent profiles, each constituent profile
of
said first set being determined by measuring a response of a biological
sample obtained from said subject to a different perturbation selected from a
plurality of perturbations;
(b) combining said first set of constituent profiles into a first augmented-
profile; and
(c) comparing said first augmented profile with a library of augmented
profiles, wherein each augmented profile in said library of augmented
profiles is derived from a different biological sample with a known biological
state, to diagnose said disease state.
59. The method of 58 wherein the comparing step includes the step of
clustering said
library into groups based upon similarities to said first augmented profile.
60. A method of drug discovery, the method comprising the steps of:
(a) determining a plurality of augmented profiles; each augmented profile
being
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is
determined by use of a test compound; and
each constituent profile in said constituent profile set is determined by
contacting said test compound with a cell line that is in a different
biological
state selected from a plurality of biological states;
and
(b) comparing said plurality of augmented profiles to determine the effect of
said test
compound on said cell line.
-63-

61. A method of determining a biological state of a first subject, the method
comprising
comparing a first augmented profile with a second augmented profile to predict
said
biological state of said first subject wherein said first and said second
augmented
profile are derived by
(a) determining a first set of constituent profiles, each constituent profile
of said first
set is determined by measuring a response of a biological sample derived from
said
first subject to a perturbation at a different time;
(b) determining a second set of constituent profiles, each constituent profile
of said
second set is determined by measuring a response of a second biological
sample,
which is derived from a second subject having a known biological state, to
said
perturbation at a different time;
(c) combining said first set of constituent profiles into a first augmented
profile; and
(d) combining said second set of constituent profiles into a second augmented
profile.
62. A method of diagnosing a disease state in a subject, the method comprising
comparing a first augmented profile with a library of augmented profiles,
wherein
each augmented profile in said library of augmented profiles is derived from a
different biological sample with a known biological state and said first
augment
profile is derived by
(a) determining a first set of constituent profiles, each constituent profile
of said first
set of constituent profiles is determined by measuring a response of a
biological
sample obtained from said subject to a different perturbation selected from a
plurality of perturbations; and
(b) combining said first set of constituent profiles to derive said first
augmented
profile.
63. A computer system for determining a degree of similarity between an effect
of a first
perturbation and an effect of a second perturbation on a biological system,
the
computer system comprising a processor, and a memory encoding one or more
programs coupled to the processor, wherein the one or more programs cause the
processor to perform a method comprising:
(a) combining a first set of constituent profiles into a first augmented
profile;
each constituent profile in said first set determined by:
a different one of a plurality of initial states of said biological system
wherein a response of said biological system to said first perturbation
when said biological system is in said different one of said initial
states is measured;
-64-

(b) combining a second set of constituent profiles into a second augmented
profile; each constituent profile of said second set determined by:
a different one of said plurality of initial states of said biological
sample; wherein a response of said biological sample to said second
perturbation when said biological sample is in said different one of
said initial states is measured; and
(c) comparing said first augmented profile with said second augmented
profile to determine said degree of similarity.
64. A computer system for determining a degree of similarity between an effect
of a first
perturbation and an effect of a second perturbation on a biological sample,
the
computer system comprising a processor, and a memory encoding one or more
programs coupled to the processor, wherein the one or more programs cause the
processor to perform a method that comprises comparing a first augmented
profile
with a second augmented profile to determine said degree of similarity;
wherein:
(i) said first augmented profile is determined by combining a first set of
constituent profiles; each constituent profile of said first set determined
with
a different one of a plurality of initial states of said biological sample by
measuring a response of said biological sample to said first perturbation
when said biological sample is in said different one of said initial states;
and
(ii) said second augmented profile is determined by combining a second set
of constituent profiles; each constituent profile of said second set is
determined with said different one of said plurality of initial states of said
biological sample by measuring a response of said biological sample to said
second perturbation when said biological sample is in said different one of
said initial states.
65. A computer system for determining an effect of a first perturbation on a
subject, the
computer system comprising a processor, and a memory encoding one or more
programs coupled to the processor, wherein the one or more programs cause the
processor to perform a method comprising:
(a) determining a plurality of augmented profiles; each augmented profile
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is determined by obtaining a biological sample from said subject
at a different time; and
each constituent profile in said constituent profile set is determined by
-65-

measuring a biological response of said biological sample to a
different second perturbation selected from a plurality of
perturbations;
and
(b) comparing said plurality of augmented profiles to determine said effect of
said first perturbation on said subject.
66. A computer system for determining an effect of a first perturbation on a
subject, the
computer system comprising a processor, and a memory encoding one or more
programs coupled to the processor, wherein the one or more programs cause the
processor to perform a method comprising:
(a) determining a plurality of augmented profiles; each augmented profile
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is determined by obtaining a biological sample from said subject
at a different stage of an environmental insult; and
each constituent profile in said constituent profile set is determined by
measuring a biological response of said biological sample to a
different second perturbation selected from a plurality of
perturbations;
and
(b) comparing said plurality of augmented profiles to determine said effect of
said first perturbation on said subject.
67. A computer system for determining a biological state of a first subject,
the computer
system comprising a processor, and a memory encoding one or more programs
coupled to the processor, wherein the one or more programs cause the processor
to
perform a method comprising:
(a) determining a first set of constituent profiles, each constituent profile
of
said first set being determined by measuring a response of a biological
sample derived from said first subject to a perturbation at a different time;
(b) determining a second set of constituent profiles, each constituent profile
of said second set being determined by measuring a response of a second
biological sample, which is derived from a second subject having a known
biological state, to said perturbation at a different time;
(c) combining said first set of constituent profiles into a first augmented
profile;
-66-

(d) combining said second set of constituent profiles into a second augmented
profile; and
(e) comparing said first augmented profile with said second augmented
profile to predict the biological state of said first subject.
68. A computer system for diagnosing a disease state in a subject, the
computer system
comprising a processor, and a memory encoding one or more programs coupled to
the processor, wherein the one or more programs cause the processor to perform
a
method comprising:
(a) determining a first set of constituent profiles, each constituent profile
of
said first set being determined by measuring a response of a biological
sample obtained from said subject to a different perturbation selected from a
plurality of perturbations;
(b) combining said first set of constituent profiles into a first augmented
profile; and
(c) comparing said first augmented profile with a library of augmented
profiles, wherein each augmented profile in said library of augmented
profiles is derived from a different biological sample with a known biological
state, to diagnose said disease state.
69. A computer system for advancing drug discovery, the computer system
comprising a
processor, and a memory encoding one or more programs coupled to the
processor,
wherein the one or more programs cause the processor to perform a method
comprising the steps of:
(a) determining a plurality of augmented profiles; each augmented profile
being
determined by combining a constituent profile set selected from a plurality of
constituent profile sets wherein:
each said constituent profile set in said plurality of constituent profile
sets is
determined by use of a test compound; and
each constituent profile in said constituent profile set is determined by
contacting said test compound with a cell line that is in a different
biological
state selected from a plurality of biological states;
and
(b) comparing said plurality of augmented profiles to determine the effect of
said test compound on said cell line.
70. A computer system for determining a biological state of a first subject,
the computer
system comprising a processor, and a memory encoding one or more programs
-67-

coupled to the processor, wherein the one or more programs cause the processor
to
perform a method comprising comparing a first augmented profile with a second
augmented profile to predict said biological state of said first subject
wherein said
first and said second augmented profile are derived by:
(a) determining a first set of constituent profiles, each constituent profile
of said first
set is determined by measuring a response of a biological sample derived from
said
first subject to a perturbation at a different time;
(b) determining a second set of constituent profiles, each constituent profile
of said
second set is determined by measuring a response of a second biological
sample,
which is derived from a second subject having a known biological state, to
said
perturbation at a different time;
(c) combining said first set of constituent profiles into a first augmented
profile; and
(d) combining said second set of constituent profiles into a second augmented
profile.
71. A computer system for diagnosing a disease state in a subject, the
computer system
comprising a processor, and a memory encoding one or more programs coupled to
the processor, wherein the one or more programs cause the processor to perform
a
method comprising comparing a first augmented profile with a library of
augmented
profiles, wherein each augmented profile in said library of augmented profiles
is
derived from a different biological sample with a known biological state and
said
first augment profile is derived by:
(a) determining a first set of constituent profiles, each constituent profile
of said first
set of constituent profiles is determined by measuring a response of a
biological
sample obtained from said subject to a different perturbation selected from a
plurality of perturbations; and
(b) combining said first set of constituent profiles to derive said first
augmented
profile.
-68-

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
Methods for Robust Discrimination of Profiles
This is a continuation-in-part of copending application serial number
09/220,274, by
Stoughton et al. filed Decembesr 23, 1998 entitled, "Methods for Robust
Discrimination of
Profiles" which is incorporatef. by reference herein in its entirety.
1 FIELD OF THE INVENTION
The field of this invention relates to methods for discriminating between the
subtle
effects of a first perturbation anal a second perturbation on a biological
sample. The
invention also relates to improved methods for identifying disease states in
patients. In
addition, the invention provides. improved methods for optimizing drug therapy
regimens in
diseased subjects. The invention also generally relates to improved methods
for
__ . determining the subtle effects of pharmacological agents on a biological
system.
2 BA(~KGROUND OF THE INVENTION
2.1 Profiles of Cellular Constituents
"Cellular constituents" include gene expression levels, abundance of mRNA
encoding specific genes, and protein expression levels in a biological sample.
Levels of
various constituents of a cell, such as mRNA encoding genes and/or protein
expression
levels, are known to change in rc;sponse to drug treatments and other
perturbations of the
cell's biological state. Measurements of a plurality of such "cellular
constituents" therefore
contain a wealth of information about the affect of perturbations on the
cell's biological
state. The collection of such measurements is generally referred to as the
"profile" of the
cell's biological state.
There may be on the order of 100,000 different cellular constituents for
mammalian
cells. Consequently, the profile of a particular cell is typically complex.
The profile of any
given state of a biological sample; is often measured after the sample has
been subjected to a
perturbation. Such perturbations include, for example, exposure of the sample
to a drug
candidate, the introduction of an exogenous gene, the deletion of a gene from
the sample, or
changes in culture conditions. Comprehensive measurements of cellular
constituents, or
profiles of gene and protein expression and their response to perturbations in
the cell,
therefore have a wide range of utility including the ability to compare and
understand the
effects of drugs, diagnose disease" and optimize patient drug regimens. In
addition, they
have further application in a basic life science research.

CA 02356891 2001-06-22
WO 00/39337 PCTNS99/30577
Within the past decade, several technological advances have made it possible
to
accurately measure cellular constituents and therefore derive profiles. For
example, new
techniques provide the ability to monitor the expression level of a large
number of
transcripts at any one time (see', e.g., Schena et aL, 1995, Quantitative
monitoring of gene
expression patterns with a corr.~plementary DNA micro-array, cience 270:467-
470;
Lockhart et al., 1996, Expression monitoring by hybridization to high-density
oligonucleotide arrays, Nature BiotechnoloQV 14:1675-1680; Blanchard et al.,
1996,
Sequence to array: Probing thc: genome's secrets, Nature Biotechnolosv 14,
1649; U.S.
Patent 5,569,588, issued October 29, 1996 to Ashby et al. entitled "Methods
for Drug
Screening"). In organisms for which the complete genome is known, it is
possible to
~alyze the transcripts of all genes within the cell. With other organisms,
such as humans,
for which there is an increasing; knowledge of the genome, it is possible to
simultaneously
monitor large numbers of the genes within the cell.
In another front, the direct measurement of protein abundance has been
improved by
the use of microcolumn reversed-phase liquid chromatography electrospray
ionization
t~dem mass spectrometry (LC,~1VIS/MS) to directly identify proteins contained
in mixtures.
This technology promises to push the dynamic range for which protein abundance
can be
measured in a biological sample. Using LC/MS/MS, McCormack et al. have
demonstrated
that proteins presented in sample mixtures can be readily identified with a 30-
fold
difference in molar quantity, that the identifications are reproducible, and
that proteins
mthm the mixture can be identified at low femtomole levels. McCormack et al.,
1997,
Direct analysis and identification of proteins in mixtures by LC/MS/MS and
database
searching at the low-femtomole level, Anal. Chem. 69:767-776. In a review of
tandem
mass spectrometry, Chait points out that an additional advantage of this
technology is that it
is orders of magnitude faster than more conventional approaches such as Edman
sequencing. Chait, 1996, Trawliing for proteins in the post-genome era, Nat.
Biotech.
14:1544.
Other technological advances have provided for the ability to specifically
perturb
biological samples with individual genetic mutations. For example, Mortensen
et al.
describe a method for producing embryonic stem (ES) cell lines whereby both
alleles are
inactivated by homologous recombination. Using the methods of Mortensen et
al., it is
possible to obtain homozygous mutationally altered cells, i.e., double
knockouts of ES cell
lines. Mortensen et al. propose that their method may be generally applicable
to other
genes and to cell lines other than ES cells. Mortensen et al. 1992, Production
of
homozygous mutant ES cells with a single targeting construct, 11 iol. 12:2391-
2395.
~ mother promising technology Wach et al. provide a dominant resistance module
for selection of S. cerevisiae tran:>formants which entirely consists of
heterologous DNA.
The module can also be used to provide PCR based gene disruptions. Wach et
al., 1994,
-2-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
New heterologous modules for classical or PCR-based gene disruptions in
Saccharomyces
cerevisiae, Yeast. 10:1793-808..
Technological advances., such as the use of DNA microarrays, are already being
used in drug discovery (See e.g" Morton et al., 1998, Drug target validation
and
identification of secondary drug; target effects using DNA microarrays, Nature
Medicine in
press; Gray et al., 1998, Exploiting chemical libraries, structure, and
genomics in the search
for kinase inhibitors, fence 281:533-538).
2.2 Profile comparis rtes
Comparison of profiles with other profiles in a database (see, e.g., U.S.
Patent
5,777,888, issued July 7, 1998 to Rine et al. entitled "Systems for generating
and analyzing
stimulus-response output signal matrices") or clustering of profiles by
similarity can give
clues to the molecular targets of drugs and related functions, efficacy and
toxicity of drug
candidates and/or pharmacological agents. Such comparisons may also be used to
derive
consensus profiles representative: of ideal drug activities or disease states.
Profile
comparison can also help detect diseases in a patient at-an early stage and
provide improved
clinical outcome projections for a patient diagnosed with a disease.
At the center of all these profile comparison efforts is the need for robust
discrimination of subtle differences in activity of the experimental
conditions
{"perturbations") that are often associated with the different profiles. To
date such robust
discrimination has not been achieved. In a typical perturbation experiment,
the response of
several thousand cellular constitu~.ents are typically measured, yet only a
small number of
constituents change significantly. Frequently none of the cellular
constituents change at all.
Consequently, there is frequently not enough information available in a
conventional
profile to provide an accurate assessment of the subtle effects of a
perturbation. Figure 1
illustrates this art recognized problem. In Figure l, the results of 365 mRNA
transcription
profiling experiments are shown. The 365 experiments include experiments
with/without
drugs at different concentrations, with/without specific genes in the yeast
strain,
combinations of drug treatment and gene deletion, changes in culture density,
growth
temperature, medium composition, and stimulations with endogenous hormones
like mating
factor. Although several thousand cellular constituents are being profiled in
each
experiment depicted in Figure l, typically only a small number of constituents
change
significantly, and often none at all. As a consequence, a profile derived from
any of the 365
experiments in Figure 1 would nol: provide enough information to determine the
subtle
effects of a particular perturbation. Consequently, profile comparisons using
conventional
profiles suffer from a failure to provide sufficient information to discern
the subtle affects of
a perturbation on a biological system.
-3-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
According to the above background, there is a great demand in the art for
robust
profile comparison methods.
Discussion or citation o~f a reference herein shall not be construed as an
admission
that such reference is prior art to the present invention.
3 SUMMARY OF THE INVENTION
This invention provides robust profile comparison methods. These methods are
used to determine a degree of siimilarity between an effect of a first
perturbation and a
second perturbation on a biological system. The methods of this invention have
extensive
applications in the areas of preventive health care, drug discovery, drug
candidate lead
selection, drug candidate validation, drug regimen optimization in a variety
of patient
populations, development of clinical trial protocols to satisfy United States
Food and Drug
Administration (FDA) requirements including those for investigative new drugs,
satisfaction of related clinical tnial protocol requirements in administrative
agencies that are
equivalent to the FDA in countries other than the United States, drug and/or
drug candidate
~15 ~f~caey,-drug and~or-drug candidate toxicity, diagnostic applications such-
as disease
monitoring in a variety of patient populations, and for the prediction of the
clinical outcome
of a patient.
One aspect of the invention includes a method comprising the steps of (a)
determining a first set of constituent profiles, wherein each constituent
profile in the set is
determined by a different one of a plurality of initial states of a biological
sample by
measuring a response of the biological sample to the first perturbation when
the biological
sample is in the selected initial skate; {b) determining a second set of
constituent profiles,
each constituent profile of the second set determined using a different one of
a plurality of
initial states of the biological sample by measuring a response of the
biological sample to a
second perturbation when the bie~logical sample is in the selected initial
state; (c) combining
the first set of constituent profiles into a first augmented profile; (d)
combining the second
set of constituent profiles into a second augmented profile; and (e) comparing
the first
augmented profile with the second augmented profile to determine the degree of
similarity
between the first perturbation and the second perturbation.
~ accord with a second aspect of the invention at least one constituent
profile in the
first set of constituent profiles is a first response profile and at Least one
constituent profile
in the second set of constituent profiles is a second response profile. The
first response
profile is determined by at least one measurement of a at least one cellular
constituent in the
biological sample when the biological sample is in an initial state selected
from a plurality
°f initial states, and the second response profile is determined by at
least one measurement
of at lease one cellular constituent in said biological sample when said
biological sample is
in the selected initial state.
-4-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
In accord with a another aspect of the invention at least one constituent
profile in the
first set of constituent profiles is a first projected profile and at least
one constituent profile
in the second set of constituent profiles is a second projected profile. In
this aspect of the
invention, the first and second Inojected profiles each contain a plurality of
cellular
constituent set values derived according to a definition of co-varying
cellular constituent
sets. The first and second projected profiles could be determined by an
initial state selected
from said plurality of initial stakes of the biological sample. An augmented
profile could
include any combination of projiected profiles and response profiles.
In accord with a another aspect of the invention the biological sample is a
cell line.
The cell line could be an of an unicellular organism and at least one initial
state included in
a plurality of initial states could be determined by altering the biological
sample in a manner
that alters cell wall permeability. In another aspect the biological sample is
substantially
isogenic to Saccharomyces cerevisiae.
In another aspect of the invention, the biological sample is a cell line that
expresses
a macromolecule that serves as a drug efflux pump. In this embodiment, some of
the initial
biological states are generated b:y selecting isogenic cell lines khat do not
possess
macromolecules that have an abiility to act as a drug efflux pump.
In another embodiment,1he biological sample is a cell line and the first
initial state
that is selected from a plurality of initial states is determined by a first
set of culture growth
conditions and a second initial state that is selected from a plurality of
initial states is
determined by a second set of culture growth conditions. In this embodiment,
the first
culture growth conditions and the second culture growth conditions vary by a
variable such
as an amount of a nutrient that is necessary for viability of said cell line,
an amount of a
trace element, an amount of a mineral, a culture temperature, andlor the
nature of the
container the sample is cultured in. Examples of containers include but are
not limited to
shaker flasks, culture plates and incubators.
In another aspect of the invention, the biological sample is a cell line and a
first
initial state that is selected from a plurality of initial states is
determined by a culture growth
density of the cell line and a second initial state that is selected fi-om a
plurality of initial
states is determined by a second culture growth density of the cell line,
wherein the two
culture growth densities vary by ;m amount.
In an another aspect of thc; invention, the biological sample is a cell line
and a first
initial state that is selected from a. plurality of initial states is
detenmined by a first amount
of a pharmacological agent that is. contacted with the biological sample and a
second initial
state that is selected from said plurality of initial states is determined by
a second amount of
a pharmacological agent that is contacted with the biological sample.
In an accord with another aspect of the invention a first initial state is
determined by
a genetic feature of the biological sample. In this aspect of the invention,
the biological
-5-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
sample could be SaccharomycE:s cerevisiae having a genome and the first
initial state that is
selected from a plurality of initial states is determined by a genetic feature
selected from the
group consisting of a haploid state of the genome, a diploid state of the
genome, a
heterozygous state of a gene included in the genome, a homozygous state of a
gene included
in the genome, a mutation of a ;gene included in the genome, a deletion of a
portion of a
gene from the genome, an alteration of a regulatory sequence of a gene in the
genome, an
exogenous gene integrated into the genome and an exogenous oligonucleotide
integrated
into the genome.
In accordance with the another aspect of the invention, the biological sample
could
be a cell line having a genome wherein the first initial state that is
selected from a plurality
°f initial states is determined by a genetic feature selected from the
group consisting of a
heterozygous state of a gene included in the genome, a homozygous state of a
gene included
in the genome, a mutation of a gene included in the genome, a deletion of a
portion of a
gene from the genome, an alteration of a regulatory sequence of a gene in the
genome, an
exogenous gene integrated into I:he genome, and an exogenous oligonucleotide
integrated
into t genome. --
In another aspect of the invention, the biological sample is a cell line and
the first
initial state that is selected from a plurality of initial states is
determined by a state of a
biological pathway that is selected from a compendium of biological pathways
present in
the cell line. In one aspect of thE; invention, the biological sample is
substantially isogenic
with Saccharomyces cerevisiae ~md the biological pathway is a mating pathway.
1n yet another aspect of the invention, the first perturbation is a first
amount of a first
pharmacological agent that is contacted with the biological sample. In another
aspect, the
second perturbation is a second amount of the f rst pharmacological agent that
is contacted
with the biological sample, and tlhe first and second amounts of
pharmacological agent vary.
In another aspect, the second perturbation is a second amount of a second
pharmacological
agent that is contacted with said biological sample.
In accordance with another aspect of the invention, the biological sample
includes a
genome and the first perturbation is determined by the introduction of an
exogenous gene
into the genome, and/or deletion .of at least one gene in the genome.
~ accordance with another aspect of the invention, the first perturbation is a
method, the method comprising: contacting said biological sample with a
hormone, a drug,
a peptide, an oligonucIeotide, a mineral, a composition of media, a phage, a
trace element, a
salt, a colony stimulating factor or a source of irradiation. In another
aspect, the first
perturbation is a method, the method comprising: contacting an amount of an
organic
compound that has a molecular weight less than 1000 Daltons with said
biological sample.
In accordance with another aspect of the invention, the first augmented
profile is
expressed as:
-6-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
P' _ [P'~;...;P'N]
wherein,
P' is a first augmented profile;
P', is a first constituent profile in a first set of constituent profiles that
is
determined by measuring a response of a biological sample to a first
perturbation when a biological sample is in a first biological state selected
from a plurality of initial states;
P N is an N'" constituent profile in the first set of constituent profiles
that is
determined by measuring a response of the biological sample to the first
perturbation when the biological sample is in an N'" biological state selected
from the plurality of initial states; and
the second augmented profile is:
P' _ [P"r . . ~ P" N]
,.,
wherein,
p' is a second au~~nented profile;
P", is a first constituent profile in a second set of constituent profiles
that is
determined by measuring a response of the biological sample to the second
perturbation when the biological sample is in said the biological state;
P' N is an N'" constituent profile in the second set of constituent profiles
that
is determined by measuring a response of the biological sample to the second
perturbation when the biological sample is in an N'" biological state selected
from the plurality of initial states; and
N is the number of states in the plurality of initial states.
In this embodiment the step of comparing the first augmented profile with the
second
augmented profile to detenmine the correlation is performed by comparing P' to
P' using a
quantitative measure of similarity. In one aspect this quantitative measure of
similarity is a
generalized dot product:
~;; = pi * P' i C~P'~~I"~)
wherein * denotes dot product, ~~ denotes vector norm and r;~ denotes
similarity. In another
aspect of the invention, the quantitative measure of similarity is derived
from Shannon
mutual information theory.
In another aspect of the invention, each constituent profile includes a
plurality of
elements that each represent an amount of a cellular constituent in a
biological sample.
Accordingly, the cellular constituents are independently selected from the
group consisting
of a gene expression level, an amount of an mRNA encoding a gene, an amount of
a
protein, an amount of an enzymatic activity, an amount of an epitope presented
by a
_7_

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
macromolecule, an amount of a divalent cation, an amount of a phosphorylated
protein, an
amount of a dephosphorylated protein, an amount of a hormone, and an amount of
a
peptide.
Another aspect of the invention is a method of determining an effect of a
first
perturbation on a subject, the method comprising:(a) determining a plurality
of augmented
profiles; each augmented profile determined by combining a constituent profile
set selected
from a plurality of constituent ;prof le sets wherein:
each constituent profile set in the plurality of constituent profile sets is
determined
by obtaining a biological sample from the subject at a different time; and
each constituent profile in the constituent profile set is determined by
measuring a
biological response of tlhe biological sample to a different second
perturbation
selected from a pluralit)r of perturbations;
and
(b) comparing the plurality of augmented profiles to determine the effect of
the first
perturbation on the subject. The first perturbation may be selected from the
group
consisting of a diseased state; -introduction of an exogenous gene into the
genome of the
subject, and a behavioral health risk. Optionally, the first constituent
profile set in the
plurality of constituent profiles nets represents a baseline state and all
other constituent
profile sets in the plurality of constituent profile sets are expressed as a
ratio or logarithmic
ratio of the first constituent profile set. Optionally, the first perturbation
is a drug that is
den by the subject of interest a.t regular intervals.
4 BRIEF :DESCRIPTION OF THE DRAWINGS
Fig. 1 represents the results of 365 mRNA transcription profiling experiments.
Methods were as described for a subset of these experiments in Section 6.,
supra. Each of
the 365 rows in this image has, vvhen printed at full resolution, 6000 gray-
scale pixels
representing the ratio in mRNA expression of the 6000 yeast genes between the
pair of cell
conditions in that experiment pair. Black denotes upregulation of a gene's
transcription,
white denotes downregulation, and the middle gray denotes very little or no
change. The
gray-scale bar at the bottom of Fiigure 1 indicates a scale from logl0(ratio)
_ -1 (ten fold
downregulation) to 1og10(ratio) _= +1 (ten fold upregulation) for reference.
The 365
condition pairs include comparisons of with/without drugs at different
concentrations,
with/without specific genes in the; yeast strain, combinations of drug
treatment and gene
deletion, changes in culture density, growth temperature, medium composition,
and
stimulations with endogenous homnones like mating factor.
Fig. 2 represents profiles to drugs in multiple conditions, Although the
response to
the drugs under starting State 1 may be small or nonexistent, the concatenated
response
profiles obtained in different states may provide robust discrimination of the
activities of the
_g_

CA 02356891 2001-06-22
WO 00/39337 PCT1US99l30577
different compounds. t denotes upregulation. 1 denotes downregulation. Absence
of an
arrow denotes no change for that cellular constituent.
Fig. 3A illustrates a profile for the immunosuppressant drug cyclosporin A.
Fig. 3B
illustrates a profile for the immunosuppressant drug FK506. In both figures,
the horizontal
axis is the intensity of the individual hybridized spots on the microan ary,
representing
individual mRNA species abundance in the two cultures. The vertical axis is
the 1og10 of
the ratio of the intensity measured for one fluorescent label (Culture 1) to
that measured for
the other label (Culture 2). Error bars and names are displayed only for those
genes which
had up or down regulations duc; to the drug that were significant at the 95%
confidence level
or better.
Fig. 4 Shows the high correlation (similarity) between the effects of
cyclosporin A
and FK506 on S. Cerevisiae that had been cultured in the presence of I pg/ml
of FK506 and
30pg/ml of cyciosporin respectively.
Fig. SA illustrates a response profile for the gene deletion strain FPR
cultured in the
presence of I pcg/ml of FK506.
Fig. SB i-llustrates a response profile for the gene deletion strain CPH1
cultured iii
the presence of 1 pg/ml of FK506.
Fig. SC illustrates a response profile for the gene deletion strain FPR
cultured in the
presence of SOpg/ml of cyclosporin.
Fig. SD illustrates a response profile for the gene deletion strain CPH1
cultured in
~e presence of SOUg/ml of cycl~asporin.
Fig. 6 illustrates the reduced correlation between the effects of cyclosporin
and
FK506 in yeast when augmenteci profiles are used.
Fig. 7 illustrates a computer system useful for embodiments of the invention.
5 ED TAILED DESCRIPTION OF THE INVENTION
A basis for the present invention is the unexpected discovery that augmented
profiles provide a method for robustly discriminating between the subtle
effects of a first
perturbation and a second perturbation on a biological sample. Augmented
profiles are
derived by the combination of a plurality of response profiles and/or
projection profiles that
~e m turn based upon the measurement of cellular constituents within a
biological sample
as the biological sample is placed in a series of different starting states.
This section
presents a detailed description of the invention and its applications.
5.1 INTRODUCTION
To appreciate the methods of the present invention, an understanding of some
preliminary concepts such as biological state, response profiles, and
projection profiles is
necessary. After these concepts ~~re understood, one skilled in the art will
understand the
-9-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
concept of an augmented profile. Further, the improvements that augmented
profiles
provide in the field of profile comparison will be appreciated after the
details of the present
invention are described and an example is presented.
5.1.1 GENERAL DEFINITIONS
Biological sample and/or Biological system: As used herein, a biological
sample
and/or biological system includes a cell line, a culture of a cell line, a
tissue sample
obtained from a subject, a Homo sapien, a mammal, a yeast substantially
isogenic to
Saccharomyces cerevisia, or ar~y other art recognized biological system.
Perturbation: As used herein, a perturbatiion includes the exposure of a
biological
sample to a drug candidate or phanmacologic agent, the introduction of an
exogenous gene
into a biological sample, the deletion of a gene from the biological sample,
changes in the
culture conditions of the biological sample, or any other art recognized
method of
perturbing a biological sample.
Constituent Profile: A constituent profile is a profile used in the formation
of an
augmented profile. The constituent profile may, for example, be a response
profile or a
projected profile, which are described infra.
Behavioral Health Risk: As used herein, a behavioral health risk includes, but
is not
limited to, consumption of alcohol and cigarette smoking.
5.1.2 BIOLOGICAL SAMPLE
As used in herein, the term "biological sample" is broadly defined to include
any
cell, tissue, organ or multiceIlula.r organism. A biological sample can be
derived, for
example, from cell or tissue cultures in vitro. Alternatively, a biological
sample can be
derived from a living organism o~r from a population of single cell organisms.
The state of a
biological sample can be measured by the content, activities or structures of
its cellular
constituents. The state of a biological sample, as used herein, is determined
by the state of a
c°Ilection of cellular constituents., which are sufficient to
characterize the cell or organism
for an intended purpose including; characterizing the effects of a drug or
other perturbation.
The term "cellular constituent" is broadly defined herein to encompass any
kind of
measurable biological variable. ~('he measurements and/or observations made on
the state of
these constituents can be of their abundances (i.e., amounts or concentrations
in a biological
s~ple), their activities, their states of modification (e.g.,
phosphorylation), or other art
recognized measurements relevant to the physiological state of a biological
sample. In
various embodiments, this inventiion includes making such measurements and/or
-10-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
observations on different collections of cellular constituents. These
different collections of
cellular constituents are also called aspects of the biological state of a
biological sample.
One aspect of the biolol;ical state of a biological sample (e.g., a cell or
cell culture)
usefully measured in the present invention is its transcriptional state. The
transcriptional
state of a biological sample includes the identities and abundances of the
constituent RNA
species, especially mRNAs, in l:he cell under a given set of conditions.
Often, a substantial
fraction of all constituent RNA species in the biological sample are measured,
but at least a
sufficient fraction is measured to characterize the action of a drug or other
perturbation of
interest. The transcriptional state of a biological sample can be conveniently
determined by
measuring cDNA abundances b;y any of several existing gene expression
technologies.
DNA arrays for measuring mRNA or transcript level of a large number of genes
can be
employed to ascertain the biological state of a sample.
Another aspect of the biological state of a biological sample usefully
measured is its
translational state. The translati~onal state of a biological sample includes
the identities and
abundances of the constituent protein species in the biological sample under a
given set of
conditions. Preferably a substantial fraction of all constituent protein
species in the
biological sample is measured, but at least a sufficient fraction is measured
to characterize
the action of a drug of interest. "Che transcriptional state is often
representative of the
translational state.
Other aspects of the biological state of a biological sample are also of use
in this
invention. For example, the activity state of a biological sample includes the
activities of
the constituent protein species (and also optionally catalytically active
nucleic acid species)
in the biological sample under a ,given set of conditions. As is known to
those of skill in the
art, the translational state is often representative of the activity state.
This invention is also adaptable, where relevant, to "mixed" aspects of the
biological
state of a biological sample in wluch measurements of different aspects of the
biological
state of a biological sample are combined. For example, in one mixed aspect,
the
abundances of certain RNA species and of certain protein species, are combined
with
measurements of the activities of certain other protein species. Further, it
will be
appreciated from the following that this invention is also adaptable to any
other aspect of a
biological state of a biological sample that is measurable.
The biological state of a biological sample (e.g., a cell or cell culture) can
be
represented by a profile of some number of cellular constituents. Such a
profile of cellular
constituents can be represented b~~ the vector S.
S=~S"..S;,..S,~~
-11-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
Where S; is the level of the i'th cellular constituent, for example, the
transcript level of
gene i, or alternatively, the abundance or activity level of protein i.
In some embodiments, cellular constituents are measured as continuous
variables.
For example, transcriptional rates are typically measured as number of
molecules
synthesized per unit of time. Transcriptional rate may also be measured as
percentage of a
control rate. However, in some other embodiments, cellular constituents may be
measured
as categorical variables. For example, transcriptional rates may be measured
as either "on"
or "off ', where the value "on" indicates a transcriptional rate above a
predetermined
threshold and value "ofF' indicates a transcriptional rate below that
threshold.
5.1.3 RESPONSE PROFILES
The responses of a biological sample to a perturbation, such a pharmacological
agent, can be measured by observing the changes in the biological state of the
biological
sample. A response profile is a collection of changes of cellular
constituents. The response
profile of a biological sample (e.g., a cell or cell culture) to the
perturbation m may be
defined as the vector v~'"~:
v~'"~ = w;'"a, . . v~"'~ . . v~"'~~ (2)
k
where v"' is the amplitude of response of cellular constituent i under the
pe~ation m. In some embodiments of response profiles, biological response to
the
application of a pharmacological agent is measured by the induced change in
the transcript
level of at least 2 genes, preferably more than 10 genes, more preferably more
than 100
genes and most preferably more than 1,000 genes,
In some embodiments, biological response profiles comprise simply the
difference
between biological variables before and after perturbation. In some preferred
embodiments,
the biological response is defined as the ratio of cellular constituents
before and after a
perturbation is applied.
In some preferred embodiments, v;~ is set to zero if the response of gene i is
below
some threshold amplitude or confidence level determined from knowledge of the
me~m'ement error behavior. In such embodiments, those cellular constituents
whose
measured responses are lower tha~l the threshold are given the response value
of zero,
whereas those cellular constituents whose measured responses are greater than
the threshold
retain their measured response values. This truncation of the response vector
is suitable
when most of the smaller responses are expected to be greatly dominated by
measurement
~TOr. After the truncation, the response vector v~"'~ also approximates a
'matched detector'
(see, e.g., Van Trees, 1968, Detection. Estimation and Modulation Theorv Vol
I, Wiley &
- 12-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
Sons) for the existence of similar perturbations. It is apparent to those
skilled in the art that
the truncation levels can be set based upon the purpose of detection and the
measurement
errors. For example, in some e:mbodirnents, genes whose transcript level
changes are lower
than two fold or more preferably four fold are given the value of zero.
In some preferred embodiments of response profiles, perturbations are applied
at
several levels of strength. For example, different amounts of a drug may be
applied to a
biological sample to observe its response. In such embodiments, the
perturbation responses
may be interpolated by approximating each by a single parameterized "model"
function of
the perturbation strength u. An exemplary model function appropriate for
approximating
transcriptional state data is the Hill function, which has adjustable
parameters a, uo, and n.
ar(ulu~" (3)
H(~ --
1 v (u/u~n
The adjustable parameters are selected independently for each cellular
constituent of the
perturbation response. Preferably, the adjustable parameters are selected for
each cellular
constituent so that the sum of the; squares of the differences between the
model function
(e.g., the Hill function, Equation 3) and the corresponding experimental data
at each
perturbation strength is minimizc;d. This preferable parameter adjustment
method is known
in the art as a least squares fit. Other possible model functions are based on
polynomial
fitting. More detailed description of model fitting and biological response
has been
disclosed in Friend and Stoughton, Methods of Determining Protein Activity
Levels Using
Gene Expression Profiles, U.S- Provisional Application Serial No. 60/084,742,
filed on
May 8, 1998, which is incorporated herein by reference in it's entirety for
all purposes.
5.1.4 PROJECTED PROFILES
The methods of the invention are useful for comparing augmented profiles that
contain any number of response profile and/or projected profiles. Projected
profiles are best
understood after a discussion of genesets, which are co-regulated genes.
Projected profiles
are useful for analyzing many types of cellular constituents including
genesets.
5.1.4.1 CO-REGULATED GENES AND GENESETS
Certain genes tend to increase or decrease their expression in groups. Genes
tend to
increase or decrease their rates of transcription together when they possess
similar
regulatory sequence patterns, i.e., transcription factor binding sites. This
is the mechanism
for coordinated response to particular signaling inputs (see, e.g., Madhani
and Fink, 1998,
The riddle of MAP kinase signaling specificity, Transactions in Genetics
14:151-155;
Arnone and Davidson, 1997, The hardwiring of development: organization and
fimction of
-13-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
genomic regulatory systems, L>evelopment 124:1851-1864). Separate genes which
make
different components of a necessary protein or cellular structure will tend to
co-vary.
Duplicated genes (see, e.g., W;agner, 1996, Genetic redundancy caused by gene
duplications
and its evolution in networks o~f transcriptional regulators, Biol. Cvbern.
74:557-567) will
also tend to co-vary to the extent mutations have not lead to functional
divergence in the
regulatory regions. Further, because regulatory sequences are modular (see,
e.g., Yuh et
al.,1998, Genomic cis-regulatory logic: experimental and computational
analysis of a sea
urchin gene, Sc'ence 279:1896-1902), the more modules two genes have in
common, the
greater the variety of conditions under which they are expected to co-vary
their
transcriptional rates. Separation between modules also is an important
determinant since
c°-activators also are involved. In summary therefore, for any finite
set of conditions, it is
expected that genes will not all vary independently, and that there are
simplifying subsets of
genes and proteins that will co-vary. These co-varying sets of genes form a
complete basis
in the mathematical sense with which to describe all the profile changes
within that finite
set of conditions.
5.1.4.2 _GENESET CLASSIFICATION BY CLUSTER ANALYSIS
For many applications, it is desirable to find basis genesets that are co-
regulated
over a wide variety of conditions. A preferred embodiment for identifying such
basis
genesets involves clustering algorithms (for reviews of clustering algorithms,
see, e.g.,
F~aga, 1990, Statistical Pattern Recognition, 2nd Ed., Academic Press, San
Diego;
Everitt, 1974, Cluster Analysis, London: Heinemann Educ. Books; Hartigan,
1975,
Clustering Algorithms, New York: WiIey; Sneath and Sokal, 1973, Numerical
Taxonomy,
Freeman; Anderberg, 1973, Cluster Analysis for A~nlications, Academic Press:
New York).
In some embodiments employing cluster analysis, the expression of a large
number
°f genes is monitored as biologi<:al samples are subjected to a wide
variety of perturbations.
A table of data containing the gene expression measurements is used for
cluster analysis. In
order to obtain basis genesets that contain genes which co-vary over a wide
variety of
conditions multiple perturbations or conditions are employed. Cluster analysis
operates on
a table of data which has the dimension m x k wherein m is the total number of
conditions or
p~bations and k is the numben~ of genes measured.
A number of clustering algorithms are useful for clustering analysis.
Clustering
algorithms use dissimilarities or distances between objects when forming
clusters. In some
embodiments, the distance used is Euclidean distance in multidimensional
space:
!l2
1(x'~Y)= ~ ~Xr - Y~ (4)
r
- 14-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/34577
where I(xy) is the distance between gene X and gene Y; X,. and Y, are gene
expression
response under perturbation i. 'The Euclidean distance may be squared to place
progressively greater weight on objects that are further apart. Alternatively,
the distance
measure may be the Manhattan distance e.g., between gene X and Y, which is
provided by:
I(.x~Y)= ~ ~Xa ~ yh
r
Again, X, and Y,. are gene expre:;sion responses under perturbation i. Some
other definitions
of distances are Chebychev distance, power distance, and percent disagreement.
Percent
disagreement, defined as I(xy) -_ (number ofX,. #Y,)li, is particularly useful
for the method
of this invention, if the data for 'the dimensions are categorical in nature.
Another useful
I O distance definition, which is particularly useful in the context of
cellular response, is
I =1- r, where r is the con elation coefficient between the response vectors
X, Y, also called
the normalized dot product X~YI~~~Y~.
Various cluster linkage rules are useful for defining genesets. Single
linkage, a
_ _ near~st.neighbvr method, deterrr~ines the distance between the two closest
objects. By
contrast, complete linkage methods determine distance by the greatest distance
between any
two objects in the different clusters. This method is particularly useful in
cases when genes
or other cellular constituents form naturally distinct "clumps."
Alternatively, the
unweighted pair-group average dfefines distance as the average distance
between all pairs of
objects in two different clusters. This method is also very useful for
clustering genes or
other cellular constituents to fornn naturally distinct "clumps." Finally, the
weighted pair-
group average method may also be used. This method is the same as the
unweighted pair-
group average method except that the size of the respective clusters is used
as a weight.
This method is particularly useful for embodiments where the cluster size is
suspected to be
greatly varied (Sneath and Sokal,1973, Numerical taxonomv, San Francisco: W.
H.
Freeman & Co.). Other cluster linkage rules, such as the unweighted and
weighted pair-
group centroid and Ward's method are also useful for some embodiments of the
invention.
See., e.g., Ward, 1963, J. Am. Sta.t Assn. 58:236; Hartigan, 1975, Clustering
al o~ s,
New York: Wiley.
As the diversity of perturbations in the clustering set becomes very large,
the
genesets which are clearly distinguishable get smaller and more numerous.
However, even
over very large experiment sets, there are small genesets that retain their
coherence. These
genesets are termed irreducible gewesets. Typically, a large number of diverse
perturbations are applied to obtain such irreducible genesets.
Often, the clustering of genesets is represented graphically and is termed a
'tree'.
Genesets may be defined based or.~ the many smaller branches of a tree, or a
small number
of larger branches by cutting acro;;s the tree at different levels. The choice
of cut level may
- IS -

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
be made to match the number of distinct response pathways expected. If little
or no prior
information is available about the number of pathways, then the tree should be
divided into
as many branches as are truly distinct. 'Truly distinct' may be defined by a
minimum
distance value between the individual branches. Typical values are in the
range 0.2 to 0.4
where 0 is perfect correlation and 1 is zero correlation, but may be larger
for poorer quality
data or fewer experiments in the training set, or smaller in the case of
better data and more
experiments in the training set.
Preferably, 'truly distinct' may be defined with an objective test of
statistical
significance for each bifurcation in the tree. In one aspect of the invention,
the Monte
Carlo randomization of the experiment index for each cellular constituent's
responses
across the set of experiments is used to define an objective test.
In some embodiments, the objective test is defined in the following manner:
Let pk; be the response of constituent k in experiment i. Let II(i) be a
random
permutation of the experiment iindex. Then for each of a large (about 100 to
1000) number
of different random permutations, construct pkx;~. For each branching in the
original tree,
for each permutation:
(1) perform hierarchical clustering with the same algorithm ('hclust' in this
case)
used on the original unpermuteci data;
(2) compute fractional improvement f in the total scatter with respect to
cluster
centers in going from one cluster to two clusters
j= I - ~Dk~'~ ~ ~JDk~l~ (6)
where Dk is the square of the distance measure for constituent k with respect
to the center
(mean) of its assigned cluster. Superscript 1 or 2 indicates whether it is
with respect to the
center of the entire branch or with respect to the center of the appropriate
cluster out of the
two subclusters. There is considerable freedom in the definition of the
distance function D
used in the clustering procedure. In these examples, D = 1- r , whe:re r is
the correlation
coefficient between the responses of one constituent across the experiment set
vs. the
responses of the other (or vs. the mean cluster response).
The distribution of fractional improvements obtained from the Monte Carlo
procedure is an estimate of the distribution under the null hypothesis that a
given branching
was not significant. The actual fractional improvement for that branching with
the
unpermuted data is then compared to the cumulative probability distribution
from the null
hypothesis to assign significance. Standard deviations are derived by fitting
a log normal
model for the null hypothesis distribution. Using this procedure, a standard
deviation
greater than about 2, for example, indicates that the branching is significant
at the 95%
- 16-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
confidence level. Genesets defined by cluster analysis typically have
underlying biological
significance.
Another aspect of the cluster analysis method provides the definition of basis
vectors for use in profile projection described in the following sections.
A set of basis vectors V has k x n dimensions, where k is the number of genes
and n
is the number of genesets.
Y~'~ . Y~"~
t' _ . . . ('n
Y~'~ . Y~"~
k k
~"~k 1S the amplitude contribution of gene index k in basis vector n. In some
embodiments,
Vr"'k = l, if gene k is a member of geneset n, and I~"'k = 0 if gene k is not
a member of
geneset n. In some embodiments, Vr"~k is proportional to the response of gene
k in geneset n
over the training data set used ta~ define the genesets .
.. _ . . In some preferred embodiments, the elements Yr"~k are normalized so
that each Yr"~k
has unit length by dividing by the square root of the number of genes in
geneset n. This
produces basis vectors which are; not only orthogonal (the genesets derived
from cutting the
clustering tree are disjoint), but also orthonormal (unit length). With this
choice of
normalization, random measurement errors in profiles project onto the i~"~k in
such a way
that the amplitudes tend to be comparable for each n. Normalization prevents
large
genesets from dominating the results of similarity calculations.
5.1.4.3 GE;NESET CLASSIFICATION BASED UPON
MECHANISMS OF REGULATION
Genesets can also be defined based upon the mechanism of the regulation of
genes.
Genes whose regulatory regions have the same transcription factor binding
sites are more
likely to be co-regulated. In some preferred embodiments, the regulatory
regions of the
genes of interest are compared using multiple alignment analysis to decipher
possible
shared transcription factor binding sites (Stormo and Hartze11,I989,
Identifying protein
binding sites from unaligned DN,A fragments, Proc Natl Acad Sci 86:1183-1187;
Hertz and
Stormo, 1995, Identification of consensus patterns in unaligned DNA and
protein
sequences: a large-deviation statistical basis for penalizing gaps, Proc of
3rd Intl Conf on
Bioinformatics and Genome Rese~arc , Lim and Cantor, eds., World Scientific
Publishing
Co., Ltd. Singapore, pp. 201-216). For example, as Example 3, infra, shows,
common
promoter sequence responsive to ~Gcn4 in 20 genes may be responsible for those
20 genes
being co-regulated over a wide variety of perturbations.
-17-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
The co-regulation of genes is not limited to those with binding sites for the
same
transcriptional factor. Co-regulated (co-varying) genes may be in the up-
stream/down-
stream relationship where the products of up-stream genes regulate the
activity of down-
stream genes. It is well known to those of skill in the art that there are
numerous varieties of
gene regulation networks. One of skill in the art also understands that the
methods of this
invention are not limited to any particular kind of gene regulation mechanism.
If it can be
derived from the mechanism of regulation that two genes are co-regulated in
terms of their
activity change in response to perturbation, the two genes may be clustered
into a geneset.
Because of lack of complete understanding of the regulation of genes of
interest, it is
often preferred to combine cluster analysis with regulatory mechanism
knowledge to derive
better defined genesets. In some embodiments, K-means clustering may be used
to cluster
genesets when the regulation ofgenes of interest is partially known. K-means
clustering is
particularly useful in cases where the number of genesets is predetermined by
the
understanding of the regulatory mechanism. In general, K-mean clustering is
constrained to
produce exactly the number of clusters desired. Therefore, if promoter
sequence
comparison indicates the measured genes should fall into three genesets, K-
means
clustering may be used to generate exactly three genesets with greatest
possible distinction
between clusters.
5.1.4.4 REPRESENTING PROJECTED PROFILE,
The expression value of genes can be converted into the expression value for
genesets. This process is referred to as projection. In some embodiments, the
projection is
as follows:
P = [P~, .. P~, .. PnJ = p . y (8)
wherein,p is the expression profile, P is the projected profile, P; is
expression value for
geneset i and v is a predefined set of basis vectors. The basis vectors have
been previously
defined in Equation 7 (Section 5.1.4.2, supra) as:
il~~t) . yin)
t t
V = . . . (9)
~!(1) . y(n)
k k
~,h~.ein l~"~k is the amplitude of ce:lluIar constituent index k of basis
vector n.
-18-

CA 02356891 2001-06-22
WO 00/39337 PCT1US99/30577
In one preferred embodiment, the value of geneset expression is simply the
average
of the expression value of the gt;nes within the geneset. In some other
embodiments, the
average is weighted so that highly expressed genes do not dominate the geneset
value. The
collection of the expression values of the genesets is the projected profile.
5.2 PROFILE COMPARISON AND CLASSIFICATION
Once the basis genesets ~~re chosen, projected profiles P; may be obtained for
any set
of profiles indexed by i. Similatzties between the P; may be more clearly seen
than between
the original profiles p; for two reasons. First, measurement errors in
extraneous genes have
been excluded or averaged out. Second, the basis genesets tend to capture the
biology of the
profilesp; and so are matched detectors for their individual response
components.
Classification and clustering of the profiles both are based on an objective
similarity metric,
call it S, where one useful definition is
S;~ = S(P; . P;~ = P; ~P~ / (IPrI IP;U (IO)
This definition is the genc;ralized angle cosine between the vectors P; and
P~. It is the
projected version of the conventional correlation coefficient between p; and
p~. Profile p; is
deemed most similar to that other profile p~ for which S;~ is maximum. New
profiles may be
classified according to their similarity to profiles of known biological
significance, such as
~e response patterns for known drugs or perturbations in specific biological
pathways. Sets
of new profiles may be clustered using the distance metric
D,~ = I - S;~ (1 I)
where this clustering is analogous to clustering in the original larger space
of the entire set
of response measurements, but has the advantages just mentioned of reduced
measurement
error effects and enhanced capture; of the relevant biology.
The statistical significance of any observed similarity S;~ may be assessed
using an
empirical probability distribution ,generated under the null hypothesis of no
correlation.
~s distribution is generated by performing the projection, Equations (9) and
(10) for many
different random permutations of the constituent index in the original profile
p. That is, the
ordered set pk are replaced by pxk~ where II(k) is a permutation, for 100 to
1000 different
random permutations. The probability of the similarity S;~ arising by chance
is then the
fi~action of these permutations for which the similarity S;~ (permuted)
exceeds the similarity
observed using the original unpermuted data.
-19-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
5.3 AUGMENTED PROFILES AND ROBUST DISCRIMINATION
5.3.1 MAD FOR ROBUST DISCRIMINATION
In the methods of this invention, a biological sample is placed in alternative
states by, for example, introducing mutations or changing growth conditions,
to make the
biological sample more responsive to a given perturbation. This concept is
illustrated in
Figure 2. Under State 1, in Figlsre 2, the drugs have only limited responses
and comparison
of their effects is tenuous and based on little information. By forming
augmented profiles
consisting of concatenated profiles from multiple states or conditions, the
profiles become
much more informative. Because they are more informative, they can provide
improved
detail on the effects of different perturbations, such as drugs in the
illustration, on a patient.
The different states may be different culture growth conditions, background
genetic strains,
or additional drug treatments, to name a few. These additional states may be
chosen based
on prior biological knowledge to elicit specific responses in otherwise
unresponsive cells, or
they may be chosen more or less; at random with the knowledge that the
resulting additional
dwersity in the augmented response profile will tend to allow better
discrimination, on
average. Techniques to change the initial state and possibly elicit responses
include, for
example, inhibiting drug efflux pumps or enhancing cell wall permeability by
genetic
modification of the organism, growing in nutrient-poor media, growing on
plates vs. in
volume culture, adding certain trace elements or minerals to the media, using
haploid,
diploid, and heterozygous background strains, activating pathways such as the
mating
pathway which have widespread effects on cell state and are likely to change
the
responsiveness to the stimuli that are being compared.
Robust augmented profiles comparison has wide ranging applicability, such as
providing a method for robust discrimination of drug activities or disease
states in vivo. In
such applications, multiple conditions are provided by following a patient in
time or through
other environmental or medical insults and by concatenation of the multiple
profiles
obtained under these different host conditions. Profiles may be expressed as
departure
profiles from baseline states by forming the ratio or log(ratio) of
constituent levels with
respect to a baseline state, or any second perturbation.
Mathematically, comparisons of augmented profiles are done in a manner that is
analogous to the comparison of profiles obtained in a single state as
described in section 5.2.
The concatenated profile may be written P = [pl;p2;...;pNJ of length NL, where
pl is the
profile in the first state, N is the number of states and L is the number of
cellular
constituents measured in a single :>tate. Measures of similarity, such as a
generalized dot
pr°duct,
r~ .= pt * p' /(~P'IIPiU
-20-

CA 02356891 2001-06-22
WO 00/39337 PCT/US99130577
can be used to define the concatenated profiles, as they would be defined on
single-state
profiles pl. In Equation (12), * denotes dot product and ~~ denotes vector
norm (length).
Many other quantitative measures of similarity are possible, such as Shannon
mutual
information [S.E. Shannon and W. Weaver, The mathematical theory of
communication,
University of Illinois Press, Urbana, IL, 1949], or modifications of Equation
(12) where
elements of the profiles are set to "1" ("-1 ") if they exceed a positive
(negative) threshold
and "0" if they do not.
These measures of similarity then support searches of augmented-profile
libraries for
the profile most similar to a query profile, and clustering of sets of
augmented profiles into
groups that are likely to share characteristics like toxicity or
effectiveness.
5.3.2 ILLUSTRATIVE; DRUG DISCOVERY APPLICATIONS
Robust discrimination of augmented profiles has wide applicability to several
aspects of drug discovery as outlined in the following sections.
5.3.2.1 DRUG CANDIDATE LEAD SELECTION
The methods of the present invention have applicability to the field of drug
candidate
lead selection. In many drug discovery efforts, a target enzyme will be
screened against a
large library of proprietary and/or nonproprietary compounds. Such a screening
effort is
referred to as a primary assay. Primary assays are often reduced to a robotic
format in which
~°usands of compounds are screened per day. These efforts will result
in a large number of
compounds that produce the desired activity, which is typically the inhibition
of the activity
of a selected target enzyme.
Compounds that are succE;ssful in the proprietary assay are typically called
hits or
leads. Hits from the primary assay are typically screened in appropriately
designed
secondary assays. While the format of the secondary assay may vary depending
on the
scope of the drug discovery project, a typically secondary assay includes the
dose response
of a compound on whole cells. Thus in such a cell-based assay, the presence of
some
cellular constituent, such as TNF secretion, is measured as the cells are
incubated in
increasing concentrations of test compound.
~ order to measure the suitability of a test compound, secondary assays are
typically
used to compare the activity of hits from the primary assay with the activity
of some
reference compound. The reference compound may be one that has proven efficacy
in the
appropriate clinical setting, a known drug or simply a prior lead. Comparison
of newly
developed compounds against the active reference compound serves as an
excellent tool for
m~~ng progress and for detenni~ung what is to be expected of new compounds.
In one aspect, the methods of the present invention will serve as an improved
secondary assay. Accordingly, the effect of dosing an appropriate cell line
with a reference
-21 -

CA 02356891 2001-06-22
WO 00/39337 PCTNS99/30577
compound can be compared to l;he effect of dosing the same cell line with each
of the hits
from the primary screening assay. In this embodiment, appropriate cellular
constituents of
the cell line can be measured using any of the techniques described in this
specification or
known in the art. Further, these measurements can be done when the cell line
is placed in a
variety of different initial biological states. For example, cell response
profiles can be
measured when the reference compound has been contacted with the cells after
they have
been cultured in a variety of cell. culture densities, temperatures, or other
culture conditions.
Each of these response profiles ~~re combined to form a reference augmented
profile.
Similar augmented profiles are created for each of the hits from the primary
screening assay
and these augment profiles are compared with the reference profile. By
comparing
augmented profiles generated from each compound of interest rather than
individual
response or projected profiles, subtle differences between the effects of each
test compound
can be detected. Even small changes in cellular constituents associated with a
known
toxicity or a desired physiologic event will become statistically meaningful
using the
methods of this invention.
5.3.2.2 DRUG CANDIDATE VALIDATION
08en in the drug discovery process, a potential drug candidate will exhibit
excellent
activity in the primary in vitro assay and secondary cell-based assays. Even
if a compound
is successful in both primary and secondary assays, their remains a need to
validate the
compound. Compound validation addresses the difficult issue of verifying that
a test
compound was successful in the primary and secondary assays because of
selective affects
on the desired target rather than unselective affects on multiple
physiological processes.
Compounds that selectively affect the desired target are preferred over
compounds that
selectively affect a wide variety of cellular constituents. For example, a
compound that is
excessively hydrophobic may bind to the target enzyme by unselective
hydrophobic
interactions. The problem with such an excessively hydrophobic protein is that
it is likely to
unselectively bind and/or inhibit several cellular constituents as well.
Compounds that
nonselectively inhibit all enzymes in a class are also undesirable. For
example, in addition
to inhibiting a target kinase of interest, a nonselective kinase inhibitor
such as staurosporine
well bind and inhibit dozens of kunases. A test compound may perform well in
the
secondary assay because it is toxic to the cells or because the compound
knocked out a
biological pathway that is unrelated to the biological pathway of interest.
The methods of the present invention provide improved means for validating
test
compounds in a drug discovery effort. In this embodiment of the invention,
augmented
profiles (reference augment profiles) based on the compounds that have a known
effect on a
biological sample are compared with augmented profiles generated from
compounds that
need validation. For example, reference compounds that have a general toxic
effect on the
-22-

CA 02356891 2001-06-22
WO 00139337 PCT/US99/30577
biological sample will have distinct augmented profiles. Thus a low
correlation between
such reference toxic compounds and test compounds of interest is desired.
Similarly, a high
correlation between an augmented profile derived from a previously validated
compound
and a test compound would indicate that the test compound is selectively
influencing the
proper biological pathway. A previously validated compound may be obtained
from animal
trials or from prior scientific publications.
5.3.2.3 DRUG REGIMEN OPTIMIZATION IN A VARIETY OF PATIENT
POPULATIONS
The methods of the present invention provide improved methods for
°ptimizing drug regimens in a variety of patient populations. In one
embodiment,
augmented profiles developed from biological samples obtained from a patient
can be
compared with reference augmented profiles that represent model drug responses
of patients
with favorable clinical outcome. Data derived from such comparisons would then
be used
to optimize a particular drug regimen thus maximizing the effectiveness of
drug treatment
~d reducing its costs in terms of response time and financial expenditure.
The augmented profiles taken from patients can also be used tv discover
unsatisfactory therapeutic responses caused by inadequate drug exposure or
undesirable
side-effects before they manifest in unfavorable symptoms. Robust augmented
profile
comparison can also be used to detect poor compliance with a dosage regimen.
In another
embodiment regular comparison of augmented profiles can be used to detect and
monitor
interactions with co-ingested medications or the effects of changes in the
physical status of
the patient.
5.3.3 ILLUSTRATIVE DIAGNOSTIC APPLICATION
5.3.3.1 PREVENTIVE HEALTH CA_R~
Because of its improved. ability at measuring the subtle effects of a
perturbation on a
biological sample, comparison 0f augmented profiles will provide an invaluable
service in
the field of preventive health care. In one embodiment of the invention,
biological samples
are obtained from subjects on a routine basis over time. Augmented profiles
are developed
bred upon these biological sarr~ples. Comparison of these augmented profiles
to a database
that includes several model disease states provides advance warning that the
subject has a
particular disease before the disease manifests itself in any outward clinical
symptoms.
Such a diagnostic tool is particularly valuable in diseases such as cancer
because early
treatment leads to improved chances of recovery and/or survival. Appropriately
chosen
augmented profile comparisons will also provide useful information on health
risks in a
subject. Thus appropriately designed augmented profiles will be used to
determine if a
patient should alter their diet, exercise more, take certain vitamins, or
alter other behavioral
- 23 -

CA 02356891 2001-06-22
WO 00/39337 PCT/US99/30577
aspects. As the database of reference augmented profiles is enriched, the
utility of the robust
profile compariso
Representative Drawing

Sorry, the representative drawing for patent document number 2356891 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2018-01-01
Inactive: IPC from MCD 2006-03-12
Inactive: IPC from MCD 2006-03-12
Application Not Reinstated by Deadline 2003-12-22
Time Limit for Reversal Expired 2003-12-22
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2002-12-23
Inactive: Cover page published 2001-12-11
Inactive: First IPC assigned 2001-12-10
Inactive: Office letter 2001-09-25
Letter Sent 2001-09-21
Letter Sent 2001-09-21
Inactive: Notice - National entry - No RFE 2001-09-21
Application Received - PCT 2001-09-20
Application Published (Open to Public Inspection) 2000-07-06

Abandonment History

Abandonment Date Reason Reinstatement Date
2002-12-23

Maintenance Fee

The last payment was received on 2001-11-21

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Registration of a document 2001-06-22
Basic national fee - standard 2001-06-22
MF (application, 2nd anniv.) - standard 02 2001-12-21 2001-11-21
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ROSETTA INPHARMATICS, INC.
Past Owners on Record
ROLAND STOUGHTON
STEPHEN H. FRIEND
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

({010=All Documents, 020=As Filed, 030=As Open to Public Inspection, 040=At Issuance, 050=Examination, 060=Incoming Correspondence, 070=Miscellaneous, 080=Outgoing Correspondence, 090=Payment})


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2001-06-21 53 3,563
Claims 2001-06-21 15 765
Drawings 2001-06-21 11 265
Abstract 2001-06-21 1 53
Reminder of maintenance fee due 2001-09-23 1 116
Notice of National Entry 2001-09-20 1 210
Courtesy - Certificate of registration (related document(s)) 2001-09-20 1 136
Courtesy - Certificate of registration (related document(s)) 2001-09-20 1 136
Courtesy - Abandonment Letter (Maintenance Fee) 2003-01-19 1 176
Correspondence 2001-09-20 1 13
PCT 2001-06-21 9 421