Language selection

Search

Patent 2434677 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2434677
(54) English Title: MOLECULES FOR DIAGNOSTICS AND THERAPEUTICS
(54) French Title: MOLECULES UTILISEES A DES FINS DIAGNOSTIQUES ET THERAPEUTIQUES
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/12 (2006.01)
  • A01K 67/027 (2006.01)
  • A61K 38/16 (2006.01)
  • C07K 14/47 (2006.01)
  • C07K 16/18 (2006.01)
  • C12N 1/21 (2006.01)
  • C12N 5/10 (2006.01)
  • C12N 15/00 (2006.01)
  • C12Q 1/68 (2006.01)
  • G01N 33/68 (2006.01)
  • A61K 38/00 (2006.01)
(72) Inventors :
  • PANZER, SCOTT R. (United States of America)
  • LINCOLN, STEPHEN E. (United States of America)
  • ALTUS, CHRISTINA M. (United States of America)
  • DUFOUR, GERARD E. (United States of America)
  • HILLMAN, JENNIFER L. (United States of America)
  • JONES, ANISSA L. (United States of America)
  • DAM, TAM C. (United States of America)
  • LIU, TOMMY F. (United States of America)
  • HARRIS, BERNARD (United States of America)
  • FLORES, VINCENT (United States of America)
  • DAFFO, ABEL (United States of America)
  • MARWAHA, RAKESH (United States of America)
  • CHEN, ALICE J. (United States of America)
  • CHANG, SIMON C. (United States of America)
  • GERSTIN, JR. EDWARD H. (United States of America)
  • PERALTA, CAREYNA H. (United States of America)
  • DAVID, MARIE H. (United States of America)
  • LEWIS, SAMANTHA A. (United States of America)
(73) Owners :
  • PANZER, SCOTT R. (Not Available)
  • LINCOLN, STEPHEN E. (Not Available)
  • ALTUS, CHRISTINA M. (Not Available)
  • DUFOUR, GERARD E. (Not Available)
  • HILLMAN, JENNIFER L. (Not Available)
  • JONES, ANISSA L. (Not Available)
  • DAM, TAM C. (Not Available)
  • LIU, TOMMY F. (Not Available)
  • HARRIS, BERNARD (Not Available)
  • FLORES, VINCENT (Not Available)
  • DAFFO, ABEL (Not Available)
  • MARWAHA, RAKESH (Not Available)
  • CHEN, ALICE J. (Not Available)
  • CHANG, SIMON C. (Not Available)
  • GERSTIN, JR. EDWARD H. (Not Available)
  • PERALTA, CAREYNA H. (Not Available)
  • DAVID, MARIE H. (Not Available)
  • LEWIS, SAMANTHA A. (Not Available)
(71) Applicants :
  • INCYTE GENOMICS, INC. (United States of America)
(74) Agent: SMART & BIGGAR
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2002-01-09
(87) Open to Public Inspection: 2002-10-10
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2002/001009
(87) International Publication Number: WO2002/079473
(85) National Entry: 2003-07-09

(30) Application Priority Data:
Application No. Country/Territory Date
60/261,622 United States of America 2001-01-12
60/262,599 United States of America 2001-01-19
60/262,662 United States of America 2001-01-19
60/263,064 United States of America 2001-01-19
60/263,330 United States of America 2001-01-19
60/263,065 United States of America 2001-01-19
60/263,329 United States of America 2001-01-19
60/263,063 United States of America 2001-01-19
60/262,760 United States of America 2001-01-19
60/263,077 United States of America 2001-01-19
60/263,069 United States of America 2001-01-19
60/261,864 United States of America 2001-01-16
60/261,865 United States of America 2001-01-16
60/262,207 United States of America 2001-01-17
60/262,208 United States of America 2001-01-17
60/262,209 United States of America 2001-01-17
60/262,164 United States of America 2001-01-17
60/262,215 United States of America 2001-01-17
60/263,102 United States of America 2001-01-18

Abstracts

English Abstract




The present invention provides purified human polynucleotides for diagnostics
and therapeutics (dithp). Also encompassed are the polypeptides (DITHP)
encoded by dithp. The invention also provides for the use of dithp, or
complements, oligonucleotides, or fragments thereof in diagnostic assays. The
invention further provides for vectors and host cells containing dithp for the
expression of DITHP .The invention additionally provides for the use of
isolated and purified DITHP to induce antibodies and to screen libraries of
compounds and the use of anti-DITHP antibodies in diagnostic assays. Also
provided are microarrays containing dithp and methods of use.


French Abstract

La présente invention concerne des polynucléotides humains purifiés utilisés à des fins diagnostiques et thérapeutiques (dithp). Elle concerne également des polypeptides (DITHP) codés par les dithp. De plus, l'invention concerne l'utilisation de dithp, ou de compléments de dithp, d'oligonucléotides ou de fragments d'oligonucléotide , pour des analyses diagnostiques. L'invention concerne en outre des vecteurs et des cellules hôtes renfermant des dithp pour l'expression de DITHP .Par ailleurs, cette invention porte sur l'emploi de DITHP isolés et purifiés dans le but de produire des anticorps anti-DITHP dans des tests de diagnostic. L'invention concerne enfin des microréseaux renfermant des dithp et leurs modes d'utilisation.

Claims

Note: Claims are shown in the official language in which they were submitted.




CLAIMS

What is claimed is:

1. An isolated polynucleotide comprising a polynucleotide sequence selected
from the group
consisting of:
a) a polynucleotide sequence selected from the group consisting; of SEQ ID
NO:1-56,
b) a naturally occurring polynucleotide sequence at least 90% identical to a
polynucleotide
sequence selected from the group consisting of SEQ ID NO:1-56,
c) a polynucleotide sequence complementary to a),
d) a polynucleotide sequence complementary to b), and
e) an RNA equivalent of a) through d).

2. An isolated polynucleotide of claim 1, comprising a polynucleotide sequence
selected from
the group consisting of SEQ ID NO:1-56.

3. An isolated polynucleotide comprising at least 60 contiguous nucleotides of
a polynucleotide
of claim 1.

4. A composition for the detection of expression of diagnostic and therapeutic
polynucleotides comprising at least one of the polynucleotides of claim 1 and
a detectable label.

5. A method for detecting a target polynucleotide in a sample, said target
polynucleotide
having a sequence of a polynucleotide of claim 1, the method comprising:
a) amplifying said target polynucleotide or fragment thereof using polymerase
chain reaction
amplification, and
b) detecting the presence or absence of said amplified target polynucleotide
or fragment
thereof, and, optionally, if present, the amount thereof.

6. A method for detecting a target polynucleotide in a sample, said target
polynucleotide
comprising a sequence of a polynucleotide of claim 1, the method comprising:
a) hybridizing the sample with a probe comprising at least 20 contiguous
nucleotides
comprising a sequence complementary to said target polynucleotide in the
sample, and which probe
specifically hybridizes to said target polynucleotide, under conditions
whereby a hybridization complex
is formed between said probe and said target polynucleotide or fragments
thereof, and
b) detecting the presence or absence of said hybridization complex, and,
optionally, if present,

259



7. A method of claim 5, wherein the probe comprises at least 30 contiguous
nucleotides.

8. A method of claim 5, wherein the probe comprises at least 60 contiguous
nucleotides.

9. A recombinant polynucleotide comprising a promoter sequence operably linked
to a
polynucleotide of claim 1.

10. A cell transformed with a recombinant polynucleotide of claim 9.

11. A transgenic organism comprising a recombinant polynucleotide of claim 9.

12. A method for producing a diagnostic and therapeutic polypeptide, the
method comprising:
a) culturing a cell under conditions suitable for expression of the diagnostic
and therapeutic
polypeptide, wherein said cell is transformed with a recombinant
polynucleotide of claim 9, and
b) recovering the diagnostic and therapeutic polypeptide so expressed.

13. A purified diagnostic and therapeutic polypeptide (DITHP) encoded by at
least one of the
polynucleotides of claim 2.

14. An isolated antibody which specifically binds to a diagnostic and
therapeutic polypeptide
of claim 13.

15. A method of identifying a test compound which specifically binds to the
diagnostic and
therapeutic polypeptide of claim 13, the method comprising the steps of:
a) providing a test compound;
b) combining the diagnostic and therapeutic polypeptide with the test compound
for a
sufficient time and under suitable conditions for binding; and
c) detecting binding of the diagnostic and therapeutic polypeptide to the test
compound,
thereby identifying the test compound which specifically binds the diagnostic
and therapeutic
polypeptide.

16. A microarray wherein at least one element of the microarray is a
polynucleotide of claim
3.

260




17. A method for generating a transcript image of a sample which contains
polynucleotides,
the method comprising the steps of:
a) labeling the polynucleotides of the sample,
b) contacting the elements of the microarray of claim 16 with the labeled
polynucleotides of
the sample under conditions suitable for the formation of a hybridization
complex, and
c) quantifying the expression of the polynucleotides in the sample.

18. A method for screening a compound for effectiveness in altering expression
of a target
polynucleotide, wherein said target polynucleotide comprises a polynucleotide
sequence of claim 1, the
method comprising:
a) exposing a sample comprising the target polynucleotide to a compound, under
conditions
suitable for the expression of the target polynucleotide,
b) detecting altered expression of the target polynucleotide, and
c) comparing the expression of the target polynucleotide in the presence of
varying amounts
of the compound and in the absence of the compound.

19. A method for assessing toxicity of a test compound, said method
comprising:
a) treating a biological sample containing nucleic acids with the test
compound;
b) hybridizing the nucleic acids of the treated biological sample with a probe
comprising at
least 20 contiguous nucleotides of a polynucleotide of claim 1 under
conditions whereby a specific
hybridization complex is formed between said probe and a target polynucleotide
in the biological
sample, said target polynucleotide comprising a polynucleotide sequence of a
polynucleotide of claim 1
or fragment thereof;
c) quantifying the amount of hybridization complex; and
d) comparing the amount of hybridization complex in the treated biological
sample with the
amount of hybridization complex in an untreated biological sample, wherein a
difference in the amount
of hybridization complex in the treated biological sample is indicative of
toxicity of the test compound.

20. An array comprising different nucleotide molecules affixed in distinct
physical locations
on a solid substrate, wherein at least one of said nucleotide molecules
comprises a first oligonucleotide
or polynucleotide sequence specifically hybridizable with at least 30
contiguous nucleotides of a target
polynucleotide, said target polynucleotide having a sequence of claim 1.

21. An array of claim 20, wherein said first oligonucleotide or polynucleotide
sequence is
completely complementary to at least 30 contiguous nucleotides of said target
polynucleotide.

261




22. An array of claim 20, wherein said first oligonucleotide or polynucleotide
sequence is
completely complementary to at least 60 contiguous nucleotides of said target
polynucleotide

23. An array of claim 20, which is a microarray.

24. An array of claim 20, further comprising said target polynucleotide
hybridized to said first
oligonucleotide or polynucleotide.

25. An array of claim 20, wherein a linker joins at least one of said
nucleotide molecules to
said solid substrate.

26. An array of claim 20, wherein each distinct physical location on the
substrate contains
multiple nucleotide molecules having the same sequence, and each distinct
physical location on the
substrate contains nucleotide molecules having a sequence which differs from
the sequence of
nucleotide molecules at another physical location on the substrate.

27. An isolated polypeptide comprising an amino acid sequence selected from
the group
consisting of:
a) an amino acid sequence selected from the group consisting of SEQ ID NO:57-
113,
b) a naturally occurring amino acid sequence at least 90% identical to an
amino acid
sequence selected from the group consisting of SEQ ID NO:57-113,
c) a biologically active fragment of an amino acid sequence selected from the
group
consisting of SEQ ID NO:57-113, and
d) an immunogenic fragment of an amino acid sequence selected from the group
consisting of
SEQ ID NO:57-113.

28. An isolated polypeptide of claim 27, comprising a polypeptide sequence
selected from the
group consisting of SEQ ID NO:57-113.

262

Description

Note: Descriptions are shown in the official language in which they were submitted.





DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 218
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 218
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
2002045
NOTE POUR LE TOME / VOLUME NOTE:


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
MOLECULES FOR DIAGNOSTICS AND THERAPEUTICS
TECHNICAL FIELD
The present invention relates to human molecules and to the use of these
sequences in the
diagnosis, study, prevention, and. treatment of diseases associated with, as
well as effects of
exogenous compounds on, the expression of human molecules.
BACKGROUND OF THE INVENTION
1o The human genome is comprised of thousands of genes, many encoding gene
products that
function in the maintenance and growth of the various cells and tissues in the
body. Aberrant
expression or mutations in these genes and their products is the cause of, or
is associated with, a
variety of human diseases such as cancer and other cell proliferative
disorders,
autoimmune/inflammatory disorders, infections, developmental disorders,
endocrine disorders,
15 metabolic disorders, neurological disorders, gastrointestinal disorders,
transport disorders, and
connective tissue disorders. The identification of these genes and their
products is the basis of an
ever-expanding effort to find markers fox early detection of diseases, and
targets for their prevention
and treatment. Therefore, these genes and their products are useful as
diagnostics and therapeutics.
These genes may encode, for example, enzyme molecules, molecules associated
with growth and
a o development, biochemical pathway molecules, extracellular information
transmission molecules,
receptor molecules, intracellular signaling molecules, membrane transport
molecules, protein
modification and maintenance molecules, nucleic acid synthesis and
modification molecules, adhesion
molecules, antigen recognition molecules, secreted and extracellular matrix
molecules, cytoskeletal
molecules, ribosomal molecules, electron transfer associated molecules,
transcription factor molecules,
2 s chromatin molecules, cell membrane molecules, and organelle associated
molecules.
For example, cancer represents a type of cell proliferative disorder that
affects nearly every
tissue in the body. A wide variety of molecules, either aberrantly expressed
or mutated, can be the
cause of, or involved with, various cancers because tissue growth involves
complex and oxdered
patterns of cell proliferation, cell differentiation, and apoptosis. Cell
proliferation must be regulated to
3 o maintain both the number of cells and their spatial organization. This
regulation depends upon the
appropriate expression of proteins which control cell cycle progression in
response to extracellular
signals such as growth factors and other xnitogens, and intracellular cues
such as DNA damage or
nutrient starvation. Molecules which directly or indirectly modulate cell
cycle progression fall into
several categories, including growth factors and their receptors, second
messenger and signal
3 5 transduction proteins, oncogene products, tumor-suppressor proteins, and
mitosis-promoting factors.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Aberrant expression or mutations in any of these gene products can result in
cell proliferative
disorders such as cancer. Oncogenes are genes generally derived from normal
genes that, through
abnormal expression or mutation, can effect the transformation of a normal
cell to a malignant one
(oncogenesis). Oncoproteins, encoded by oncogenes, can affect cell
proliferation in a vat~iety of ways
s and include growth factors, growth factor receptors, intracellular signal
transducers, nuclear
transcription factors, and cell-cycle control proteins. In contrast, tumor-
suppressor genes are involved
in inhibiting cell proliferation. Mutations which cause reduced function or
loss of function in
tumor-suppressor genes result in aberrant cell proliferation and cancer.
Although many different
genes and their products have been found to be associated with cell
proliferative disorders such as
Zo cancer, many more may exist that are yet to be discovered.
DNA-based arrays can provide a simple way to explore the expression of a
single
polymorphic gene or a large number of genes. When the expression of a single
gene is explored,
DNA-based arrays are employed to detect the expression of specific gene
variants. For example, a
p53 tumor suppressor gene array is used to determine whether individuals are
carrying mutations that
15 predispose them to cancer. A cytochrome p450 gene array is useful to
determine whether individuals
have one of a number of specific mutations that could result in increased drug
metabolism, drug
resistance or drug toxicity.
DNA-based array technology is especially relevant for the rapid screening of
expression of a
large number of genes. There is a growing awareness that gene expression is
affected in a global
2 o fashion. A genetic predisposition, disease or therapeutic treatment may
affect, directly or indirectly,
the expression of a large number of genes. In some cases the interactions may
be expected, such as
when the genes are part of the same signaling pathway. In other cases, such as
when the genes
participate in separate signaling pathways, the interactions may be totally
unexpected. Therefore,
DNA-based arrays can be used to investigate how genetic predisposition,
disease, or therapeutic
2s treatment affects the expression of a large number of genes.
Enzyme Molecules
The cellular processes of biogenesis and biodegradation involve a number of
key enzyme
classes including oxidoreductases, transferases, hydrolases, lyases,
isomerases, and ligases. These
3 o enzyme classes are each comprised of numerous substrate-specific enzymes
having precise and well
regulated functions. These enzymes function by facilitating metabolic
processes such as glycolysis,
the tricarboxylic cycle, and fatty acid metabolism; synthesis or degradation
of amino acids, steroids,
phospholipids, alcohols, etc.; regulation of cell signalling, proliferation,
inflamation, apoptosis, etc., and
through catalyzing critical steps in DNA replication and repair, and the
process of translation.
3 5 Oxidoreductases


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Many pathways of biogenesis and biodegradation require oxidoreductase
(dehydrogenase or
reductase) activity, coupled to the reduction or oxidation of a donor or
acceptor cofactor. Potential
cofactors include cytochromes, oxygen, disulfide, iron-sulfur proteins, flavin
adenine dinucleotide
(FAD), and the nicotinamide adenine dinucleotides NAD and NADP (Newsholme,
E.A. and A.R.
s Leech (1983) Biochemistryfor the Medical Sciences, John Wiley and Sons,
Chichester, U.K., pp.
779-793). Reductase activity catalyzes the transfer of electrons between
substrates) and cofactors)
with concurrent oxidation of the cofactor. The reverse dehydrogenase reaction
catalyzes the
reduction of a cofactor and consequent oxidation of the substrate.
Oxidoreductase enzymes are a
broad superfamily of proteins that catalyze numerous reactions in all cells of
organisms ranging from
Zo bacteria to plants to humans. These reactions include metabolism of sugar,
certain detoxification
reactions in. the liver, and the synthesis or degradation of fatty acids,
amino acids, glucocorticoids,
estrogens, androgens, and pxostaglandins. Different family members are named
according to the
direction in which their reactions are typically catalyzed; thus they may be
referred to as
oxidoreductases, oxidases, reductases, or dehydrogenases. In addition, family
members often have
2s distinct cellular localizations, including the cytosol, the plasma
membrane, mitochondrial inner or outer
membrane, and peroxisomes.
Short-chain alcohol dehydrogenases (SCADS) are a family of dehydrogenases that
only share
15% to 30°~o sequence identity, with similarity predominantly in the
coenzyme binding domain and the
substrate binding domain. In addition to the well-known role in detoxification
of ethanol, SCADS are
2 o also involved in synthesis and degradation of fatty acids, steroids, and
some prostaglandins, and are
therefore implicated in a variety of disorders such as lipid storage disease,
myopathy, SCAD
deficiency, and certain genetic disorders. For example, retinol dehydrogenase
is a SCAD-family
member (Simon, A. et al. (1995) J. Biol. Chem. 270:1107-1112) that converts
retinol to retinal, the
precursor of retinoic acid. Retinoic acid, a regulator of differentiation and
apoptosis, has been shown
2s to down-regulate genes involved in cell proliferation and inflammation
(Chaff, X. et al. (1995) J. Biol.
Chem. 270:3900-3904). In addition, retinol dehydrogenase has been linked to
hereditary eye diseases
such as autosomal recessive childhood-onset severe retinal dystrophy (Simon,
A. et al. (1996)
Genomics 36:424-430).
Propagation of nerve impulses, modulation of cell proliferation and
differentiation, induction of
s o the immune response, and tissue homeostasis involve neurotransmitter
metabolism (Weiss, B. (1991)
Neurotoxicology 12:379-386; Collins, S.M. et al. (1992) Ann. N.Y. Acad. Sci.
664:415-424; Brown,
J.K. and H. Imam (1991) J. Inherit. Metab. Dis. 14:436-458). Many pathways of
neurotransmitter
metabolism require oxidoreductase activity, coupled to reduction or oxidation
of a cofactor, such as
NAD+/NADH (Newsholme, E.A. and A.R. Leech (1983) Biochemistry for the Medical
Sciences,
~ 3ohn Wiley and Sons, Chichester, U.K. pp. 779-793). Degradation of
catecholamines (epinephrine or


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
norepmephrme) requires alcohol dehydrogenase (W the brain) or aldehyde
dehydrogenase (in
peripheral tissue). NAD+-dependent aldehyde dehydrogenase oxidizes 5-
hydroxyindole-3-acetate (the
product of 5-hydroxytryptamiue (serotonin) metabolism) in the brain, blood
platelets, liver and
pulmonary endothelium (Newsholme, su ra, p. 786). Other neurotransmitter
degradation pathways
s that utilize NAD+/NADH-dependent oxidoreductase activity include those of L-
DOPA (precursor of
dopamine, a neuronal excitatory compound), glycine (an inhibitory
neurotransmitter in the brain and
spinal cord), histamine (liberated from mast cells during the inflammatory
response), and taurine (an
inhibitory neurotransmitter of the brain stem, spinal cord and retina)
(Newsholme, supra, pp. 790, 792).
Epigenetic or genetic defects in neurotransmitter metabolic pathways can
result in a spectrum of
so disease states in different tissues including Parkinson disease and
inherited myoclonus (McCance,
K.L. and S.E. Huether (1994) Pathophysiologyy, Mosby-Year Book, Inc., St.
Louis MO, pp. 402-404;
Gundlach, A.L. (1990) FASEB J. 4:2761-2766).
Tetrahydrofolate is a derivatized glutamate molecule that acts as a carrier,
providing activated
one-carbon units to a wide variety of biosynthetic reactions, including
synthesis of purines, pyrimidines,
1 s and the amino acid methionine. Tetrahydrofolate is generated by the
activity of a holoenzyme
complex called tetrahydrofolate synthase, which includes three enzyme
activities: tetrahydrofolate
dehydrogenase, tetrahydrofolate cyclohydrolase, and tetrahydrofolate
synthetase. Thus,
tetrahydrofolate dehydrogenase plays an important role in generating building
blocks for nucleic and
amino acids, crucial to proliferating cells.
20 3-Hydroxyacyl-CoA dehydrogenase (3HACD) is involved in fatty acid
metabolism. It
catalyzes the reduction of 3-hydroxyacyl-CoA to 3-oxoacyl-CoA, with
concomitant oxidation of NAD
to NADH, in the mitochondria and peroxisomes of eukaryotic cells. In
peroxisomes, 3HACD and
enoyl-CoA hydratase form an enzyme complex called bifunctional enzyme, defects
in which are
associated with peroxisomal bifunctional enzyme deficiency. This interruption
in fatty acid metabolism
2 s produces accumulation of very-long chain fatty acids, disrupting
development of the brain, bone, and
adrenal glands. Infants born with this deficiency typically die within 6
months (Watkins, P. et al.
(1989) J. Clin. Invest. 83:771-777; Online Mendelian Inheritance in Man
(OMIM), #261515). The
neurodegeneration that is characteristic of Alzheimer's disease involves
development of extracellular
plaques in certain brain regions. A major protein component of these plaques
is the peptide amyloid-(3
3 0 (A(3), which is one of several cleavage products of amyloid precursor
protein (APP). 3HACD has
been shown to bind the A(3 peptide, and is overexpressed in neurons affected
in Alzheimer's disease.
In addition, an antibody against 3HACD can block the toxic effects of A~i in a
cell culture model of
Alzheimer's disease (Yau, S. et al. (1997) Nature 389:689-695; OMIM, #602057).
Steroids, such as estrogen, testosterone, corticosterone, and others, are
generated from a
s s common precursor, cholesterol, and are interconverted into one another. A
wide variety of enzymes
4


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
act upon cholesterol, including a number of dehydrogenases. Steroid
dehydrogenases, such as the
hydroxysteroid dehydrogenases, are involved in hypertension, fertility, and
cancer (Duax, W.L. and
D. Ghosh (1997) Steroids 62:95-100). One such dehydrogenase is 3-oxo-5-a-
steroid dehydrogenase
(DASD), a microsomal membrane protein highly expressed in prostate and other
androgen-responsive
tissues. DASD catalyzes the conversion of testosterone into
dihydrotestosterone, wluch is the most
potent androgen. Dihydrotestosterone is essential for the formation of the
male phenotype during
embryogenesis, as well as for proper androgen-mediated growth of tissues such
as the prostate and
male genitalia. A defect in OASD that prevents the conversion of testosterone
into
dihydrotestosterone leads to a rare form of male pseudohermaphroditis,
characterized by defective
so formation of the external genitalia (Andersson, S. et al. (1991) Nature
354:159-161; Labrie, F. et al.
(1992) Endocrinology 131:1571-1573; OMIM #264600). Thus, DASD plays a central
role in sexual
differentiation and androgen physiology.
17(3-hydroxysteroid dehydrogenase (17(3HSD6) plays an important role in the
regulation of the
male reproductive hormone, dihydrotestosterone (DHTT). 17(3HSD6 acts to reduce
levels of DHTT
by oxidizing a precursor of DHTT, 3a-diol, to androsterone which is readily
glucuronidated and
removed from tissues. 17[3HSD6 is active with both androgen and estrogen
substrates when
expressed in embryonic kidney 293 cells. At least five other isozymes of
17(3HSD have been
identified that catalyze oxidation andlor reduction reactions in various
tissues with preferences for
different steroid substrates (Biswas, M.G. and D.W. Russell (1997) J. Biol.
Chem. 272:15959-15966).
2 o For example, 17(3HSD1 preferentially reduces estradiol and is abundant in
the ovary and placenta.
17(3HSD2 catalyzes oxidation of androgens and is present in the endometrium
and placenta.
17(3HSD3 is exclusively a reductive enzyme in the testis (Geissler, W.M. et
al. (1994) Nat. Genet.
7:34-39). An excess of androgens such as DHTT can contribute to certain
disease states such as
benign prostatic hyperplasia and prostate cancer.
Oxidoreductases are components of the fatty acid metabolism pathways in
mitochondria and
peroxisomes. The main beta-oxidation pathway degrades both saturated and
unsaturated fatty acids,
while the auxiliary pathway performs additional steps required for the
degradation of unsaturated fatty
acids. The auxiliary beta-oxidation enzyme 2,4-dienoyl-CoA reductase catalyzes
the removal of even-
numbered double bonds from unsaturated fatty acids prior to their entry into
the main beta-oxidation
s o pathway. The enzyme may also remove odd-numbered double bonds from
unsaturated fatty acids
(Koivuranta, K.T. et al. (1994) Biochem. J. 304:787-792; Smeland, T.E. et al.
(1992) Proc. Natl.
Acad. Sci. USA 89:6673-6677). 2,4-dienoyl-CoA reductase is located in both
mitochondria and
peroxisomes. Inherited deficiencies in mitochondrial and peroxisomal beta-
oxidation enzymes are
associated with severe diseases, some of which manifest themselves soon after
birth and lead to death
Wlthln a few years. Defects inbeta-oxidation are associated with Reye's
syndrome, Zellweger


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
syndrome, neonatal adrenoleukodystrophy, infantile Refsum's disease, acyl-CoA
oxidase deficiency,
and bifunctional protein deficiency (Suzuki, Y. et al. (1994) Am. J. Hum.
Genet. 54:36-43; Hoefler,
supra; Cotran, R.S. et al. (1994) Robbins Pathologic Basis of Disease, W.B.
Saunders Co.,
Philadelphia PA, p.866). Peroxisomal beta-oxidation is impaired in cancerous
tissue. Although
s neoplastic human breast epithelial cells have the same number of peroxisomes
as do normal cells, fatty
acyl-CoA oxidase activity is lower than in control tissue (e1 Bouhtoury, F. et
al. (1992) J. Pathol.
166:27-35). Human colon carcinomas have fewer peroxisomes than normal colon
tissue and have
lower fatty-acyl-CoA oxidase and bifunctional enzyme (including enoyl-CoA
hydratase) activities than
normal tissue (Cable, S. et al. (1992) Virchows Arch. B Cell Pathol. Incl.
Mol. Pathol. 62:221-226).
so Another important oxidoreductase is isocitxate dehydrogenase, which
catalyzes the conversion of
isocitrate to a-ketoglutarate, a substrate of the citric acid cycle.
Isocitrate dehydrogenase can be
either NAD or NADP dependent, and is found in the cytosol, mitochondria, and
peroxisomes. Activity
of isocitrate dehydrogenase is regulated developmentally, and by hormones,
neurotransmitters, and
growth factors.
15 Hydroxypyruvate reductase (HPR), a peroxisomal 2 hydroxyacid dehydrogenase
in the
glycolate pathway, catalyzes the conversion of hydroxypyruvate to glycerate
with the oxidation of both
NADH and NADPH. The revexse dehydrogenase reaction reduces NAD+ and NADP+. HPR
recycles nucleotides and bases back into pathways leading to the synthesis of
ATP and GTP. ATP
and GTP are used to produce DNA and RNA and to control various aspects of
signal transduction
2 o and energy metabolism. Inhibitors of purine nucleotide biosynthesis have
long been employed as
antiproliferative agents to treat cancer and viral diseases. HPR also
regulates biochemical synthesis
of serine and cellular serine levels available for protein synthesis.
The mitochondrial electron transport (or respiratory) chain is a series of
oxidoreductase-type
enzyme complexes in the mitochondrial membrane that is responsible for the
transport of electrons
2s from NADH through a series of redox centers within these complexes to
oxygen, and tile coupling of
this oxidation to the synthesis of ATP (oxidative phosphorylation). ATP then
provides the primary
source of energy for driving a cell's many energy-requiring reactions. The key
complexes in the
respiratory chain are NADH:ubiquinone oxidoreductase (complex I),
succinate:ubiquinone
oxidoreductase (complex 1I), cytochrome ci b oxidoreductase (complex III),
cytochrome c oxidase
30 (complex IV), and ATP synthase (complex V) (Alberts, B. et al. (1994)
Molecular Biolo~y of the Cell,
Garland Publishing, Inc., New York NY, pp. 677-678). All of these complexes
are located on the
inner matrix side of the mitochondrial membrane except complex II, which is on
the cytosolic side.
Complex II transports electrons generated in the citric acid cycle to the
respiratory chain. The
electrons generated by oxidation of succinate to fumarate in the citric acid
cycle are transferred
3 5 through electron carriers in complex 1I to membrane bound ubiquinone (Q).
Transcriptional regulation
6


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
of these nuclear-encoded genes appears to be the predominant means for
controlling the biogenesis of
respiratory enzymes. Defects and altered expression of enzymes in the
respiratory chain are
associated with a variety of disease conditions.
Other dehydrogenase activities using NAD as a cofactor are also important in
mitochondria)
s function. 3-hydroxyisobutyrate dehydrogenase (3HBD), important in valine
catabolism, catalyzes the
NAD-dependent oxidation of 3-hydroxyisobutyrate to methylmalonate semialdehyde
within
mitochondria. Elevated levels of 3 hydroxyisobutyrate have been reported in a
number of disease
states, including ketoacidosis, methylmalonic acidemia, and other disorders
associated with
deficiencies in methylinalonate semialdehyde dehydrogenase (Rougraff, P.M. et
al. (1989) J. Biol.
2o Chem.264:5899-5903).
Another mitochondria) dehydrogenase important in amino acid metabolism is the
enzyme
isovaleryl-CoA-dehydrogenase (IVD). IVD is involved in leucine metabolism and
catalyzes the
oxidation of isovaleryl-CoA to 3-methylcrotonyl-CoA. Human IVD is a tetrameric
flavoprotein that is
encoded in the nucleus and synthesized in the cytosol as a 45 kDa precursor
with a mitochondria)
15 llllpOrt signal sequence. A genetic deficiency, caused by a mutation in the
gene encoding TVD, results
in the condition known as isovaleric acidemia. This mutation results in
inefficient mitochondria) import
and processing of the PJD precursor (Vockley, J. et al. (1992) J. Biol. Chem.
267:2494-2501).
Transferases
Transferases are enzymes that catalyze the transfer of molecular groups. The
reaction may
2 o involve an oxidation, reduction, or cleavage of covalent bonds, and is
often specific to a substrate or to
particular sites on a type of substrate. Transferases participate in reactions
essential to such functions
as synthesis and degradation of cell components, regulation of cell functions
including cell signaling,
cell proliferation, inflamation, apoptosis, secretion and excretion.
Transferases are involved in key
steps in disease processes involving these functions. Transferases are
frequently classified according
2 s to the type of group transferred. For example, methyl transferases
transfer one-carbon methyl groups,
amino transferases transfer nitrogenous amino groups, and similarly
denominated enzymes transfer
aldehyde or ketone, acyl, glycosyl, alkyl or aryl, isoprenyl, saccharyl,
phosphorous-containing, sulfur-
containing, or selenium-containing groups, as well as small enzymatic groups
such as Coenzyme A.
Acyl transferases include peroxisomal carnitine octanoyl transferase, which is
involved in the
3 o fatty acid beta-oxidation pathway, and mitochondria) carnitine palmitoyl
transferases, involved in fatty
acid metabolism and transport. Choline O-acetyl transferase catalyzes the
biosynthesis of the
neurotransmitter acetylcholine.
Amino transferases play key roles in protein synthesis and degradation, and
they contribute to
other processes as well. For example, the amino transferase 5-aminolevulinic
acid synthase catalyzes
3 5 the addition of succinyl-CoA to glycine, the first step in heme
biosynthesis. Other amino transferases
7


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
participate in pathways important for neurological function and metabolism.
For example, glutamine-
phenylpyruvate amino transferase, also known as glutamine transaminase K
(GTK), catalyzes several
reactions with a pyridoxal phosphate cofactor. GTK catalyzes the reversible
conversion of L-
glutamine and phenylpyruvate to 2-oxoglutaramate and L-phenylalanine. Other
amino acid substrates
s for GTK include L-methionine, L-histidine, and L-tyrosine. GTK also
catalyzes the conversion of
kynurenine to kynurenic acid, a tryptophan metabolite that is an antagonist of
the N-methyl-D-
aspartate (NMDA) receptor in the brain and may exert a neuromodulatory
function. Alteration of the
kynurenine metabolic pathway may be associated with several neurological
disorders. GTK also plays
a role in the metabolism of halogenated xenobiotics conjugated to glutathione,
leading to nephrotoxicity
Zo in rats and neurotoxicity in humans. GTK is expressed in kidney, liver, and
brain. Both human and rat
GTKs contain a putative pyridoxal phosphate binding site (ExPASy ENZYME: EC
2.6.1.64; Perry,
S.J. et al. (1993) Mol. Pharmacol. 43:660-665; Perry, S. et al. (1995) FEBS
Lett. 360:277-280; and
Alberati-Giani, D. et al. (1995) J. Neurochem. 64:1448-1455). A second amino
transferase associated
with this pathway is kynurenine/a-aminoadipate amino transferase (AadAT).
AadAT catalyzes the
is reversible conversion of a-aminoadipate and a-ketoglutarate to a-
ketoadipate and L-glutamate during
lysine metabolism. AadAT also catalyzes the transamination of kynurenine to
kynurenic acid. A
cytosolic AadAT is expressed in rat kidney, liver, anal brain (Nakatani, Y. et
al. (1970) Biochim.
Biophys. Acta 198:219-228; Buchli, R. et al. (1995) J. Biol. Chem. 270:29330-
29335).
Glycosyl transferases include the mammalian UDP-glucouronosyl trausferases, a
family of
2 o membrane-bound microsomal enzymes catalyzing the transfer of glucouronic
acid to lipophilic
substrates in reactions that play important roles in detoxification and
excretion of drugs, carcinogens,
and other foreign substances. Another mammalian glycosyl transferase,
mammalian UDP-galactose-
ceramide galactosyl trausferase, catalyzes the transfer of galactose to
ceramide in the synthesis of
galactocerebrosides in myelin membranes of the nervous system. The UDP-
glycosyl transferases
25 share a conserved signature domain of about 50 amino acid residues
(PROSITE: PDOC00359,
http://expasy.hcuge.ch/sprot/prosite.html).
Methyl transferases are involved in a variety of pharmacologically important
processes.
Nicotinamide N-methyl transferase catalyzes the N-methylation of nicotinamides
and other pyridines,
an important step in the cellular handling of drugs and other foreign
compounds. Phenylethanolamine
s o N-methyl transferase catalyzes the conversion of noradrenalin to
adrenalin. 6-O-methylguanine-DNA
methyl transferase reverses DNA methylation, an important step in
carcinogenesis. Uroporphyrin-I)I
C-methyl transferase, which catalyzes the transfer of two methyl groups from S-
adenosyl-L-
methionine to uroporphyrinogen III, is the first specific enzyme in the
biosynthesis of cobalamin, a
dietary enzyme whose uptake is deficient in pernicious anemia. Protein-
arginine methyl transferases
3 5 catalyze the posttranslational methylation of arginine residues in
proteins, resulting in the mono- and


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
dimethylation of arginine on the guanidino group. Substrates include histones,
myelin basic protein, and
heterogeneous nuclear ribonucleoproteins involved in mRNA processing,
splicing, and transport.
Protein-arginine methyl transferase interacts with proteins upregulated by
mitogens, with proteins
involved in chronic lymphocytic leukemia, and with interferon, suggesting an
important role for
methylation in cytokine receptor signaling (Lin, W.-J. et al. (1996) J. Biol.
Chem. 271:15034-15044;
Abramovich, C. et al. (1997) EMBO J. 16:260-266; and Scott, H.S. et al. (1998)
Genomics 48:330-
340).
Phosphotransferases catalyze the transfer of high-energy phosphate gxoups and
are important
in energy-requiring and -releasing reactions. The metabolic enzyme creative
kinase catalyzes the
~.o reversible phosphate transfer between creative/creatine phosphate and
ATP/ADP. Glycocyamine
kinase catalyzes phosphate transfer from ATP to guanidoacetate, and arginine
kinase catalyzes
phosphate transfer from ATP to arginine. A cysteine-containing active site is
conserved in this family
(PROSITE: PDOC00103).
Prenyl transferases are heterodimers, consisting of an alpha and a beta
subunit, that catalyze
s 5 the transfer of an isoprenyl group. An example of a prenyl transferase is
the mammalian protein
farnesyl transferase. The alpha subunit of farnesyl transferase consists of 5
repeats of 34 amino acids
each, with each repeat containing an invariant tryptophan (PROSTIB:
PDOC00703).
Saccharyl transferases are glycating enzymes involved in a variety of
metabolic processes.
Oligosacchryl transferase-48, for example, is a receptor for advanced
glycation endproducts.
2 o Accumulation of these endproducts is observed in vascular complications of
diabetes, macrovascular
disease, renal insufficiency, and Alzheimer's disease (Thornalley, P.J. (1998)
Cell Mol. Biol. (Noisy-
Le-Grand) 44:1013-1023).
Coenzyme A (CoA) transferase catalyzes the transfer of CoA between two
carboxylic acids.
Succinyl CoA:3-oxoacid CoA transferase, for example, transfers CoA from
succinyl-CoA to a
2s recipient such as acetoacetate. Acetoacetate is essential to the metabolism
of ketone bodies, which
accumulate in tissues affected by metabolic disorders such as diabetes
(PROSITE: PDOC00980).
Hydrolases
Hydrolysis is the breaking of a covalent bond in a substrate by introduction
of a molecule of
water. The reaction involves a nucleophilic attack by the water molecule's
oxygen atom on a target
3 o bond in the substrate. The water molecule is split across the target bond,
breaking the bond and
generating two product molecules. Hydrolases participate in reactions
essential to such functions as
synthesis and degradation of cell components, and for regulation of cell
functions including cell
signaling, cell proliferation, inflamation, apoptosis, secretion and
excretion. Hydrolases are involved in
key steps in disease processes involving these functions. Hydrolytic enzymes,
ox hydrolases, may be
3 5 grouped by substrate specificity into classes including phosphatases,
peptidases, lysophospholipases,


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
phosphodiesterases, glycosidases, and glyoxalases.
Phosphatases hydrolytically remove phosphate groups from proteins, an energy-
providing step
that regulates many cellular processes, including intracellular signaling
pathways that in turn control
cell growth and differentiation, cell-cell contact, the cell cycle, and
oncogenesis.
Lysophospholipases (LPLs) regulate intracellular lipids by catalyzing the
hydrolysis of ester
bonds to remove an acyl group, a key step in lipid degradation. Small LPL
isoforms, approximately
15-30 kD, function as hydrolases; larger isoforms function both as hydrolases
and transacylases. A
particular substrate for LPLs, lysophosphatidylcholine, causes lysis of cell
membranes. LPL activity is
regulated by signaling molecules important in numerous pathways, including the
inflammatory
1o response.
Peptidases, also called proteases, cleave peptide bonds that form the backbone
of peptide or
protein chains. Proteolytic processing is essential to cell growth,
differentiation, remodeling, and
homeostasis as well as inflammation and immune response. Since typical protein
half lives range from
hours to a few days, peptidases are continually cleaving precursor proteins to
their active form,
s5 removing signal sequences from targeted proteins, and degrading aged or
defective proteins.
Peptidases function in bacterial, parasitic, and viral invasion and
replication within a host. Examples of
peptidases include trypsin and chymotrypsin (components of the complement
cascade and the
blood-clotting cascade) lysosomal cathepsins, calpains, pepsin, renin, and
chymosin (Beynon, R.J. and
J.S. Bond (1994) Proteolytic Enzymes: A Practical Approach, Oxford University
Press, New York
ao NY, pp. 1-5).
The phosphodiesterases catalyze the hydrolysis of one of the two ester bonds
in a
phosphodiester compound. Phosphodiesterases are therefore crucial to a variety
of cellular processes.
Phosphodiesterases include DNA and RNA endo- and exo-nucleases, which are
essential to cell
growth and replication as well as protein synthesis. Another phosphodiesterase
is acid
25 sphingomyelinase, which hydrolyzes the membrane phospholipid sphingomyelin
to ceramide and
phosphorylcholine. Phosphorylcholine is used in the synthesis of
phosphatidylcholine, which is
involved in numerous intracellular signaling pathways. Ceramide is an
essential precursor for the
generation of gangliosides, membrane lipids found in high concentration in
neural tissue. Defective
acid sphingomyelinase phosphodiesterase leads to a build-up of sphingomyelin
molecules in lysosomes,
3 o resulting in Niemann-Pick disease.
Glycosidases catalyze the cleavage of hemiacetyl bonds of glycosides, which
are compounds
that contain one or more sugar. Mammalian lactase-phlorizin hydrolase, for
example, is an intestinal
enzyme that splits lactose. Mammalian beta-galactosidase removes the terminal
galactose from
gangliosides, glycoproteins, and glycosaminoglycans, and deficiency of this
enzyme is associated with
s s a gangliosidosis known as Morquio disease type B. Vertebrate lysosomal
alpha-glucosidase, which


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
hydrolyzes glycogen, maltose, and isomaltose, and vertebrate intestinal
sucrase-isomaltase, which
hydrolyzes sucrose, maltose, and isomaltose, are widely distributed members of
this family with highly
conserved sequences at their active sites.
The glyoxylase system is involved in gluconeogenesis, the production of
glucose from storage
compounds in the body. It consists of glyoxylase I, which catalyzes the
formation of S-D-
lactoylglutathione from methyglyoxal, a side product of triose-phosphate
energy metabolism, and
glyoxylase II, which hydrolyzes S-D-lactoylglutathione to D-lactic acid and
reduced glutathione.
Glyoxylases are involved in hyperglycemia, non-insulin-dependent diabetes
mellitus, the detoxification
of bacterial toxins, and in the control of cell proliferation and microtubule
assembly.
1o L ases
Lyases are a class of enzymes that catalyze the cleavage of C-C, C-O, C-N, C-
S, C-(halide),
P-O or other bonds without hydrolysis or oxidation to form two molecules, at
least one of which
contains a double bond (Stryer, L. (1995) Biochemistry W.H. Freeman and Co.
New York, NY
p.620). Lyases are critical components of cellular biochemistry with roles in
metabolic energy
production including fatty acid metabolism, as well as other diverse enzymatic
processes. Further
classification of lyases reflects the type of bond cleaved as well as the
nature of the cleaved group.
The group of C-C lyases include carboxyl-lyases (decarboxylases), aldehyde-
lyases
(aldolases), oxo-acid-lyases and others. The C-O lyase group includes hydro-
lyases, lyases acting on
polysaccharides and other lyases. The C-N lyase group includes ammonia-lyases,
amidine-lyases,
2 o amine-lyases (deaminases) and other lyases.
Proper regulation of lyases is critical to normal physiology. For example,
mutation induced
deficiencies in the uroporphyrinogen decarboxylase can lead to photosensitive
cutaneous lesions in the
genetically-linked disorder familial porphyria cutanea tarda (Mendez, M. et
al. (1998) Am. J. Genet.
63:1363-1375). It has also been shown that adenosine deaminase (ADA)
deficiency stems from
genetic mutations in the ADA gene, resulting in the disorder severe combined
immunodeficiency
disease (SC1D) (Hershfteld, M.S. (1998) Semin. Hematol. 35:291-298).
Isomerases
Isomerases are a class of enzymes that catalyze geometric or structural
changes within a
molecule to form a single product. This class includes racemases and
epimerases, cis-trans-
s o isomerases, intramolecular oxidoreductases, intramolecular transferases
(mutases) and intramolecular
lyases. Isomerases are critical components of cellular biochemistry with roles
in metabolic energy
production including glycolysis, as well as other diverse enzymatic processes
(Stryer, L. (1995)
Biochemistry, W.H. Freeman and Co., New York NY, pp.483-507).
Racemases are a subset of isomerases that catalyze inversion of a molecules
configuration
s s around the asymmetric carbon atom in a substrate having a single center of
asymmetry, thereby
11


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
interconverting two racemers. Epimerases are another subset of isomerases that
catalyze inversion of
configuration around an asymmetric carbon atom in a substrate with more than
one center of
symmetry, thereby interconverting two epimers. Racemases and epimerases can
act on amino acids
and derivatives, hydroxy acids and derivatives, as well as carbohydrates and
derivatives. The
interconversion of UDP-galactose and UDP-glucose is catalyzed by UDP-galactose-
4'-epimerase.
Proper regulation and function of this epimerase is essential to the synthesis
of glycoproteins and
glycolipids. Elevated blood galactose levels have been correlated with UDP-
galactose-4'-epimerase
deficiency in screening programs of infants (Gitzelmann, R. (1972) Helv.
Paediat. Acta 27:125-130).
Oxidoreductases can be isomerases as well. Oxidoreductases catalyze the
reversible transfer
so of electrons from a substrate that becomes oxidized to a substrate that
becomes reduced. This class
of enzymes includes dehydrogenases, hydroxylases, oxidases, oxygenases,
peroxidases, and
reductases. Proper maintenance of oxidoreductase levels is physiologically
important. For example,
genetically-linked deficiencies in lipoamide dehydrogenase can result in
lactic acidosis (Robinson, B.H.
et al. (1977) Pediat. Res. 11:1198-1202).
Another subgroup of isomerases are the transferases (or mutases). Transferases
transfer a
chemical group from one compound (the donor) to another compound (the
acceptor). The types of
groups transferred by these enzymes include acyl groups, amino groups,
phosphate groups
(phosphotransferases or phosphomutases), and others. The trausferase carnitine
paltnitoyltransferase
is an important component of fatty acid metabolism. Genetically-linked
deficiencies in this transferase
zo can lead to myopathy (Scriver, C.R. et al. (1995) The Metabolic and
Molecular Basis of Inherited
Disease, McGraw-Hill, New York NY, pp.1501-1533).
Yet another subgroup of isomerases are the topoisomersases. Topoisomerases are
enzymes
that affect the topological state of DNA. For example, defects in
topoisomerases or their regulation
can affect normal physiology. Reduced levels of topoisomerase II have been
correlated with some of
the DNA processing defects associated with the disorder ataxia-telangiectasia
(Singh, S.P. et al.
(1988) Nucleic Acids Res. 16:3919-3929).
Lig_ases
Ligases catalyze the formation of a bond between two substrate molecules. The
process
involves the hydrolysis of a pyrophosphate bond in ATP or a similar energy
donor. Ligases are
3 o classified based on the nature of the type of bond they form, which can
include carbon-oxygen,
carbon-sulfur, carbon-nitrogen, carbon-carbon and phosphoric ester bonds.
Ligases forming carbon-oxygen bonds include the aminoacyl-transfer RNA (tRNA)
synthetases which are important RNA-associated enzymes with roles in
translation. Protein
biosynthesis depends on each amino acid forming a linkage with the appropriate
tRNA. The
aminoacyl-tRNA synthetases are responsible for the activation and correct
attachment of an amino
12


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
acid with its cognate tRNA. The 20 aminoacyl-tRNA synthetase enzymes can be
divided into two
structural classes, and each class is characterized by a distinctive topology
of the catalytic domain.
Class I enzymes contain a catalytic domain based on the nucleotide-binding
Rossman fold. Class II
enzymes contain a central catalytic domain, which consists of a seven-stranded
antiparalle113-sheet
s motif, as well as N- and C- terminal regulatory domains. Class If enzymes
are separated into two
groups based on the heterodimeric or homodimeric structure of the enzyme; the
latter group is further
subdivided by the structure of the N- and C-terminal regulatory domains
(Hartlein, M. and S. Cusack
(1995) J. Mol. Evol. 40:519-530). Autoantibodies against aminoacyl-tRNAs are
generated by patients
with dermatomyositis and polymyositis, and correlate strongly with
complicating interstitial lung disease
~o (ILD). These antibodies appear to be generated in response to viral
infection, and coxsackie virus has
been used to induce experimental viral myositis in animals.
Ligases forming carbon-sulfur bonds (Acid-thiol ligases) mediate a large
number of cellular
biosynthetic intermediary metabolism processes involve intermolecular transfer
of carbon
atom-containing substrates (carbon substrates). Examples of such reactions
include the tricarboxylic
s5 acid cycle, synthesis of fatty acids and long-chain phospholipids,
synthesis of alcohols and aldehydes,
synthesis of intermediary metabolites, and reactions involved in the amino
acid degradation pathways.
Some of these reactions require input of energy, usually in the form of
conversion of ATP to either
ADP or AMP and pyrophosphate.
In many cases, a carbon substrate is derived from a small molecule containing
at least two
2 o carbon atoms. The carbon substrate is often covalently bound to a larger
molecule which acts as a
carbon substrate carrier molecule within the cell. In the biosynthetic
mechanisms described above, the
carrier molecule is coenzyme A. Coenzyme A (CoA) is structurally related to
derivatives of the
nucleotide ADP and consists of 4'-phosphopantetheine linked via a
phosphodiester bond to the alpha
phosphate group of adenosine 3',5'-bisphosphate. The terminal thiol group of
4'-phosphopantetheine
2s acts as the site for carbon substrate bond formation. The predominant
carbon substrates which utilize
CoA as a carrier molecule during biosynthesis and intermediary metabolism in
the cell are acetyl,
succinyl, and propionyl moieties, collectively referred to as acyl groups.
Other carbon substrates
include enoyl lipid, which acts as a fatty acid oxidation intermediate, and
carnitine, which acts as an
acetyl-CoA flux regulator/ mitochondrial acyl group transfer protein. Acyl-CoA
and acetyl-CoA are
3 o synthesized in the cell by acyl-CoA synthetase and acetyl-CoA synthetase,
respectively.,
Activation of fatty acids is mediated by at least three forms of acyl-CoA
synthetase activity: i)
acetyl-CoA synthetase, which activates acetate and several other low molecular
weight~carboxylic
acids and is found in muscle mitochondria and the cytosol of other tissues;
ii) medium-chain acyl-CoA
synthetase, which activates fatty acids containing between four and eleven
carbon atoms
3 s (predominantly from dietary sources), and is present only in liver
mitochondria; and iii) acyl CoA
13


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
synthetase, which is specific for long chain fatty acids with between six and
twenty carbon atoms, and
is found in microsomes and the mitochondria. Proteins associated with acyl-CoA
synthetase activity
have been identified from many sources including bacteria, yeast, plants,
mouse, and man. The
activity of acyl-CoA synthetase may be modulated by phosphorylation of the
enzyme by
s cAMP-dependent protein kinase.
Ligases forming carbon-nitrogen bonds include amide syntheses such as
glutamine synthetase
(glutamate-ammonia ligase) that catalyzes the amination of glutamic acid to
glutamine by ammonia
using the energy of ATP hydrolysis. Glutamine is the primary source for the
amino group in various
amide transfer reactions involved in de novo pyrimidine nucleotide synthesis
and in purine and
1o pyrimidine ribonucleotide interconversions. Overexpression of glutamine
synthetase has been
observed in primary liver cancer (Christa, L. et al. (1994) Gastroent.
106:1312-1320).
Acid-amino-acid ligases (peptide syntheses) are represented by the ubiquitin
proteases which
are associated with the ubiquitin conjugation system (UCS), a major pathway
for the degradation of
cellular proteins in eukaryotic cells and some bacteria. The UCS mediates the
elimination of abnormal
s5 proteins and regulates the half-lives of important regulatory proteins that
control cellular processes
such as gene transcription and cell cycle progression. In the UCS pathway,
proteins targeted for
degradation are conjugated to a ubiquitin (Ub), a small heat stable protein.
Ub is first activated by a
ubiquitin-activating enzyme (E1), anal then transferred to one of several Ub-
conjugating enzymes
(E2). E2 then links the Ub molecule through its C-terminal glycine to an
internal lysine (acceptor
20 lysine) of a target protein. The ubiquitinated protein is then recognized
and degraded by proteasome, a
large, multisubunit proteolytic enzyme complex, and ubiquitin is released for
reutilization by ubiquitin
protease. The UCS is implicated in the degradation of mitotic cyclic kinases,
oncoproteins, tumor
suppressor genes such as p53, viral proteins, cell surface receptors
associated with signal transduction,
transcriptional regulators, and mutated or damaged proteins (Ciechanover, A.
(1994) Cell 79:13-21).
2 s A murine proto-oncogene, Unp, encodes a nuclear ubiquitin protease whose
overexpression leads to
oncogenic transformation of NIH3T3 cells, and the human homolog of this gene
is consistently
elevated in small cell tumors and adenocarcinomas of the lung (Gray, D.A.
(1995) Oncogene
10:2179-2183).
Cyclo-ligases and other carbon-nitrogen ligases comprise various enzymes and
enzyme
3 o complexes that participate in the de novo pathways to purine and
pyrimidine biosynthesis. Because
these pathways are critical to the synthesis of nucleotides for replication of
both RNA and DNA,
many of these enzymes have been the targets of clinical agents for the
treatment of cell proliferative
disorders such as cancer and infectious diseases.
Purine biosynthesis occurs de novo from the amino acids glycine and glutamine,
and other
3 s small molecules. Three of the key reactions in this process are catalyzed
by a trifunctional enzyme
14


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
composed of glycinamide-ribonucleotide synthetase (GARS), aminoimidazole
ribonucleotide synthetase
(AIRS), and glycinamide ribonucleotide transformylase (GART). Together these
three enzymes
combine ribosylamine phosphate with glycine to yield phosphoribosyl
aminoimidazole, a precursor to
both adenylate and guanylate nucleotides. This trifunctional protein has been
implicated in the
s pathology of Downs syndrome (Aimi, J. et al. (1990) Nucleic Acid Res.
18:6665-6672).
Adenylosuccinate synthetase catalyzes a later step in purine biosynthesis that
converts inosinic acid to
adenylosuccinate, a key step on the path to ATP synthesis. This enzyme is also
similar to another
carbon-nitrogen ligase, argininosuccinate synthetase, that catalyzes a similar
reaction in the urea cycle
(Powell, S.M. et al. (1992) FEBS Lett. 303:4-10).
1o Like the de novo biosynthesis of purines, de novo synthesis of the
pyrimidine nucleotides
uridylate and cytidylate also arises from a common precursor, in this instance
the nucleotide orotidylate
derived from orotate and phosphoribosyl pyrophosphate (PPRP). Again a
trifunctional enzyme
comprising three carbon-nitrogen ligases plays a key role in the process. In
this case the enzymes
aspartate transcarbamylase (ATCase), carbamyl phosphate synthetase II, and
dihydroorotase
15 (DHOase) are encoded by a single gene called CAD. Together these three
enzymes combine the
initial reactants in pyrimidine biosynthesis, glutamine, COZ, and ATP to form
dihydroorotate, the
precursor to orotate and orotidylate (Iwahana, H. et al. (1996) Biochem.
Biophys. Res. Comrnun.
219:249-255). Further steps then lead to the synthesis of uridine nucleotides
from orotidylate.
Cytidine nucleotides are derived from uridine-5 =triphosphate (UTP) by the
amidation of UTP using
2 o glutamine as the amino donor and the enzyme CTP synthetase. Regulatory
mutations in the human
CTP synthetase are believed to confer multi-drug resistance to agents widely
used in cancer therapy
(Yamauchi, M. et al. (1990) EMBO J. 9:2095-2099).
Ligases forming carbon-carbon bonds include the carboxylases acetyl-CoA
carboxylase and
pyruvate carboxylase. Acetyl-CoA carboxylase catalyzes the carboxylation of
acetyl-CoA from COZ
z s and Hz0 using the energy of ATP hydrolysis. Acetyl-CoA carboxylase is the
rate-limiting step in the
biogenesis of long-chain fatty acids. Two isoforms of acetyl-CoA carboxylase,
types I and types II,
are expressed in human in a tissue-specific manner (Ha, J. et al. (1994) Eur.
J. Biochem. 219:297-
306). Pyruvate carboxylase is a nuclear-encoded mitochondrial enzyme that
catalyzes the conversion
of pyruvate to oxaloacetate, a key intermediate in the citric acid cycle.
3 o Ligases forming phosphoric ester bonds include the DNA ligases involved in
both DNA
replication and xepair. DNA ligases seal phosphodiester bonds between two
adjacent nucleotides in a
DNA chain using the energy from ATP hydrolysis to first activate the free 5'-
phosphate of one
nucleotide and then react it with the 3 =OH group of the adjacent nucleotide.
This resealing reaction is
used in both DNA replication to join small DNA fragments called Okazaki
fragments that are
3 5 transiently formed in the process of replicating new DNA, and in DNA
repair. DNA repair is the


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
process by which accidental base changes, such as those produced by oxidative
damage, hydrolytic
attack, or uncontrolled methylation of DNA, are corrected before replication
or transcription of the
DNA can occur. Bloom's syndrome is an inherited human disease in which
individuals are partially
deficient in DNA ligation and consequently have an increased incidence of
cancer (Alberts, B. et al.
(1994) The Molecular Biology of the Cell, Garland Publishing Inc., New York
NY, p. 247).
Molecules Associated with Growth and Development
Human growth and development requires the spatial and temporal regulation of
cell
differentiation, cell proliferation, and apoptosis. These processes
coordinately control reproduction,
1o aging, embryogenesis, morphogenesis, organogenesis, and tissue repair and
maintenance. At the
cellular level, growth and development is governed by the cell's decision to
enter into or exit from the
cell division cycle and by the cell's commitment to a terminally
differentiated state. These decisions
are made by the cell in response to extracellular signals and other
environmental cues it receives. The
following discussion focuses on the molecular mechanisms of cell division,
reproduction, cell
differentiation and proliferation, apoptosis, and aging.
Cell Division
Cell division is the fundamental process by which all living things grow and
reproduce. In
unicellular organisms such as yeast and bacteria, each cell division doubles
the number of organisms,
while in multicellular species many rounds of cell division are required to
replace cells lost by wear or
2 o by programmed cell death, and for cell differentiation to produce a new
tissue or organ. Details of the
cell division cycle may vary, but the basic process consists of three
principle events. The first event,
interphase, involves preparations for cell division, replication of the DNA,
and production of essential
proteins. In the second event, mitosis, the nuclear material is divided and
separates to opposite sides
of the cell. The final event, cytokinesis, is division and fission of the cell
cytoplasm. The sequence
and timing of cell cycle transitions is under the control of the cell cycle
regulation system which
controls the process by positive or negative regulatory circuits at various
check points.
Regulated progression of the cell cycle depends on the integration of growth
control pathways
with the basic cell cycle machinery. Cell cycle regulators have been
identified by selecting for human
and yeast cDNAs that block or activate cell cycle arrest signals in the yeast
mating pheromone
s o pathway when they are overexpressed. Known regulators include human CPR
(cell cycle progression
restoration) genes, such as CPR8 and CPR2, and yeast CDC {cell division
control) genes, including
CDC91, that block the arrest signals. The CPR genes express a variety of
proteins including cyclins,
tumor suppressor binding proteins, chaperones, transcription factors,
translation factors, and
RNA binding proteins {Edwards, M.C. et al.{1997) Genetics 147:1063-1076).
3 5 Several cell cycle transitions, including the entry and exit of a cell
from mitosis, are dependent
16


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
upon the activation and inhibition of cyclin-dependent kinases (Cdks). The
Cdks are composed of a
kinase subunit, Cdk, and an activating subunit, cyclin, in a complex that is
subject to many levels of
regulation. There appears to be a single Cdk in Saccharomyces cerevisiae and
Saccharomyces
ombe whereas mammals have a variety of specialized Cdks. Cyclins act by
binding to and activating
s cyclin-dependent protein kinases which then phosphorylate and activate
selected proteins involved in
the mitotic process. The Cdk-cyclin complex is both positively and negatively
regulated by
phosphorylation, and by targeted degradation involving molecules such as CDC4
and CDC53. In
addition, Cdks are further regulated by binding to inhibitors and other
proteins such as Suc1 that
modify their specificity or accessibility to regulators (Patra, D. and W.G.
Dunphy (1996) Genes Dev.
so 10:1503-1515; and Mathias, N. et al. (1996) Mol. Cell Biol. 16:6634-6643).
Reproduction
The male and female reproductive systems are complex and involve many aspects
of growth
and development. The anatomy and physiology of the male and female
reproductive systems are
reviewed in (Guyton, A.C. (1991) Textbook of Medical Physiolo~y, W.B. Saunders
Co., Philadelphia
15 PA, pp. 899-928).
The male reproductive system includes the process of spermatogenesis, in which
the sperm
are formed, and male reproductive functions are regulated by various hormones
and their effects on
accessory sexual organs, cellular metabolism, growth, and other bodily
functions.
Spermatogenesis begins at puberty as a result of stimulation by gonadotropic
hormones
2 o released from the anterior pituitary. Immature sperm (spermatogonia)
undergo several mitotic cell
divisions before undergoing meiosis and full maturation. The testes secrete
several male sex
hormones, the most abundant being testosterone, that is essential for growth
and division of the
immature sperm, and for the masculine characteristics of the male body. Three
other male sex
hormones, gonadotropin-releasing hormone (GnRII), luteinizing hormone (LIT),
and follicle-stimulating
25 hormone (FSI~ control sexual function.
The uterus, ovaries, fallopian tubes, vagina, and breasts comprise the female
reproductive
system. The ovaries and uterus are the source of ova and the location of fetal
development,
respectively. The fallopian tubes and vagina are accessory organs attached to
the top and bottom of
the uterus, respectively. Both the uterus and ovaries have additional roles in
the development and loss
s o of reproductive capability during a female's lifetime. The primary role of
the breasts is lactation.
Multiple endocrine signals from the ovaries, uterus, pituitary, hypothalamus,
adrenal glands, and other
tissues coordinate reproduction and lactation. These signals vary during the
montbly menstruation
cycle and during the female's lifetime. Similarly, the sensitivity of
reproductive organs to these
endocrine signals varies during the female's lifetime.
35 A combination of positive and negative feedback to the ovaries, pituitary
and hypothalamus
17


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
glands controls physiologic changes during the monthly ovulation and
endometrial cycles. The anterior
pituitary secretes two major gonadotropin hormones, follicle-stimulating
hormone (FSH) and luteinizing
hormone (LH), regulated by negative feedback of steroids, most notably by
ovarian estradiol. If
fertilization does not occur, estrogen and progesterone levels decrease. This
sudden reduction of the
s ovarian hormones leads to menstruation, the desquamation of the endometrium.
Hormones further govern all the steps of pregnancy, parturition, lactation,
and menopause.
During pregnancy large quantities of human chorionic gonadotropin (hCG),
estrogens, progesterone,
and human chorionic somatomammotropin (hCS) are formed by the placenta. hCG, a
glycoprotein
similar to luteinizing hormone, stimulates the corpus luteum to continue
producing more progesterone
so and estrogens, rather than to involute as occurs if the ovum is not
fertilized. hCS is similar to growth
hormone and is crucial for fetal nutrition.
The female breast also matures during pregnancy. Large amounts of estrogen
secreted by
the placenta trigger growth and branching of the breast milk ductal system
while lactation is initiated
by the secretion of prolactin by the pituitary gland.
15 Parturition involves several hormonal changes that increase uterine
contractility toward the
end of pregnancy, as follows. The levels of estrogens increase more than those
of progesterone.
Oxytocin is secreted by the neurohypophysis. Concomitantly, uterine
sensitivity to oxytocin increases.
The fetus itself secretes oxytocin, cortisol (from adrenal glands), and
prostaglandins.
Menopause occurs when most of the ovarian follicles have degenerated. The
ovary then
2o produces less estradiol, reducing the negative feedback on the pituitary
and hypothalamus glands.
Mean levels of circulating FSH and LH increase, even as ovulatory cycles
continue. Therefore, the
ovary is less responsive to gonadotropins, and there is an increase in the
time between menstrual
cycles. Consequently, menstrual bleeding ceases and reproductive capability
ends.
Cell Differentiation and Proliferation
2s Tissue growth involves complex and ordered patterns of cell proliferation,
cell differentiation,
and apoptosis. Cell proliferation must be regulated to maintain both the
number of cells and their
spatial organization. This regulation depends upon the appropriate expression
of proteins which control
cell cycle progression in response to extracellular signals, such as growth
factors and other mitogens,
and intracellular cues, such as DNA damage or nutrient starvation. Molecules
which directly or
3 o indirectly modulate cell cycle progression fall into several categories,
including growth factors and
their receptors, second messenger and signal transduction proteins, oncogene
products, tumor-
suppressor proteins, and mitosis-promoting factors.
Growth factors were originally described as serum factors required to promote
cell
proliferation. Most growth factors are large, secreted polypeptides that act
on cells in their local
s s environment. Growth factors bind to and activate specific cell surface
receptors and initiate
1~


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
intracellular signal transduction cascades. Many growth factor receptors are
classified as receptor
tyrosine kinases which undergo autophosphorylation upon ligand binding.
Autophosphorylation enables
the receptor to interact with signal transduction proteins characterized by
the presence of SH2 or SH3
domains (Src homology regions 2 or 3). These proteins then modulate the
activity state of small G-
proteins, such as Ras, Rab, and Rho, along with GTPase activating proteins
(GAPS), guanine
nucleotide releasing proteins (GNRPs), and other guanine nucleotide exchange
factors. Small G
proteins act as molecular switches that activate other downstream events, such
as mitogen-activated
protein kinase (MAP kinase) cascades. MAP kinases ultimately activate
transcription of mitosis-
promoting genes.
so In addition to growth factors, small signaling peptides and hormones also
influence cell
proliferation. These molecules bind primarily to another class of receptor,
the trimeric G-protein
coupled receptor (GPCR), found predominantly on the surface of immune,
neuronal and
neuroendocrine cells. Upon ligand binding, the GPCR activates a trimeric G
protein which in turn
triggers increased levels of intracellular second messengers such as
phospholipase C, Ca2+, and cyclic
AMP. Most GPCR-mediated signaling pathways indirectly promote cell
proliferation by causing the
secretion or breakdown of other signaling molecules that have direct mitogenic
effects. These
signaling cascades often involve activation of kinases and phosphatases. Some
growth factors, such
as some members of the transforming growth factor beta (TGF-(3) family, act on
some cells to
stimulate cell proliferation and on other cells to inhibit it. Growth factors
may also stimulate a cell at
one concentration and inhibit the same cell at another concentration. Most
growth factors also have a
multitude of other actions besides the regulation of cell growth and division:
they can control the
proliferation, survival, differentiation, migration, or function of cells
depending on the circumstance.
For example, the tumor necrosis factor/nerve growth factor (~'NF/NGF) family
can activate or inhibit
cell death, as well as regulate proliferation and differentiation. The cell
response depends on the type
2 s of cell, its stage of differentiation and transformation status, which
surface receptors are stimulated,
and the types of stimuli acting on the cell (Smith, A. et al. (1994) Cell
76:959-962; and Nocentini, G. et
al. (1997) Proc. Natl. Acad. Sci. USA 94:6216-6221).
Neighboring cells in a tissue compete for growth factors, and when provided
with 'unlimited"
quantities in a perfused system will grow to even higher cell densities before
reaching density-
s o dependent inhibition of cell division. Cells often demonstrate au
anchorage dependence of cell division
as well. This anchorage dependence may be associated with the formation of
focal contacts lirkirig
the cytoskeleton with the extracellular matrix (ECM). The expression of ECM
components can be
stimulated by growth factors. For example, TGF-(3 stimulates fibroblasts to
produce a variety of ECM
proteins, including fibronectin, collagen, and tenascin (Pearson, C.A. et al.
(1988) EMBO J. 7:2677-
35 2981). Iu fact, for some cell types specific ECM molecules, such as laminin
or fibronectin, may act as
19


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
growth factors. Tenascin-C and -R, expressed in developing and lesioned neural
tissue, provide
stimulatorylanti-adhesive or inhibitory properties, respectively, for axonal
growth (Faissner, A. (1997)
Cell Tissue Res. 290:331-341).
Cancers are associated with the activation of oncogenes which are derived from
normal
cellular genes. These oncogenes encode oncoproteins which convert normal cells
into malignant cells.
Some oncoproteins are mutant isoforms of the normal protein, and other
oncoproteins are abnormally
expressed with respect to location or amount of expression. The latter
category of oncoprotein causes
cancer by altering transcriptional control of cell proliferation. Five classes
of oncoproteins are known
to affect cell cycle controls. These classes include growth factors, growth
factor receptors,
1 o intracellular signal transducers, nuclear transcription factors, and cell-
cycle control proteins. Viral
oncogenes are integrated into the human genome after infection of human cells
by certain viruses.
Examples of viral oncogenes include v-src, v-abl, and v-fps.
Many oncogenes have been identified and characterized. These include sis,
erbA, erbB, her
2, mutated GS, src, abl, ras, crk, jun, fos, myc, and mutated tumor-suppressor
genes such as RB, p53,
mdm2, Cipl, p16, and cyclin D. Transformation of normal genes to oncogenes may
also occur by
chromosomal translocation. The Philadelphia chromosome, characteristic of
chronic myeloid leukemia
and a subset of acute lymphoblastic leukemias, results from a reciprocal
translocation between
chromosomes 9 and 22 that moves a truncated portion of the proto-oncogene c-
abl to the breakpoint
cluster region (bcr) on chromosome 22.
2 o Tumor-suppressor genes are involved in regulating cell proliferation.
Mutations which cause
reduced or loss of function in tumor-suppressor genes result in uncontrolled
cell proliferation. For
example, the retinoblastoma gene product (RB), in a non-phosphorylated state,
binds several early-
response genes and suppresses their transcription, thus blocking cell
division. Phosphorylation of RB
causes it to dissociate from the genes, releasing the suppression, and
allowing cell division to proceed.
2 5 A '~OptOSls
Apoptosis is the genetically controlled process by which unneeded or defective
cells undergo
programmed cell death. Selective elimination of cells is as important for
morphogenesis and tissue
remodeling as is cell proliferation and differentiation. Lack of apoptosis may
result in hyperplasia and
other disorders associated with increased cell proliferation. Apoptosis is
also a critical component of
3 o the immune response. Immune cells such as cytotoxic T-cells and natural
killer cells prevent the
spread of disease by inducing apoptosis in tumor cells and virus-infected
cells. In addition, immune
cells that fail to distinguish self molecules from foreign molecules must be
eliminated by apoptosis to
avoid an autoimmune response.
Apoptotic cells undergo distinct morphological changes. Hallmarks of apoptosis
include cell
3 5 shrinkage, nuclear and cytoplasmic condensation, and alterations in plasma
membrane topology.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Biochemically, apoptotic cells are characterized by increased intracellular
calcium concentration,
fragmentation of chromosomal DNA, and expression of novel cell surface
components.
The molecular mechanisms of apoptosis are highly conserved, and many of the
key protein
regulators and effectors of apoptosis have been identified. Apoptosis
generally proceeds in response
s to a signal which is transduced intracellularly and results in altered
patterns of gene expression and
protein activity. Signaling molecules such as hormones and cytokines are known
both to stimulate and
to inlubit apoptosis through interactions with cell surface receptors.
Transcription factors also play an
important role in the onset of apoptosis. A number of downstream effector
molecules, particularly
proteases such as the cysteine proteases called caspases, have been implicated
in the degradation of
so cellular components and the proteolytic activation of other apoptotic
effectors.
A~m~ and Senescence
Studies of the aging process or senescence have shown a number of
characteristic cellular
and molecular changes (Fauci et al. (1998) Ilarrison's Princi.,~les of
Internal Medicine, McGraw-Hill,
New York NY, p.37). These characteristics include increases in chromosome
structural
1s abnormalities, DNA cross-linking, incidence of single-stranded breaks in
DNA, losses in DNA
methylation, and degradation of telomere regions. In addition to these DNA
changes, post-
translational alterations of proteins increase including, deamidation,
oxidation, cross-linking, and
nonenzymatic glycation. Still further molecular changes occur in the
mitochondria of aging cells
through deterioration of structure. These changes eventually contribute to
decreased function in every
z o organ of the body.
Biochemical Pathway Molecules
Biochemical pathways are responsible for regulating metabolism, growth and
development,
protein secretion and trafficking, environmental responses, and ecological
interactions including
immune response and response to parasites.
DNA replication
Deoxyribonucleic acid (DNA), the genetic material, is found in both the
nucleus and
mitochondria of human cells. The bulk of human DNA is nuclear, in the form of
linear chromosomes,
while mitochondria) DNA is circular. DNA replication begins at specific sites
called origins of
3 o replication. Bidirectional synthesis occurs from the origin via two
growing forks that move in opposite
directions. Replication is semi-conservative, with each daughter duplex
containing one old strand and
its newly synthesized complementary partner. Proteins involved in DNA
replication include DNA
polymerises, DNA primase, telomerase, DNA helicase, topoisomerases, DNA
ligases, replication
factors, and DNA-binding proteins.
3 5 DNA Recombination and Repair
21


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Cells are constantly faced with replication errors and environmental assault
(such as
ultraviolet irradiation) that can produce DNA damage. Damage to DNA consists
of any change that
modifies the structure of the molecule. Changes to DNA can be divided into two
general classes,
single base changes and structural distortions. Any damage to DNA can produce
a mutation, and the
s mutation may produce a disorder, such as cancer.
Changes in DNA are recognized by repair systems within the cell. These repair
systems act
to correct the damage and thus prevent any deleterious affects of a mutational
event. Repair systems
can be divided into three general types, direct repair, excision repair, and
retrieval systems. Proteins
involved in DNA repair include DNA polymerase, excision repair proteins,
excision and cross link
so repair proteins, recombination and repair proteins, RAD51 proteins, and BLN
and WRN proteins that
are homologs of RecQ helicase. When the repair systems are eliminated, cells
become exceedingly
sensitive to environmental mutagens, such as ultraviolet irradiation. Patients
with disorders associated
with a loss in DNA repair systems often exhibit a high sensitivity to
environmental mutagens.
Examples of such disorders include xeroderma pigmentosum (XP), Bloom's
syndrome (BS), and
15 Werner's syndrome (WS) (Yamagata, K.. et al. (1998) Proc. Natl. Acad. Sci.
USA 95:8733-8738),
ataxia telangiectasia, Cockayne's syndrome, and Fanconu's anemia.
Recombination is the process whereby new DNA sequences are generated by the
movements of large pieces of DNA. In homologous recombination, which occurs
during meiosis and
DNA repair, parent DNA duplexes align at regions of sequence similarity, and
new DNA molecules
2 o form by the breakage and joining of homologous segments. Proteins involved
include RAD51
recombinase. In site-specific recombination, two speciftc but not necessarily
homologous DNA
sequences are exchanged. In the immune system this process generates a diverse
collection of
antibody and T cell receptor genes. Proteins involved in site-specific
recombination in the immune
system include recombination activating genes 1 and 2 (RAGl and RAG2). A
defect in immune
25 system site-specific recombination causes severe combined immunodeficiency
disease in mice.
RNA Metabolism
Ribonucleic acid (RNA) is a linear single-stranded polymer of four
nucleotides, ATP, CTP,
UTP, and GTP. In most organisms, RNA is transcribed as a copy of DNA, the
genetic material of
the organism. In. retroviruses RNA rather than DNA serves as the genetic
material. RNA copies of
s o the genetic material encode proteins or serve various structural,
catalytic, or regulatory roles in
organisms. RNA is classified according to its cellular localization and
function. Messenger RNAs
(mRNAs) encode polypeptides. Ribosomal RNAs (rRNAs) are assembled, along with
ribosomal
proteins, into ribosomes, which are cytoplasmic particles that translate mRNA
into polypeptides.
Transfer RNAs (tRNAs) are cytosolic adaptor molecules that function in mRNA
translation by
3 s recognizing both an mRNA codon and the amino acid that matches that codon.
Heterogeneous
22


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
nuclear RNAs (huRNAs) include mRNA precursors and other nuclear RNAs of
various sizes. Small
nuclear RNAs (snRNAs) are a part of the nuclear spliceosome complex that
removes intervening,
non-coding sequences (introits) and rejoins exons in pre-mRNAs.
RNA Transcription
s The transcription process synthesizes an RNA copy of DNA. Proteins involved
include
mufti-subunit RNA polymerases, transcription factors IIA, IlB, Im, BE, BF',
IIH, and ILT. Many
transcription factors incorporate DNA binding structural motifs which comprise
either a helices or (3-
sheets that bind to the major groove of DNA. Four well-characterized
structural motifs are helix-turn-
helix, zinc finger, leucine zipper, and helix-loop-helix.
1o RNA Processing
Various proteins are necessary for processing of transcribed RNAs in the
nucleus. Pre-
mRNA processing steps include capping at the S' end with methylguanosine,
polyadenylating the 3'
end, and splicing to remove introits. The spliceosomal complex is comprised of
five small nuclear
ribonucleoprotein particles (snRNPs) designated U1, U2, U4, U5, and U6. Each
snRNP contains a
15 single species of snRNA and about ten proteins. The RNA components of some
snRNPs recognize
and base-pair with introit consensus sequences. The protein components mediate
spliceosome
assembly and the splicing reaction. Autoantibodies to snRNP proteins are found
in the blood of
patients with systemic lupus erythematosus (Stryer, L. (1995) Biochemistry
W.H. Freeman and
Company, New fork NY, p. 863).
2 o Heterogeneous nuclear ribonucleoproteins (hnRNPs) have been identified
that have roles in
splicing, exporting of the mature RNAs to the cytoplasm, and mRNA translation
(Biamonti, G. et al.
(1998) Clip. Exp. Rheumatol. 16:317-326). Some examples of hnRNPs include the
yeast proteins
Hrplp, involved in cleavage and polyadenylation at the 3' end of the RNA;
Cbp80p, involved in
capping the 5' end of the RNA; and Npl3p, a homolog of mammalian hnRNP A1,
involved in export of
25 mRNA from the nucleus (Shen, E.C. et al. (1998) Genes Dev. 12:679-691).
HnRNPs have been
shown to be important targets of the autoimmune response in rheumatic diseases
(Biainonti, su ra).
Many snRNP proteins, hnRNP proteins, and alternative splicing factors are
characterized by
an RNA recognition motif (RRM). (Reviewed in Birney, E. et al. (1993) Nucleic
Acids Res. 21:5803-
5816.) The RRM is about 80 amino acids in length and forms four (3-strands and
two a-helices
3 o arranged in an a/(3 sandwich. The RRM contains a core RNP-1 octapeptide
motif along with
surrounding conserved sequences.
RNA Stability and Degradation
RNA helicases alter and regulate RNA conformation and secondary structure by
using
energy derived from ATP hydrolysis to destabilize and unwind RNA duplexes. The
most well-
3 s characterized and ubiquitous family of RNA helicases is the DEAD-box
family, so named for the
23


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
conserved t3-type A'TP-binding motif which is diagnostic of proteins in this
family. Over 40 DEAD-
box helicases have been identified in organisms as diverse as bacteria,
insects, yeast, amphibians,
mammals, and plants. DEAD-box helicases function in diverse processes such as
translation initiation,
splicing, ribosome assembly, and RNA editing, transport, and stability. Some
DEAD-box helicases
s play tissue- and stage-specific roles in spermatogenesis and embryogenesis.
(Reviewed in Linden, P.
et al. (1989) Nature 337:121-122.)
Overexpression of the DEAD-box 1 protein (DDX1) may play a role in the
progression of
neuroblastoma (Nb) and retinoblastoma (Rb) tumors. Other DEAD-box helicases
have been
implicated either directly or indirectly in ultraviolet light-induced tumors,
B cell lymphoma, and myeloid
so malignancies. (Reviewed in Godbout, R. et al. (1998) J. Biol. Chem.
273:21161-21168.)
Ribonucleases (RNases) catalyze the hydrolysis of phosphodiester bonds in RNA
chains, thus
cleaving the RNA. For example, RNase P is a ribonucleoprotein enzyme which
cleaves the 5' end of
pre-tRNAs as part of their maturation process. RNase H digests the RNA strand
of an RNA/DNA
hybrid. Such hybrids occur in cells invaded by retroviruses, and RNase H is an
important enzyme in
~s the retroviral replication cycle. RNase H domains are often found as a
domain associated with
reverse transcriptases. RNase activity in serum and cell extracts is elevated
in a variety of cancers
and infectious diseases (Schein, C.H. (1997) Nat. Biotechnol. 15:529-536).
Regulation of RNase
activity is being investigated as a means to control tumor angiogenesis,
allergic reactions, viral
infection and replication, and fungal infections.
a o Protein Translation
The eukaryotic ribosome is composed of a 60S (large) subunit and a 40S (small)
subunit,
which together form the 80S ribosome. In addition to the 185, 28S, 5S, and
5.85 rRNAs, the ribosome
also contains more than fifty proteins. The ribosomal proteins have a preftx
which denotes the subunit
to which they belong, either L (large) or S (small). Three important sites are
identified on the
2s ribosome. The aminoacyl-tRNA site (A site) is where charged tRNAs (with the
exception of the
initiator-tRNA) bind on arnval at the ribosome. The peptidyl-tRNA site (P
site) is where new peptide
bonds are formed, as well as where the initiator tRNA binds. The exit site (E
site) is where
deacylated tRNAs bind prior to their release from the ribosome. (Translation
is reviewed in Stryer, L.
(1995) Biochemistry, W.H. Freeman and Company, New York NY, pp. 875-908; and
Lodish, H. et al.
30 (1995) Molecular Cell Biology, Scientific American Books, New York NY, pp.
119-138.)
tRNA Charging
Protein biosynthesis depends on each amino acid forming a linkage with the
appropriate
tRNA. The aminoacyl-tRNA synthetases are responsible for the activation and
correct attachment of
an amino acid with its cognate tRNA. The 20 aminoacyl-tRNA synthetase enzymes
can be divided
3 5 lnto two structural classes, Class I and Class II. Autoantibodies against
aminoacyl-tRNAs are
24


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
generated by patients with dermatomyositis and polymyositis, and correlate
strongly with complicating
interstitial lung disease (IL,D). These antibodies appear to be generated in
response to viral infection,
and coxsackie virus has been used to induce experimental viral myositis in
animals.
Translation Initiation
s , Initiation of translation can be divided into three stages. The first
stage brings an initiator
transfer RNA (Met-tRNAf) together with the 40S ribosomal subunit to form the
43S preinitiation
complex. The second stage binds the 43S preinitiation complex to the mRNA,
followed by migration
of the complex to the correct AUG initiation codon. The third stage brings the
60S ribosomal subunit
to the 40S subunit to generate an 80S ribosome at the initiation codon.
Regulation of translation
so primarily involves the first and second stage in the initiation process
(Pain, V.M. (1996) Eur. J.
Biochem. 236:747-771).
Several initiation factors, many of which contain multiple subunits, are
involved in bringing an
initiator tRNA and 40S ribosomal subunit together. eIF2, a guanine nucleotide
binding protein, recruits
the initiator tRNA to the 40S ribosomal subunit. Only when eIF2 is bound to
GTP does it associate
1s with the initiator tRNA. eIF2B, a guanine nucleotide exchange protein, is
responsible for converting
eIF2 from the GDP-bound inactive form to the GTP-bound active form. Two other
factors, eIFlA
and eIF3 bind and stabilize the 40S subunit by interacting with 18S ribosomal
RNA and specific
ribosomal structural proteins. elF3 is also involved in association of the 40S
ribosomal subunit with
mRNA. The Met-tRNAf, elFlA, elF3, and 40S ribosomal subunit together make up
the 43S
a o preinitiation complex (Pain, sera).
Additional factors are required for binding of the 43S preinitiation complex
to an mRNA
molecule, and the process is regulated at several levels. eIF4F is a complex
consisting of three
proteins: eIF4E, eIF4A, and elF4G. elF4E recognizes and binds to the mRNA 5
=terniinal m'GTP
cap, eIF4A is a bidirectional RNA-dependent helicase, and elF4G is a
scaffolding polypeptide. eIF4G
2 s has three binding domains. The N-terminal third of eIF4G interacts with
elF4E, the central third
interacts with eIF4A, and the C-terminal third interacts with elF3 bound to
the 43S preinitiation
complex. Thus, elF4G acts as a bridge between the 40S ribosomal subunit and
the mRNA (Hentze,
M.W. (1997) Science 275:500-501).
The ability of eIF4F to initiate binding of the 43S preinitiation complex is
regulated by
3 o structural features of the mRNA. The mRNA molecule has an untranslated
region (UTR) between
the 5' cap and the AUG start codon. In some mRNAs this region forms secondary
structures that
impede binding of the 43S preinitiation complex. The helicase activity of
elF4A is thought to function
in removing this secondary structure to facilitate binding of the 43S
preinitiation complex (Pain, s-upra).
Translation Elongation
3 5 Elongation is the process whereby additional amino acids are joined to the
initiator methionine


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
to form the complete polypeptide chain. The elongation factors EFla, EF1(3 y,
and EFZ are involved
in elongating the polypeptide chain following initiation. EFla is a GTP-
binding protein. Iu EFla's
GTP-bound form, it brings an arninoacyl-tRNA to the ribosome's A site. The
amino acid attached to
the newly arrived aminoacyl-tRNA forms a peptide bond with the initiator
methionine. The GTP on
s EFla is hydrolyzed to GDP, and EFla-GDP dissociates from the ribosome. EF1(3
y binds EFla -
GDP and induces the dissociation of GDP from EFla, allowing EFla to bind GTP
and a new cycle to
begin.
As subsequent aminoacyl-tRNAs are brought to the ribosome, EF-G, another GTP
binding
protein, catalyzes the translocation of tRNAs from the A site to the P site
and finally to the E site of
1o the ribosome. This allows the processivity of translation.
Translation Termination
The release factor eRF carries out termination of translation. eRF recognizes
stop codons in
the mRNA, leading to the release of the polypeptide chain from the ribosome.
Post-Translational Pathways
15 Proteins may be modified after translation by the addition of phosphate,
sugar, prenyl, fatty
acid, and other chemical groups. These modifications are often required for
proper protein activity.
Enzymes involved in post-trauslational modification include kinases,
phosphatases,
glycosyltransferases, and prenyltransferases. The conformation of proteins may
also be modified
after translation by the introduction and rearrangement of disulfide bonds
(rearrangement catalyzed by
2 o protein disulfide isomerase), the isomerization of proline sidechains by
prolyl isomerase, and by
interactions with molecular chaperone proteins.
Proteins may also be cleaved by proteases. Such cleavage may result in
activation,
inactivation, or complete degradation of the protein. Proteases include serine
proteases, cysteine
proteases, aspartic proteases, and metalloproteases. Signal peptidase in the
endoplasmic reticulum
2 s (ER) lumen cleaves the signal peptide from membrane or secretory proteins
that are imported into the
ER. Ubiquitin proteases are associated with the ubiquitin conjugation system
(UCS), a major pathway
for the degradation of cellular proteins in eukaryotic cells and some
bacteria. The UCS mediates the
elimination of abnormal proteins and regulates the half lives of important
regulatory proteins that
control cellular processes such as gene transcription and cell cycle
progression. In the UCS pathway,
3 o proteins targeted for degradation are conjugated to a ubiquitin, a small
heat stable protein. Proteins
involved in the UCS include ubiquitin-activating enzyme, ubiquitin-conjugating
enzymes, ubiquitiu-
ligases, and ubiquitin C-terminal hydrolases. The ubiquitinated protein is
then recognized and degraded
by the proteasome, a large, multisubunit proteolytic enzyme complex, and
ubiquitin is released for
reutilization by ubiquitin protease.
3 5 Lipid Metabolism
26


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Lipids are water-insoluble, oily or greasy substances that are soluble in
nonpolar solvents such
as chloroform or ether. Neutral fats (triacylglycerols) serve as major fuels
and energy stores. Polar
lipids, such as phospholipids, sphingolipids, glycolipids, and cholesterol,
are key structural components
of cell membranes.
Lipid metabolism is involved in human diseases and disorders. In the arterial
disease
atherosclerosis, fatty lesions form on the inside of the arterial wall. These
lesions promote the loss of
arterial flexibility and the formation of blood clots (Guyton, A.C. Textbook
of Medical Physiolo~y
(1991) W.B. Saunders Company, Pluladelphia PA, pp.760-763). In Tay-Sachs
disease, the GMZ
ganglioside (a sphingolipid) accumulates in lysosomes of the central nervous
system due to a lack of
1o the enzyme N-acetylhexosaminidase. Patients suffer nervous system
degeneration leading to early
death (Fauci, A.S. et al. (1998) Harrison's Principles of Internal Medicine
McGraw-Hill, New York
NY, p. 2171). The Niemann-Pick diseases are caused by defects in lipid
metabolism. Niemann-Pick
diseases types A and B are caused by accumulation of sphingomyelin (a
sphingolipid) and other lipids
in the central nervous system due to a defect in the enzyme sphingomyelinase,
leading to
15 neurodegeneration and lung disease. Niemann-Pick disease type C results
from a defect in
cholesterol transport, leading to the accumulation of sphingomyelin and
cholesterol in lysosomes and a
secondary reduction in sphingomyelinase activity. Neurological symptoms such
as grand mal seizures,
ataxia, and loss of previously learned speech, manifest 1-2 years after birth.
A mutation in the NPC
protein, which contains a putative cholesterol-sensing domain, was found in a
mouse model of
2o Niemann-Pick disease type C (Fauci, su ra, p. 2175; Loftus, S.K. et al.
(1997) Science 277:232-235).
(Lipid metabolism is reviewed in Stryer, L. (1995) Biochemistry, W.H. Freeman
and Company, New
York NY; Lehninger, A. (1982) Principles of Biochemistry Worth Publishers,
Inc., New York NY;
and ExPASy "Biochemical Pathways" index of Boehringer Mannheim World Wide Web
site.)
Fatty Acid Synthesis
2s Fatty acids are long-chain organic acids with a single carboxyl group and a
long non-polar
hydrocarbon tail. Long-chain fatty acids are essential components of
glycolipids, phospholipids, and
cholesterol, which are building blocks for biological membranes, and of
triglycerides, which are
biological fuel molecules. Long-chain fatty acids are also substrates for
eicosanoid production, and
are important in the functional modification of certain complex carbohydrates
and proteins. 16-carbon
s o and 18-carbon fatty acids are the most common.
Fatty acid synthesis occurs in the cytoplasm. In the first step, acetyl-
Coenzyme A (CoA)
carboxylase (ACC) synthesizes malonyl-CoA from acetyl-CoA and bicarbonate. The
enzymes which
catalyze the remaining reactions are covalently linked into a single
polypeptide chain, referred to as the
multifunctional enzyme fatty acid synthase (FAS). FAS catalyzes the synthesis
of paltnitate from
s acetyl-CoA and malonyl-CoA. FAS contains acetyl transferase, malonyl
transferase, ~i-ketoacetyl
27


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
synthase, acyl carrier protein, (3-ketoacyl reductase, dehydratase, enoyl
reductase, and thioesterase
activities. The final product of the FAS reaction is the 16-carbon fatty acid
palmitate. Further
elongation, as well as unsaturation, of palmitate by accessory enzymes of the
ER produces the variety
of long chain fatty acids required by the individual cell. These enzymes
include a NADH-cytochrome
s b5 reductase, cytochrome b5, and a desaturase.
Phospholipid and Triac~glycerol Synthesis
Triacylglycerols, also known as triglycerides and neutral fats, are major
energy stores in
animals. Triacylglycerols are esters of glycerol with three fatty acid chains.
Glycerol-3-phosphate is
produced from dihydroxyacetone phosphate by the enzyme glycerol phosphate
dehydrogenase or from
o glycerol by glycerol kinase. Fatty acid-CoA's are produced from fatty acids
by fatty acyl-CoA .
synthetases. Glyercol-3-phosphate is acylated with two fatty acyl-CoA's by the
enzyme glycerol
phosphate acyltransferase to give phosphatidate. Phosphatidate phosphatase
converts phosphatidate
to diacylglycerol, which is subsequently acylated to a triacylglyercol by the
enzyme diglyceride
acyltransferase. Phosphatidate phosphatase and diglyceride acyltransferase
form a triacylglyerol
s s synthetase complex bound to the ER membrane.
A major class of phospholipids are the phosphoglycerides, which are composed
of a glycerol
backbone, two fatty acid chains, and a phosphorylated alcohol.
Phosphoglycerides are components of
cell membranes. Principal phosphoglycerides are phosphatidyl choline,
phosphatidyl ethanolamine,
phosphatidyl serine, phosphatidyl inositol, and diphosphatidyl glycerol. Many
enzymes involved in
2o phosphoglyceride synthesis are associated with membranes (Meyers, R.A.
(1995) Molecular Biology
and Biotechnolo~y, VCH Publishers Inc., New York NY, pp. 494-501).
Phosphatidate is converted to
CDP-diacylglycerol by the enzyme phosphatidate cytidylyltransferase (ExPASy
ENZYME EC
2.7.7.41). Transfer of the diacylglyeerol group from CDP-diacylglycerol to
serine to yield
phosphatidyl serine, or to inositol to yield phosphatidyl inositol, is
catalyzed by the enzymes CDP-
z5 diacylglycerol-serine O-phosphatidyltransferase and CDP-diacylglycerol-
inositol 3-
phosphatidyltransferase, respectively (ExPASy ENZYME EC 2.7.8.8; ExPASy ENZYME
EC
2.7.8.11). The enzyme phosphatidyl serine decarboxylase catalyzes the
conversion of phosphatidyl
serine to phosphatidyl ethanolamine, using a pyruvate cofactor (Voelker, D.R.
(1997) Biochim.
Biophys. Acta 1348:236-244). Phosphatidyl choline is formed using diet-derived
choline by the
3 o reaction of CDP-choline with 1,2-diacylglycerol, catalyzed by
diacylglycerol
cholinephosphotransferase (ExPASy ENZYME 2.7.8.2).
Sterol, Steroid, and Isoprenoid Metabolism
Cholesterol, composed of four fused hydrocarbon rings with an alcohol at one
end, moderates
the fluidity of membranes in which it is incorporated. In addition,
cholesterol is used in the synthesis of
35 steroid hormones such as cortisol, progesterone, estrogen, and
testosterone. Bile salts derived from
28


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cholesterol facilitate the digestion of lipids. C;halesterol in the skin forms
a barrier that prevents excess
water evaporation from the body. Farnesyl and geranylgeranyl groups, which are
derived from
cholesterol biosynthesis intermediates, are post-translationally added to
signal transduction proteins
such as ras and protein-targeting proteins such as rob. These modifications
are important for the
s activities of these proteins (Guyton, su ra; Stryer, supra, pp. 279-280, 691-
702, 934).
Mammals obtain cholesterol derived from both de novo biosynthesis and the
diet. The livex is
the major site of cholesterol biosynthesis in mammals. Two acetyl-CoA
molecules initially condense
to form acetoacetyl-CoA, catalyzed by a thiolase. Acetoacetyl-CoA condenses
with a third acetyl-
CoA to form hydroxymethylglutaryl-CoA (HMG-CoA), catalyzed by HMG-CoA
synthase.
so Conversion of HMG-CoA to cholesterol is accomplished via a series of
enzymatic steps known as the
mevalonate pathway. The rate-limiting step is the conversion of HMG-CoA to
mevalonate by HMG
CoA reductase. The drug lovastatin, a potent inhibitor of HMG-CoA reductase,
is given to patients to
reduce their serum cholesterol levels. Other mevalonate pathway enzymes
include mevalonate kinase,
phosphomevalonate kinase, diphosphomevalonate decarboxylase,
isopentenyldiphosphate isomerase,
1s dimethylallyl transferase, gexanyl transferase, farnesyl-diphosphate
farnesyltransferase, squalene
monooxygenase, lanosterol synthase, lathosterol oxidase, and 7-
dehydrocholesterol reductase.
Cholesterol is used in the synthesis of steroid hormones such as cortisol,
progesterone,
aldosterone, estrogen, and testosterone. First, cholesterol is converted to
pregnenolone by cholesterol
monooxygenases. The other steroid hormones are synthesized from pregnenolone
by a series of
2 o enzyme-catalyzed reactions including oxidations, isomerizations,
hydroxylations, reductions, and
demethylations. Examples of these enzymes include steroid ~-isomerase, 3(3-
hydroxy-OS-steroid
dehydrogenase, steroid 21-monooxygenase, steroid 19-hydroxylase, and 3(3-
hydroxysteroid
dehydrogenase. Cholesterol is also the precursor to vitamin. D.
Numerous compounds contain 5-carbon isoprene units derived from the mevalonate
pathway
2s intermediate isopentenyl pyrophosphate. Isoprenoid groups are found in
vitamin K, ubiquinone, retinal,
dolichol phosphate (a carrier of oligosaccharides needed for N-linked
glycosylation), and farnesyl and
geranylgeranyl groups that modify proteins. Enzymes involved include farnesyl
transferase, polyprenyl
transferases, dolichyl phosphatase, and dolichyl kinase.
Sn~olipid Metabolism
s o Sphingolipids are an important class of membrane lipids that contain
sphingosine, a long chain
amino alcohol. They are composed of one long-chain fatty acid, one polar head
alcohol, and
sphingosine or sphingosine derivative. The three classes of sphingolipids are
sphingomyelins,
cerebrosides, and gangliosides. Sphingomyelins, which contain phosphocholine
or
phosphoethanolamine as their head group, are abundant in the myelin sheath
surrounding nerve cells.
35 Galactocerebrosides, which contain a glucose or galactose head group, are
characteristic of the brain.
29


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Other cerebrosides are found in nonneural tissues. Gangliosides, whose head
groups contain multiple
sugar units, are abundant in the brain, but are also found in nonneural
tissues.
Sphingolipids are built on a sphingosine backbone. Sphingosine is acylated to
ceramide by the
enzyme sphingosine acetyltransferase. Ceramide and phosphatidyl choline are
converted to
s sphingomyelin by the enzyme ceramide choline phosphotransferase.
Cerebrosides are synthesized by
the linkage of glucose or galactose to ceramide by a transfexase. Sequential
addition of sugar
residues to ceramide by transferase enzymes yields gangliosides.
Eicosanoid Metabolism
Eicosanoids, including prostaglandins, prostacyclin, thromboxanes, and
leukotrienes, are 20-
~o carbon molecules derived from fatty acids. Eicosanoids are signaling
molecules which have roles in
pain, fever, and inflammation. The precursor of all eicosanoids is
arachidonate, which is generated
from phospholipids by phospholipase AZ and from diacylglycerols by
diacylglycerol lipase.
Leukotrienes are produced from arachidonate by the action of lipoxygenases.
Prostaglandin synthase,
reductases, and isomerases are responsible for the synthesis of the
prostaglandins. Prostaglandins
15 have roles in inflammation, blood flow, ion transport, synaptic
transnussion, and sleep. Prostacyclin
and the thromboxanes are derived from a precursor prostaglandin by the action
of prostacyclin
synthase and thromboxane syntheses, respectively:
Ketone Body Metabolism
Pairs of acetyl-CoA molecules derived from fatty acid oxidation in the liver
can condense to
~o form acetoacetyl-CoA, which subsequently forms acetoacetate, D-3-
hydroxybutyrate, and acetone.
These three products are known as ketone bodies. Enzymes involved in ketone
body metabolism
include HMG-CoA synthetase, HMG-CoA cleavage enzyme, D-3-hydroxybutyrate
dehydrogenase,
acetoacetate decarboxylase, and 3-ketoacyl-CoA transferase. Ketone bodies are
a normal fuel supply
of the heart and renal cortex. Acetoacetate produced by the liver is
transported to cells where the
2s acetoacetate is converted back to acetyl-CoA and enters the citric acid
cycle. In times of starvation,
ketone bodies produced from stored triacylglyerols become an important fuel
source, especially for the
brain. Abnormally high levels of ketone bodies are observed in diabetics.
Diabetic coma can result if
ketone body levels become too great.
Lipid Mobilization
3 o Within cells, fatty acids are transported by cytoplasmic fatty acid
binding proteins (Online
Mendelian Inheritance in Man (OMIM) *134650 Fatty Acid-Binding Protein 1,
Liver; FABP1).
Diazepam binding inhibitor (DBI), also known as endozepine and acyl CoA
binding protein, is an
endogenous ~y-aminobutyric acid (GABA) receptor ligand which is thought to
down-regulate the
effects of GABA. DBI binds medium- and long-chain acyl-CoA esters with very
high affinity and
3 s may function as an intracellular carrier of acyl-CoA esters (OM1M * 125950
Diazepam Binding


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Inhibitor; DBI; PROSITE PDOC00686 Acyl-CoA-binding protein signature).
Fat stored in liver and adipose triglycerides may be released by hydrolysis
and transported in
the blood. Free fatty acids are transported in the blood by albumin.
Triacylglycerols and cholesterol
esters in the blood are transported in lipoprotein particles. The particles
consist of a core of
s hydrophobic lipids surrounded by a shell of polar lipids and
apolipoproteins. The protein components
serve in the solubilization of hydrophobic lipids and also contain cell-
targeting signals. Lipoproteins
include chylomicrons, chylomicron remnants, very-low-density lipoproteins
(VLDL), intermediate-
density lipoproteins (IDL), low-density lipoproteins (LDL), and high-density
lipoproteins (HDL).
There is a strong inverse correlation between the levels of plasma HDL and
risk of premature
1 o coronary heart disease.
Triacylglycerols in chylomicrons and VLDL are hydrolyzed by lipoprotein
lipases that line
blood vessels in muscle and other tissues that use fatty acids. Cell surface
LDL receptors bind LDL
particles which are then internalized by endocytosis. Absence of the LDL
receptor, the cause of the
disease familial hypercholesterolemia, leads to increased plasma cholesterol
levels and ultimately to
1s atherosclerosis. Plasma cholesteryl ester transfer protein mediates the
transfer of cholesteryl esters
from HDL to apolipoprotein B-containing lipoproteins. Cholesteryl ester
transfer protein is important
in the reverse cholesterol transport system and may play a role in
atherosclerosis (Yamashita, S. et al.
(1997) Curr. Opin. Lipidol. 8:101-110). Macrophage scavenger receptors, which
bind and internalize
modified lipoproteins, play a role in lipid transport and may contribute to
atherosclerosis (Greaves,
2o D.R. et a1. (1998) Curr. Opin. Lipidol. 9:425-432).
Proteins involved in cholesterol uptake and biosynthesis are tightly regulated
in response to
cellular cholesterol levels. The sterol regulatory element binding pxotein
(SREBP) is a sterol-
xesponsive transcription factox. Under normal cholesterol conditions, SREBP
resides in the ER
membrane. When cholesterol levels are low, a regulated cleavage of SREBP
occurs which releases
2s the extracellular domain of the protein. This cleaved domain is then
transported to the nucleus where
it activates the transcription of the LDL receptor gene, and genes encoding
enzymes of cholesterol
synthesis, by binding the sterol regulatory element (SRE) upstream of the
genes (Yang, J. et al. (1995)
J. Biol. Chem. 270:12152-12161). Regulation of cholesterol uptake and
biosynthesis also occurs via
the oxysterol-binding protein (OSBP). OSBP is a high-affinity intracellular
receptor for a variety of
s o oxysterols that down-regulate cholesterol synthesis and stimulate
cholesterol esterification (Lagace,
T.A. et al. (1997) Biochem. J. 326:205-213).
Beta-oxidation
Mitochondrial and peroxisomal beta-oxidation enzymes degrade saturated and
unsaturated
fatty acids by sequential removal of two-carbon units from CoA-activated fatty
acids. The main beta-
3 s oxidation pathway degrades both saturated and unsaturated fatty acids
while the auxiliary pathway
31


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
performs additional steps required for the degradation of unsaturated fatty
acids.
The pathways of mitochondrial and peroxisomal beta-oxidation use similar
enzymes, but have
different substrate specificities and functions. Mitochondria oxidize short-,
medium-, and long-chain
fatty acids to produce energy for cells. Mitochondrial beta-oxidation is a
major energy source for
s cardiac and skeletal muscle. In liver, it provides ketone bodies to the
peripheral circulation when
glucose levels are low as in starvation, endurance exercise, and diabetes
(Eaton, S, et al. (1996)
Biochem. T. 320:345-357). Peroxisomes oxidize medium-, long-, and very-long-
chain fatty acids,
dicarboxylic fatty acids, branched fatty acids, prostaglandins, xenobiotics,
and bile acid intermediates.
The chief roles of peroxisomal beta-oxidation are to shorten toxic lipophilic
carboxylic acids to
s o facilitate their excretion and to shorten very-long-chain fatty acids
prior to mitochondrial beta-oxidation
(Mannaerts, G.P. and P.P. van Veldhoven (1993) Biochimie 75:147-158).
Enzymes involved in beta-oxidation include acyl CoA synthetase, carnitine
acyltransferase,
acyl CoA dehydrogenases, enoyl CoA hydratases, L-3-hydroxyacyl CoA
dehydrogenase, (3-
ketothiolase, 2,4-dienoyl CoA reductase, and isomerase.
15 Lipid Cleayage and Degradation
Triglycerides are hydrolyzed to fatty acids and glycerol by lipases.
Lysophospholipases
(LPLs) are widely distributed enzymes that metabolize intracellular lipids,
and occur in numerous
isoforms. Small isoforms, approximately 15-30 kD, function as hydrolases;
large isoforms, those
exceeding 60 kD, function both as hydrolases and transacylases. A particular
substrate for LPLs,
20 lysophosphatidylcholine, causes lysis of cell membranes when it is formed
or imported into a cell.
LPLs are regulated by lipid factors including acylcarnitine, arachidonic acid,
and phosphatidic acid.
These lipid factors are signaling molecules important in numerous pathways,
including the
inflammatory response. (Anderson, R. et al. (1994) Toxicol. Appl. Pharmacol.
125:176-183; Selle, H.
et al. (1993); Eur. J. Biochem. 212:411-416.)
25 The secretory phospholipase AZ (PLA2) superfamily comprises a number of
heterogeneous
enzymes whose common feature is to hydrolyze the sn-2 fatty acid acyl ester
bond of
phosphoglycerides. Hydrolysis of the glycerophospholipids releases free fatty
acids and
lysophospholipids. PLA2 activity generates precursors for the biosynthesis of
biologically active lipids,
hydroxy fatty acids, and platelet-activating factor. PLA2 hydrolysis of the sn-
2 ester bond in
3 o phospholipids generates free fatty acids, such as arachidonic acid and
lysophospholipids.
Carbon and Carbohydrate Metabolism
Carbohydrates, including sugars or saccharides, starch, and cellulose, are
aldehyde or ketone
compounds with multiple hydroxyl groups. The importance of carbohydrate
metabolism is
demonstrated by the sensitive regulatory system in place for maintenance of
blood glucose levels.
3 5 Two pancreatic hormones, insulin and glucagon, promote increased glucose
uptake and storage by
32


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cells, and increased glucose release from cells, respectively. Carbohydrates
have three unportant
roles in mammalian cells. First, carbohydrates are used as energy stores,
fuels, and metabolic
intermediates. Carbohydrates are broken down to form energy in glycolysis and
are stored as
glycogen for later use. Second, the sugars deoxyribose and ribose form part of
the structural support
of DNA and RNA, respectively. Third, carbohydrate modifications are added to
secreted and
membrane proteins and lipids as they traverse the secretory pathway. Cell
surface carbohydrate-
containing macromolecules, including glycoproteins, glycolipids, and
transmembrane proteoglycans,
mediate adhesion with other cells and with components of the extracellular
matrix. The extracellular
matrix is comprised of diverse glycoproteins, glycosaminoglycans (GAGS), and
carbohydrate-binding
so proteins which are secreted from the cell and assembled into an organized
meshwork in close
association with the cell surface. The interaction of the cell with the
surrounding matrix profoundly
influences cell shape, strength, flexibility, motility, and adhesion. These
dynamic properties are
intimately associated with signal transduction pathways controlling call
proliferation and differentiation,
tissue construction, and embryonic development.
Carbohydrate metabolism is altered in several disorders including diabetes
mellitus,
hyperglycemia, hypoglycemia, galactosemia, galactokinase deficiency, and UDP-
galactose-4-
epimerase deficiency (Fauci, A.S. et al. (1998) Harrison's Princ~leS of
Internal Medicine, McGraw-
Hill, New York NY, pp. 2208-2209). Altered carbohydrate metabolism is
associated with cancer.
Reduced GAG and proteoglycan expression is associated with human lung
carcinomas (Nackaerts, K.
2o et al. (1997) Int. J. Cancer 74:335-345). The carbohydrate determinants
sialyl Lewis A and sialyl
Lewis ~ are frequently expressed on human cancer cells (Kannagi, R. (1997)
Glycoconj. J. 14:577-
584). Alterations of the N-linked carbohydrate core structure of cell surface
glycoproteins are linked
to colon and pancreatic cancers (Schwarz, R.E. et al. (1996) Cancer Lett.
107:285-291). Reduced
expression of the Sda blood group carbohydrate structure in cell surface
glycolipids and glycoproteins
is observed in gastrointestinal cancer (Dohi, T. et al. (1996) Int. J. Cancer
67:626-663). (Carbon and
carbohydrate metabolism is reviewed in Stryer, L. (1995) Biochemistry W.H.
Freeman and Company,
New York NY; Lehninger, A.L. (1982) Principles of Biochemistry Worth
Publishers Inc., New York
NY; and Lodish, H. et al. (1995) Molecular Cell Biolo~y Scientific American
Books, New York NY.)
Glycolysis
3 o Enzymes of the glycolytic pathway convert the sugar glucose to pyruvate
while simultaneously
producing ATP. The pathway also provides building blocks for the synthesis of
cellular components
such as long-chain fatty acids. After glycolysis, pyrvuate is converted to
acetyl-Coenzyme A, which,
in aerobic organisms, enters the citric acid cycle. Glycolytic enzymes include
hexokinase,
phosphoglucose isomerase, phosphofructokinase, aldolase, triose phosphate
isomerase, glyceraldehyde
3 5 3-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglyceromutase,
enolase, and pyruvate
33


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
kinase. Of these, phosphofructokinase, hexokinase, and pyruvate kinase are
important in regulating
the rate of glycolysis.
Gluconeo eg nesis
Gluconeogenesis is the synthesis of glucose from noncarbohydrate precursors
such as lactate
and amino acids. The pathway, wluch functions mainly in times of starvation
and intense exercise,
occurs mostly in the liver and kidney. Responsible enzymes include pyruvate
carboxylase,
phosphoenolpyruvate carboxykinase, fructose 1,6-bisphosphatase, and glucose-6-
phosphatase.
Pentose Phosphate Pathway
Pentose phosphate pathway enzymes are responsible for generating the reducing
agent
so NADPH, while at the same time oxidizing glucose-6-phosphate to ribose-5-
phosphate. Ribose-5-
phosphate anal its derivatives become part of important biological molecules
such as ATP, Coenzyme
A, NAD+, FAD, RNA, and DNA. The pentose phosphate pathway has both oxidative
and non-
oxidative branches. The oxidative branch steps, which are catalyzed by the
enzymes glucose-6-
phosphate dehydrogenase, lactonase, and 6-phosphogluconate dehydrogenase,
convert glucose-6-
phosphate and NADP+ to ribulose-6-phosphate and NADPH. The non-oxidative
branch steps, which
are catalyzed by the enzymes phosphopentose isomerase, phosphopentose
epimerase, transketolase,
and transaldolase, allow the interconversion of three-, four-, five-, six-,
and seven-carbon sugars.
Glucouronate Metabolism
Glucuronate is a monosaccharide which, in the form of D-glucuronic acid, is
found in the
ao GAGs chondroitin and dermatan. D-glucuronic acid is also important in the
detoxification and
excretion of foreign organic compounds such as phenol. Enzymes involved in
glucuronate metabolism
include UDP-glucose dehydrogenase and glucuronate reductase.
Disaccharide Metabolism
Disaccharides must be hydrolyzed to monosaccharides_ o be digested. Lactose, a
2s disaccharide found in milk, is hydrolyzed to galactose and glucose by the
enzyme lactase. Maltose is
derived from plant starch and is hydrolyzed to glucose by the enzyme maltase.
Sucrose is derived
from pleats and is hydrolyzed to glucose and fructose by the enzyme sucrase.
Trehalose, a
disaccharide found mainly in insects and mushrooms, is hydrolyzed to glucose
by the enzyme trehalase
(OMIM *275360 Trehalase; Ruf, J. et al. (1990) J. Biol. Chem. 265:15034-
15039). Lactase, maltase,
3 o sucrase, and trehalase are bound to mucosal cells lining the small
intestine, where they participate in
the digestion of dietary disaccharides. The enzyme lactose synthetase,
composed of the catalytic
subunit galactosyltransferase and the modifier subunit a-lactalbumin, converts
UDP-galactose and
glucose to lactose in the mammary glands.
Glycogen, Starch, and Chitin Metabolism
35 Glycogen is the storage form of carbohydrates in mammals. Mobilization of
glycogen
34


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
mamtatns glucose levels between meals and during muscular actimty. Glycogen is
stored mainly in the
liver and in skeletal muscle in the form of cytoplasmic granules. 'These
granules contain enzymes that
catalyze the synthesis and degradation of glycogen, as well as enzymes that
regulate these processes.
Enzymes that catalyze the degradation of glycogen include glycogen
phosphorylase, a trausferase, a-
1,6-glucosidase, and phosphoglucomutase. Enzymes that catalyze the synthesis
of glycogen include
UDP-glucose pyrophosphorylase, glycogen synthetase, a branching enzyme, and
nucleoside
diphosphokinase. The enzymes of glycogen synthesis and degradation are tightly
regulated by the
hormones insulin, glucagon, and epinephrine. Starch, a plant-derived
polysaccharide, is hydrolyzed to
maltose, maltotriose, and a-dextrin by a-amylase, an enzyme secreted by the
salivary glands and
Zo pancreas. Chitin is a polysaccharide found in insects and crustacea. A
chitotriosidase is secreted by
macrophages and may play a role in the degradation of chitin-containing
pathogens (Boot, R.G. et al.
(1995) J. Biol. Chem. 270:26252-26256).
Peptido~lycans and Glycosamino~lucans
Glycosaminoglycans (GAGS) are anionic linear unbranched polysaccharides
composed of
s5 repetitive disaccharide units. These repetitive units contain a derivative
of an amino sugar, either
glucosamine or galactosamine. GAGS exist free or as part of proteoglycans,
large molecules
composed of a core protein attached to one or more GAGS. GAGS are found on the
cell surface,
inside cells, and in the extracellular matrix. Changes in GAG levels are
associated with several
autoimmune diseases including autoimmune thyroid disease, autoimmune diabetes
mellitus, and
ao systemic lupus erythematosus (Hausen, C. et al. (1996) Clip. Exp. Rheum. 14
(Suppl. 15):559-567).
GAGS include chondroitin sulfate, keratan sulfate, heparin, heparan sulfate,
dermatan sulfate, and
hyaluronan.
'The GAG hyaluronan (HA) is found in the extracellular matrix of many cells,
especially in soft
connective tissues, and is abundant in synovial fluid (Pitsillides, A.A. et
al. (1993) Int. J. Exp. Pathol.
2 s 74:27-34). HA seems to play important roles in cell regulation,
development, and differentiation
(Laurent, T.C. and J.R. Fraser (1992) FASEB J. 6:2397-2404). Hyaluronidase is
an enzyme that
degrades HA to oligosaccharides. Hyaluronidases may function in cell adhesion,
infection,
angiogenesis, signal transduction, reproduction, cancer, and inflammation.
Proteoglycans, also known as peptidoglycans, are found in the extracellular
matrix of
3 o connective tissues such as cartilage and are essential for distributing
the load in weight bearing joints.
Cell-surface-attached proteoglycans anchor cells to the extracellular matrix.
Both extracellular and
cell-surface proteoglycans bind growth factors, facilitating their binding to
cell-surface receptors and
subsequent triggering of signal transduction pathways.
Amino Acid and Nitrogen Metabolism
35 NH.~+ is assimilated into amino acids by the actions of two enzymes,
glutamate dehydrogenase


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
and glutamine synthetase. The carbon skeletons of amino acids come from the
intermediates of
glycolysis, the pentose phosphate pathway, or the citric acid cycle. Of the
twenty amino acids used in
proteins, humans can synthesize only thirteen (nonessential amino acids). The
remaining nine must
come from the diet (essential amino acids). Enzymes involved in nonessential
amino acid biosynthesis
s include glutamate kinase dehydrogenase, pyrroline carboxylate reductase,
asparagine synthetase,
phenylalanine oxygenase, methionine adenosyltransferase,
adenosylhomocysteinase, cystathionine (3-
synthase, cystathionine ~y-lyase, phosphoglycerate dehydrogenase,
phosphoserine transaminase,
phosphoserine phosphatase, serine hydroxylmethyltransferase, and glycine
synthase.
Metabolism of amino acids takes place almost entirely in the liver, where the
amino group is
so removed by aminotransferases (transaminases), for example, alanine
aminotrausferase. The amino
group is transferred to a-ketoglutarate to form glutamate. Glutamate
dehydrogenase converts
glutamate to NH4+ and a-ketoglutarate. NHa+is converted to urea by the urea
cycle which is
catalyzed by the enzymes arginase, ornithine transcarbamoylase,
arginosuccinate synthetase, and
arginosuccinase. Carbarnoyl phosphate synthetase is also involved in urea
formation. Enzymes
1s involved in the metabolism of the carbon skeleton of amino acids include
serine dehydratase,
asparaginase, glutaminase, propionyl CoA carboxylase, methyhnalonyl CoA
mutase, branched-chain
a-keto dehydrogenase complex, isovaleryl CoA dehydrogenase, (3-methylcrotonyl
CoA carboxylase,
phenylalanine hydroxylase, p hydroxylphenylpyruvate hydroxylase, and
homogentisate oxidase.
Polyamines, which include spermidine, putrescine, and spermine, bind tightly
to nucleic acids
2 o and are abundant in rapidly proliferating cells. Enzymes involved in
polyamine synthesis include
ornithine decarboxylase.
Diseases involved in amino acid and nitrogen metabolism include
hyperammonemia,
carbamoyl phosphate synthetase deficiency, urea cycle enzyme deficiencies,
methylmalonic aciduria,
maple syrup disease, alcaptonuria, and phenylketonuria.
2 s Enemy Metabolism
Cells derive energy from metabolism of ingested compounds that may be roughly
categorized
as carbohydrates, fats, or proteins. Energy is also stored in polymers such as
triglycerides (fats) and
glycogen (carbohydrates). Metabolism proceeds along separate reaction pathways
connected by key
intermediates such as acetyl coenzyme A (acetyl-CoA). Metabolic pathways
feature anaerobic and
s o aerobic degradation, coupled with the energy-requiring reactions such as
phosphorylation of adenosine
diphosphate (ADP) to the triphosphate (ATP) or analogous phosphorylations of
guanosine
(GDP/GTP), uridine (UDP/LJTP), or cytidine (CDP/CT'P). Subsequent
dephosphorylation of the
triphosphate drives reactions needed for cell maintenance, growth, and
proliferation.
Digestive enzymes convert carbohydrates and sugars to glucose; fructose and
galactose are
3 s converted in the liver to glucose. Enzymes involved in these conversions
include galactose-1-
36


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
phosphate uridyl transferase and UDP-galactose-4 epimerase. In the cytoplasm,
glycolysis converts
glucose to pyruvate in a series of reactions coupled to ATP synthesis.
Pyruvate is transported into the mitochondria and converted to acetyl-CoA for
oxidation via
the citric acid cycle, involving pyruvate dehydrogenase components,
dihydrolipoyl transacetylase, and
dihydrolipoyl dehydrogenase. Enzymes involved in the citric acid cycle
include: citrate synthetase,
aconitases, isocitrate dehydrogenase, alpha-ketoglutarate dehydrogenase
complex including
transsuccinylases, succinyl CoA synthetase, succinate dehydrogenase,
fumarases, and malate
dehydrogenase. Acetyl CoA is oxidized to COz with concomitant formation of
NADH, FADH2, and
GTP. In oxidative phosphorylation, the transport of electrons from NADH and
FADHZ to oxygen by
so dehydrogenases is coupled to the synthesis of ATP from ADP and Pi by the
FoF'1 ATPase complex in
the mitochondrial inner membrane. Enzyme complexes responsible for electron
transport and ATP
synthesis include the FoF1 ATPase complex, ubiquinone(CoQ)-cytochrome c
reductase, ubiquinone
reductase, cytochrome b, cytochrome c1, FeS protein, and cytochrome c oxidase.
Triglycerides are hydrolyzed to fatty acids and glycerol by lipases. Glycerol
is then
~5 phosphorylated to glycerol-3-phosphate by glycerol kinase and glycerol
phosphate dehydrogenase, and
degraded by the glycolysis. Fatty acids are transported into the mitochondria
as fatty acyl-carnitine
esters and undergo oxidative degradation.
In addition to metabolic disorders such as diabetes and obesity, disorders of
energy
metabolism are associated with cancers (Dorward, A. et al. (1997) J. Bioenerg.
Biomembr. 29:385-
20 392), autism (Lombard, J. (1998) Med. Hypotheses 50:497-500),
neurodegenerative disorders (Alexi,
T. et al. (1998) Neuroreport 9:857-64), and neuromuscular disorders (DiMauro,
S. et al. (1998)
Biochim. Biophys. Acta 1366:199-210). The myocardium is heavily dependent on
oxidative
metabolism, so metabolic dysfunction often leads to heart disease (DiMauro, S.
and M. Hirano (1998)
Curr. Opin. Cardiol. 13:190-197).
25 For a review of energy metabolism enzymes and intermediates, see Stryer, L.
et al. (1995)
Biochemistry, W.H. Freeman and Co., San Francisco CA, pp. 443-652. For a
review of energy
metabolism regulation, see Lodish, H. et al. (1995) Molecular Cell Biolo~y,
Scientific American Books,
New York NY, pp. 744-770.
Cofactor Metabolism
3 o Cofactors, including coenzymes and prosthetic groups, are small molecular
weight inorganic or
organic compounds that are required for the action of an enzyme. Many
cofactors contain vitamins as
a component. Cofactors include thiamine pyrophosphate, flavin adenine
dinucleotide, flavin
mononucleotide, nicotinamide adenine dinucleotide, pyridoxal phosphate,
coenzyme A,
tetrahydrofolate, lipoamide, and heme. The vitamins biotin and cobalamin are
associated with
35 enzymes as well. Heme, a prosthetic group found in myoglobin and
hemoglobin, consists of
37


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
protoporphyrin group bound to iron. Porphyrin groups contain four substituted
pyrroles covalently
joined in a ring, often with a bound metal atom. Enzymes involved in porphyrin
synthesis include 8-
aminolevulinate synthase, 8-aminolevulinate dehydrase, porphobilinogen
deaminase, and cosynthase.
Deficiencies in heme formation cause porphyrias. Heme is broken down as a part
of erythrocyte
s turnover. Enzymes involved in heme degradation include heme oxygenase and
biJiverdin reductase.
Iron is a required cofactor for many enzymes. Besides the heme-containing
enzymes, iron is
found in iron-sulfur clusters in proteins including aconitase, succinate
dehydrogenase, and NADH-Q
reductase. Iron is transported in the blood by the protein transferrin.
Binding of transferrin to the
transferrin receptor on cell surfaces allows uptake by receptor mediated
endocytosis. Cytosolic iron is
Zo bound to ferritin protein.
A molybdenum-containing cofactor (molybdopterin) is found in enzymes including
sulfite
oxidase, xanthine dehydrogenase, and aldehyde oxidase. Molybdopterin
biosynthesis is performed by
two molybdenum cofactor synthesizing enzymes. Deficiencies in these enzymes
cause mental
retardation and lens dislocation. Other diseases caused by defects in cofactor
metabolism include
15 pernicious anemia and methylmalonic aciduria.
Secretion and Trafficking
Eukaryotic cells are bound by a lipid bilayer membrane and subdivided into
functionally
distinct, membrane bound compartments. The membranes maintain the essential
differences between
the cytosol, the extracellular environment, and the lumenal space of each
intracellular organelle. As
z o lipid membranes are highly impermeable to most polar molecules, transport
of essential nutrients,
metabolic waste products, cell signaling molecules, macromolecules and
proteins across lipid
membranes and between organelles must be mediated by a variety of transport-
associated molecules.
Protein Trafficking
In eukaryotes, some proteins are synthesized on ER bound ribosomes, co-
translationally
2 s imported into the ER, delivered from the ER to the Golgi complex for post-
translational processing and
sorting, and transported from the Golgi to specific intracellular and
extracellular destinations. All cells
possess a constitutive transport process which maintains homeostasis between
the cell and its
environment. In many differentiated cell types, the basic machinery is
modified to carry out specific
transport functions. For example, in endocrine glands, hormones and other
secreted proteins are
3 o packaged into secretory granules for regulated exocytosis to the cell
exterior. In macrophage, foreign
extracellular material is engulfed (phagocytosis) and delivered to lysosomes
for degradation. In fat
and muscle cells, glucose transporters are stored in vesicles which fuse with
the plasma membrane
only in response to insulin stimulation.
The Secretor~Pathway
35 Synthesis of most integral membrane proteins, secreted proteins, and
proteins destined for the
3~


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
lumen of a particular organelle occurs on ER-bound ribosornes. These proteins
are co-translationally
~~ imported into the ER. The proteins leave the ER via membrane-bound vesicles
which bud off the ER
at specific sites and fuse with each other (homotypic fusion) to form the ER-
Golgi Intermediate
Compartment (ERGIC). The ERGIC matures progressively through the cis, medial,
and tr~ayis
s cisternal stacks of the Golgi, modifying the enzyme composition by
retrograde transport of specific
Golgi enzymes. In this way, proteins moving through the Golgi undergo post-
translational modification,
such as glycosylation. The final Golgi compartment is the Trans-Golgi Network
(TGN), where both
membrane and lumenal proteins are sorted for their final destination.
Transport vesicles destined for
intracellular compartments, such as the Iysosome, bud off the TGN. What
remains is a secretory
Zo vesicle which contains proteins destined for the plasma membrane, such as
receptors, adhesion
molecules, and ion channels, and secretory proteins, such as hormones,
neurotransmitters, and
digestive enzymes. Secretory vesicles eventually fuse with the plasma membrane
(Glick, B.S. and V.
Malhotxa (1998) Cell 95:883-889).
The secretory process can be constitutive or regulated. Most cells have a
constitutive
15 pathway for secretion, whereby vesicles derived from maturation of the TGN
require no specific
signal to fuse with the plasma membrane. In many cells, such as endocrine
cells, digestive cells, and
neurons, vesicle pools derived from the TGN collect in the cytoplasm and do
not fuse with the plasma
membrane until they are directed to by a specific signal.
Endoc osis
2 o Endocytosis, wherein cells internalize material from the extracellular
environment, is essential
for transmission of neuronal, metabolic, and proliferative signals; uptake of
many essential nutrients;
and defense against invading organisms. Most cells exhibit two forms of
endocytosis. The first,
phagocytosis, is an actin-driven process exemplified in macrophage and
neutrophils. Material to be
endocytosed contacts numerous cell surface receptors which stimulate the
plasma membrane to
2 s extend and surround the particle, enclosing it in a membrane-bound
phagosome. In the mammalian
immune system, IgG-coated particles bind Fc receptors on the surface of
phagocytic leukocytes.
Activation of the Fc receptors initiates a signal cascade involving src-family
cytosolic kinases and the
monomeric GTP-binding (G) protein Rho. The resulting actin reorganization
leads to phagocytosis of
the particle. This process is an important component of the humoral immune
response, allowing the
s o processing and presentation of bacterial-derived peptides to antigen-
specific T-lymphocytes.
The second form of endocytosis, pinocytosis, is a more generalized uptake of
material from
the external milieu. Like phagocytosis, pinocytosis is activated by ligand
binding to cell surface
receptors. Activation of individual receptors stimulates an internal response
that includes coalescence
of the receptor-ligand complexes and formation of clathrin-coated pits.
Invagination of the plasma
3 s membrane at clathrin-coated pits produces an endocytic vesicle within the
cell cytoplasm. These
39


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
vesicles undergo homotypic fusion to form an early endosomal (EE) compartment.
The
tubulovesicular EE serves as a sorting site for incoming material. ATP-driven
proton pumps in the EE
membrane lowers the pH of the EE lumen (pH 6.3-6.8). The acidic environment
causes many ligands
to dissociate from their receptors. The receptors, along with membrane and
other integral membrane
s proteins, are recycled back to the plasma membrane by budding off the
tubular extensions of the EE in
recycling vesicles (RV). This selective removal of recycled components
produces a carrier vesicle
containing ligand and other material from the external environment. The
carrier vesicle fuses with
TGN-derived vesicles which contain hydrolytic enzymes. The acidic environment
of the resulting late
endosome (LE) activates the hydrolytic enzymes which degrade the Iigands and
other material. As
a.o digestion takes place, the LE fuses with the lysosome where digestion is
completed (Mellman, I.
(1996) Annu. Rev. Cell Dev. Biol. 12:575-625).
Recycling vesicles may return directly to the plasma membrane. Receptors
internalized and
returned directly to the plasma membrane have a turnover rate of 2-3 minutes.
Some RVs undergo
microtubule-directed relocation to a perinuclear site, from which they then
return to the plasma
15 membrane. Receptors following this route have a turnover rate of 5-10
minutes. Still other RVs are
retained within the cell until an appropriate signal is received (Mellman,
supra; and James, D.E. et al.
(1994) Trends Cell Biol. 4:120-126).
Vesicle Formation
Several steps in the txansit of material along the secretory and endocytic
pathways require the
2 o formation of transport vesicles. Specifically, vesicles form at the
transitional endoplasmic reticulum
(tER), the xim of Golgi cisternae, the face of the Trans-Golgi Network (TGN),
the plasma membrane
(PM), and tubular extensions of the endosomes. The process begins with the
budding of a vesicle out
of the donor membrane. The membrane-bound vesicle contains proteins to be
transported and is
surrounded by a protective coat made up of protein subunits recruited from the
cytosol. The initial
2s budding and coating processes are controlled by a cytosolic ras-like GTP-
binding protein, ADP-
ribosylating factor (Arf), and adapter proteins (AP). Different isoforms of
both Arf and AP are
involved at different sites of budding. Another small G-protein, dynamin,
forms a ring complex around
the neck of the forming vesicle and may provide the mechanoehemical force to
accomplish the final
step of the budding process. The coated vesicle complex is then transported
through the cytosol.
3 o During the transport process, Arf bound GTP is hydrolyzed to GDP and the
coat dissociates from the
transport vesicle (West, M.A. et al. (1997) J. Cell Biol. 138:1239-1254). Two
different classes of coat
protein have also been identified. Clathrin coats form on the TGN and PM
surfaces, whereas
coatomer or COP coats form on the ER and Golgi. COP coats can further be
distinguished as COPI,
involved in retrograde traffic through the Golgi and from the Golgi to the ER,
and COPII, involved in
3 s anterograde traffic from the ER to the Golgi (Mellman, su re). The COP
coat consists of two major


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
components, a G-protein (Arf or Sar) and coat protomer (coatomer). Coatomer is
an equimolar
complex of seven proteins, termed alpha-, beta-, beta'-, gamma-, delta-,
epsilon- and zeta-COP.
(Hatter, C. and F.T. Wieland (1998) Proc. Natl. Acad. Sci. USA 95:11649-
11654.)
Membrane Fusion
Transport vesicles undergo homotypic or heterotypic fusion in the secretory
and endocytotic
pathways. Molecules required for appropriate targeting and fusion of vesicles
with their target
membrane include proteins incorporated in the vesicle membrane, the target
membrane, and proteins
recruited from the cytosol. During budding of the vesicle from the donor
compartment, an integral
membrane protein, VAMP (vesicle-associated membrane protein) is incorporated
into the vesicle.
1o Soon after the vesicle uncoats, a cytosolic prenylated GTP-binding protein,
Rab (a member of the Ras
superfamily), is inserted into the vesicle membrane. GTP bound Rab proteins
are directed into
nascent transport vesicles where they interact with VAMP. Following vesicle
transport, GTPase
activating proteins (GAPS) in the target membrane convert Rab proteins to the
GDP-bound form. A
cytosolic protein, guanine-nucleotide dissociation inhibitor (GDI) helps
return GDP-bound Rab proteins
15 to their membrane of origin. Several Rab isoforms have been identified and
appear to associate with
specific compartments within the cell. Rab proteins appear to play a role in
mediating the function of
a viral gene, Rev, which is essential for replication of HIV-1, the virus
responsible for AIDS (Flavell,
R.A. et al. (1996) Proc. Natl. Acad. Sci. USA 93:4421-4424).
Docking of the transport vesicle with the target membrane involves the
formation of a
2 o complex between the vesicle SNAP receptor (v-SNARE), target membrane (t-)
SNAREs, and
certain other membrane and cytosolic proteins. Many of these other proteins
have been identified
although their exact functions in the docking complex remain uncertain
(Tellam, J.T. et al. (1995) J.
Biol. Chem. 270:5857-5863; and Hata, Y. and T.C. Sudhof (1995) J. Biol. Chem.
270:13022-13028).
N-ethylmaleimide sensitive factor (NSF) and soluble NSF-attachment protein (a-
SNAP and (3-SNAP)
2 s are two such proteins that are conserved from yeast to man and function in
most intracellular
membrane fusion reactions. Sec1 represents a family of yeast proteins that
function at many different
stages in the secretoxy pathway including membrane fusion. Recently, mammalian
homologs of Secl,
called Munc-18 proteins, have been identified (Katagiri, H. et a1. (1995) J.
Biol. Chem. 270:4963-4966;
Hata et al. supra).
s o The SNARE complex involves three SNARE molecules, one in the vesicular
membrane and
two in the target membrane. Synaptotagmin is an integral membrane protein in
the synaptic vesicle
which associates with the t-SNARE syntaxin in the docking complex.
Synaptotagmin binds calcium in
a complex with negatively charged phospholipids, which allows the cytosolic
SNAP protein to displace
synaptotagmin from syntaxin and fusion to occur. Thus, synaptotagmin is a
negative regulator of
35 fusion in the neuron (Littleton, J.T. et al. (1993) Cell 74:1125-1134). The
most abundant membrane
41


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
protein of synaptic vesicles appears to be the glycoprotein synaptophysin, a
38 kDa protein with four
transmembrane domains.
Specificity between a vesicle and its target is derived from the v-SNARE, t-
SNARES, and
associated proteins involved. Different isoforms of SNARES and Rabs show
distinct cellular and
s subcellular distributions. VAMP-1/synaptobrevin, membrane-anchored
synaptosome-associated
protein of 25 kDa (SNAP-25), syntaxin-1, Rab3A, Rabl5, and Rab23 are
predominantly expressed in
the brain and nervous system. Different syntaxin, VAMP, and Rab proteins are
associated with
distinct subcellular compartments and their vesicular carriers.
Nuclear Transport
so Transport of proteins and RNA between the nucleus and the cytoplasm occurs
through
nuclear pore complexes (NPCs). NPC-mediated transport occurs in both
directions through the
nuclear envelope. All nuclear proteins are imported from the cytoplasm, their
site of synthesis. tRNA
and mRNA are exported from the nucleus, their site of synthesis, to the
cytoplasm, their site of
function. Processing of small nuclear RNAs involves export into the cytoplasm,
assembly with
15 proteins and modifications such as hypermethylation to produce small
nuclear ribonuclear proteins
(snRNPs), and subsequent import of the snRNPs back into the nucleus. The
assembly of ribosomes
requires the initial import of ribosomal proteins from the cytoplasm, their
incorporation with RNA into
ribosomal subunits, and export back to the cytoplasm. (Gorlich, D. and LW.
Mattaj (1996) Science
271:1513-1518.)
2 o The transport of proteins and mRNAs across the NPC is selective, dependent
on nuclear
localization signals, and generally requires association with nuclear
transport factors. Nuclear
localization signals (NLS) consist of short stretches of amino acids enriched
in basic residues. NLS
are found on proteins that are targeted to the nucleus, such as the
glucocorticoid receptor. The NLS
is recognized by the NLS receptor, importin, which then interacts with the
monomeric GTP-binding
2 s protein Ran. This NLS protein/receptor/Ran complex navigates the nuclear
pore with the help of the
homodimeric protein nuclear transport factor 2 (NTF2). NTF2 binds the GDP
bound form of Ran and
to multiple proteins of the nuclear pore complex containing FXFG repeat
motifs, such as p62.
(Paschal, B. et al. (1997) J. Biol. Chem. 272:21534-21539; and Wong, D.H. et
al. (1997) Mol. Cell
Biol. 17:3755-3767). Some proteins are dissociated before nuclear mRNAs are
transported across the
3 o NPC while others are dissociated shortly after nuclear mRNA transport
across the NPC and are
reimported into the nucleus.
Disease Correlation
The etiology of numerous human diseases and disorders can be attributed to
defects in the
transport or secretion of proteins. For example, abnormal hormonal secretion
is linked to disorders
3 s such as diabetes insipidus (vasopressin), hyper- and hypoglycemia
(insulin, glucagon), Grave's disease
42


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
and goiter (thyroid hormone), and Cushing's and Addison's diseases
(adrenocorticotropic hormone,
ACTH). Moreover, cancer cells secrete excessive amounts of hormones or other
biologically active
peptides. Disorders related to excessive secretion of biologically active
pep"tides by tumor cells include
fasting hypoglycemia due to increased insulin secretion from insulinoma-islet
cell tumors; hypertension
s due to increased epinephrine and norepinephrine secreted from
pheochromocytomas of the adrenal
medulla and sympathetic paraganglia; and carcinoid syndrome, which is
characterized by abdominal
cramps, diarrhea, and valvular heart disease caused by excessive amounts of
vasoactive substances
such as serotonin, bradykinin, histamine, prostaglandins, and polypeptide
hormones, secreted from
intestinal tumors. Biologically active peptides that are ectopically
synthesized in and secreted from
so tumor cells include ACTH and vasopressin (lung and pancreatic cancers);
parathyroid hormone (lung
and bladder cancers); calcitonin (lung and breast cancers); and thyroid-
stimulating hormone
(medullary thyroid carcinoma). Such peptides may be useful as diagnostic
markers for tumorigenesis
(Schwartz, M.Z. (1997) Semin. Pediatr. Surg. 3:141-146; and Said, S.I. and
G.R. Faloona (1975) N.
Engl. J. Med. 293:155-160).
15 Defective nuclear transport may play a role in cancer. The BRCA1 protein
contains three
potential NLSs which interact with importin alpha, and is transported into the
nucleus by the
importin/NPC pathway. In breast cancer cells the BRCA1 protein is aberrantly
localized in the
cytoplasm. The mislocation of the BRCA1 protein in breast cancer cells may be
due to a defect in the
NPC nuclear import pathway (Chen, C.F. et al. (1996) J. Biol. Chem. 271:32863-
32868).
2 o It has been suggested that in some breast cancers, the tumor-suppressing
activity of p53 is
inactivated by the sequestration of the protein in the cytoplasm, away from
its site of action in the cell
nucleus. Cytoplasmic wild-type p53 was also found in human cervical carcinoma
cell lines. (Moll,
U.M. et al. (1992) Proc. Natl. Acad. Sci. USA 89:7262-7266; and Liang, X.H. et
al. (1993) Oncogene
8:2645-2652. )
2 s Enyironmental Responses
Organisms respond to the environment by a number of pathways. Heat shock
proteins,
including hsp 70, hsp60, hsp90, and hsp 40, assist organisms in coping with
heat damage to cellular
proteins.
Aquaporins (AQP) are channels that transport water and, in some cases,
nonionic small
3 o solutes such as urea and glycerol. Water movement is important for a
number of physiological
processes including renal fluid filtration, aqueous humor generation in the
eye, cerebrospinal fluid
production in the brain, and appropriate hydration of the lung. Aquaporins are
members of the major
intrinsic protein (MIP) family of membrane transporters (King, L.S. and P.
Agre (1996) Annu. Rev.
Physiol. 58:619-648; Ishibashi, K. et al. (1997) J. Biol. Chem: 272:20782-
20786). The study of
s s aquaporins may have relevance to understanding edema formation and fluid
balance in both normal
43


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
physiology and disease states (King, su ra). Mutations in AQPZ cause autosomal
recessive
nephrogenic diabetes insipidus (OMIM *107777 Aquaporin 2; AQP2). Reduced AQP4
expression in
skeletal muscle may be associated with Duchenne muscular dystrophy (Frigeri,
A. et al. (1998) J.
Clin. Invest. 102:695-703). Mutations in AQPO cause autosomal dominant
cataracts in the mouse
s (OMIM ~ 154050 Major Intrinsic Protein of Lens Fiber; MIP).
The metallothioneins (MTs) are a group of small (61 amino acids), cysteine-
rich proteins that
bind heavy metals such as cadmium, zinc, mercury, lead, and copper and are
thought to play a role in
metal detoxification or the metabolism and homeostasis of metals. Arsenate-
resistance proteins have
been identified in hamsters that are resistant to toxic levels of arsenate
(Rossman, T.G. et al. (1997)
so Mutat. Res. 386:307-314).
Humans respond to light and odors by specific protein pathways. Proteins
involved in light
perception include rhodopsin, transducin, and cGMP phosphodiesterase. Proteins
involved in odor
perception include multiple olfactory receptors. Other proteins are important
in human Circadian
rhythms and responses to wounds.
1s Immunity and Host Defense
All vertebrates have developed sophisticated and complex immune systems that
provide
protection from viral, bacterial, fungal and parasitic infections. Included in
these systems are the
processes of humoral immunity, the complement cascade and the inflammatory
response (Paul, W.E.
(1993) Fundamental Itnmunolo~y, Raven Press, Ltd., New York NY, pp.1-20).
20 The cellular components of the humoral immune system include six different
types of
leukocytes: monocytes, lymphocytes, polymorphonuclear grauulocytes (consisting
of neutrophils,
eosinopluls, and basophils) and plasma cells. Additionally, fragments of
megakaryocytes, a seventh
type of white blood cell in the bone marrow, occur in large numbers in the
blood as platelets.
Leukocytes are formed from two stem cell lineages in bone marrow. The myeloid
stem cell
line produces granulocytes and monocytes and, the lymphoid stem cell produces
lymphocytes.
Lymphoid cells travel to the thymus, spleen and lymph nodes, where they mature
and differentiate into
lymphocytes. Leukocytes are responsible for defending the body against
invading pathogens.
Neutrophils and monocytes attack invading bacteria, viruses, and other
pathogens and destroy them by
phagocytosis. Monocytes enter tissues and differentiate into macrophages which
are extremely
3 o phagocytic. Lymphocytes and plasma cells are a part of the immune system
which recognizes
specific foreign molecules and organisms and inactivates them, as well as
signals other cells to attack
the invaders.
Granulocytes and monocytes are formed and stored in the bone marrow until
needed.
Megakaryocytes are produced in bone marrow, where they fragment into platelets
and are released
~ into the bloodstream. The main function of platelets is to activate the
blood clotting mechanism.
44


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Lymphocytes and plasma cells are produced in various lymphogenous organs,
including the lymph
nodes, spleen, thymus, and tonsils.
Both neutrophils and macrophages exhibit chemotaxis towards sites of
inflammation. Tissue
inflammation in response to pathogen invasion results in production of chemo-
attractants for
s leukocytes, such as endotoxins or other bacterial products, prostaglandins,
and products of leukocytes
or platelets.
Basophils participate in the release of the chemicals involved in the
inflammatory process. The
main function of basophils is secretion of these chemicals to such a degree
that they have been
referred to as "unicellular endocrine glands." A distinct aspect of basophilic
secretion is that the
o contents of granules go directly into the extracellular environment, not
into vacuoles as occurs with
neutrophils, eosinophils and monocytes. Basophils have receptors for the Fc
fragment of
immunoglobulin E (IgE) that are not present on other leukocytes. Crosslinking
of membrane IgE with
anti-IgE or other ligands triggers degranulation.
Eosinophils are bi- or multi-nucleated white blood cells which contain
eosinophilic granules.
15 Their plasma membrane is characterized by Ig receptors, particularly IgG
and IgE. Generally,
eosinophils are stored in the bone marrow until recruited for use at a site of
inflammation or invasion.
They have specific functions in parasitic infections and allergic reactions,
and are thought to detoxify
some of the substances released by mast cells and basophils which cause
inflammation. Additionally,
they phagocytize antigen-antibody complexes and further help prevent spread of
the inflammation.
a o Macrophages are monocytes that have left the blood stream to settle in
tissue. Once
monocytes have migrated into tissues, they do not re-enter the bloodstream.
The mononuclear
phagocyte system is comprised of precursor cells in the bone marrow, monocytes
in circulation, and
macrophages in tissues. The system is capable of very fast and extensive
phagocytosis. A
macrophage may phagocytize over 100 bacteria, digest them and extrude
residues, and then survive
2s for many more months. Macrophages are also capable of ingesting large
particles, including red blood
cells and malarial parasites. They increase several-fold in size and transform
into macrophages that
are characteristic of the tissue they have entered, surviving in tissues for
several months.
Mononuclear phagocytes are essential in defending the body against invasion by
foreign
pathogens, particularly intracellular microorganisms such as M. tuberculosis,
listeria, leishmania and
3 o toxoplasma. Macrophages can also control the growth of tumorous cells, via
both phagocytosis and
secretion of hydrolytic enzymes. Another important function of macrophages is
that of processing
antigen and presenting them in a biochemically modified form to lymphocytes.
The immune system responds to invading microorganisms in two major ways:
antibody
production and cell mediated responses. Antibodies are immunoglobulin proteins
produced by
3 5 B-lymphocytes which bind to specific antigens and cause inactivation or
promote destruction of the


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
antigen by other cells. Cell-mediated immune responses involve T-lymphocytes
(T cells) that react
with foreign antigen on the surface of infected host cells. Depending on the
type of T cell, the
infected cell is either killed or signals are secreted which activate
macrophages and other cells to
destroy the infected cell (Paul, su ra).
s T-lymphocytes originate in the bone marrow or liver in fetuses. Precursor
cells migrate via
the blood to the thymus, where they are processed to mature into T-
lymphocytes. This processing is
crucial because of positive and negative selection of T cells that will react
with foreign antigen and not
with self molecules. After processing, T cells continuously circulate in the
blood and secondary
lymphoid tissues, such as lymph nodes, spleen, certain epithelium-associated
tissues in the
1o gastrointestinal tract, respiratory tract and skin. When T-lymphocytes are
presented with the
complementary antigen, they are stimulated to proliferate and release large
numbers of activated T
cells into the lymph system and the blood system. These activated T cells can
survive and circulate
for several days. At the same time, T memory cells are created, which remain
in the lymphoid tissue
for months or years. Upon subsequent exposure to that specific antigen, these
memory cells will
s5 respond more rapidly and with a stronger response than induced by the
original antigen. This creates
an "immunological memory' that can provide immunity for years.
There are two major types of T cells: cytotoxic T cells destroy infected host
cells, and helper
T cells activate other white blood cells via chemical signals. One class of
helper cell, TH1, activates
macrophages to destroy ingested microorganisms, while another, TH2, stimulates
the production of
2 o antibodies by B cells.
Cytotoxic T cells directly attack the infected target cell. In virus-infected
cells, peptides
derived from viral proteins are generated by the proteasome. These peptides
are transported into the
ER by the transporter associated with antigen processing (TAP) (Pamer, E. and
P. Cresswell (1998)
Annu. Rev. Tmmunol. 16:323-358). Once inside the ER, the peptides bind MHC I
chains, and the
z s peptide/MHC I complex is transported to the cell surface. Receptors on the
surface of T cells bind to
antigen presented on cell surface MHC molecules. Once activated by binding to
antigen, T cells
secrete y-interferon, a signal molecule that induces the expression of genes
necessary for presenting
viral (or other) antigens to cytotoxic T cells. Cytotoxic T cells kill the
infected cell by stimulating
programmed cell death.
3 o Helper T cells constitute up to 75 % of the total T cell population. They
regulate the immune
functions by producing a vaxiety of lymphokines that act on other cells in the
immune system and on
bone marrow. Among these lymphokines are: interleukins-2,3,4,5,6; granulocyte-
monocyte colony
stimulating factor, and y-interferon.
Helper T cells are required for most B cells to respond to antigen. When an
activated helper
35 cell contacts a B cell, its centrosome and Golgi apparatus become oriented
toward the B cell, aiding
46


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
the directing of signal molecules, such as transmembrane-bound protein called
CD40 ligand, onto the B
cell surface to interact with the CD40 transmembrane protein. Secreted signals
also help B cells to
proliferate and mature and, in some cases, to switch the class of antibody
being produced.
B-lymphocytes (B cells) produce antibodies which react with specific antigenic
proteins
s presented by pathogens. Once activated, B cells become filled with extensive
rough eudoplasmic
reticulum and are known as plasma cells. As with T cells, interaction of B
cells with antigen
stimulates proliferation of only those B cells which produce antibody specific
to that antigen. There
are five classes of antibodies, known as immunoglobulins, which together
comprise about 20% of total
plasma protein. Each class mediates a characteristic biological response after
antigen binding. Upon
z o activation by specific antigen B cells switch from making membrane bound
antibody to secretion of
that antibody.
Antibodies, or immunoglobulins (Ig), are the founding members of the Ig
superfamily and the
central components of the humoral immune response. Antibodies are either
expressed on the surface
of B cells or secreted by B cells into the circulation. Antibodies bind and
neutralize blood borne
15 foreign antigens. The prototypical antibody is a tetramer consisting of two
identical heavy polypeptide
chains (H-chains) and two identical light polypeptide chains (L-chains)
interlinked by disulfide bonds.
This arrangement confers the characteristic Y-shape to antibody molecules.
Antibodies are classified
based on their H-chain composition. The five antibody classes, IgA, IgD, IgE,
IgG and IgM, are
defined by the a, ~, s, y, and ~. H-chain types. There are two types of L-
chains, x and ~,, either of
2o which may associate as a pair with any H-chain pair. IgG, the most common
class of antibody found
in the circulation, is tetrameric, while the other classes of antibodies are
generally variants or
multimers of this basic structure.
H-chains and L-chains each contain an N-terminal variable region and a C-
terminal constant
region. Both H-chains and L-chains contain repeated Ig domains. For example, a
typical H-chain
2 s contains four Ig domains, three of which occur within the constant region
and one of which occurs'
within the variable region and contributes ~to the formation of the antigen
recognition site. Likewise, a
typical L-chain contains two Ig domains, one of which occurs within the
constant region and one of
which occurs within the variable region. In. addition, H chains such as ~.
have been shown to associate
with other polypeptides during differentiation of the B cell.
3 o Antibodies can be described in terms of their two main functional domains.
Antigen
recognition is mediated by the Fab (antigen binding fragment) region of the
antibody, while effector
functions are mediated by the Fc (crystallizable fragment) region. Binding of
antibody to an antigen,
such as a bacterium, triggers the destruction of the antigen by phagocytic
white blood cells such as
macrophages and neutrophils. These cells express surface receptors that
specifically bind to the
3 s antibody Fc region and allow the phagocytic cells to engulf, ingest, and
degrade the antibody bound
47


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
antigen. The Fc receptors expressed by phagocytic cells are single-pass
transmembrane glycoproteins
of about 300 to 400 amino acids (Sears, D.W. et al. (1990) J. Tmmunol. 144:371-
378). The
extracellular portion of the Fc receptor typically contains two or three Ig
domains.
Diseases which cause over- or under-abundance of any one type of leukocyte
usually result in
s the entire immune defense system becoming involved. A well-known autoimmune
disease is AIDS
(Acquired T_mmunodeficiency Syndrome) where the number of helper T cells is
depleted, leaving the
patient susceptible to infection by microorganisms and parasites. Another
widespread medical
condition attributable to the immune system is that of allergic reactions to
certain antigens. Allergic
reactions include: hay fever, asthma, anaphylaxis, and urticaria (hives).
Leukemias are an excess
zo ~ production of white blood cells, to the point where a major portion of
the body's metabolic resources
are directed solely at proliferation of white blood cells, leaving other
tissues to starve. Leukopenia or
agrauulocytosis occurs when the bone marrow stops producing white blood cells.
This leaves the
body unprotected against foreign microorganisms, including those which
normally inhabit skin, mucous
membranes, and gastrointestinal tract. If all white blood cell production
stops completely, infection
is will occur within two days and death may follow only 1 to 4 days later.
Impaired phagocytosis occurs in several diseases, including monocytic
leukemia, systemic
lupus, and granulomatous disease. In such a situation, macrophages can
phagocytize normally, but the
enveloped organism is not killed. A defect in the plasma membrane enzyme which
converts oxygen to
lethally reactive forms results in abscess formation in liver, lungs, spleen,
lymph nodes, and beneath the
2o skin. Eosinophilia is an excess of eosinophils commonly observed.in
patients with allergies (hay fever,
asthma), allergic reactions to drugs, rheumatoid arthritis, and cancers
(Hodgkin's disease, lung, and
liver cancer) (Isselbacher, K.J. et al. (1994) Harrison's Principles of
Internal Medicine, McGraw-Hill,
Inc., New York NY).
Host defense is further augmented by the complement system. The complement
system
z s serves as an effector system and is involved in infectious agent
recognition. It can function as an
independent immune network or in conjunction with other humoral immune
responses. The
complement system is comprised of numerous plasma and membrane proteins that
act in a cascade of
reaction sequences whereby one component activates the next. The result is a
rapid and amplified
response to infection through either an inflammatory response or increased
phagocytosis.
3 o The complement system has more than 30 protein components which can be
divided into
functional groupings including modified serine proteases, membrane binding
proteins and regulators of
complement activation. Activation occurs through two different pathways the
classical and the
alternative. Both pathways serve to destroy infectious agents through distinct
triggering mechanisms
that eventually merge with the involvement of the component C3.
3 5 The classical pathway requires antibody binding to infectious agent
antigens. The antibodies
48


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
serve to define the target and initiate the complement system cascade,
culminating m the destruction
of the infectious agent. In this pathway, since the antibody guides initiation
of the process, the
complement can be seen as an. effector arm of the humoral immune system.
The alternative pathway of the complement system does not require the presence
of pre-
s existing antibodies for targeting infectious agent destruction. Rather, this
pathway, through low levels
of an activated component, remains constantly primed and provides surveillance
in the non-immune
host to enable targeting and destruction of infectious agents. In this case
foreign material triggers the
cascade, thereby facilitating phagocytosis or lysis (Paul, supra, pp.918-919).
Another important component of host defense is the process of inflammation.
Inflammatory
so responses are divided into four categories on the basis of pathology and
include allergic inflammation,
cytotoxic antibody mediated inflammation, immune complex mediated inflammation
and monocyte
mediated inflammation. Inflammation manifests as a combination of each of
these forms with one
predominating.
Allergic acute inflammation is observed in individuals wherein specific
antigens stimulate IgE
is antibody production. Mast cells and basophils are subsequently activated by
the attachment of
antigen-IgE complexes, resulting in the release of cytoplasmic granule
contents such as histamine.
The products of activated mast cells can increase vascular permeability and
constrict the smooth
muscle of breathing passages, resulting in anaphylaxis or asthma. Acute
inflammation is also mediated
by cytotoxic antibodies and can result in the destruction of tissue through
the binding of complement-
2 o fixing antibodies to cells. The responsible antibodies are of the IgG or
IgM types. Resultant clinical
disorders include autoimmune hemolytic anemia and thrombocytopenia as
associated with systemic
lupus erythematosis.
T_tn_m__une complex mediated acute inflammation involves the IgG or IgM
antibody types which
combine with antigen to activate the complement cascade. When such immune
complexes bind to
2 s neutrophils and macrophages they activate the respiratory burst to form
protein- and vessel-damaging
agents such as hydrogen peroxide, hydroxyl radical, hypochlorous acid, and
chloramines. Clinical
manifestations include rheumatoid arthritis and systemic lupus erythematosus.
Iu chronic inflammation or delayed-type hypersensitivity, macrophages are
activated and
process antigen for presentation to T cells that subsequently produce
lymphokines and monokines.
3 o This type of inflammatory response is likely important for defense against
intracellular parasites and
certain viruses. Clinical associations include, granulomatous disease,
tuberculosis, leprosy, and
sarcoidosis (Paul, W.E., supra, pp.1017-1018).
Extracellular Information Transmission Molecules
3 5 Intercellular communication is essential for the growth and survival of
multicellular organisms,
49


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
and in particular, for the function of the endocrine, nervous, and immune
systems. In addition, ,
intercellular communication is critical for developmental processes such as
tissue construction and
organogenesis, in which cell proliferation, cell differentiation, and
morphogenesis must be spatially and
temporally regulated in a precise and coordinated manner. Cells communicate
with one another
s through the secretion and uptake of diverse types of signaling molecules
such as hormones, growth
factors, neuropeptides, and cytokines.
Hormones
Hormones are signaling molecules that coordinately regulate basic
physiological processes
from embryogenesis throughout adulthood. These processes include metabolism,
respiration,
~.o reproduction, excretion, fetal tissue differentiation and organogenesis,
growth and development,
homeostasis, and the stress response. Hormonal secretions and the nervous
system are tightly
integrated and interdependent. Hormones are secreted by endocrine glands,
primarily the
hypothalamus and pituitary, the thyroid and parathyroid, the pancreas, the
adrenal glands, and the
ovaries and testes.
15 The secretion of hormones into the circulation is tightly controlled.
Hormones are often
secreted in diurnal, pulsatile, and cyclic patterns. Hormone secretion is
regulated by perturbations in
blood biochemistry, by other upstream-acting hormones, by neural impulses, and
by negative feedback
loops. Blood hormone concentrations are constantly monitored and adjusted to
maintain optimal,
steady-state levels. Once secreted, hormones act only on those target cells
that express specific
2 o receptors.
Most disorders of the endocrine system are caused by either hyposecretion or
hypersecretion
of hormones. Hyposecretion often occurs when a hormone's gland of origin is
damaged or otherwise
impaired. Hypersecretion often results from the proliferation of tumors
derived from hormone-
secreting cells. Inappropriate hormone levels may also be caused by defects in
regulatory feedback
25 loops or in the processing of hormone precursors. Endocrine malfunction may
also occur when the
target cell fails to respond to the hormone.
Hormones can be classified biochemically as polypeptides, steroids,
eicosanoids, or amines.
Polypeptides, which include diverse hormones such as insulin and growth
hormone, vary in size and
function and are often synthesized as inactive precursors that are processed
intracellularly into mature,
3 o active forms. Am?res, which include epinephrine and dopamine, are amino
acid derivatives that
function in neuroendocrine signaling. Steroids, which include the cholesterol-
derived hormones
estrogen and testosterone, function in sexual development and reproduction.
Eicosanoids, which
include prostaglandins and prostacyclins, are fatty acid derivatives that
function in a variety of
processes. Most polypeptides and some amines are soluble in the circulation
where they are highly
3 s susceptible to proteolytic degradation within seconds after their
secretion. Steroids and lipids are


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
insoluble and must be transported in the circulation by carrier proteins. The
following discussion will
focus primarily on polypeptide hormones.
Hormones secreted by the hypothalamus and pituitary gland play a critical role
in endocrine
function by coordinately regulating hormonal secretions from other endocrine
glands in response to
neural signals. Hypothalamic hormones include thyrotropin-releasing hormone,
gonadotropin-releasing
hormone, somatostatin, growth-hormone releasing factor, corticotropin-
releasing hormone, substance
P, dopamine, and prolactin-releasing hormone. These hormones directly regulate
the secretion of
hormones from the anterior lobe of the pituitary. Hoxmones secreted by the
anterior pituitary include
adrenocorticotropic hormone (ACTH), melanocyte-stimulating hormone,
somatotropic hormones such
s o as growth hormone and prolactin, glycoprotein hormones such as thyroid-
stimulating hormone,
luteinizing hormone (LH), and follicle-stimulating hormone (FSH), (3-
lipotropin, and (3-endorphins.
These hormones regulate hormonal secretions from the thyroid, pancreas, and
adrenal glands, and act
directly on the reproductive organs to stimulate ovulation and
spermatogenesis. The posterior pituitary
synthesizes and secretes antidiuretic hormone (ADH, vasopressin) and oxytocin.
Disorders of the hypothalamus and pituitary often result from lesions such as
primary brain
tumors, adenomas, infarction associated with pregnancy, hypophysectomy,
aneurysms, vascular
malformations, thrombosis, infections, immunological disorders, and
complications due to head trauma.
Such disorders have profound effects on the function of other endocrine
glands. Disorders associated
with hypopituitarism include hypogonadism, Sheehan syndrome, diabetes
insipidus, I~allman's disease,
2o Hand-Schuller-Christian disease, Letterer-Siwe disease, sarcoidosis, empty
sella syndrome, and
dwarftsm. Disorders associated with hyperpituitarism include acromegaly,
giantism, and syndrome of
inappropriate ADH secretion (SIADH), often caused by benign adenomas.
Hormones secreted by the thyroid and parathyroid primarily control metabolic
rates and the
regulation of serum calcium levels, respectively. Thyroid hormones include
calcitonin, somatostatin,
2s and thyroid hormone. The parathyroid secretes parathyroid hormone.
Disorders associated with
hypothyroidism include goiter, myxedema, acute thyroiditis associated with
bacterial infection,
subacute thyroiditis associated with viral infection, autoimmune thyroiditis
(Hashimoto's disease), and
cretinism. Disorders associated with hyperthyroidism include thyrotoxieosis
and its various forms,
Grave's disease, pretibial myxedema, toxic multinodular goiter, thyroid
carcinoma, and Plummer's
s o disease. Disorders associated with hyperparathyroidism include Coon
disease (chronic hypercalemia)
leading to bone resorption and parathyroid hyperplasia.
Hormones secreted by the pancreas regulate blood glucose levels by modulating
the rates of
carbohydrate, fat, and protein metabolism. Pancreatic hormones include
insulin, glucagon, amylin, ~y-
aminobutyric acid, gastrin, somatostatin, and pancreatic polypeptide. The
principal disorder associated
3 s with pancreatic dysfunction is diabetes mellitus caused by insufficient
insulin activity. Diabetes
51


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
mellitus is generally classified as either Type I (insulin-dependent, juvenile
diabetes) or Type II (non-
insulin-dependent, adult diabetes). The treatment of both forms by insulin
replacement therapy is well
known. Diabetes mellitus often leads to acute complications such as
hypoglycemia (insulin shock),
coma, diabetic ketoacidosis, lactic acidosis, and chronic complications
leading to disorders of the eye,
s kidney, skin, bone, joint, cardiovascular system, nervous system, and to
decreased resistance to
infection.
The anatomy, physiology, and diseases related to hormonal function are
reviewed in
McCance, K.L. and S.E. Huether (1994) Pathoph si~y: The Biological Basis for
Disease in Adults
and Children, Mosby-Year Book, Inc., St. Louis MO; Greenspan, F.S. and J.D.
Baxter (1994) Basic
Zo . and Clinical Endocrinolo~y, Appleton and Lange, East Norwalk CT.
Gxowth Factors
Growth factors are secreted proteins that mediate intercellular communication.
Unlike
hormones, which travel great distances via the circulatory system, most growth
factors are primarily
local mediators that act on neighboring cells. Most growth factors contain a
hydrophobic N-terminal
15 signal peptide sequence which directs the growth factor into the secretory
pathway. Most growth
factors also undergo post-translational modifications within the secretory
pathway. These
modifications can include proteolysis, glycosylation, phosphorylation, and
intramolecular disulfide bond
formation. Once secreted, growth factors bind to specific receptors on the
surfaces of neighboring
target cells, and the bound receptors trigger intracellular signal
transduction pathways. These signal
20 transduction pathways elicit specific cellular responses in the target
cells. These responses can
include the modulation of gene expression and the stimulation or inhibition of
cell division, cell
differentiation, and cell motility.
Growth factors fall into at least two broad and overlapping classes. The
broadest class
includes the large polypeptide growth factors, which are wide-ranging in their
effects. These factors
include epidermal growth factor (EGF), fibroblast growth factor (FGF),
transforming growth factor-/3
(TGF-(3), insulin-like growth factor (IGF), nerve growth factor (NGF), and
platelet-derived growth
factor (PDGF), each defining a family of numerous related factors. The large
polypeptide growth
factors, with the exception of NGF, act as mitogens on diverse cell types to
stimulate wound healing,
bone synthesis and remodeling, extracellular matrix synthesis, and
proliferation of epithelial, epidermal,
3 o and connective tissues. Members of the TGF-(3, EGF, and FGF families also
function as inductive
signals in the differentiation of embryonic tissue. NGF functions specifically
as a neurotrophic factor,
promoting neuronal growth and differentiation.
Another class of growth factors includes the hematopoietic growth factors,
which are narrow
in their target specificity. These factors stimulate the proliferation and
differentiation of blood cells
3 s such as B-lymphocytes, T-lymphocytes, erythrocytes, platelets,
eosinopluls, basophils, neutrophils,
52


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
macrophages, and their stem cell precursors. 'These factors include the colony-
stimulating factors (Ci-
CSF, M-CSF, GM-CSF, and CSF1-3), erythropoietin, and the cytokines. The
cytokines are specialized
hematopoietic factors secreted by cells of the immune system and are discussed
in detail below.
Growth factors play critical roles in neoplastic transformation of cells in
vitro and in tumor
s progression in vivo. Overexpression of the large polypeptide growth factors
promotes the proliferation
and transformation of cells in culture. Inappropriate expression of these
growth factors by tumor cells
in yivo may contribute to tumor vascularization and metastasis. Inappropriate
activity of hematopoietic
growth factors can result in anemias, leukemias, and lymphomas. Moreover,
growth factors are both
structurally and functionally related to oncoproteins, the potentially cancer-
causing products of prnto-
oncogenes. Certain FGF and PDGF family members are themselves homologous to
oncoproteins,
whereas receptors for some members of the EGF, NGF, and FGF families are
encoded by proto-
oncogenes. Growth factors also affect the transcriptional regulation of both
proto-oncogenes and
oncosuppressor genes (Pimentel, E. (1994) Handbook of Growth Factors, CRC
Press, Ann Arbor MI;
McKay, I. and I. Leigh, eds. (1993) Growth Factors: A Practical Approach,
Oxford University Press,
New York NY; Habenicht, A., ed. (1990) Growth Factors, Differentiation
Factors, and Cytokines,
Springer-Verlag, New York NY).
In addition, some of the large polypeptide growth factors play crucial roles
in the induction of
the primordial germ layers in the developing embryo. This induction ultimately
results in the formation
of the embryonic mesoderm, ectoderm, and endoderm which in turn provide the
framework for the
2 o entire adult body plan. Disruption of this inductive process would be
catastrophic to embryonic
development.
Small Peptide Factors - Neuropeptides and Vasomediators
Neuropeptides and vasomediators (NP/VM) comprise a family of small peptide
factors,
typically of 20 amino acids or less. These factors generally function in
neuronal excitation and
2s inhibition of vasoconstriction/vasodilation, muscle contraction, and
hormonal secretions from the brain
and other endocrine tissues. Included in this family are neuropeptides and
neuropeptide hormones
such as bombesin, neuropeptide Y, neurotensin, neuromedin N, melanocortins,
opioids, galanin,
somatostatin, tachykinins, urotensin II and related peptides involved in
smooth muscle stimulation,
vasopressin, vasoactive intestinal peptide, and circulatory system-borne
signaling molecules such as
3 o angiotensin, complement, calcitonin, endothelins, formyl-methionyl
peptides, glucagon, cholecystokinin,
gastrin, and many of the peptide hormones discussed above. NP/VMs can
transduce signals directly,
modulate the activity or release of other neurotransmitters and hormones, and
act as catalytic enzymes
in signaling cascades. The effects of NP/VMs range from extremely brief to
long-lasting. (Reviewed
in Martin, C.R. et al. (1985) Endocrine Physiology, Oxford University Press,
New York NY, pp. 57-
35 62.)
53


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
~:ytokines
Cytokines comprise a family of signaling molecules that modulate the immune
system and the
inflammatory response. Cytokiues are usually secreted by leukocytes, or white
blood cells, in
response to injury or infection. Cytokines function as growth and
differentiation factors that act
s primarily on cells of the immune system such as B- and T-lymphocytes,
monocytes, macrophages, and
granulocytes. Like other signaling molecules, cytokines bind to specific
plasma membrane receptors
and trigger intracellular signal transduction pathways which alter gene
expression patterns. There is
considerable potential for the use of cytokines in the treatment of
inflammation and immune system
disorders.
so Cytokine structure and function have been extensively characterized in
vitro. Most cytokines
are small polypeptides of about 30 kilodaltons or less. Over 50 cytokines have
been identified from
human and rodent sources. Examples of cytokine subfamilies include the
interferons (IFN-a, -(3, and -
Y), the interleukins (IL1-IL13), the tumor necrosis factors (TNF-a and -(3),
and the chemokines.
Many cytokines have been produced using recombinant DNA techniques, and the
activities of
25 individual cytokines have been determined in vitro. These activities
include regulation of leukocyte
proliferation, differentiation, and motility.
The activity of an individual cytokine in vitro may not reflect the full scope
of that cytokine's
activity in vivo. Cytokines are not expressed individually in vivo but are
instead expressed in
combination with a multitude of other cytokines when the organism is
challenged with a stimulus.
z o Together, these cytokines collectively modulate the immune response in a
manner appropriate for that
particular stimulus. Therefore, the physiological activity of a cytokine is
determined by the stimulus
itself and by complex interactive networks among co-expressed cytokines which
may demonstrate
both synergistic and antagonistic relationships.
Chemokines comprise a cytokine subfamily with over 30 members. (Reviewed in
Wells, T.
N.C. and M.C. Peitsch (1997) J. Leukoc. Biol. 61:545-550.) Chemokines were
initially identified as
chemotactic proteins that recruit monocytes and macrophages to sites of
inflammation. Recent
evidence indicates that chemokines may also play key roles in hematopoiesis
and H1V-1 infection.
Chemokines are small proteins which range from about 6-15 kilodaltons in
molecular weight.
Chemokines are further classified as C, CC, CXC, or CX3C based on the number
and position of
3 o critical cysteine residues. The CC chemokines, for example, each contain a
conserved motif
consisting of two consecutive cysteines followed by two additional cysteines
which occur downstream
at 24- and 16-residue intervals, respectively (ExPASy PROSITE database,
documents PS00472 and
PDOC00434). The presence and spacing of these four cysteine residues are
highly conserved,
whereas the intervening residues diverge significantly. However, a conserved
tyrosine located about
3 5 15 residues downstream of the cysteine doublet seems to be important for
chemotactic activity. Most
54


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
of the human genes encoding CC chemokines are clustered on chromosome 17,
although there are a
few examples of CC chemokine genes that map elsewhere. Other chemokines
include lymphotactin
(C chemokine); macrophage chemotactic and activating factor (MCAF/MCP-1; CC
chemokine);
platelet factor 4 and IL-8 (CXC chemokines); and fractalkine and neurotractin
(CX3C chemokines).
s (Reviewed in Luster, A.D. (1998) N. Engl. J. Med. 338:436-445.)
Receptor Molecules
'The term receptor describes proteins that specifically recognize other
molecules. The
category is broad and includes proteins with a variety of functions. The bulk
of receptors are cell
so surface proteins which bind extracellular ligands and produce cellular
responses in the areas of
growth, differentiation, endocytosis, and immune response. Other receptors
facilitate the selective
transport of proteins out of the endoplasmic reticulum and localize enzymes to
particular locations in
the cell. The term may also be applied to proteins which act as receptors for
ligands with known or
unknown chemical composition and which interact with other cellular
components. For example, the
15 steroid hormone receptors bind to and regulate transcription of DNA.
Regulation of cell proliferation, differentiation, and migration is important
for the formation and
function of tissues. Regulatory proteins such as growth factors coordinately
control these cellular
processes and act as mediators in cell-cell signaling pathways. Growth factors
are secreted proteins
that bind to specific cell-surface receptors on target cells. The bound
receptors trigger intracellular
2o signal transduction pathways which activate various downstream effectors
that regulate gene
expression, cell division, cell differentiation, cell motility, and other
cellular processes.
Cell surface receptors are typically integral plasma membrane proteins. These
receptors
recognize hormones such as catecholamines; peptide hormones; growth and
differentiation factors;
small peptide factors such as thyrotropin-releasing hormone; galanin,
somatostatin, and tachykinins;
2s and circulatory system-borne signaling molecules. Cell surface receptors on
immune system cells
recognize antigens, antibodies, and major histocompatibility complex (MHC)
bound peptides. Other
cell surface receptors bind ligands to be internalized by the cell. This
receptor-mediated endocytosis
functions in the uptake of low density lipoproteins (LDL), transferrin,
glucose- or mannose-terminal
glycoproteins, galactose-terminal glycoproteins, immunoglobulins,
phosphovitellogenins, fibrin,
3o proteinase-inhibitor complexes, plasminogen activators, and thrombospondin
(Lodish, H. et al. (1995)
Molecular Cell Biology, Scientific American Books, New York NY, p. 723;
Mikhailenko, I. et al.
(1997) J. Biol. Chem. 272:6784-6791).
Receptor Protein Kinases
Many growth factor receptors, including receptors for epidermal growth factor,
35 platelet-derived growth factor, fibroblast growth factor, as well as the
growth modulator a-thrombin,


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
contain intrinsic protein kinase activities. When growth factor binds to the
receptor, it triggers the
autophosphorylation of a serine, threonine, or tyrosine residue on the
receptor. These phosphorylated
sites are recognition sites for the binding of other cytoplasmic signaling
proteins. These proteins
participate in signaling pathways that eventually link the initial receptor
activation at the cell surface to
s the activation of a specific intracellular target molecule. In the case of
tyrosine residue
autophosphorylation, these signaling proteins contain a common domain referred
to as a Src homology
(SH) domain. SH2 domains and SH3 domains are found in phospholipase C-y, PI-3-
K p85 regulatory
subunit, Ras-GTPase activating protein, and pp60°-Sr' (Lowenstein, E.J.
et al. (1992) Cell 70:431-442).
The cytokine family of receptors share a different common binding domain and
include
so trausmembrane receptors for growth hormone (GH), interleukins,
erythropoietin, and prolactin.
Other receptors and second messenger-binding proteins have intrinsic
serine/threonine protein
kinase activity. These include activin/TGF-~3/BMP-superfamily receptors,
calcium- and
diacylglycerol-activated/phospholipid-dependant protein kinase (PK-C), and RNA-
dependant protein
kinase (PK-R). In addition, other serine/threonine protein kinases, including
nematode Twitchin, have
15 fibronectin-like, immunoglobulin C2-like domains.
G-Protein Coupled Receptors
G-protein coupled receptors (GPCRs) are integral membrane proteins
characterized by the
presence of seven hydrophobic transmembrane domains which span the plasma
membrane and form a
bundle of antiparallel alpha (a) helices. These proteins range in size from
under 400 to ovex 1000
ao amino acids (Strosberg, A.D. (1991) Eur. J. Biochem. 196:1-10; Coughlin,
S.R. (1994) Curr. Opin.
Cell Biol. 6:191-197). The amino-terminus of the GPCR is extracellular, of
variable length and often
glycosylated; the carboxy-terminus is cytoplasmic and generally
phosphorylated. Extracellular loops
of the GPCR alternate with intracellular loops and link the transmembrane
domains. The most
conserved domains of GPCRs are the transmembrane domains and the first two
cytoplasmic loops.
25 The transmembrane domains account for structural and functional features of
the receptor. In most
cases, the bundle of a helices forms a binding pocket. In addition, the
extracellular N-tern~inal
segment or one or more of the three extracellular loops may also participate
in ligand binding. Ligand
binding activates the receptor by inducing a conformational change in
intracellular portions of the
receptor. The activated receptor, in turn, interacts with an intracellular
heterotrimeric guanine
3 o nucleotide binding (G) protein complex which mediates fuxther
intracellular signaling activities,
generally the production of second messengers such as cyclic AMP (cAMP),
phospholipase C, inositol
triphosphate, or interactions with ion channel proteins (Baldwin, J.M. (1994)
Curr. Opin. Cell Biol.
6:180-190).
GPCRs include those for acetylcholine, adenosine, epinephrine and
norepinephrine, bombesin,
3 5 bradykinin, chemokines, dopamine, endothelia, y-aminobutyric acid (GABA),
follicle-stimulating
56


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
hormone (FSI~, glutamate, gonadotropin-releasing hormone (GnRfl), hepatocyte
growth factor,
histamine, leukotrienes, melanocortins, neuropeptide Y, opioid peptides,
opsins, prostanoids, serotonin,
somatostatin, tachykinins, thrombin, thyrotropin-releasing hormone (TRITj,
vasoactive intestinal
polypeptide family, vasopressin and oxytocin, and orphan receptors.
GPCR mutations, which may cause loss of function or constitutive activation,
have been
associated with numerous human diseases (Coughlin, supra). For instance,
retinitis pigmentosa may
arise from mutations in the rhodopsin gene. Rhodopsin is the retinal
photoreceptor which is located
within the discs of the eye rod cell. Parma, J. et al. (1993, Nature 365:649-
651) report that somatic
activating mutations in the thyxotropin receptor cause hyperfunctioning
thyroid adenomas and suggest
z o that certain GPCRs susceptible to constitutive activation may behave as
protooncogenes.
Nuclear Rectors
Nuclear receptors bind small molecules such as hormones or second messengers,
leading to
increased receptor-binding affinity to specific chromosomal DNA elements. In
addition the affinity
for other nuclear proteins may also be altered. Such binding and protein-
protein interactions may
15 regulate and modulate gene expression. Examples of such receptors include
the steroid hormone
receptors family, the retinoic acid receptors family, and the thyroid hormone
receptors family.
Li~and-Gated Receptor Ion Channels
Ligand-gated receptor ion channels fall into two categories. The first
category, extracellular
ligand-gated receptor ion channels (ELGs), rapidly transduce neurotransmitter-
binding events into
z o electrical signals, such as fast synaptic neurotransmission. ELG function
is regulated by post-
trauslataonal modification. The second category, intracellular ligand-gated
receptor ion channels
(>I,Gs), are activated by many intracellular second messengers and do not
require post-translational
modifications) to effect a channel-opening xesponse.
ELGs depolarize excitable cells to the threshold of action potential
generation. In non-
25 excitable cells, ELGs permit a limited calcium ion-influx during the
presence of agonist. ELGs include
channels directly gated by neurotransmitters such as acetylcholine, L-
glutamate, glycine, ATP,
serotonin, GABA, and histamine. ELG genes encode proteins having strong
structural and functional
similarities. ILGs are encoded by distinct and unrelated gene families and
include receptors for
cAMP, cGMP, calcium ions, ATP, and metabolites of arachidonic acid.
3 o Macrophage Scavenger Receptors -
Macrophage scavenger receptors with broad ligand specificity may participate
in the binding
of low density lipoproteins (LDL) and foreign antigens. Scavenger receptors
types I and II are
trimeric membrane proteins with each subunit containing a small N-terminal
intracellular domain, a
transmembrane domain, a large extracellular domain, and a C-terminal cysteine-
rich domain. The
3 5 extracellular domain contains a short spacer domain, an a helical coiled-
coil domain, and a triple
57


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
helical collagenous domain. These receptors have been shown to bind a spectrum
of ligands, including
chemically modified lipoproteins and albumin, polyribonucleotides,
polysaccharides, phospholipids, and
asbestos (Matsumoto, A. et al. (1990) Proc. Natl. Acad. Sci. USA 87:9133-9137;
Elomaa, O. et al.
(1995) Cell 80:603-609). The scavenger receptors are thought to play a key
role in atherogenesis by
s mediating uptake of modified LDL in arterial walls, and in host defense by
binding bacterial
endotoxins, bacteria, and protozoa.
T-Cell Receptors
T cells play a dual role in the immune system as effectors and regulators,
coupling antigen
recognition with the transmission of signals that induce cell death in
infected cells and stimulate
so proliferation of other immune cells. Although a population of T cells can
recognize a wide range of
different antigens, an individual T cell can only recognize a single antigen
and only when it is presented
to the T cell receptor (TCR) as a peptide complexed with a major
histocompatibility molecule (MHC)
on the surface of an antigen presenting cell. The TCR on most T cells consists
of immunoglobuJin-like
integral membrane glycoproteins containing two polypeptide subunits, a and (3,
of similar molecular
15 weight. Both TCR subunits have an extracellular domain containing both
variable and constant
regions, a trausrnembrane domain that traverses the membrane once, and a short
intracellular domain
(Saito, H. et a1. (2984) Nature 309:757-762). The genes for the TCR subunits
are constructed through
somatic rearrangement of different gene segments. Interaction of antigen in
the proper MHC context
with the TCR initiates signaling cascades that induce the proliferation,
maturation, and function of
2o cellular components of the immune system (Weiss, A. (1991) Annu. Rev.
Genet. 25:487-510).
Rearrangements in TCR genes and alterations in TCR expression have been noted
in lymphomas,
leukemias, autoimmune disorders, and immunodeficiency disorders (Aisenberg,
A.C. et al. (1985) N.
Engl. T. Med. 313:529-533; Weiss, supra).
as Intracellular Signaling Molecules
Intracellular signaling is the general process by which cells respond to
extracellular signals
(hormones, neurotransmitters, growth and differentiation factors, etc.)
through a cascade of
biochemical reactions that begins with the binding of a signaling molecule to
a cell membrane receptor
and ends with the activation of an intracellular target molecule. Intermediate
steps in the process
3 o involve the activation of various cytoplasmic proteins by phosphorylation
via protein kinases, and their
deactivation by protein phosphatases, and the eventual translocation of some
of these activated
proteins to the cell nucleus where the transcription of specific genes is
triggered. The intracellular
signaling process regulates aIt types of cell functions including cell
proliferation, cell differentiation, and
gene transcription, and involves a diversity of molecules including protein
kinases and phosphatases,
3 5 and second messenger molecules, such as cyclic nucleotides, calcium-
calmodulin, inositol, and various
5~


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
mvogens, znai reguiaie proiem pnospnoryat~on.
Protein Phosphorylation
Protein kinases and phosphatases play a key role in the intracellular
signaling process by
controlling the phosphorylation and activation of various signaling proteins.
The high energy phosphate
s for this reaction is generally transferred from the adenosine triphosphate
molecule (ATP) to a
particular protein by a protein kinase and removed from that protein by a
protein phosphatase. Protein
kinases are roughly divided into two groups: those that phosphorylate tyrosine
residues (protein
tyrosine kinases, PTK) and those that phosphorylate serine or threonine
residues (serine/threonine
kinases, STK). A few protein kinases have dual specificity for
serine/threonine and tyrosine residues.
so Almost all kinases contain a conserved 250-300 amino acid catalytic domain
containing specific
residues and sequence motifs characteristic of the kinase family (Hardie, G.
and S. Hanks (1995) The
Protein Kinase Facts Books, Vol I:7-20, Academic Press, San Diego CA).
STKs include the second messenger dependent protein kinases such as the cyclic-
AMP
dependent protein kinases (PKA), involved in mediating hormone-induced
cellular responses;
is calcium-calmodulin (CaM) dependent protein kinases, involved in regulation
of smooth muscle
contraction, glycogen breakdown, and neurotransmission; and the mitogen-
activated protein kinases
(MAP) which mediate signal transduction from the cell surface to the nucleus
via phosphorylation
cascades. Altered PKA expression is implicated in a variety of disorders and
diseases including
cancer, thyroid disorders, diabetes, atherosclerosis, and cardiovascular
disease (Isselbacher, K.J. et al.
20 (1994) Harrison's Principles of Internal Medicine, McGraw-Hill, New York
NY, pp. 416-431, 1887).
PTKs are divided into transmembrane, receptor PTKs and nontransmembrane, non-
receptor
PTKs. Transmembrane PTKs are receptors for most growth factors. Non-receptor
PTKs lack
transmembrane regions and, instead, form complexes with the intracellular
regions of cell surface
receptors. Receptors that function through non-receptor PTKs include those for
cytokines and
2 s hormones (growth hormone and prolactin) and antigen-specific receptors on
T and B lymphocytes.
Many of these PTKs were first identified as the products of mutant oncogenes
in cancer cells in
which their activation was no longer subject to normal cellular controls. In
fact, about one third of the
known oncogenes encode PTKs, and it is well known that cellular transformation
(oncogenesis) is
often accompanied by increased tyrosine phosphorylation activity (Charbonneau,
H. and N.K. Tonks
so (1992) Annu. Rev. CellBiol. 8:463-493).
An additional family of protein kinases previously thought to exist only in
procaryotes is the
histidine protein kinase family (HPK). HPKs bear little homology with
mammalian STKs or PTKs but
have distinctive sequence motifs of their own (Davie, J.R. et al. (1995) J.
Biol. Chem.
270:19861-19867). A histidine residue in the N-terminal half of the molecule
(region I) is an
s s autophosphorylation site. Three additional motifs located in the C-
terminal half of the molecule include
59


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
an invariant asparagine residue in region II and two glycine-rich loops
characteristic of nucleotide
binding domains in regions III and IV. Recently a branched chain alpha-
ketoacid dehydrogenase
kinase has been found with characteristics of HPK in rat (Davie, su ra).
Protein phosphatases regulate the effects of protein kinases by removing
phosphate groups
from molecules previously activated by kinases. The two principal categories
of protein phosphatases
are the protein (serine/threonine) phosphatases (PPs) and the protein tyrosine
phosphatases (PTPs).
PPs dephosphorylate phosphoserine/threonine residues and are important
regulators of many
cAMP-mediated hormone responses (Cohen, P. (1989) Annu. Rev. Biochem. 58:453-
508). PTPs
reverse the effects of protein tyrosine kinases and play a significant role in
cell cycle and cell signaling
so processes (Charbonneau, supra). As previously noted, many PTKs are encoded
by oncogenes, and
oncogenesis is often accompanied by increased tyrosine phosphorylation
activity. It is therefore
possible that PTPs may prevent or reverse cell transformation and the growth
of various cancers by
controlling the levels of tyrosine phosphorylation in cells. This hypothesis
is supported by studies
showing that overexpression of PTPs can suppress transformation in cells, and
that specific inhibition
of PTPs can enhance cell transformation (Charbonneau, supra).
Phospholipid and Inositol-Pho~hate Si~nalin~
Inositol phospholipids (phosphoinositides) are involved in an intracellular
signaling pathway that
begins with binding of a signaling molecule to a G-protein linked receptor in
the plasma membrane.
This leads to the phosphorylation of phosphatidylinositol (PI) residues on the
inner side of the plasma
~ o membrane to the biphosphate state (PlPz) by inositol kinases.
Simultaneously, the G-protein linked
receptor binding stimulates a trimeric G-protein which in turn activates a
phosphoinositide-specific
phospholipase C-(3. Phospholipase C-(3 then cleaves P1P2 into two products,
inositol triphosphate (IP3)
and diacylglycerol. These two products act as mediators for separate signaling
events. IP3 diffuses
through the plasma membrane to induce calcium release from the endoplasmic
reticulum (ER), while
2 5 diacylglycerol remains in the membrane and helps activate protein kinase
C, an STK that
phosphorylates selected proteins in the target cell. The calcium response
initiated by IP3 is terminated
by the dephosphorylatiori of IP3 by specific inositol phosphatases. Cellular
responses that are
mediated by this pathway are glycogen breakdown in the liver in response to
vasopressin, smooth
muscle contraction in response to acetylcholine, and thrombin-induced platelet
aggregation.
3 o Cyclic Nucleotide Signaling
Cyclic nucleotides (CAMP and cGMP) function as intracellular second messengers
to
transduce a variety of extracellular signals including hormones, light, and
neurotransmitters. In
particular, cyclic-AMP dependent protein kinases (PKA) are thought to account
for all of the effects
of cAMP in most mammalian cells, including various hormone-induced cellular
responses. Visual
3 5 excitation and the phototransmission of light signals in the eye is
controlled by cyclic-GMP regulated,


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Caz+-specific channels. Because of the importance of cellular levels of cyclic
nucleotides in mediating
these various responses, regulating the synthesis and bxeakdown of cyclic
nucleotides is an important
matter. Thus adenylyl cyclase, which synthesizes cAMP from AMP, is activated
to increase cAMP
levels in muscle by binding of adrenaline to (3-andrenergic receptors, while
activation of guanylate
cyclase and increased cGMP levels in photoreceptors leads to reopening of the
Ca2+-specific channels
and recovery of the dark state in the eye. In contrast, hydrolysis of cyclic
nucleotides by cAMP and
cGMP-specific phosphodiesterases (PDEs) produces the opposite of these and
other effects mediated
by increased cyclic nucleotide levels. PDEs appear to be particularly
important in the regulation of
cyclic nucleotides, considering the diversity found in this family of
proteins. At least seven families of
1o mammalian PDEs (PDE1-7) have been identified based on substrate specificity
and affinity, sensitivity
to cofactors, and sensitivity to inhibitory drugs (Beavo, J.A. (1995)
Physiological Reviews 75:725-748).
PDE inhibitors have been found to be particularly useful in treating various
clinical disorders.
Rolipram, a specific inhibitor of PDE4, has been used in the treatment of
depression, and similar
inhibitors are undergoing evaluation as anti-inflammatory agents. Theophylline
is a nonspecific PDE
1s inhibitor used in the treatment of bronchial asthma and other respiratory
diseases (Banner, K.H. and
C.P. Page (1995) Eur. Respir. J. 8:996-1000).
G-Protein Si~nalin~
Guanine nucleotide binding proteins (G-proteins) are critical mediators of
signal transduction
between a particular class of extracellular receptors, the G-protein coupled
receptors (GPCR), and
2o intracellular second messengers such as cAMP and Caz+. G-proteins are
linked to the cytosolic side
of a GPCR such that activation of the GPCR by ligand binding stimulates
binding of the G-protein to
GTP, inducing an "active" state in the G-protein. In the active state, the G-
protein acts as a signal to
trigger other events in the cell such as the increase of CAMP levels or the
release of Ca2+ into the
cytosol from the ER, which, in turn, regulate phosphorylation and activation
of other intracellular
2 s proteins. Recycling of the G-protein to the inactive state involves
hydrolysis of the bound GTP to
GDP by a GTPase activity in the G-protein. (See Alberts, B. et al. (1994)
Molecular Biology of the
Cell, Garland Publishing, Inc., New York NY, pp.734-759.) Two structurally
distinct classes of G-
proteins are recognized: heterotrimeric G-proteins, consisting of three
different subunits, and
monomeric, low molecular weight (LMW), G-proteins consisting of a single
polypeptide chain.
3 o The three polypeptide subunits of heterotrimeric G-proteins are the a, [3,
and 'y subunits. The
a subunit binds and hydrolyzes GTP. The (3 and y subunits form a tight complex
that anchors the
protein to the inner side of the plasma membrane. The (3 subunits, also known
as G-[3 proteins or j3
transducins, contain seven tandem repeats of the WD-repeat sequence motif, a
motif found in many
proteins with regulatory functions. Mutations and variant expression of (3
trausducin proteins are
61


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
linked mth vanous disorders (Veer, F.J. et al. (1994) Nature 371:297-300;
Margottin, F. et al. (1998)
Mol. Cell 1:565-574).
LMW GTP-proteins are GTPases which regulate cell growth, cell cycle control,
protein
secretion, and intracellular vesicle interaction. They consist of single
polypeptides which, like the a
s subunit of the heterotrimeric G-proteins, are able to bind and hydrolyze
GTP, thus cycling between an
inactive and an active state. At least sixty members of the LMW G-protein
superfamily have been
identified and are currently grouped into the six subfamilies of ras, rho,
arf, sarl, ran, and rab.
Activated ras genes were initially found in human cancers, and subsequent
studies confirmed that ras
function is critical in determining whether cells continue to grow or become
differentiated. Other
so members of the LMW G-protein superfamily have roles in signal transduction
that vary with the
function of the activated genes and the locations of the G-proteins.
Guanine nucleotide exchange factors regulate the activities of LMW G-proteins
by
determining whether GTP or GDP is bound. GTPase-activating protein (GAP) binds
to GTP-ras and
induces it to hydrolyze GTP to GDP. In contrast, guanine nucleotide releasing
protein (GNRP) binds
s5 to GDP-ras and induces the release of GDP and the binding of GTP.
Other regulators of G-protein signaling (RGS) also exist that act primarily by
negatively
regulating the G-protein pathway by an unknown mechanism (Druey, K.M. et al.
(1996) Nature
379:742-746). Some 15 members of the RGS family have been identified. RGS
family members are
related structurally through similarities in an approximately 120 amino acid
region termed the RGS
2 0 domain and functionally by their ability to inhibit the interleukin
(cytokine) induction of MAP kinase in
cultured mammalian 293T cells (Druey, supra).
Calcium Si~naliu~ Molecules
Ca+2 is another second messenger molecule that is even more widely used as an
intracellular
mediator than cAMP. Two pathways exist by which Ca+2 can enter the cytosol in
response to
25 extracellular signals: One pathway acts primarily in nerve signal
transduction where Ca+2 enters a
nerve terminal through a voltage-gated Ca+2 channel. The second is a more
ubiquitous pathway in
which Ca+2 is released from the ER into the cytosol in response to binding of
an extracellular signaling
molecule to a receptor. Caz+ directly activates regulatory enzymes, such as
protein kinase C, which
trigger signal transduction pathways. Caz+ also binds to specific Ca~+ binding
proteins (CBPs) such as
s o calmodulin (CaM) which then activate multiple target proteins in the cell
including enzymes, membrane
transport pumps, and ion channels. CaM interactions are involved in a
multitude of cellular processes
including, but not limited to, gene regulation, DNA synthesis, cell cycle
progression, mitosis,
cytokinesis, cytoskeletal organization, muscle contraction, signal
transduction, ion homeostasis,
exocytosis, and metabolic regulation (Cello, M.R. et al. (1996) Guidebook to
Calcium-binding Proteins,
62


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Uxtord Uluversity Yress, Uxtbrd, UK, pp. 15-LU). Some C,liYs can serve as a
storage depot for ~:a'~
in an inactive state. Calsequestrin is one such CBP that is expressed in
isoforms specific to cardiac
muscle and skeletal muscle. It is suggested that calsequestrin binds Ca2+ in a
rapidly exchangeable
state that is released during Ca2+ -signaling conditions (Cello, M.R. et al.
(1996) Guidebook to
s Calcium-binding_Proteins, Oxford University Press, New York NY, pp. 222-
224).
C clips
Cell division is the fundamental process by which all living things grow and
reproduce. In
most organisms, the cell cycle consists of three principle steps; interphase,
mitosis, and cytokinesis.
Interphase, involves preparations for cell division, replication of the DNA
and production of essential
so proteins. In mitosis, the nuclear material is divided and separates to
opposite sides of the cell.
Cytokinesis is the final division and fission of the cell cytoplasm to produce
the daughter cells.
The entry and exit of a cell from mitosis is regulated by the synthesis and
destruction of a
family of activating proteins called cyclins. Cyclins act by binding to and
activating a group of
cyclin-dependent protein kinases (Cdks) which then phosphorylate and activate
selected proteins
15 involved in the mitotic process. Several types of cyclins exist.
(Ciechanover, A. (1994) Cell
79:13-21.) Twa principle types are mitotic cyclin, or cyclin B, which controls
entry of the cell into
mitosis, and G1 cyclin, which controls events that drive the °cell out
of mitosis.
Signal Complex Scaffolding Proteins
Ceretain proteins in intracellular signaling pathways serve to link or cluster
other proteins
2 o involved in the signaling cascade. A conserved protein domain called the
PDZ domain has been
identified in various membrane-associated signaling proteins. This domain has
been implicated in
receptor and ion channel clustering and in the targeting of multiprotein
signaling complexes to
specialized functional regions of the cytosolic face of the plasma membrane.
(For a review of PDZ
domain-containing proteins, see Pouting, C.P. et al. (1997) Bioessays 19:469-
479.) A large proportion
25 of PDZ domains are found in the eukaryotic MAGUK (membrane-associated
guanylate kinase)
protein family, members of which bind to the intracellular domains of
receptors and channels.
However, PDZ domains are also found in diverse membrane-localized proteins
such as protein
tyrosine phosphatases, serine/threonine kinases, G-protein cofactors, and
synapse-associated pxoteins
such as syntrophins and neuxonal nitric oxide synthase (nNOS). Generally,
about one to three PDZ
s o domains are found in a given protein, although up to nine PDZ domains have
been identified in a single
protein.
Membrane Transport Molecules
The plasma membrane acts as a barrier to most molecules. Transport between the
cytoplasm
3 s and the extracellular environment, and between the cytoplasm and lumenal
spaces of cellular
63


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
organelles requires specific transport proteins. Each transport protein
carries a particular class of
molecule, such as ions, sugars, or amino acids, and often is specific to a
certain molecular species of
the class. A variety of human inherited diseases are caused by a mutation in a
transport protein. For
example, cystinuria is an inherited disease that results from the inability to
transport cystine, the
disulfide-linked dimer of cysteine, from the urine into the blood.
Accumulation of cystine in the urine
leads to the formation of cystine stones in the kidneys.
Transport proteins are multi-pass transmembrane proteins, which either
actively transport
molecules across the membrane or passively allow them to cross. Active
transport involves
directional pumping of a solute across the membrane, usually against an
electrochemical gradient.
io Active transport is tightly coupled to a source of metabolic energy, such
as ATP hydrolysis or an
electrochemically favorable ion gradient. Passive transport involves the
movement of a solute down
its electrochemical gradient. Transport proteins can be further classified as
either carrier proteins or
channel proteins. Carrier proteins, which can function in active or passive
transport, bind to a specific
solute to be transported and undergo a conformational change which transfers
the bound solute across
the membrane. Channel proteins, which only function in passive transport, form
hydrophilic pores
across the membrane. When the pores open, specific solutes, such as inorganic
ions, pass through the
membrane and down the electrochemical gradient of the solute.
Carrier proteins which transport a single solute from one side of the membrane
to the other
are called uniporters. In contrast, coupled transporters link the transfer of
one solute with
2 o simultaneous or sequential transfer of a second solute, either in the same
direction (symport) or in the
opposite direction (antiport). For example, intestinal and kidney epithelium
contains a variety of
symporter systems driven by the sodium gradient that exists across the plasma
membrane. Sodium
moves into the cell down its electrochemical gradient and brings the solute
into the cell with it. The
sodium gradient that provides the driving force for solute uptake is
maintained by the ubiquitous
Na+IK+ ATPase. Sodium-coupled transporters include the mammalian glucose
transporter (SGLT1),
iodide transporter (NIS), and multivitamin transporter (SMVT). All three
transparters have twelve
putative transmembrane segments, extracellular glycosylation sites, and
cytoplasmically-oriented N-
and C-termini. NIS plays a crucial role in the evaluation, diagnosis, and
treatment of various thyroid
pathologies because it is the molecular basis for radioiodide thyroid-imaging
techniques and for
3 o specific targeting of radioisotopes to the thyroid gland (Levy, O. et al.
(1997) Proc. Natl. Acad. Sci.
USA 94:5568-5573). SMVT is expressed in the intestinal mucosa, kidney, and
placenta, and is
implicated in the transport of the water-soluble vitamins, e.g., biotin and
pantothenate (Prasad, P.D. et
al. (1998) J. Biol. Chem. 273:7501-7506).
Transporters play a major role in the regulation of pH, excretion of drugs,
and the cellular
K+/Na+ balance. Monocarboxylate anion transporters are proton-coupled
symporters with a broad
64


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
substrate specificity that includes L-lactate, pyruvate, and the ketone bodies
acetate, acetoacetate, and
beta-hydroxybutyrate. At least seven isoforms have been identified to date.
The isoforms are
predicted to have twelve transmembrane (TM) helical domains with a large
intracellular loop between
TM6 and TM7, and play a critical role in maintaining intracellular pH by
removing the protons that are
s produced stoichiometrically with lactate during glycolysis. The best
characterized
H(+)-monocarboxylate transporter is that of the erythrocyte membrane, which
transports L-lactate
and a wide range of other aliphatic monocarboxylates. Other cells possess H(+)-
linked
monocarboxylate transporters with differing substrate and
inhibitor.selectivities. In particular, cardiac
muscle and tumor cells have transporters that differ in their K"~ values for
certain substrates, including
so stereoselectivity for L- over D-lactate, and in their sensitivity to
inhibitors. There are
Na(+)-monocarboxylate cotransporters on the luminal surface of intestinal and
kidney epithelia, which
allow the uptake of lactate, pyruvate, and ketone bodies in these tissues. In
addition, there are specific
and selective transporters for organic cations and organic anions in organs
including the kidney,
intestine~and liver. Organic anion transporters are selective for hydrophobic,
charged molecules with
is electron-attracting side groups. Organic cation transporters, such as the
ammonium transporter,
mediate the secretion of a variety of drugs and endogenous metabolites, and
contribute to the
maintenance of intercellular pH. (Poole, R.C. and A.P. Halestrap (1993) Am. J.
Physiol.
264:C761-C782; Price, N.T. et al. (1998) Biochem. J. 329:321-328; and
Martinelle, K. and I.
Haggstrom (1993) J. Biotechuol. 30: 339-350.)
2o The largest and most diverse family of transport proteins known is the ATP-
binding cassette
(ABC) transporters. As a family, ABC transporters can transport substances
that differ markedly in
chemical structure and size, ranging from small molecules such as ions,
sugars, amino acids, peptides,
and phospholipids, to lipopeptides, large proteins, and complex hydrophobic
drugs. ABC proteins
consist of four modules: two nucleotide-binding domains (NBD), which hydrolyze
ATP to supply the
2 s energy required for transport, and two membrane-spanning domains (MSD),
each containing six
putative transmembrane segments. These four modules may be encoded by a single
gene, as is the
case for the cystic fibrosis trausmembrane regulator (CF'TR), or by separate
genes. When encoded
by separate genes, each gene product contains a single NBD and MSD. These
"half molecules" form
homo- and heterodimers, such as Tap1 and Tap2, the endoplasmic reticulum-based
major
3 o histocompatibility (MHC) peptide transport system. Several genetic
diseases are attributed to defects
in ABC transporters, such as the following diseases and their corresponding
proteins: cystic fibrosis
(CFTR, an ion channel), adrenoleukodystrophy (adrenoleukodystrophy protein,
ALDP), Zellweger
syndrome (peroxisomal membrane protein-70, PMP70), and hyperinsulinemic
hypoglycemia
(sulfonylurea receptor, SUR). Overexpression of the multidrug resistance (MDR)
protein, another
3 5 ABC transporter, in human cancer cells makes the cells resistant to a
variety of cytotoxic drugs used


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
in chemotherapy (Taglight, D. and S. Michaelis (1998) Meth. Enzymol. 292:131-
163).
Transport of fatty acids across the plasma membrane can occur by diffusion, a
high capacity,
low affinity process. However, under normal physiological conditions a
significant fraction of fatty
acid transport appears to occur via a high affinity, low capacity protein-
mediated transport process.
s Fatty acid transport protein (FATP), an integral membrane protein with four
transmembrane
segments, is expressed in tissues exhibiting high levels of plasma membrane
fatty acid flux, such as
muscle, heart, and adipose. Expression of FATP is upregulated in 3T3-L1 cells
during adipose
conversion, and expression in COS7 fibroblasts elevates uptake of long-chain
fatty acids (Hui, T.Y. et
al. (1998) J. Biol. Chem. 273:27420-27429).
s o Ion Channels
The electrical potential of a cell is generated and maintained by controlling
the movement of
ions across the plasma membrane. The movement of ions requires ion channels,
which form an ion-
selective pore within the membrane. There are two basic types of ion channels,
ion transporters and
gated ion channels. Ion transporters utilize the energy obtained from ATP
hydrolysis to actively
15 transport an ion against the ion's concentration gradient. Gated ion
channels allow passive flow of an
ion down the ion's electrochemical gradient under restricted conditions.
Together, these types of ion
channels generate, maintain, and utilize an electrochemical gradient that is
used in 1) electrical impulse
conduction down the axon of a nerve cell, 2) transport of molecules into cells
against concentration
gradients, 3) initiation of muscle contraction, and 4) endocrine cell
secretion.
2 o Ion transporters generate and maintain the resting electrical potential of
a cell. Utilizing the
energy derived from ATP hydrolysis, they transport ions against the ion's
concentration gradient.
These transmembrane ATPases are divided into three families. The
phosphorylated (P) class ion
transporters, including Na+-K+ ATPase, Ca2~-ATPase, and H+-ATPase, are
activated by a
phosphorylation event. P-class ion transporters are responsible for
maintaining resting potential
2s distributions such that cytosolic concentrations of Na+ and Caz+ are low
and cytosolic concentration of
K+ is high. The vacuolar (V) class of ion transporters includes H~ pumps on
intracellular organelles,
such as lysosomes and Golgi. V-class ion transporters are responsible for
generating the low pH
within the lumen of these organelles that is required for function. The
coupling factor (F) class
consists of H+ pumps in the mitochondria. F-class ion transporters utilize a
proton gradient to generate
3 o ATP from ADP and inorganic phosphate (P;).
The resting potential of the cell is utilized in many processes involving
carrier proteins and
gated ion channels. Carrier proteins utilize the resting potential to
transport molecules into and out of
the cell. Amino acid and glucose transport into many cells is linked to sodium
ion co-transport
(symport) so that the movement of Na+ down an electrochemical gradient drives
transport of the other
3 s molecule up a concentration gradient. Similarly, cardiac muscle links
transfer of Ca2+ out of the cell
66


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
with transport of Na+ into the cell (antiport).
Ion channels share common structural and mechanistic themes. The channel
consists of four
or five subunits or protein monomers that are arranged like a barrel in the
plasma membrane. Each
subunit typically consists of six potential transmembrane segments (S 1, S2,
S3, S4, S5, and S6). The
center of the barrel forms a pore lined by a-helices or (3-strands. The side
chains of the amino acid
residues comprising the a-helices or (3-strands establish the charge (cation
or anion) selectivity of the
channel. The degree of selectivity, or what specific ions are allowed to pass
through the channel,
depends on the diameter of the narrowest part of the pore.
Gated ion channels control ion flow by regulating the opening and closing of
pores. These
io channels are categorized according to the manner of regulating the gating
function. Mechanically-
gated channels open pores in response to mechanical stress, voltage-gated
channels open pores in
response to changes in membrane potential, and ligand-gated chancels open
pores in the presence of a
specific ion, nucleotide, or neurotransmitter.
Voltage-gated Na+ and K+ channels are necessary for the function of
electrically excitable
s5 cells, such as nerve and muscle cells. Action potentials, which lead to
neurotransmitter release and
muscle contraction, arise from large, transient changes in the permeability of
the membrane to Nay
and I~+ ions. Depolarization of the membrane beyond the threshold level opens
voltage-gated Nay
channels. Sodium ions flow into the cell, further depolarizing the membrane
and opening more
voltage-gated Na+ channels, which propagates the depolarization down the
length of the cell.
2 o Depolarization also opens voltage-gated potassium channels. Consequently,
potassium ions flow
outward, which leads to repolarization of the membrane. Voltage-gated channels
utilize charged
residues in the fourth transmembrane segment (S4) to sense voltage change. The
open state lasts
only about 1 millisecond, at which time the channel spontaneously converts
into an inactive state that
cannot be opened irrespective of the membrane potential. Inactivation is
mediated by the channel's
a 5 N-terminus, which acts as a plug that closes the pore. The transition from
an inactive to a closed state
requires a return to resting potential.
Voltage-gated Na+ channels are heterotrimeric complexes composed of a 260 kDa
pore
forming a subunit that associates with two smaller auxiliary subunits, (31 and
(32. The (32 subunit is an
integral membrane glycoprotein that contains an extracellular Ig domain, and
its association with a and
3 0 (31 subunits correlates with increased functional expression of the
channel, a change in its gating
properties, and an increase in whole cell capacitance due to an increase in
membrane surface area.
(Isom, L.L. et al. (1995) Cell 83:433-442.)
Voltage-gated Ca 2+ channels are involved in presynaptic neurotransmitter
release, and heart
and skeletal muscle contraction. The voltage-gated Ca 2+ channels from
skeletal muscle (L-type) and
3 5 brain (N-type) have been purified, and though their functions differ
dramatically, they have similar
67


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
subunit compositions. The channels are composed of three subunits. The al
subunit forms the
membrane pore and voltage sensor, while the a28 and (3 subunits modulate the
voltage-dependence,
gating properties, and the current amplitude of the channel. These subunits
are encoded by at least six
al, one a28, and four [3 genes. A fourth subunit, ~y, has been identified in
skeletal muscle. (Walker, D.
s et al. (1998) J. Biol. Chem. 273:2361-2367; and Jay, S.D. et a1. (1990)
Science 248:490-492.)
Chloride channels are necessary in endocrine secretion and in regulation of
cytosolic and
organelle pH. In secretory epithelial cells, Cl- enters the cell across a
basolateral membrane through
an Na+, K~/CI- cotransporter, accumulating in the cell above its
electrochemical equilibrium
concentration. Secretion of Cl- from the apical surface, in response to
hormonal stimulation, leads to
so flow of Na+ and water into the secretory lumen. The cystic fibrosis
transmembrane conductance
regulator (CFTR) is a chloride channel encoded by the gene for cystic
fibrosis, a common fatal genetic
disorder in humans. Loss of CFTR function decreases transepithelial water
secretion anl, as a result,
the layers of mucus that coat the respiratory tree, pancreatic ducts, and
intestine are dehydrated and
difficult to clear. The resulting blockage of these sites leads to pancreatic
insufficiency, "meconium
i5 ileus", and devastating "chronic obstructive pulmonary disease" (Al-Awqati,
Q. et al. (1992) J. Exp.
Biol. 172:245-266).
Many intracellular organelles contain H+-ATPase pumps that generate
transmembrane pH
and electrochemical differences by moving protons from the cytosol to the
organelle lumen. If the
membrane of the organelle is permeable to other ions, then the electrochemical
gradient can be
2 o abrogated without affecting the pH differential. In fact, removal of the
electrochemical barrier allows
more H+ to be pumped across the membrane, increasing the pH differential. Cl-
is the sole counterion
of H+ translocation in a number of organelles, including chromaffm granules,
Golgi vesicles,
lysosomes, and endosomes. Functions that require a low vacuolar pH include
uptake of small
molecules such as biogenic amines in chromaffin granules, processing of
vacuolar constituents such as
2 s pro-hormones by proteolytic enzymes, and protein degradation in lysosomes
(Al-Awqati, su ra).
Ligand-gated channels open their pores when an extracellular or intracellular
mediator binds to
the channel. Neurotransmitter-gated channels are channels that open when a
neurotransmitter binds
to their extracellular domain. These channels exist in the postsynaptic
membrane of nerve or muscle
cells. There are two types of neurotransmitter-gated channels. Sodium channels
open in response to
3 o excitatory neurotransmitters, such as acetylcholine, glutamate, and
serotonin. This opening causes an
influx of Na+ and produces the initial localized depolarization that activates
the voltage-gated channels
and starts the action potential. Chloride channels open in response to
inhibitory neurotransmitters,
such as y-aminobutyric acid (GABA) and glycine, leading to hyperpolarization
of the membrane and
the subsequent generation of an action potential.
35 Ligand-gated channels can be regulated by intracellular second messengers.
Calcium-
68


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
activated K+ channels are gated by internal calcium ions. In nerve cells, an
influx of calcium during
depolarization opens K+ channels to modulate the magnitude of the action
potential (Ishi, T.M. et al.
(1997) Proc. Natl. Acad. Sci. USA 94:11651-11656). Cyclic nucleotide-gated
(CNG) channels are
gated by cytosolic cyclic nucleotides. The best examples of these are the cAMP-
gated Na+ channels
involved in olfaction and the cGMP-gated cation channels involved in vision.
Both systems involve
Iigand-mediated activation of a G-protein coupled receptor which then alters
the level of cyclic
nucleotide within the cell.
Ion channels are expressed in a number of tissues where they are implicated in
a variety of
processes. CNG channels, while abundantly expressed in photoreceptor and
olfactory sensory cells,
so are also found in kidney, lung, pineal, retinal ganglion cells, testis,
aorta, and brain. Calcium-activated
K+ channels may be responsible for the vasodilatory effects of bradykinin in
the kidney and for
shunting excess K''- from brain capillary endothelial cells into the blood.
They are also implicated in
repolarizing granulocytes after agonist-stimulated depolarization (Ishi,
s_u~ra). Ion channels have been
the target for many drug therapies. Neurotransmitter-gated channels have been
targeted in therapies
5 for treatment of insomnia, anxiety, depression, and schizophrenia. Voltage-
gated channels have been
targeted in therapies for arrhythmia, ischemic stroke, head trauma, and
neurodegenerative disease
(Taylor, C.P. and L.S. Narasimhan (1997) Adv. Pharmacol. 39:47-98).
Disease Correlation
The etiology of numerous human diseases and disorders can be attributed to
defects in the
2 o transport of molecules across membranes. Defects in the trafficking of
membrane bound transporters
and ion channels are associated with several disorders, e.g. cystic fibrosis,
glucose-galactose
malabsorption syndrome, hypercholesterolemia, von Gierke disease, and certain
forms of diabetes
mellitus. Single-gene defect diseases resulting in an inability to transport
small molecules across
membranes include, e.g., cystinuria, iminoglycinuria, Hartup disease, and
Fanconi disease (van't Hoff,
25 W.G. (1996) Exp. Nephrol. 4:253-262; Talente, G.M. et al. (1994) Ann.
Intern. Med. 120:218-226;
and Chillon, M. et al. (1995) New Engl. J. Med. 332:1475-1480).
Protein Modification and Maintenance Molecules
The cellular processes regulating modification and maintenance of protein
molecules
3 o coordinate their conformation, stabilization, and degradation. Each of
these processes is mediated by
key enzymes or proteins such as proteases, protease inhibitors, transferases,
isomerases, and
molecular chaperones.
Proteases
Proteases cleave proteins and peptides at the peptide bond that forms the
backbone of the
3 s peptide and protein chain. Proteolytic processing is essential to cell
growth, differentiation,
69


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
remodeling, and homeostasis as well as W tlammation and immune response.
Typical protein half-lives
range from hours to a few days, so that within all living cells, precursor
proteins are being cleaved to
their active form, signal sequences proteolytically removed from targeted
proteins, and aged or
defective proteins degraded by proteolysis. Proteases function in bacterial,
parasitic, and viral invasion
and replication within a host. Four principal categories of mammalian
proteases have been identified
based on active site structure, mechanism of action, and overall three-
dimensional structure. (Beynon,
R.J. and J.S. Bond (1994) Proteol, is Enzymes: A Practical Ap roach, Oxford
University Press, New
York NY, pp. 1-5).
The serine proteases (SPs) have a serine residue, usually within a conserved
sequence, in an
so active site composed of the serine, an aspartate, and a histidine residue.
SPs include the digestive
enzymes trypsin and chymotrypsin, components of the complement cascade and the
blood-clotting
cascade, and enzymes that control extracellular protein degradation. The main
SP sub-families are
trypases, which cleave after arginine or lysine; aspartases, which cleave
after aspartate; chymases,
which cleave after phenylalanine or leucine; metases, which cleavage after
methionine; and serases
s5 wluch cleave after serine. Enterokinase, the initiator of intestinal
digestion, is a serine protease found
in the intestinal brush border, where it cleaves the acidic propeptide from
trypsinogen to yield active
trypsin (Kitamoto, Y. et al. (1994) Proc. Natl. Acad. Sci. USA 91:7588-7592).
Prolylcarboxypeptidase, a lysosomal serine peptidase that cleaves peptides
such as angiotensin II and
III and [des-Arg9] bradykinin, shares sequence homology with members of both
the serine
zo carboxypeptidase and prolylendopeptidase families (Tan, F. et al. (1993) J.
Biol. Chem. 268:16631-
16638).
Cysteine proteases (CPs) have a cysteine as the major catalytic residue at an
active site
where catalysis proceeds via an intermediate thiol ester and is facilitated by
adjacent histidine and
aspartic acid residues. CPs are involved in diverse cellular processes ranging
from the processing of
25 precursor proteins to intracellular degradation. Mammalian CPs include
lysosomal cathepsins and
cytosolic calcium activated proteases, calpains. CPs are produced by
monocytes, macrophages and
other cells of the immune system which migrate to sites of inflammation and
secrete molecules
involved in tissue repair. Overabundance of these repair molecules plays a
role in certain disorders.
In autoimmune diseases such as rheumatoid arthritis, secretion of the cysteine
peptidase cathepsin C
3 o degrades collagen, lami'~, elastin and other structural proteins found in
the extracellular matrix of
bones.
Aspartic proteases are members of the cathepsin family of lysosomal proteases
and include
pepsin A, gastricsin, chymosin, renin, and cathepsins D and E. Aspartic
proteases have a pair of
aspartic acid residues in the active site, and are most active in the pH 2 - 3
range, in which one of the
3 5 aspartate residues is ionized, the other un-ionized. Aspartic proteases
include bacterial penicillopepsin,


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
mammalian pepsin, renin, chymosin, and certain fungal proteases. Abnormal
regulation and
expression of cathepsins is evident in various inflammatory disease states. In
cells isolated from
inflamed synovia, the mRNA for stromelysin, cytokines, T)MP-1, cathepsin,
gelatinise, and other
molecules is preferentially expressed. Expression of cathepsins L and D is
elevated in synovial tissues
from patients with rheumatoid arthritis and osteoarthritis. Cathepsin L
expression may also contribute
to the influx of mononuclear cells which exacerbates the destruction of the
rheumatoid synovium.
(Keyszer, G.M. (1995) Arthritis Rheum. 38:976-984.) The increased expression
and differential
regulation of the cathepsins are linked to the metastatic potential of a
variety of cancers and as such
are of therapeutic and prognostic interest (Chambers, A.F. et al. (1993) Crit.
Rev. Oncog. 4:95-114).
so Metalloproteases have active sites that include two glutamic acid residues
and one histidine
residue that serve as binding sites for zinc. Carboxypeptidases A and B are
the principal mammalian
metalloproteases. Both are exoproteases of similar structure and active sites.
Carboxypeptidase A,
like chymotrypsin, prefers C-terminal aromatic and aliphatic side chains of
hydrophobic nature,
whereas carboxypeptidase B is directed toward basic arginine and lysine
residues. Glycoprotease
(GCP), or O-sialoglycoprotein endopeptidase, is a metallopeptidase which
specifically cleaves
O-sialoglycoproteins such as glycophorin A. Another metallopeptidase,
placental leucine
aminopeptidase (P-LAP) degrades several peptide hormones such as oxytocin and
vasopressin,
suggesting a role in maintaining homeostasis during pregnancy, and is
expressed in several tissues
(Rogi, T. et al. (1996) J. Biol. Chem. 271:56-61).
2o Ubiquitin proteases are associated with the ubiquitin conjugation system
(UCS), a major
pathway for the degradation of cellular proteins in eukaryotic cells and some
bacteria. The UCS
mediates the elimination of abnormal proteins and regulates the half lives of
important regulatory
proteins that control cellular processes such as gene transcription and cell
cycle progression. In the
UCS pathway, proteins targeted for degradation are conjugated to a ubiquitin,
a small heat stable
~ 5 protein. The ubiquitinated protein is then recognized and degraded by
proteasome, a large,
multisubunit proteolytic enzyme complex, and ubiquitin is released for
reutilization by ubiquitin
protease. The UCS is implicated in the degradation of mitotic cyclic kinases,
oncoproteins, tumor
suppressor genes such as p53, viral proteins, cell surface receptors
associated with signal txansduction,
transcriptional regulators, and mutated or damaged proteins (Ciechanover, A.
(1994) Cell 79:13-21).
3 o A murine proto-oncogene, Unp, encodes a nuclear ubiquitin protease whose
overexpression leads to
oncogenic transformation of N»i3T3 cells, and the human homolog of this gene
is consistently
elevated in small cell tumors and adenocarcinomas of the lung (Gray, D.A.
(1995) Oncogene 10:2179-
2183).
Signal Peptidases
3 5 The mechanism for the translocation process into the endoplasmic reticulum
(ER) involves the
71


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
recognition of an N-terminal signal peptide on the elongating protein. The
signal peptide directs the
protein and attached ribosome to a receptor on the ER membrane. The
polypeptide chain passes
through a pore in the ER membrane into the lumen while the N-terminal signal
peptide remains
attached at the membrane surface. The process is completed when signal
peptidase located inside the
s ER cleaves the signal peptide from the protein and releases the protein into
the lumen.
Protease Inhibitors
Protease inhibitors and other regulators of protease activity control the
activity and effects of
proteases. Protease inhibitors have been shown to control pathogenesis in
animal models of
proteolytic disorders (Murphy, G. (1991) Agents Actions Suppl. 35:69-76). Low
levels of the
o cystatins, low molecular weight inhibitors of the cysteine proteases,
correlate with malignant
progression of tumors. (Catkins, C. et al (1995) Biol. Biochem. Hoppe Seyler
376:71-80). Serpins are
inhibitors of mammalian plasma serine proteases. Many serpins serve to
regulate the blood clotting
cascade and/or the complement cascade in mammals. 5p32 is a positive regulator
of the mammalian
acrosomal protease, acrosin, that binds the proenzyme, proacrosin, and thereby
aides in packaging the
15 enzyme into the acrosomal matrix (Baba, T. et al. (1994) J. Biol. Chem.
269:10133-10140). The
Kunitz family of serine protease inhibitors are characterized by one or more
"Kunitz domains"
containing a series of cysteine residues that are regularly spaced over
approximately 50 amino acid
residues and form three intrachain disulfide bonds. Members of this family
include aprotinin, tissue
factor pathway inhibitor (TFPI-1 and TFPI-2), inter-a-trypsin inhibitor, and
bikunin. (Manor, C.W. et
ao al. (1997) J. Biol. Chem. 272:12202-12208.) Members of this family are
potent inhibitors (in the
nanomolar range) against serine proteases such as kallikxein and plasmin.
Aprotinin has clinical utility
in reduction of perioperative blood loss.
A major portion of all proteins synthesized in eukaryotic cells are
synthesized on the cytosolic
surface of the endoplasmic reticulum (ER). Before these immature proteins are
distributed to other
2 s organelles in the cell or are secreted, they must be transported into the
interior lumen of the ER where
post-translational modifications are performed. These modifications include
protein folding and the
formation of disulfide bonds, and N-linked glycosylations.
Protein Isomerases
Protein folding in the ER is aided by two principal types of protein
isomerases, protein disulfide
s o isomerase (PDI), and peptidyl-prolyl isomerase (PPI). PDI catalyzes the
oxidation of free sulfhydryl
groups in cysteine residues to form intramolecular disulfide bonds in
proteins. PPI, an enzyme that
catalyzes the isomerization of certain proliue imidic bonds in oligopeptides
and proteins, is considered
to govern one of the rate limiting steps in the folding of many proteins to
their final functional
conformation. The cyclophilins represent a major class of PPI that was
originally identified as the
72


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
major receptor for the sm_m__unosuppressW a drug cyclosporin A
(Handschumacher, R.E. et al. (1984)
Science 226: 544-547).
Protein Glycosylation
The glycosylation of most soluble secreted and membrane-bound proteins by
oligosaccharides
linked to asparagine residues in proteins is also performed in the ER. This
reaction is catalyzed by a
membrane-bound enzyme, oligosaccharyl transferase. Although the exact purpose
of this "N-linked"
glycosylation is unknown, the presence of oligosaccharides tends to make a
glycoprotein resistant to
protease digestion. In addition, oligosaccharides attached to cell-surface
proteins called selectins are
known to function in cell-cell adhesion processes (Alberts, B. et aI. (1994)
Molecular Biology of the
so Cell, Garland Publishing Co., New York NY, p.608). "O-linked" glycosylation
of proteins also occurs
in the ER by the addition of N-acetylgalactosamine to the hydroxyl group of a
serine or threonine
residue followed by the sequential addition of other sugar residues to the
first. This process is
catalysed by a series of glycosyltransferases each specific for a particular
donor sugar nucleotide and
acceptor molecule (Lodish, H. et al. (1995) Molecular Cell Biology, W.H.
Freeman and Co., New
York NY, pp.700-708). In many cases, both N- and O-linked oligosaccharides
appear to be required
for the secretion of proteins or the movement of plasma membrane glycoproteins
to the cell surface.
An additional glycosylation mechanism operates in the ER specifically to
target lysosomal
enzymes to lysosomes and prevent their secretion. Lysosomal enzymes in the ER
receive an N-linked
oligosaccharide, like plasma membrane arid secreted proteins, but are then
phosphorylated on one or
2 o two mannose residues. The phosphorylation of mannose residues occurs in
two steps, the first step
being the addition of an N-acetylglucosamine phosphate residue by N-
acetylglucosamine
phosphotransferase, and the second the removal of the N-acetylglucosamine
group by
phosphodiesterase. The phosphorylated mannose residue then targets the
lysosomal enzyme to a
mannose 6-phosphate receptor which transports it to a lysosome vesicle
(Lodish, supra, pp. 708-711).
Chaperones
Molecular chaperones are proteins that aid in the proper folding of immature
proteins and
refolding of improperly folded ones, the assembly of protein subunits, and in
the transport of unfolded
proteins across membranes. Chaperones are also called heat-shock proteins
(hsp) because of their
tendency to be expressed in dramatically increased amounts following brief
exposure of cells to
3 o elevated temperatures. This latter property most likely reflects their
need in the refolding of proteins
that have become denatured by the high temperatures. Chaperones may be divided
into several
classes according to their location, function, and molecular weight, and
include hsp60, TCP1, hsp70,
hsp40 (also called DnaJ), and hsp90. For example, hsp90 binds to steroid
hormone receptors,
represses transcription in the absence of the ligand, and provides proper
folding of the ligand-binding
domain of the receptor in the presence of the hormone (Burston, S.G. and A.R.
Clarke (1995) Essays
73


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Biochem. 29:125-136). Hsp60 and hsp70 chaperones aid in the transport and
folding of newly
synthesized proteins. Hsp70 acts early in protein folding, binding a newly
synthesized protein before it
leaves the ribosome and trausporting the protein to the mitochondria or ER
before releasing the folded
protein. Hsp60, along with hspl0, binds misfolded proteins and gives them the
opportunity to refold
s correctly. All chaperones share an affinity for hydrophobic patches on
incompletely folded proteins
and the ability to hydrolyze ATP. The energy of ATP hydrolysis is used to
release the hsp-bound
protein in its properly folded state (Alberts, su ra, pp 214, 571-572).
Nucleic Acid Synthesis and Modification Molecules
~.o Polymerases
DNA and RNA replication are critical processes for cell replication and
function. DNA and
RNA replication are mediated by the enzymes DNA and RNA polymerase,
respectively, by a
"templating" process in which the nucleotide sequence of a DNA or RNA strand
is copied by
complementary base-pairing into a complementary nucleic acid sequence of
either DNA or RNA.
15 However, there are fundamental differences between the two processes.
DNA polymerase catalyzes the stepwise addition of a deoxyribonucleotide to the
3'-OH end
of a polynucleotide strand (the primer strand) that is paired to a second
(template) strand. The new
DNA strand therefore grows in the 5' to 3' direction (Alberts, B. et al.
(1994) The Molecular Biology
of the Cell, Garland Publishing Inc., New York NY, pp. 251-254). The
substrates for the
~o polymerization reaction are the corresponding deoxynucleotide triphosphates
which must base-pair
with the correct nucleotide on the template strand in order to be recognized
by the polymerase.
Because DNA exists as a double-stranded helix, each of the two strands may
serve as a template for
the formation of a new complementaxy strand. Each of the two daughter cells of
the dividing cell
therefore inherits a new DNA double helix containing one old and one new
strand. Thus, DNA is said
25 to be replicated "semiconservatively" by DNA polymerase. Iu addition to the
synthesis of new DNA,
DNA polymerase is also involved in the repair of damaged DNA as discussed
below under "Ligases."
In contrast to DNA polymerase, RNA polymerase uses a DNA template strand to
"transcribe" DNA into RNA using ribonucleotide triphosphates as substrates.
Like DNA
polymerization, RNA polymerization proceeds in a 5' to 3' direction by
addition of a ribonucleoside
3 o monophosphate to the 3'-OH end of a growing RNA chain. DNA trauscription
generates messenger
RNAs (mRNA) that carry information for protein synthesis, as well as the
transfer, ribosomal, and
other RNAs that have structural or catalytic functions. In eukaryotes, three
discrete RNA
polymerases synthesize the three different types of RNA (Alberts, supra, pp.
367-368). RNA
polymerase I makes the large ribosomal RNAs, RNA polymerase II makes the mRNAs
that will be
3 5 translated into proteins, and RNA polymerase DI makes a variety of small,
stable RNAs, including 5S
74


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
ribosomal RNA and the transfer RNAs (tRNA). In all cases, RNA synthesis is
initiated by binding of
the RNA polymerase to a promoter region on the DNA arid synthesis begins at a
start site within the
promoter. Synthesis is completed at a broad, general stop or termination
region in the DNA where
both the polymerase and the completed RNA chain are released.
Ligases
DNA repair is the process by which accidental base changes, such as those
produced by
oxidative damage, hydrolytic attack, or uncontrolled methylation of DNA are
corrected before
replication or transcription of the DNA can occur. Because of the efficiency
of the DNA repair
process, fewer than one in one thousand accidental base changes causes a
mutation (Alberts, supra,
Zo pp. 245-249). The three steps common to most types of DNA repair are (1)
excision of the damaged
or altered base or nucleotide by DNA nucleases, leaving a gap; (2) insertion
of the correct nucleotide
in this gap by DNA polymerase using the complementary strand as the template;
and (3) sealing the
break left between the inserted nucleotides) and the existing DNA strand by
DNA ligase. In the last
reaction, DNA ligase uses the energy from ATP hydrolysis to activate the 5'
end of the broken
phosphodiester bond before forming the new bond with the 3'-OH of the DNA
strand. In Bloom's
syndrome, an inherited human disease, individuals are partially deficient in
DNA ligation and
consequently have an increased incidence of cancex (Alberts, s_u~ra, p. 247).
Nucleases
Nucleases comprise both enzymes that hydrolyze DNA (DNase) and RNA (RNase).
They
2 o serve different purposes in nucleic acid metabolism. Nucleases hydrolyze
the phosphodiester bonds
between adjacent nucleotides either at internal positions (endonucleases) or
at the terminal 3' or 5'
nucleotide positions (exonucleases). A DNA exonuclease activity in DNA
polymerase, for example,
serves to remove improperly paired nucleotides attached to the 3'-OH end of
the growing DNA strand
by the polymerase and thereby serves a "proofreading" function. As mentioned
above, DNA
2 s endonuclease activity is involved in the excision step of the DNA repair
process.
RNases also serve a variety of functions. For example, RNase P is a
ribonucleoprotein
enzyme which cleaves the 5' end of pre-tRNAs as part of their maturation
process. RNase H digests
the RNA strand of an RNA/DNA hybrid. Such hybrids occur in cells invaded by
retroviruses, and
RNase H is an important enzyme in the retroviral replication cycle. Pancreatic
RNase secreted by
3 o the pancreas into the intestine hydrolyzes RNA present in ingested foods.
RNase activity in serum
and cell extracts is elevated in a variety of cancers and infectious diseases
(Schein, C.H. (1997) Nat.
Biotechnol. 15:529-536). Regulation of RNase activity is being investigated as
a means to control
tumor angiogenesis, allergic reactions, viral infection and replication, and
fungal infections.
Methylases
3 5 Methylation of specific nucleotides occurs in both DNA and RNA, and serves
different


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
functions in the two macromolecules. Methylation of cytosine residues to form
5-methyl cytosine in
DNA occurs specifically at CG sequences which are base-paired with one another
in the DNA
double-helix. This pattern of methylation is passed from generation to
generation during DNA
replication by an enzyme called "maintenance methylase" that acts
preferentially on those CG
s sequences that are base-paired with a CG sequence that is already
methylated. Such methylation
appears to distinguish active from inactive genes by preventing the binding of
regulatory proteins that
"turn on" the gene, but permit the binding of proteins that inactivate the
gene (Alberts, su ra, pp. 448-
451). In RNA metabolism, "tRNA methylase" produces one of several nucleotide
modifications in
tRNA that affect the conformation and base-pairing of the molecule and
facilitate the recognition of
1o the appropriate mRNA codons by specific tRNAs. The primary methylation
pattern is the
dimethylation of guanine residues to form N,N-dimethyl guanine.
Helicases and Sin 1g e-Stranded Binding Proteins
Helicases are enzymes that destabilize and unwind double helix structures in
both DNA and
RNA. Since DNA replication occurs more or less simultaneously on both strands,
the two strands
is must first separate to generate a replication "fork" for DNA polymerase to
act on. Two types of
replication proteins contribute to this process, DNA helicases and single-
stranded binding proteins.
DNA helicases hydrolyze ATP and use the energy of hydrolysis to separate the
DNA strands.
Single-stranded binding proteins (SSBs) then bind to the exposed DNA strands
without covering the
bases, thereby temporarily stabilizing them for templating by the DNA
polymerase (Alberts, su ra, pp.
z o 255-256).
RNA helicases also alter and regulate RNA conformation and secondary
structure. Like the
DNA helicases, RNA helicases utilize energy derived from ATP hydrolysis to
destabilize and unwind
RNA duplexes. The most well-characterized and ubiquitous family of RNA
helicases is the DEAD-
box family, so named for the conserved B-type ATP-binding motif which is
diagnostic of proteins in
~ s this family. Over 40 DEAD-box helicases have been identified in organisms
as diverse as bacteria,
insects, yeast, amphibians, mammals, and plants. DEAD box helicases function
in diverse processes
such as translation initiation, splicing, ribosome assembly, and RNA editing,
transport, and stability.
Some DEAD box helicases play tissue- and stage-specific roles in
spermatogenesis and
embryogenesis. Overexpression of the DEAD box 1 protein (DDX1) may play a role
in the
s o progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout,
R. et al. (1998) J. Biol.
Chem. 273:21161-21168). These observations suggest that DDX1 may promote or
enhance tumor
progression by altering the normal secondary structure and expression levels
of RNA in cancer cells.
Other DEAD box helicases have been implicated either directly or indirectly in
tumorigenesis
(Discussed in Godbout, supra). For example, murine p68 is mutated in
ultraviolet light-induced tumors,
3 s and human DDX6 is located at a chromosomal breakpoint associated with B-
cell lymphoma.
76


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
l Similarly, a chimeric protein comprised of DDX10 and NUP98, a nucleoporin
protein, may be involved
in the pathogenesis of certaiu myeloid malignancies.
To_poisomerases
Besides the need to separate DNA strands prior to replication, the two strands
must be
'unwound" from one another prior to their separation by DNA helicases. This
function is performed
by proteins known as DNA topoisomerases. DNA topoisomerase effectively acts as
a reversible
nuclease that hydrolyzes a phosphodiesterase bond in a DNA strand, permitting
the two strands to
rotate freely about one another to remove the strain of the helix, and then
rejoins the original
phosphodiester bond between the two strands. Two types of DNA topoisomerase
exist, types I and
1o II. DNA Topoisomerase I causes a single-strand break in a DNA helix to
allow the rotation of the
two strands of the helix about the remaining phosphodiester bond in the
opposite strand. DNA
topoisomerase II causes a transient break in both strands of a DNA helix where
two double helices
cross over one another. This type of topoisomerase can efficiently separate
two interlocked DNA
circles (Alberts, su ra, pp.260-262). Type II topoisomerases are largely
confined to proliferatiug cells
in eukaryotes, such as cancer cells. For this reason they are targets for
auticancer drugs.
Topoisomerase II has been implicated in multi-drug resistance (MDR) as it
appears to aid in the repair
of DNA damage inflicted by DNA binding agents such as doxorubicin and
vincristine.
Recombinases
Genetic recombination is the process of rearranging DNA sequences within an
organism's
z o genome to provide genetic variation for the organism in response to
chauges in the environment. DNA
recombination allows variation in the particular combination of genes present
in an individual's
genome, as well as the timing and level of expression of these genes (see
Alberts, supra, pp. 263-273).
Two broad classes of genetic recombination are commonly recognized, general
recombination and
site-specific recombination. General recombination involves genetic exchange
between any
homologous pair of DNA sequences usually located on two copies of the same
chromosome. The
process is aided by enzymes called recombinases that "nick" one strand of a
DNA duplex more or
less randomly and permit exchange with the complementary strand of another
duplex. The process
does not normally change the arrangement of genes on a chromosome. In site-
specific recombination,
the recombinase recognizes specific nucleotide sequences present in one or
both of the recombining
s o molecules. Base-pairing is not involved in this form of recombination and
therefore does not require
DNA homology between the recombining molecules. Unlike general recombination,
this form of
recombination can alter the relative positions of nucleotide sequences in
chromosomes.
Splicing Factors
Various proteins are necessary for processing of trauscribed RNAs in the
nucleus. Pre-
3 5 mRNA processing steps include capping at the 5' end with methylguanosine,
polyadenylating the 3'
77


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
end, and splicing to remove iutrons. The primary RNA transcript from DNA is a
faithful copy of the
gene containing both exon and intron sequences, and the latter sequences must
be cut out of the RNA
transcript to produce an mRNA that codes for a protein. This "splicing" of the
mRNA sequence takes
place in the nucleus with the aid of a large, multicomponent ribonucleoprotein
complex known as a
spliceosome. The spliceosomal complex is composed of five small nuclear
ribonucleoprotein particles
(snRNPs) designated U1, U2, U4, U5, and U6, and a number of additional
proteins. Each snRNP
contains a single species of snRNA and about ten proteins. The RNA components
of some snRNPs
recognize and base pair with intron consensus sequences. The protein
components mediate
spliceosome assembly and the splicing reaction. Autoantibodies to snRNP
proteins are found in the
so blood of patients with systemic lupus erythematosus (Stryer, L. (1995)
Biochemistry, W.H. Freeman
and Company, New York NY, p. 863).
Adhesion Molecules
The surface of a cell is rich in transmembrane proteoglycans, glycoproteins,
glycolipids, and
1s receptors. These macromolecules mediate adhesion with other cells and with
components of the
extracellular matrix (ECM). The interaction of the cell with its surroundings
profoundly influences cell
shape, strength, flexibility, motility, and adhesion. These dynamic properties
are iutimately associated
with signal transduction pathways controlling cell proliferation and
differentiation, tissue construction,
and embryonic development.
2 o Cadherins
Cadherins comprise a family of calcium-dependent glycoproteins that function
in mediating
cell-cell adhesion in virtually all solid tissues of multicellular organisms.
These proteins share multiple
repeats of a cadheriu-specific motif, and the repeats form the folding units
of the cadherin
extracellular domain. Cadherin molecules cooperate to form focal contacts, or
adhesion plaques,
2s between adjacent epithelial cells. The cadherin family includes the
classical cadherius and
protocadherins. Classical cadherins include the E-cadherin, N-cadherin, and P-
cadherin subfamilies.
E-cadherin is present on many types of epithelial cells and is especially
important for embryonic
development. N-cadherin is present on nerve, muscle, and lens cells and is
also critical for embryonic
development. P-cadherin is present on cells of the placenta and epidermis.
Recent studies report that
s o protocadherins are involved in a variety of cell-cell interactions
(Suzuki, S.T. (1996) J. Cell Sci.
109:2609-2611). The intracellular auchorage of cadherins is regulated by their
dynamic association
with catenins, a family of cytoplasmic signal transduction proteins associated
with the actin
cytoskeleton. The anchorage of cadherins to the actin cytoskeleton appears to
be regulated by protein
tyrosine phosphorylation, and the cadherins are the target of phosphorylation-
induced functional
35 disassembly (Aberle, H. et al. (1996) J. Cell. Biochem. 61:514-523).
78


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
111te~rius
.Integrins are ubiquitous transmembrane adhesion molecules that link the ECM
to the internal
cytoskeleton. Iutegrins are composed of two noncovalently associated
transmembrane glycoprotein
subunits called a and J3. Integrins function as receptors that play a role in
signal transduction. For
s example, binding of integrin to its extracellular ligand may stimulate
changes in intracellular calcium
levels or protein kinase activity (Sjaastad, M.D. and W.J. Nelson (1997)
BioEssays 19:47-55). At
least ten cell surface receptors of the integrin family recognize the ECM
component fibronectin, which
is involved in many different biological processes including cell migration
and embryogenesis
(Johansson, S. et al. (1997) Front. Biosci. 2:D126-D146).
so Lectins
Lectins comprise a ubiquitous family of extracellular glycoproteins which bind
cell surface
carbohydrates specifically and reversibly, resulting in the agglutination of
cells (reviewed in
Drickamer, K, and M.E. Taylor (1993) Annu. Rev. Cell Biol. 9:237-264). This
function is particularly
important for activation of the immune response. Lectins mediate the
agglutination and mitogenic
25 stimulation of lymphocytes at sites of inflammation (Lasky, L.A. (1991) J.
Cell. Biochem. 45:139-146;
Paietta, E. et al. (1989) J. Tmrrmunol. 143:2850-2857).
Lectins are further classified into subfamilies based on carbohydrate-binding
specificity and
other criteria. The galectin subfamily, in particular, includes lectins that
bind J3-galactoside,
carbohydrate moieties in a thiol-dependent manner (reviewed in Hadari, Y.R. et
al. (1998) J. Biol.
2o Chem. 270:3447-3453). Galectins are widely expressed and developmentally
regulated. Because alt
galectins lack an N-terminal signal peptide, it is suggested that galectins
are externalized through an
atypical secretory mechanism. Two classes of galectins have been defined based
on molecular
weight and oligomerization properties. Small galectins form homodimers and are
about 14 to 16
kilodaltons in mass, while large galectins are monomeric and about 29-37
kilodaltons.
25 Galectins contain a characteristic carbohydrate recognition domain (CRD).
The CRD is
about 140 amino acids and contains several stretches of about 1 - 10 amino
acids which are highly
conserved among all galectins. A particular 6-amino acid motif within the CRD
contains conserved
tryptophan and arginine residues which are critical for carbohydrate binding.
The CRD of some
galectins also contains cysteine residues which may be important for disulfide
bond formation.
3 o Secondary structure predictions indicate that the CRD forms several (3-
sheets.
Galectins play a number of roles in diseases and conditions associated with
cell-cell and cell-
matrix interactions. For example, certain galectins associate with sites of
inflammation and bind to cell
surface immunoglobulin E molecules. In addition, galectins may play an
important role in cancer
metastasis. Galectin overexpression is correlated with the metastatic
potential of cancers in humans
35 and mice. Moreover, anti-galectin antibodies inhibit processes associated
with cell transformation,
79


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
such as cell aggregation and anchorage-independent growth (See, for example,
Su, Z.-Z. et al. (1996)
Proc. Natl. Acad. Sci. USA 93:7252-7257).
Selectins
Selectins, or LEC-CAMS, comprise a specialized lectin subfamily involved
primarily in
s inflammation and leukocyte adhesion (Reviewed in Lasky, su ra). Selectins
mediate the recruitment
of leukocytes from the circulation to sites of acute inflammation and are
expressed on the surface of
vascular endothelial cells in response to cytokine signaling. Selectins bind
to specific ligands on the
leukocyte cell membrane and enable the leukocyte to adhere to and migrate
along the endothelial
surface. Binding of selectin to its ligand leads to polarized rearraugement of
the actin cytoskeleton
so and stimulates signal transduction within the leukocyte (Brenner, B. et al.
(1997) Biochem. Biophys. .
Res. Commun. 231:802-807; Hidari, K.I. et al. (1997) J. Biol. Chem. 272:28750-
28756). Members of
the selectin family possess three characteristic motifs: a lectin or
carbohydrate recognition domain; an
epidermal growth factor-like domain; and a variable number of short consensus
repeats (scr or "sushi"
repeats) which are also pxesent in complement regulatory proteins. The
selectins include lymphocyte
15 adhesion molecule-1 (Lam-1 or L-selectin), endothelial leukocyte adhesion
molecule-1 (ELAM-1 or E-
selectin), and granule membrane protein-140 (GMP-140 or P-selectin) (Johnston,
G.I. et al. (1989)
Cell 56:1033-1044).
Antigen Recognition Molecules
2 o All vertebrates have developed sophisticated and complex immune systems
that provide
protection from viral, bacterial, fungal, and parasitic infections. A key
feature of the immune system is
its ability to distinguish foreign molecules, or antigens, from "self'
molecules. This ability is mediated
primarily by secreted and transmembrane proteins expressed by leukocytes
(white blood cells) such as
lymphocytes, granulocytes, and monocytes. Most of these proteins belong to the
immunoglobulin (Ig)
2 s superfamily, members of which contain one or more repeats of a conserved
structural domain. This
Ig domain is comprised of antiparallel (3 sheets joined by a disulfide bond in
an arrangement called the
Ig fold. Members of the Ig superfamily include T-cell receptors, major
histocompatibility (MHC)
proteins, antibodies, and immune cell-specific surface markers such as CD4,
CDB, and CD28.
MHC proteins are cell surface markers that bind to and present foreign
antigens to T cells.
s o MHC molecules are classified as either class I or class II. Class I MHC
molecules (MHC I) are
expressed on the surface of almost all cells and are involved in the
presentation of antigen to cytotoxic
T cells. For example, a cell infected with virus will degrade intracellular
viral proteins and express the
protein fragments bound to MHC I molecules on the cell surface. The MHC
T/antigen complex is
recognized by cytotoxic T-cells which destroy the infected cell and the virus
within. Class 1I MHC
s s molecules are expressed primarily on specialized antigen-presenting cells
of the immune system, such


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
as B-cells and macrophages. These cells ingest foreign proteins from the
extracellular fluid and
express MHC IT/antigen complex on the cell surface. This complex activates
helper T-cells, which
then secrete cytokines and other factors that stimulate the immune response.
MHC molecules also
play an important role in organ rejection following transplantation. Rejection
occurs when the
recipient's T-cells respond to foreign MHC molecules on the transplanted organ
in the same way as to
self MHC molecules bound to foreign antigen. (Reviewed in Alberts, B. et al.
(1994) Molecular
Biology of the Cell, Garland Publishing, New York NY, pp. 1229-1246.)
Antibodies, or i_m_m__unoglobulins, are either expressed on the surface of B-
cells or secreted by
B-cells into the circulation. Antibodies bind and neutralize foreign antigens
in the blood and other
1o extracellular fluids. The prototypical antibody is a tetramer consisting of
two identical heavy
polypeptide chains (H-chains) and two identical light polypeptide chains (L-
chains) interlinked by
disulfide bonds. This arrangement confers the characteristic Y-shape to
antibody molecules.
Antibodies are classified based on their H-chain composition. The five
antibody classes, IgA, IgD,
IgE, IgG and IgM, are defined by the a, 8, a, y, and ~, H-chain types. There
are two types of L-chains,
x and a,, either of which may associate as a pair with any H-chain pair. IgG,
the most common class
of antibody found in the circulation, is tetrameric; while the other classes
of antibodies are generally
variants or multimers of this basic structure.
H-chains and L-chains each contain an N-terminal variable region and a C-
terminal constant
region. The constant region consists of about 110 amino acids in L-chains and
about 330 or 440 amino
z o acids in H-chains. The amino acid sequence of the constant region is
nearly identical among H- or L-
chains of a particular class. The variable region consists of about 110 amino
acids in both H- and L-
chains. However, the amino acid sequence of the variable region differs among
H- or L-chains of a
particular class. Within each H- or L-chain variable region are three
hypervariable regions of
extensive sequence diversity, each consisting of about 5 to 10 amino acids. In
the antibody molecule,
2 5 the H- and L-chain hypervariable regions come together to form the antigen
recognition site.
(Reviewed in Alberts, supra, pp. 1206-1213 and 1216-1217.)
Both H-chains and L-chains contain repeated Ig domains. For example, a typical
H-chain
contains four Ig domains, three of which occur within the constant region and
one of which occurs
within the variable region and contributes to the formation of the antigen
recognition site. Likewise, a
3 o typical L-chain contains two Ig domains, one of which occurs within the
constant region and one of
which occurs within the variable region.
The immune system is capable of recognizing and responding to any foreign
molecule that
enters the body. Therefore, the immune system must be armed with a full
repertoire of antibodies
against all potential antigens. Such antibody diversity is generated by
somatic rearrangement of gene
3 5 segments encoding variable and constant regions. These gene segments are
joined together by site-
81


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
specific recombination which occurs between highly conserved DNA sequences
that flank each gene
segment. Because there are hundreds of different gene segments, millions of
unique genes can be
generated combinatorially. In addition, imprecise joining of these segments
and an unusually high rate
of somatic mutation within these segments further contribute to the generation
of a diverse antibody
s population.
T-cell receptors are both structurally and functionally related to antibodies.
(Reviewed in
Alberts, supra, pp. 1228-1229.) T-cell receptors are cell surface proteins
that bind foreign autigens
and mediate diverse aspects of the immune response. A typical T-cell receptor
is a heterodimer
comprised of two disulfide-linked polypeptide chains called a and (3. Each
chain is about 280 amino
io acids in length and contains one variable region and one constant region.
Each variable or constant
region folds into an Ig domain. The variable regions from the a and (3 chains
come together in the
heterodimer to form the antigen recognition site. T-cell receptor diversity is
generated by somatic
rearrangement of gene segments encoding the a and (3 chains. T-cell receptors
recognize small
peptide antigens that are expressed on the surface of antigen-presenting cells
and pathogen-infected
15 cells. These peptide antigens are presented on the cell surface in
association with major
histocompatibility proteins wluch provide the proper context for antigen
recognition.
Secreted and Extracellular Matrix Molecules
Protein secretion is essential for cellular function. Protein secretion is
mediated by a signal
2 o peptide located at the amino terminus of the protein to be secreted. The
signal peptide is comprised of
about ten to twenty hydrophobic amino acids which target the nascent protein
from the ribosome to
the endoplasmic reticulum (ER). Proteins targeted to the ER may either proceed
through the
secretory pathway or remain in any of the secretory organelles such as the ER,
Golgi apparatus, or
lysosomes. Proteins that transit through the secretory pathway are either
secreted into the
25 extracellular space or retained in the plasma membrane. Secreted proteins
are often synthesized as
.active precursors that are activated by post-translational processing events
during transit through the
secretory pathway. Such events include glycosylation, proteolysis, and removal
of the signal peptide
by a signal peptidase. Other events that may occur during protein transport
include chaperone-
dependent unfolding and folding of the nascent protein and interaction of the
protein with a receptor or
3 o pore complex. Examples of secreted proteins with amino terminal signal
peptides include receptors,
extracellular matrix molecules, cytokines, hormones, growth and
differentiation factors, neuropeptides,
vasomediators, ion channels, trausporters/pumps, and proteases. (Reviewed in
Alberts, B. et al.
(1994) Molecular Biology of The Cell, Garland Publishing, New York NY, pp. 557-
560, 582-592.)
The extracellular matrix (ECM) is a complex network of glycoproteins,
polysaccharides,
35 proteoglycans, and other macromolecules that are secreted from the cell
into the extracellular space.
82


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
lne m:m remains m close association wttn the cell surrace ana proviaes a
supportive mesnworK that
profoundly influences cell shape, motility, strength, flexibility, and
adhesion. In fact, adhesion of a cell
to its surrounding matrix is required for cell survival except in the case of
metastatic tumor cells, which
have overcome the need for cell-ECM anchorage. This phenomenon suggests that
the ECM plays a
s critical role in the molecular mechanisms of growth control and metastasis.
(Reviewed in Ruoslahti,
E. (1996) Sci. Am. 275:72-77.) Furthermore, the ECM determines the structure
and physical
properties of connective tissue and is particularly important for
morphogenesis and other processes
associated with embryonic development and pattern formation.
The collagens comprise a family of ECM proteins that provide structure to
bone, teeth, skin,
ligaments, tendons, cartilage, blood vessels, and basement membranes. Multiple
collagen proteins
have been identified. Three collagen molecules fold together in a triple helix
stabilized by interchain
disulfide bonds. Bundles of these triple helices then associate to form
fibrils. Collagen primary
structure consists of hundreds of (Gly-X-Y) repeats where about a third of the
X and Y residues are
Pro. Glycines are crucial to helix formation as the bulkier amino acid
sidechains cannot fold into the
1s triple helical conformation. Because of these strict sequence requirements,
mutations in collagen
genes have severe consequences. Osteogenesis imperfecta patients have brittle
bones that fracture
easily; in severe cases patients die in utero or at birth. Ehlers-Danlos
syndrome patients have
hyperelastic skin, hypermobile joints, and susceptibility to aortic and
intestinal rupture.
Chondrodysplasia patients have short stature and ocular disorders. Alport
syndrome patients have
2o hematuria, sensorineural deafness, and eye lens deformation. (Isselbacher,
K.J. ~et al. (1994)
Harrison's Principles of Internal Medicine, McGraw-Hill, Inc., New York NY,
pp. 2105-2117; and
Creighton, T.E. (1984) Proteins, Structures and Molecular Principles, W.H.
Freeman and Company,
New York NY, pp. 191-197.)
Elastin and related proteins confer elasticity to tissues such as skin, blood
vessels, and lungs.
2 s Elastin is a highly hydrophobic protein of about 750 amino acids that is
rich in proline and glycine
residues. Elastin molecules are highly cross-linked, forming an extensive
extracellular network of
fibers and sheets. Elastin fibers are surrounded by a sheath of microfibrils
which are composed of a
number of glycoproteins, including fibrillin. Mutations in the gene encoding
fibrillin are responsible for
Marfan's syndrome, a genetic disorder characterized by defects in connective
tissue. In severe cases,
3 o the aortas of afflicted individuals are prone to rupture. (Reviewed in
Alberts, su ra, pp. 984-986.)
Fibronectin is a large ECM glycoprotein found in all vertebrates. Fibronectin
exists as a dimer
of two subunits, each containing about 2,500 amino acids. Each subunit folds
into a rod-like structure
containing multiple domains. The domains each contain multiple repeated
modules, the most common
of which is the type III fibronectin repeat. The type III fibronectin repeat
is about 90 amino acids in
3 5 length and is also found in other ECM proteins and in some plasma membrane
and cytoplasmic
83


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
proteins. Furthermore, some type Bl fibronectin repeats contain a
characteristic trtpeptide consisting
of Arginine-Glycine-Aspartic acid (RGD). The RGD sequence is recognized by the
integrin family of
cell surface receptors and is also found in other ECM proteins. Disruption of
both copies of the gene
encoding fibronectin causes early embryonic lethality in mice. The mutant
embryos display extensive
s morphological defects, including defects in the formation of the notochord,
somites, heart, blood
vessels, neural tube, and extraembryonic structures. (Reviewed in Alberts,
supra, pp. 986-987.)
L~~tr,inin is a major glycoprotein component of the basal lamina which
underlies and supports
epithelial cell sheets. Laminin is one of the first ECM proteins synthesized
in the developing embryo.
Laminin is an 850 kilodalton protein composed of three polypeptide chains
joined in the shape of a
1o cross by disulfide bonds. Laminin is especially important for angiogenesis
and in particular, fox guiding
the formation of capillaries. (Reviewed in Alberts, su ra, pp. 990-991.)
There are many other types of proteinaceous ECM components, most of which can
be
classified as proteoglycans. Proteoglycans are composed of unbranched
polysaccharide chains
(glycosaminoglycans) attached to protein cores. Common proteoglycans include
aggrecan,
15 betaglycan, decorin, perlecan, serglycin, and syndecan-1. Some of these
molecules not only provide
mechanical support, but also bind to extracellular signaling molecules, such
as fibroblast growth factor
and transforming growth factor (3, suggesting a role for proteoglycans in cell-
cell communication and
cell growth. (Reviewed in Alberts, supra, pp. 973-978.) Likewise, the
glycoproteins tenascin-C and
tenascin-R are expressed in developing and lesioned neural tissue and provide
stimulatory and anti-
2o adhesive (inhibitory) properties, respectively, for axonal growth.
(Faissner, A. (1997) Cell Tissue Res.
290:331-341.)
Cytoskeletal Molecules
The cytoskeleton is a cytoplasmic network of protein fibers that mediate cell
shape, structure,
25 and movement. The cytoskeleton supports the cell membrane and forms tracks
along which
organelles and other elements move in the cytosol. The cytoskeleton is a
dynamic structure that
allows cells to adopt various shapes and to carry out directed movements.
Major cytoskeletal fibers
include the microtubules, the microfilaments, and the intermediate filaments.
Motor proteins, including
myosin, dynein, and kinesin, drive movement of or along the fibers. The motor
protein dynamin. drives
3 o the formation of membrane vesicles. Accessory or associated proteins
modify the structure or activity
of the fibers while cytoskeletal membrane anchors connect the fibers to the
cell membrane.
Tubulins
Microtubules, cytoskeletal fibers with a diameter of about 24 nm, have
multiple roles in the
cell. Bundles of microtubules form cilia and flagella, which are whip-like
extensions of the cell
s s membrane that are necessary for sweeping materials across an epithelium
and for swimming of
84


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
sperm, respectively. Marginal bands of mtcrotubules in red blood cells and
platelets are important for
these cells' pliability. Organelles, membrane vesicles, and proteins are
transported in the cell along
tracks of microtubules. For example, microtubules run through nerve cell
axons, allowing bi-
directional transport of materials and membrane vesicles between the cell body
and the nerve
s terminal. Failure to supply the nerve terminal with these vesicles blocks
the transmission of neural
signals. Microtubules are also critical to chromosomal movement during cell
division. Both stable and
short-lived populations of microtubules exist in the cell.
Microtubules are polymers of GTP binding tubulin protein subunits. Each
subunit is a
heterodimer of a- and (3- tubulin, multiple isoforms of which exist. The
hydrolysis of GTP is linked to
1 o the addition of tubulin subunits at the end of a microtubule. The subunits
interact head to tail to form
protofilaments; the protofilaments interact side to side to form a
microtubule. A microtubule is
polarized, one end ringed with a-tubulin and the other with (3-tubulin, and
the two ends differ in their
rates of assembly. Generally, each microtubule is composed of 13 protohlaments
although 11 or 15
protofilament-microtubules are sometimes found. Cilia and flagella contain
doublet microtubules.
15 Microtubules grow from specialized structures known as centrosomes or
microtubule-organizing
centers (MTOCs). MTOCs may contain one or two centrioles, which are pinwheel
arrays of triplet
microtubules. The basal body, the organizing center located at the base of a
cilium or flagellum,
contains one centriole. Gamma tubulin present in the MTOC is important for
nucleating the
polymerization of a- and [3- tubulin heterodimers but does not polymerize into
microtubules.
2o Microtubule-Associated Proteins
Microtubule-associated proteins (MAPS) have roles in the assembly and
stabilization of
microtubules. One major family of MAPS, assembly MAPS, can be identified in
neurons as well as
non-neuronal cells. Assembly MAPS are responsible for cross-linking
microtubules in the cytosol.
These MAPS are organized into two domains: a basic microtubule-binding domain
and an acidic
2 s projection domain. The projection domain is the binding site for
membranes, intermediate filaments, or
other microtubules. Based on sequence analysis, assembly MAPS can be further
grouped into two
types: Type I and Type lI. Type I MAPS, which include MAP1A and MAP1B, are
large, filamentous
molecules that co-purify with microtubules and are abundantly expressed in
brain. and testes. Type I
MAPS contain several repeats of a positively-charged amino acid sequence motif
that binds and
s o neutralizes negatively charged tubuliu, leading to stabilization of
microtubules. MAP1A and MAP1B
are each derived from a single precursor polypeptide that is subsequently
proteolytically processed to
generate one heavy chain and one light chain.
Another light chain, LC3, is a 16.4 kDa molecule that binds MAP1A, MAP1B, and
microtubules. It is suggested that LC3 is synthesized from a source other than
the MAP1A or
35 MAP1B transcripts, and that the expression of LC3 may be important in
regulating the microtubule


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
binding activity of MAP1A and MAP1B during cell proliferation (Mann, S.S. et
al. (1994) J. Biol.
Chem. 269:11492-11497).
Type II MAPS, which include MAP2a, MAP2b, MAP2c, MAP4, and Tau, are
characterized
by three to four copies of an 18-residue sequence in the microtubule-binding
domain. MAP2a,
MAP2b, and MAP2c are found only in dendrites, MAP4 is found in non-neuronal
cells, and Tau is.
found in axons and dendrites of nerve cells. Alternative splicing of the Tau
mRNA leads to the
existence of multiple forms of Tau protein. Tau phosphorylation is altered in
neurodegenerative
disorders such as Alzheimer's disease, Pick's disease, progressive
supranuclear palsy, corticobasal
degeneration, and familial frontotemporal dementia and Parkinsonism linked to
chromosome 17. The
so altered Tau phosphorylation leads to a collapse of the microtubule network
and the formation of
intraneuronal Tau aggregates (Spillantini, M.G. and M. Goedert (1998) Trends
Neurosci. 21:428-433).
The protein pericentrin is found in the MTOC and has a role in microtubule
assembly.
Actins
Microfilaments, cytoskeletal filaments with a diameter of about 7-9 am, are
vital to cell
5 locomotion, cell shape, cell adhesion, cell division, and muscle
contraction. Assembly and disassembly
of the microfilaments allow cells to change their morphology. Microfilaments
are the polymerized
form of actin, the most abundant intracellular protein in the eukaryotic cell.
Human cells contain six
isoforms of actin. The three a-actins are found in different kinds of muscle,
nonmuscle ~i-actin and
nonmuscle y-actin are found in nonmuscle cells, and another y-actin is found
in intestinal smooth
2 o muscle cells. G-actin, the monomeric form of actin, polymerizes into
polarized, helical F-actin
filaments, accompanied by the hydrolysis of ATP to ADP. Actin filaments
associate to form bundles
and networks, providing a framework to support the plasma membrane and
determine cell shape.
These bundles and networks are connected to the cell membrane. In muscle
cells, thin filaments
containing actin slide past thick filaments containing the motor protein
myosin during contraction. A
2 s family of actin-related proteins exist that are not part of the actin
cytoskeleton, but rather associate
with microtubules and dynein.
Actin-Associated Proteins
Actin-associated proteins have roles in cross-linking, severing, and
stabilization of actin
filaments and in sequestering actin monomers. Several of the actin-associated
proteins have multiple
3 o functions. Bundles and networks of actin filaments are held together by
actin cross-linking proteins.
These proteins have two actin-binding sites, one for each filament. Short
cross-linking proteins
promote bundle formation while longer, more flexible cross-linking proteins
promote network
formation. Calmodulin-like calcium binding domains in actin cross-linking
proteins allow calcium
regulation of cross-linking. Group I cross-linking proteins have unique actin-
binding domains and
s s include the 30 kD protein, EF-la, fascia, and scrum. Group II cross-
linking proteins have a 7,000-MW
86


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
actin-binding domain and include villin and dematin. Group 11 cross-linking
proteins have pairs of a
26,000-MW actin-binding domain and include hrmbrin, spectrin, dystrophin, ABP
120, and ~lamin.
Severing proteins regulate the length of actin filaments by breaking them into
short pieces or
by blocking their ends. Severing proteins include gCAP39, severin (fragmin),
gelsolin, and villin.
s Capping proteins can cap the ends of actin filaments, but cannot break
filaments. Capping proteins
include CapZ and tropomodulin. The proteins thymosin and profilin sequester
actin monomers in the
cytosol, allowing a pool of unpolymerized actin to exist. The actin-associated
proteins tropomyosin,
troponin, and caldesmon regulate muscle contraction in response to calcium.
Intermediate Filaments and Associated Proteins
~.o Intermediate filaments (1Fs) are cytoskeletal fibers with a diameter of
about 10 nm,
intermediate between that of microfilaments and microtubules. IFs serve
structural roles in the cell,
reinforcing cells and organizing cells into tissues. 1Fs are particularly
abundant in epidermal cells and
in neurons. IFs are extremely stable, and, in contrast to microfilaments and
microtubules, do not
function in cell motility.
15 Five types of IF proteins are known in mammals. Type I and Type II proteins
are the acidic
and basic keratins, respectively. Heterodimers of the acidic and basic
keratins are the building blocks
of keratin IFs. Keratins are abundant in soft epithelia such as skin and
cornea, hard epithelia such as
nails and hair, and in epithelia that line internal body cavities. Mutations
in keratin genes lead to
epithelial diseases including epidermolysis bullosa simplex, bullous
congenital ichthyosiform
~o erythroderma (epidermolytic hyperkeratosis), non-epidermolytic and
epidermolytic palmoplantar
keratoderma, ichthyosis bullosa of Siemens, pachyonychia congenita, and white
sponge nevus. Some
of these diseases result in severe skin blistering. (See, e.g., Wawersik, M.
et al. (1997) J. Biol. Chem.
272:32557-32565; and Corden L.D. and W.H. McLean (1996) Exp. Dermatol. 5:297-
307.)
Type 11I IF proteins include desmin, glial fibrillary acidic protein,
vimentin, and peripherin.
2s Desmin filaments in muscle cells link myofibrils into bundles and stabilize
sarcomeres in contracting
muscle. Glial fibrillary acidic protein filaments are found in the glial cells
that surround neurons and
astrocytes. Vimentin filaments are found in blood vessel endothelial cells,
some epithelial cells, and
mesenchymal cells such as fibroblasts, and are commonly associated with
microtubules. Vimentin
filaments may have roles in keeping the nucleus and other organelles in place
in the cell. Type IV IFs
3 o include the neurofilaments and nestin. Neuro~laments, composed of three
polypeptides NF-L, NF-M,
and NF-H, are frequently associated with microtubules in axons. Neurofilaments
are responsible for
the radial growth and diameter of an axon, and ultimately for the speed of
nerve impulse transmission.
Changes in phosphorylation and metabolism of neurofilaments are observed in
neurodegenerative
diseases including amyotrophic lateral sclerosis, Parkinson's disease, and
Alzheimer's disease (Julien,
35 J.P. and W.E. Mushynski (1998) Prog. Nucleic Acid Res. Mol. Biol. 61:1-23).
Type V lFs, the
87


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
lamins, are found in the nucleus where they support the nuclear membrane.
IFs have a central a-helical rod region interrupted by short nonhelical linker
segments. The
rod region is bracketed, in most cases, by non-helical head and tail domains.
The rod regions of
intermediate filament proteins associate to form a coiled-coil dimer. A highly
ordered assembly
process leads from the dimers to the TFs. Neither ATP nor GTP is needed for IF
assembly, unlike
that of microfilaments and microtubules.
IF-associated proteins (IFAPs) mediate the interactions of 1Fs with one
another and with
other cell structures. IFAPs cross-link IFs into a bundle, into a network, or
to the plasma membrane,
anal may cross-link IFs to the microfilament and microtubule cytoskeleton.
Microtubules and IFs are
so in particular closely associated. IFAPs include BPAG1, plakoglobin,
desmoplakin I, desmoplakin II,
plectin, ankyrin, filaggrin, and lamin B receptor.
Cytoskeletal-Membrane Anchors
Cytoskeletal fibers are attached to the plasma membrane by specific proteins.
These
attachments are important for maintaining cell shape and for muscle
contraction. In erythrocytes, the
s5 spectrin-actin cytoskeleton is attached to cell membrane by three proteins,
band 4.1, ankyrin, and
adducin. Defects in this attachment result in abnormally shaped cells which
are more rapidly
degraded by the spleen, leading to anemia. In platelets, the spectrin-actin
cytoskeleton is also linked to
the membrane by ankyrin; a second actin netwoxk is anchored to the membrane by
filamin. In muscle
cells the protein dystrophin links actin filaments to the plasma membrane;
mutations in the dystrophin
ao gene lead to Duchenne muscular dystrophy. In adherens junctions and
adhesion plaques the
peripheral membrane proteins a-actinin and vinculin attach actin filaments to
the cell membrane.
IFs are also attached to membranes by cytoskeletal-membrane anchors. The
nuclear lamina
is attached to the inner surface of the nuclear membrane by the lamin_ B
receptor. Vimentin IFs are
attached to the plasma membrane by ankyrin and plectin. Desmosome and
hemidesmosome
z5 membrane junctions hold together epithelial cells of organs and skin. These
membrane junctions allow
shear forces to be distributed across the entire epithelial cell layer, thus
providing strength and rigidity
to the epithelium. IFs in epithelial cells are attached to the desmosome by
plakoglobin and
desmoplakins. The proteins that link IFs to hemidesmosomes are not known.
Desmin 1Fs surround
the sarcomere in muscle and are linked to the plasma membrane by paranemin,
synemin, and ankyrin.
3 o Myosin-related Motor Proteins
Myosins are actin-activated ATPases, found in eukaryotic cells, that couple
hydrolysis of ATP
with motion. Myosin provides the motor function for muscle contraction and
intracellular movements
such as phagocytosis and rearrangement of cell contents during mitotic cell
division (cytokinesis). The
contractile unit of skeletal muscle, termed the sarcomere, consists of highly
ordered arrays of thin
3 s actin-containing filaments and thick myosin-containing filaments.
Crossbridges form between the thick
88


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
and thin filaments, and the ATP-dependent movement of myosin heads within the
thick filaments pulls
the thin filaments, shortening the sarcomere and thus the muscle fiber.
Myosins are composed of one or two heavy chains and associated Iight chains.
Myosin heavy
chains contain an amino-terminal motor or head domain, a neck that is the site
of light-chain binding,
and a carboxy-terminal tail domain. The tail domains may associate to foam an
a-helical coiled coil.
Conventional myosins, such as those found in muscle tissue, are composed of
two myosin heavy-chain
subunits, each associated with two light-chain subunits that bind at the neck
region and play a
regulatory role. Unconventional myosins, believed to function in intracellular
motion, may contain
either one or two heavy chains and associated light chains. There is evidence
for about 25 myosin
1 o heavy chain genes in vertebrates, more than half of them unconventional.
Dynein-related Motor Proteins
Dyneins are (-) end-directed motor proteins which act on microtubules. Two
classes of
dyneins, cytosolic and axonemal, have been identified. Cytosolic dyneins are
responsible for
translocation of materials along cytoplasmic microtubules, for example,
transport from the nerve
terminal to the cell body and transport of endocytic vesicles to lysosomes.
Cytoplasmic dyneins are
also reported to play a role in mitosis. Axonemal dyneins are responsible for
the beating of flagella
and cilia. Dynein on one microtubule doublet walks along the adjacent
microtubule doublet. This
sliding force produces bending forces that cause the flagellum or cilium to
beat. Dyneius have a
native mass between 1000 and 2000 kDa and contain either two or three force-
producing heads driven
2 o by the hydrolysis of ATP. The heads are linked via stalks to a basal
domain which is composed of a
highly variable number of accessory intermediate and light chains.
Kiuesin-related Motor Proteins
Kinesins are (+) end-directed motor proteins which act on microtubules. The
prototypical
kinesin molecule is involved in the transport of membrane-bound vesicles and
organelles. This
2s function is particularly important for axonal transport in neurons. Kinesin
is also important in all cell
types for the transport of vesicles from the Golgi complex to the endoplasmic
reticulum. This role is
critical for maintaining the identity and functionality of these secretory
organelles.
Kinesins define a ubiquitous, conserved family of over 50 proteins that can be
classified into at
least 8 subfamilies based on primary amino acid sequence, domain structure,
velocity of movement,
3 o and cellular function. (Reviewed in Moore, J.D. and S.A. Endow (1996)
Bioessays 18:207-219; and
Hoyt, A.M. (1994) Curr. Opin. Cell Biol. 6:63-68.) The prototypical kinesin
molecule is a
heterotetramer comprised of two heavy polypeptide chains (KHCs) and two light
polypeptide chains
(KLCs). 'The KHC subunits are typically referred to as "kinesin." KHC is about
1000 amino acids in
length, and KLC is about 550 amino acids in length. Two KHCs dimerize to form
a rod-shaped
3 s molecule with three distinct regions of secondary structure. At one end of
the molecule is a globular
89


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
motor domain that functions in A'1'Y hydrolyses and mtcrotubule binding.
Kinesm motor domains are
highly conserved and share over 70% identity. Beyond the motor domain is an a-
helical coiled-coil
region which mediates dimerization. At the other end of the molecule is a fan-
shaped tail that
associates with molecular cargo. The tail is formed by the interaction of the
KHC C-termini with the
s two KI,Cs.
Members of the more divergent subfamilies of kinesins are called kinesin-
related proteins
(KRPs), many of which function during mitosis in eukaryotes (Hoyt, su ra).
Some KRPs are
required for assembly of the mitotic spindle. In vivo and in vitro analyses
suggest that these KRPs
exert force on microtubules that comprise the mitotic spindle, resulting in
the separation of spindle
so poles. Phosphorylation of KRP is required for this activity. Failure to
assemble the mitotic spindle
results in abortive mitosis and chromosomal aneuploidy, the latter condition
being characteristic of
cancer cells. In addition, a unique KRP, centromere protein E, localizes to
the kinetochore of human
mitotic chromosomes and may play a role in their segregation to opposite
spindle poles.
Dynamin-related Motor Proteins
m Dynaii-tin is a large GTPase motor protein that functions as a "molecular
pinchase," generating
a mechanochemical force used to sever membranes. This activity is important in
forming clathrin-
coated vesicles from coated pits in. endocytosis and in the biogenesis of
synaptic vesicles in neurons.
Binding of dynamin to a membrane leads to dynamin's self assembly into spirals
that may act to
constrict a flat membrane surface into a tubule. GTP hydrolysis induces a
change in conformation of
2 0 the dynamin polymer that pinches the membrane tubule, leading to severing
of the membrane tubule
and formation of a membrane vesicle. Release of GDP and inorganic phosphate
leads to dynamin
disassembly. Following disassembly the dynamin may either dissociate from the
membrane or remain
associated to the vesicle and be transported to another region of the cell.
Three homologous dynamic
genes have been discovered, in addition to several dynamin-related proteins.
Conserved dynamin
2 s regions are the N-terminal GTP-binding domain, a central pleckstrin
homology domain that binds
membranes, a central coiled-coil region that may activate dynamin's GTPase
activity, and a C-
terminal proline-rich domain that contains several motifs that bind SH3
domains on other proteins.
Some dynamin-related proteins do not contain the pleckstrin homology domain or
the proline-rich
domain. (See McNiven, M.A. (1998) Cell 94:151-154; Scaife, R.M. and R.L.
Margolis (1997) Cell.
so Signa1.9:395-401.)
The cytoskeleton is reviewed in Lodish, H. et al. (1995) Molecular Cell
Biolo~y, Scientific
American Books, New York NY.
Ribosomal Molecules
3 5 Ribosomal RNAs (rRNAs) are assembled, along with ribosomal proteins, into
ribosomes,


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
which are cytoplasmic particles that translate messenger RNA into
polypeptides. The eukaryotic
ribosome is composed of a 60S (large) subunit and a 40S (small) subunit, which
together form the 80S
ribosome. In addition to the 185, 28S, SS, and 5.85 rRNAs, the ribosome also
contains more than fifty
proteins. The ribosomal proteins have a prefix which denotes the subunit to
which they belong, either
L (large) or S (small). Ribosomal protein activities include binding rRNA and
organizing the
conformation of the junctions between rRNA helices (Woodson, S.A. and N.B.
Leontis (1998) Curr.
Opin. Struct. Biol. 8:294-300; Ramakrishnan, V. and S.W. White (1998) Trends
Biochem. Sci. 23:208-
212.) Three important sites are identified on the ribosome. The aminoacyl-tRNA
site (A site) is
where charged tRNAs, (with the exception of the initiator-tRNA) bind on
arrival at the ribosome. The
1o peptidyl-tRNA site (P site) is where new peptide bonds are formed, as well
as where the initiator
tRNA binds. The exit site (E site) is where deacylated tRNAs bind prior to
their release from the
ribosome. (The ribosome is reviewed in Stryer, L. (1995) Biochemistry W.H.
Freeman and Company,
New York NY, pp. 888-908; and Lodish, H. et al. (1995) Molecular Cell Biolo~y
Scientific American
Books, New York NY. pp. 119-138.)
Chromatin Molecules
The nuclear DNA of eukaryotes is organized into chromatin. Two types of
chromatin are
observed: euchromatin, some of which may be transcribed, and heterochromatin
so densely packed
that much of it is inaccessible to transcription. Chromatin packing thus
serves to regulate protein
2o expression in eukaryotes. Bacteria lack chromatin and the chromatin-packing
level of gene regulation.
The fundamental unit of chromatin is the nucleosome of 200 DNA base pairs
associated with
two copies each of histories H2A, H2B, H3, and H4. Adjascent nucleosomes are
linked by another
class of histories, H1. Low molecular weight non-historic proteins called the
high mobility group
(HMG), associated with chromatin, may function in the unwinding of DNA and
stabilization of single-
stranded DNA. Chromodomain proteins function in compaction of chromatin into
its transcriptionally
silent heterochromatin form. m
During mitosis, all DNA is compacted into heterochromatin and transcription
ceases.
Transcription in interphase begins with the activation of a region of
chromatin. Active chromatin is
decondensed. Decondensation appears to be accompanied by changes in binding
coefficient,
s o phosphorylation and acetylation states of chromatin histories. HMG
proteins HMG23 and HMG17
selectively bind activated chromatin. Topoisomerases remove superhelical
tension on DNA. The
activated region decondenses, allowing gene regulatory proteins and
transcription factors to assemble
on the DNA.
Patterns of chromatin structure can be stably inherited, producing heritable
patterns of gene
3 5 expression. In mammals, one of the two X chromosomes in each female cell
is inactivated by
91


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
condensation to heterochromatin during zygote development. The inactive state
of this chromosome is
inherited, so that adult females are mosaics of clusters of paternal-X and
maternal-X clonal cell
groups. The condensed X chromosome is reactivated in meiosis.
Chromatin is associated with disorders of protein expression such as
thalassemia, a genetic
s anemia resulting from the removal of the locus control region (LCR) required
for decondensation of
the globin gene locus.
For a review of chromatin structure and function see Alberts, B. et al. (1994)
Molecular Cell
Biolo , third edition, Garland Publishing, Inc., New York NY, pp. 351-354, 433-
439.
~.o Electron Transfer Associated Molecules
Electron carriers such as cytochromes accept electrons from NADH or FADHZ and
donate
them to other electron carriers. Most electron-transfexring proteins, except
ubiquinone, are prosthetic
groups such as flavins, heme, FeS clusters, and copper, bound to inner
membrane proteins.
Adrenodoxin, for example, is au FeS protein that forms a complex with
NADPH:adrenodoxin
15 reductase and cytochrome p450. Cytochromes contain a heme prosthetic group,
a porphyrin ring
containing a tightly bound iron atom. Electron transfer reactions play a
crucial role in cellular energy
production.
Energy is produced by the oxidation of glucose and fatty acids. Glucose is
initially converted
to pyruvate in the cytoplasm. Fatty acids and pyruvate are transported to the
mitochondria for
~ o complete oxidation to COZ coupled by enzymes to the transport of electrons
from NADH and FADHa
to oxygen and to the synthesis of ATP (oxidative phosphorylation) from ADP and
P;.
Pyruvate is txansported into the mitochondria and converted to acetyl-CoA for
oxidation via
the citric acid cycle, involving pyruvate dehydrogenase components,
dihydrolipoyl transacetylase, and
dihydrolipoyl dehydrogenase. Enzymes involved in the citric acid cycle
include: citrate synthetase,
2s aconitases, isocitrate dehydrogenase, alpha-ketoglutarate dehydrogenase
complex including
transsuccinylases, succinyl CoA synthetase, succinate dehydrogenase,
fumarases, and malate
dehydrogenase. Acetyl CoA is oxidized to COZ with concomitant formation of
NADH, FADH2, and
GTP. In oxidative phosphorylation, the transfer of electrons from NADH and
FADH2 to oxygen by
dehydrogenases is coupled to the synthesis of ATP from ADP and P; by the FoF1
ATPase complex in
s o the mitochondrial inner membrane. Enzyme complexes responsible for
electron transport and ATP
synthesis include the F~F'1 ATPase complex, ubiquinone(CoQ)-cytochrome c
reductase, ubiquinone
reductase, cytochrome b, cytochrome c1, FeS protein, and cytochrome c oxidase.
ATP synthesis requires membrane transport enzymes including the phosphate
transporter and
the ATP-ADP antiport protein. The ATP binding casette (ABC) superfamily has
also been
35 suggested as belonging to the mitochondrial transport group (Rogue, D.L. et
al. (1999) J. Mol. Biol.
92


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
285:379-389). Brown fat uncoupling protein dissipates oxidative energy as
heat, and may be involved
the fever response to infection and trauma (Cannon, B. et al. (1998) Ann. NY
Acad. Sci. 856:171-
187).
Mitochondria are oval-shaped organelles comprising an outer membrane, a
tightly folded inner
membrane, an intermembrane space between the outer and inner membranes, and a
matrix inside the
inner membrane. The outer membrane contains many porin molecules that allow
ions and charged
molecules to enter the intermembrane space, while the inner membrane contains
a variety of transport
proteins that transfer only selected molecules. Mitochondria are the primary
sites of energy
production in cells.
2o Mitochondria contain a small amount of DNA. Human mitochondrial DNA encodes
13
proteins, 22 tRNAs, and 2 rRNAs. Mitochondrial-DNA encoded proteins include
NADH-Q
reductase, a cytocbrome reductase subunit, cytochrome oxidase subunits, and
ATP synthase subunits.
Electron-transfer reactions also occur outside the mitochondria in locations
such as the
endoplasmic reticulum, which plays a crucial role in lipid and protein
biosynthesis. Cytochrome b5 is a
25 central electron donor for various reductive reactions occurring on the
cytoplasmic surface of liver
endoplasmic reticulum. Cytochrome b5 has been found in Golgi, plasma,
endoplasmic reticulum ,(ER),
and microbody membranes.
For a review of mitochondrial metabolism and regulation, see Lodish, H. et al.
(1995)
Molecular Cell Biolo~y, Scientific American Books, New York NY, pp. 745-797
and Stryer (1995)
2o Biochemistry, W.H. Freeman and Co., San Francisco CA, pp 529-558, 988-989.
The majoxity of rnitochondrial proteins are encoded by nuclear genes, are
synthesized on
cytosolic ribosomes, and are imported into the mitochondria. Nuclear-encoded
proteins which are
destined for the mitochondrial matrix typically contain positively charged
amino terminal signal
sequences. Import of these preproteins from the cytoplasm requires a
multisubunit protein complex in
25 the outer membrane known as the translocase of outer mitochondrial membrane
(TOM; previously
designated MOM; Pfanner, N. et al. (1996) Trends Biochem. Sci. 21:51-52) and
at least three inner
membrane proteins which comprise the translocase of inner mitochondrial
membrane (TIM; previously
designated MIM; Pfanner, supra). An inside-negative membrane potential across
the inner
mitochondrial membrane is also required for preprotein import. Preproteins are
recognized by surface
3 o receptor components of the TOM complex and are translocated through a
proteinaceous pore formed
by other TOM components. Proteins targeted to the matrix are then recognized
by the import
machinery of the TIM complex. The import systems of the outer and inner
membranes can function
independently (Segui-Real, B. et al. (1993) EMBO J. 12:2211-2218).
Once precursor proteins are in the mitochondria, the leader peptide is cleaved
by a signal
3 5 peptidase to generate the mature protein. Most leader peptides are removed
in a one step process by
93


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
a protease termed mitochondrial processing peptidase (.MPP) (Paces, V. et al.
(1993) Proc. Natl.
Aced. Sci. USA 90:5355-5358). In some cases a two-step process occurs in which
MPP generates
au intermediate precursor form which is cleaved by a second enzyme,
mitochondria) intermediate
peptidase, to generate the mature protein.
s Mitochondria) dysfunction leads to impaired calcium buffering, generation of
free radicals that
may participate in deleterious intracellular and extracellular processes,
changes in mitochondria)
permeability and oxidative damage which is observed in several
neurodegenerative diseases.
Neurodegenerative diseases linked to mitochondria) dysfunction include some
forms of Alzheimer's
disease, Friedreich's ataxia, familial amyotrophic lateral sclerosis, and
Huntington's disease (Beal,
~o M.F. (1998) Biochim. Biophys. Acta 1366:211-213). The myocardium is heavily
dependent on
oxidative metabolism, so mitochondria) dysfunction often leads to heart
disease (DiMauro, S. and M.
Hirano (1998) Curr. Opiu. Cardiol 13:190-197). Mitochondria are implicated in
disorders of cell
proliferation, since they play an important role in a cell's decision to
proliferate or self destruct through
apoptosis. The oncoprotein Bcl-2, for example, promotes cell proliferation by
stabilizing mitochondria)
is membranes so that apoptosis signals are not released (Susin, S.A. (1998)
Biochim. Biophys. Acta
1366:151-165).
Transcription Factor Molecules
Multicellular organisms are comprised of diverse cell types that differ
dramatically both in
2 o structure and function. The identity of a cell is determined by its
characteristic pattern of gene
expression, and different cell types express overlapping but distinctive sets
of genes throughout
development. Spatial and temporal regulation of gene expression is critical
for the control of cell
proliferation, cell differentiation, apoptosis, and other processes that
contribute to organismal
development. Furthermore, gene expression is regulated in response to
extracellular signals that
25 mediate cell-cell communication and coordinate the activities of different
cell types. Appropriate gene
regulation also ensures that cells function efficiently by expressing only
those genes whose functions
are required at a given time.
Trauscriptional regulatory proteins are essential for the control of gene
expression. Some of
these proteins function as transcription factors that initiate, activate,
repress, or terminate gene
3 o transcription. Transcription factors generally bind to the promoter,
enhancer, and upstream regulatory
regions of a gene in a sequence-specific manner, although some factors bind
regulatory elements
within or downstream of a gene's coding region. Transcription factors may bind
to a specific region of
DNA singly or as a complex with other accessory factors. (Reviewed in Lewin,
B. (1990) Genes IV,
Oxford University Press, New York NY, and Cell Press, Cambridge MA, pp. 554-
570.)
3 s The double helix structure and repeated sequences of DNA create
topological and chemical
94


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
teatures whtch can be recognized by transcription factors. these features are
hydrogen bond donor
and acceptor groups, hydrophobic patches, major and minor grooves, and
regular, repeated stretches
of sequence which induce distinct bends in the helix. Typically, transcription
factors recognize
specific DNA sequence motifs of about 20 nucleotides in length. Multiple,
adjacent transcription
s factor-binding motifs may be required for gene regulation.
Many transcription factors incorporate DNA-binding structural motifs which
comprise either
cc helices or J3 sheets that bind to the major groove of DNA. Four well-
characterized structural motifs
are helix-turn-helix, zinc finger, leucine zipper, and helix-loop-helix.
Proteins containing these motifs
may act alone as monomers, or they may form homo- or heterodimers that
interact with DNA.
1o The helix-turn-helix motif consists of two cc helices connected at a fixed
angle by a short
chain of amino acids. One of the helices binds to the major groove. Helix-turn
helix motifs are
exemplified by the homeobox motif which is present in homeodomain proteins.
These proteins are
' critical for specifying the anterior-posterior body axis during development
and are conserved
throughout the animal kingdom. The Antennapedia and Ultrabithorax proteins of
DrOSOphila
15 melano~aster are prototypical homeodomain proteins (Patio, C.O. and R.T.
Sauer (1992) Annu. Rev.
Biochem. 61:1053-1095).
The zinc finger motif, which binds zinc ions, generally contains tandem
repeats of about 30
amino acids consisting of periodically spaced cysteine and histidine residues.
Examples of this
sequence pattern, designated C2H2 and C3HC4 ("RING" finger), have been
described (Lewin,
a o supra). Zinc finger proteins each contain an oc helix and an antiparallel
13 sheet whose proximity and
conformation are maintained by the zinc ion. Contact with DNA is made by the
arginine prece ding
the oc helix and by the second, third, and sixth residues of the a helix.
Variants of the zinc finger motif
include poorly defined cysteine-rich motifs which bind zinc or other metal
ions. These motifs may not
contain histidine residues and are generally nonrepetitive.
The leucine zipper motif comprises a stretch of amino acids rich in leucine
which can form an
amphipathic oc helix. This structure provides the basis for dimerization of
two leucine zipper proteins.
The region adjacent to the leucine zipper is usually basic, and upon protein
dimerization, is optimally
positioned for binding to the major groove. Proteins containing such motifs
are generally referred to as
bZIP transcription factors.
3 o The helix-loop-helix motif (HLH) consists of a short a helix connected by
a loop to a longer of
helix. The loop is flexible and allows the two helices to fold back against
each other and to bind to
DNA. The transcription factor Myc contains a prototypical HLH motif.
Most transcription factors contain characteristic DNA binding motifs, and
variations on the
above motifs and new motifs have been and are currentlybeing characterized
(Faisst, S. and S.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Meyer (1992) Nucleic Acids Res. 20:3-26).
Many neoplastic disorders in humans can be attributed to inappropriate gene
expression.
Malignant cell growth may result from either excessive expression of tumor
promoting genes or
insufficient expression of tumor suppressor genes (Cleary, M.L. (1992) Cancer
Surv. 15:89-104).
s Chromosomal translocations may also produce chimeric loci which fuse the
coding sequence of one
gene with the regulatory regions of a second unrelated gene. Such an
arrangement likely results in
inappropriate gene transcription, potentially contributing to malignancy.
In addition, the immune system responds to infection or trauma by activating a
cascade of
events that coordinate the progressive selection, amplification, and
mobilization of cellular defense
z o mechanisms. A complex and balanced program of gene activation and
repression is involved in this
process. However, hyperactivity of the immune system as a result of improper
or insufficient
regulation of gene expression may result in considerable tissue or organ
damage. This damage is well
documented in immunological responses associated with arthritis, allergens,
heart attack, stroke, and
infections (Isselbacher, K.J. et al. (1996) Harrison's Principles of Internal
Medicine, 13/e, McGraw
15 Hill, Inc. and Teton Data Systems Software).
Furthermore, the generation of multicellular organisms is based upon the
induction and
coordination of cell differentiation at the appropriate stages of development.
Central to this process is
differential gene expression, which confers the distinct identities of cells
and tissues throughout the
body. Failure to regulate gene expression during development can result in
developmental disorders.
2 o Human developmental disorders caused by mutations in zinc finger-type
transcriptional regulators
include: urogenenital developmental abnormalities associated with WT2; Greig
cephalopolysyndactyly,
Pallister-Hall syndrome, and postaxial polydactyly type A (GLI3); and Townes-
Brocks syndrome,
characterized by anal, renal, limb, and ear abnormalities (SALL1) (Engelkamp,
D. and V. van
Heyningen (1996) Curr. Opin. Genet. Dev. 6:334-342; Kohlhase, J. et al. (1999)
Am. J. Hum. Genet.
z5 64:435-445).
Cell Membrane Molecules
Eukaryotic cells are surrounded by plasma membranes which enclose the cell and
maintain an
environment iuside the cell that is distinct from its surroundings. In
addition, eukaryotic organisms are
s o distinct from prokaryotes in possessing many intracellular organelle and
vesicle structures. Many of
the metabolic reactions which distinguish eukaryotic biochemistry from
prokaryotic biochemistry take
place within these structures. The plasma membrane and the membranes
surrounding organelles and
vesicles are composed of phosphoglycerides, fatty acids, cholesterol,
phospholipids, glycolipids,
proteoglycans, and proteins. These components confer identity and
functionality to the membranes
96


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
with which they associate.
Integral Membrane Proteins
The majority of known integral membrane proteins are transmembrane proteins
(TM) which
are characterized by an extracellular, a transmembrane, and an intracellular
domain. TM domains are
s typically comprised of 1S to 25 hydrophobic amino acids which are predicted
to adopt an a-helical
conformation. TM proteins are classified as bitopic (Types I and II) and
polytopic (Types I:Q and IV)
(Singer, S.J. (1990) Annu. Rev. Cell Biol. 6:247-296). Bitopic proteins span
the membrane once while
polytopic proteins contain multiple membrane-spanning segments. TM proteins
function as cell-
surface receptors, receptor-interacting proteins, transporters of ions or
metabolites, ion channels, cell
so anchoring proteins, and cell type-specific surface antigens.
Many membrane proteins (MPs) contain amino acid sequence motifs that target
these
proteins to specific subcellular sites. Examples of these motifs include PDZ
domains, KDEL, RGD,
NGR, and GSL sequence motifs, von Willebrand factor A (vWFA) domains, and EGF-
like domains.
RGD, NGR, and GSL motif containing peptides have been used as drug delivery
agents in targeted
1s cancer treatment of tumor vasculature (Arap, W. et al. (1998) Science
279:377-380). Furthermore,
MPs may also contain amino acid sequence motifs, such as the carbohydrate
recognition domain
(CRD), that mediate interactions with extracellular or intracellular
molecules.
G-Protein Coupled Rece tots
G-protein coupled receptors (GPCR) are a superfamily of integral membrane
proteins which
20 transduce extracellular signals. GPCRs include receptors for biogenic
amines, lipid mediators of
inflammation, peptide hormones, and sensory signal mediators. The structure of
these
highly-conserved receptors consists of seven hydrophobic transmembrane
regions, an extracellular
N-terminus, and a cytoplasmic C-terminus. Three extracellular loops alternate
with three intracellular
loops to link the seven transmembrane regions. Cysteine disulfide bridges
connect the second and
25 third extracellular loops. The most conserved regions of GPCRs are the
transmembrane regions and
the first two cytoplasmic loops. A conserved, acidic-Arg-aromatic residue
triplet present in the
second cytoplasmic loop may interact with G proteins. A GPCR consensus pattern
is characteristic of
most proteins belonging to this superfamily (ExPASy PR~Sl'TE document PS00237;
and Watson, S.
and S. Arkinstall (1994) The G-protein Linked Receptox Facts Book, Academic
Press, San Diego CA,
3 o pp. 2-6). Mutations and changes in transcriptional activation of GPCR-
encoding genes have been
associated with neurological disorders such as schizophrenia, Parkinson's
disease, Alzheimer's
disease, drug addiction, and feeding disorders.
Scayen. eg r Receptors
Macrophage scavenger receptors with broad Iigand specificity may participate
in the binding
3 s of low density lipoproteins (LDL) and foreign antigens. Scavenger
receptors types I and II are
97


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
trimeric membrane proteins with each subunit containing a small N-terminal
intracellular domain, a
transmembrane domain, a large extracellular domain, and a C-terminal cysteine-
rich domain. The
extracellular domain contains a short spacer region, an a-helical coiled-coil
region, and a triple helical
collagen-like region. These receptors have been shown to bind a spectrum of
ligands, including
s chemically modified lipoproteins and albumin, polyribonucleotides,
polysaccharides, phospholipids, and
asbestos (Matsumoto, A. et al. (1990) Proc. Natl. Acad. Sci. USA 87:9133-9137;
and Elomaa, O. et
al. (1995) Cell 80:603-609). The scavenger receptors are thought to play a key
role iri atherogenesis
by mediating uptake of modified LDL in arterial walls, and in host defense by
binding bacterial
endotoxins, bacteria, and protozoa.
1 o Tetraspan Family Proteins
The transmembrane 4 superfamily (TM4SF) or tetraspan family is a multigene
family
encoding type III integral membrane proteins (Wright, M.D. and M.G. Tomlinson
(1994) Tmmunol.
Today 15:588-594). The TM4SF is comprised of membrane proteins which traverse
the cell
membrane four times. Members of the TM4SF include platelet and endothelial
cell membrane
15 proteins, melanoma-associated antigens, leukocyte surface glycoproteins,
colonal carcinoma antigens,
tumor-associated antigens, and surface proteins of the schistosome parasites
(Jankowski, S.A. (1994)
Oncogene 9:1205-1211). Members of the TM4SF share about 25-30% amino acid
sequence identity
with one another.
A number of TM4SF members have been implicated in signal transduction, control
of cell
2o adhesion, regulation of cell growth and proliferation, including
development and oncogenesis, and cell
motility, including tumor cell metastasis. Expression of TM4SF proteins is
associated with a variety of
tumors and the level of expression may be altered when cells are growing or
activated.
Tumor Antigens
Tumor antigens are cell surface molecules that are differentially expressed in
tumor cells
~ s relative to normal cells. Tumor antigens distinguish tumor cells
immunologically from normal cells and
provide diagnostic and therapeutic targets for human cancers (Takagi, S. et
al. (1995) Int. J. Cancer
61:706-715; Liu, E. et al. (1992) Oncogene 7:1027-1032).
Leukocyte Antigens
Other types of cell surface antigens include those identified on leukocytic
cells of the immune
3 o system. These antigens have been identified using systematic, monoclonal
antibody (mAb)-based
"shot gun" techniques. These techniques have resulted in the production of
hundreds of mAbs
directed against unknown cell surface leukocytic antigens. These antigens have
been grouped into
"clusters of differentiation" based on common immunocytochemical localization
patterns in various
differentiated and undifferentiated leukocytic cell types. Antigens in a given
cluster are presumed to
s s identify a single cell surface protein and are assigned a "cluster of
differentiation" or "CD"
98


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
designation. Some of the genes encoding proteins identified by CD antigens
have been cloned and
verified by standard molecular biology techniques. CD antigens have been
characterized as both
transmembrane proteins and cell surface proteins anchored to the plasma
membrane via covalent
attachment to fatty acid-containing glycolipids such as
glycosylphosphatidylinositol (GPI). (Reviewed
s in Barclay, A.N. et al. (1995) The Leucocyte Antigen Facts Book, Academic
Press, San Diego CA,
pp. 17-20.)
Ion Channels
Ion channels are found in the plasma membranes of virtually every cell in the
body. For
example, chloride channels mediate a variety of cellular functions including
regulation of membrane
s o potentials and absorption and secretion of ions across epithelial
membranes. Chloride channels also
regulate the pH of organelles such as the Golgi apparatus and endosomes (see,
e.g., Greger, R. (1988)
Anna. Rev. Physiol. 50:111-122). Electrophysiological and pharmacological
properties of chloride
channels, including ion conductance, current-voltage relationslups, and
sensitivity to modulators,
suggest that diffexent chloride channels exist in muscles, neurons,
fibroblasts, epithelial cells, and
15 lymphocytes.
Many ion channels have sites for phosphorylation by one or more protein
kinases including
protein kinase A, protein kinase C, tyrosine kinase, and casein kinase lI, all
of which regulate ion
channel activity in cells. Inappropriate phosphorylation of proteins in cells
has been linked to changes
in cell cycle progression and cell differentiation. Changes in the cell cycle
have been linked to
2 o induction of apoptosis or cancer. Changes in cell differentiation have
been linked to diseases and
disorders of the reproductive system, immune system, skeletal muscle, and
other organ systems.
Proton Pumas
Proton ATPases comprise a large class of membrane proteins that use the energy
of ATP
hydrolysis to generate an electrochenucal proton gradient across a membrane.
The resultant gradient
2 s may be used to transport other ions across the membrane (Na+, I~+, or Cf)
or to maintain organelle
pH. Proton ATPases are further subdivided into the mitochondrial F-ATPases,
the plasma membrane
ATPases, and the vacuolar ATPases. The vacuolar ATPases establish and maintain
an acidic pH
within various organelles involved in the processes of endocytosis and
exocytosis (Mellman, I. et al.
(1986) Anna. Rev. Biochem. 55:663-700).
3 o Proton-coupled, 12 membrane-spanning domain transporters such as PEPT 1
and PEPT 2 are
responsible for gastrointestinal absorption and for renal reabsorption of
peptides using an
electrochemical H+ gradient as the driving force. Another type of peptide
transporter, the TAP
transporter, is a heterodimer consisting of TAP 1 and TAP 2 and is associated
with antigen
processing. Peptide antigens are transported across the membrane of the
endoplasmic reticulum by
35 TAP so they can be expressed on the cell surface in association with MHC
molecules. Each TAP
99


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
protein consists of multiple hydrophobic membrane spanning segments and a
highly conserved
ATP-binding cassette (Boll, M. et al. (1996) Proc. Natl. Acad. Sci. USA 93:284-
289). Pathogenic
microorganisms, such as herpes simplex virus, may encode inhibitors of TAP-
mediated peptide
transport in order to evade immune surveillance (Marusina, K. and J.J Manaco
(1996) Curr. Opin.
s Hematol.3:19-26).
ABC Transporters
The ATP-binding cassette (ABC) transporters, also called the "traffic
ATPases", comprise a
superfamily of membrane proteins that mediate transport and channel functions
in prokaryotes and
eukaryotes (Higgins, C.F. (1992) Annu. Rev. Cell Biol. 8:67-113). ABC proteins
share a similar
overall structure and significant sequence homology. All ABC proteins contain
a conserved domain of
approximately two hundred amino acid residues which includes one or more
nucleotide binding
domains. Mutations in ABC transporter genes are associated with various
disorders, such as
hyperbilirubinemia IUDubin-Johnson syndrome, recessive Stargardt's disease, X-
linked
adrenoleukodystrophy, multidrug resistance, celiac disease, and cystic
fibrosis.
Peri heral and Anchored Membrane Proteins
Some membrane proteins are not membrane-spanning but are attached to the
plasma
membrane via membrane anchors or interactions with integral membrane proteins.
Membrane
anchors are covalently joined to a protein post-translationally and include
such moieties as prenyl,
myristyl, and glycosylphosphatidyl inositol groups. Membrane localization of
peripheral and anchored
2 o proteins is important for their function in processes such as receptor-
mediated signal transduction. For
example, prenylation of Ras is required for its localization to the plasma
membrane and for its normal
and oncogenic functions in signal transduction.
Vesicle Coat Proteins
Intercellular communication is essential for the development and survival of
multicellular
organisms. Cells communicate with one another through the secretion and uptake
of protein signaling
molecules. The uptake of proteins into the cell is achieved by the endocytic
pathway, in which the
interaction of extracellular signaling molecules with plasma membrane
receptors results in the
formation of plasma membrane-derived vesicles that enclose and transport the
molecules into the
cytosol. These transport vesicles fuse with and mature into endosomal and
lysosomal (digestive)
s o compartments. The secretion of pxoteins from the cell is achieved by
exocytosis, in which molecules
inside of the cell proceed through the secretory pathway. In this pathway,
molecules transit from the
ER to the Golgi apparatus and finally to the plasma membrane, where they are
secreted from the cell.
Several steps in the transit of material along the secretory and endocytic
pathways require the
formation of transport vesicles. Specifically, vesicles form at the
transitional endoplasmic reticulum
(tER), the rim of Golgi cisternae, the face of the Traits-Golgi Network (TGN),
the plasma membrane
100


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
(PM), and tubular extensions of the endosomes. Vesicle formation occurs when a
region of
membrane buds off from the donor organelle. The membrane-bound vesicle
contains proteins to be
transported and is surrounded by a proteinaceous coat, the components of which
are recruited from
the cytosol. Two different classes of coat protein have been identified.
Clathrin coats form on
s vesicles derived from the TGN and PM, whereas coatomer (COP) coats form on
vesicles derived
from the ER and Golgi. COP coats can be further classified as COPI, involved
in retrograde traffic
through the Golgi and from the Golgi to the ER, and COPII, involved in
anterograde traffic from the
ER to the Golgi (Mellman, supra).
In clathrin-based vesicle formation, adapter proteins bring vesicle cargo and
coat proteins
1o together at the surface of the budding membrane. Adapter protein-1 and -2
select cargo from the
TGN and plasma membrane, respectively, based on molecular information encoded
on the cytoplasmic
tail of integral membrane cargo proteins. Adapter proteins also recruit
clathrin to the bud site.
Clathrin is a protein complex consisting of three large and three small
polypeptide chains arranged in a
three-legged structure called a triskelion. Multiple triskelions and other
coat proteins appear to self
assemble on the membrane to form a coated pit. This assembly process may serve
to deform the
membrane into a budding vesicle. GTP-bound ADP-ribosylation factor (Arf) is
also incorporated into
the coated assembly. Another small G-protein, dynamin, forms a ring complex
around the neck of the
forming vesicle and may provide the mechanochemical force to seal the bud,
thereby releasing the
vesicle. The coated vesicle complex is then transported through the cytosol.
During the transport
2 o process, Arf bound GTP is hydrolyzed to GDP, and the coat dissociates from
the transport vesicle
(West, M.A. et al. (1997) J. Cell Biol. 138:1239-1254).
Vesicles which bud from the ER and the Golgi are covered with a protein coat
similar to the
clathrin coat of endocytic and TGN vesicles. The coat protein (COP) is
assembled from cytosolic
precursor molecules at specific budding regions on the organelle. The COP coat
consists of two
2s major components, a G-protein (Arf or Sar) and coat protomer (coatomer).
Coatomer is an equimolar
complex of seven proteins, termed alpha-, beta-, beta =, gamma-, delta-,
epsilon- and zeta-COP. The
coatomer complex binds to dilysine motifs contained on the cytoplasmic tails
of integral membrane
proteins. These include the KKXX retrieval motif of membrane proteins of the
ER and
dibasic/diphenylamine motifs of members of the p24 family. The p24 family of
type I membrane
°s o proteins represent the major membrane proteins of COPI vesicles
(Hatter, C. and F.T. Wieland
(1998) Proc. Natl. Acad. Sci. USA 95:11649-11654).
Organelle Associated Molecules
Eukaryotic cells are organized into various cellular organelles which has the
effect of
s s separating specific molecules and their functions from one another and
from the cytosol. Within the
101


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cell, various membrane structures surround and define these organelles while
allowing them to interact
with one another and the cell environment through both active and passive
transport processes.
Important cell organelles include the nucleus, the Golgi apparatus, the
endoplasmic reticulum,
mitochondria, peroxisomes, lysosomes, endosomes, and secretory vesicles.
s Nucleus
The cell nucleus contains all of the genetic information of the cell in the
form of DNA, and the
components and machinery necessary for replication of DNA and for
transcription of DNA into
RNA. (See Alberts, B. et al. (1994) Molecular Biolo~y of the Cell, Garland
Publishing Inc., New
York NY, pp. 335-399.) DNA is organized into compact structures in the nucleus
by interactions with
so various DNA-binding proteins such as lustones and non-histone chromosomal
proteins.
DNA-specific nucleases, DNAses, partially degrade these compacted structures
prior to DNA
replication or transcription. DNA replication takes place with the aid of DNA
helicases which unwind
the double-stranded DNA helix, and DNA polymerases that duplicate the
separated DNA strands.
Transcriptional regulatory proteins are essential for the control of gene
expression. Some of
these proteins function as transcription factors that initiate, activate,
repress, or terminate gene
transcription. Transcription factors generally bind to the promoter, enhancer,
and upstream regulatory
regions of a gene in a sequence-specific manner, although some factors bind
regulatory elements
within or downstream of a gene's coding region. Transcription factors may bind
to a specific region of
DNA singly or as a complex with other accessory factors. (Reviewed in Lewin,
B. (1990) Genes IV,
ao Oxford University Press, New York NY, and Cell Press, Cambridge MA, pp. 554-
570.) Many
transcription factors incorporate DNA binding structural motifs which comprise
either a helices or 13
sheets that bind to the major groove of DNA. Four well-characterized
structural motifs are helix-turn-
helix, zinc finger, leucine zipper, and helix-loop-helix. Proteins containing
these motifs may act alone
as monomers, or they may form horno- or heterodimers that interact with DNA.
2 s Many neoplastic disorders in humans can be attributed to inappropriate
gene expression.
Malignant cell growth may result from either excessive expression of tumor
promoting genes or
insufficient expression of tumor suppressor genes (Cleary, M.L. (1992) Cancer
Surv. 15:89-104).
Chromosomal translocations may also produce chimeric loci which fuse the
coding sequence of one
gene with the regulatory regions of a second unrelated gene. Such an
arrangement likely results in
3 o inappropriate gene transcription, potentially contributing to malignancy.
In addition, the immune system responds to infection or trauma by activating a
cascade of
events that coordinate the progressive selection, amplification, and
mobilization of cellular defense
mechanisms. A complex and balanced program of gene activation and repression
is involved in this
process. However, hyperactivity of the immune system as a result of improper
or insufficient
3 s regulation of gene expression may result in considerable tissue or organ
damage. 'This damage is well
102


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
documented in immunological responses associated with arthritis, allergens,
heart attack, stroke, and
infections (Isselbacher, I~.J. et al. (1996) Harrison's Principles of Internal
Medicine, 13/e, McGraw
Hill, Inc. and Teton Data Systems Software).
Transcription of DNA into RNA also takes place in the nucleus catalyzed by RNA
polymerases. Three types of RNA polymerase exist. RNA polymerase I makes large
ribosomal
RNAs, while RNA polymerase III makes a variety of small, stable RNAs including
5S ribosomal
RNA and the transfer RNAs (tRNA). RNA polymerase II transcribes genes that
will be translated
into proteins. The primary transcript of RNA polymerase II is called
hetexogenous nuclear RNA
(hnRNA), and must be further processed by splicing to remove non-coding
sequences called introns.
so RNA splicing is mediated by small nuclear ribonucleoprotein complexes, or
snRNPs, producing mature
messenger RNA (mRNA) which is then transported out of the nucleus for
translation into proteins.
Nucleolus
The nucleolus is a highly organized subcompartment in the nucleus that
contains high
concentrations of RNA and proteins and functions mainly in ribosomal RNA
synthesis and assembly
25 (Alberts, et a1. su ra, pp. 379-382). Ribosomal RNA (rRNA) is a structural
RNA that is complexed
with proteins to form ribonucleoprotein structures called ribosomes. Ribosomes
provide the platform
on which protein synthesis takes place.
Ribosomes are assembled in the nucleolus initially from a large, 45S rRNA
combined with a
variety of proteins imported from the cytoplasm, as well as smaller, SS rRNAs.
Later processing of
a o the immature ribosome results in formation of smaller ribosomal subunits
which are transported from
the nucleolus to the cytoplasm where they are assembled into functional
ribosomes.
Endoplasmic Reticulum
In eukaryotes, proteins are synthesized within the endoplasmic reticulum (ER),
delivered from
the ER to the Golgi apparatus for post-translational processing and sorting,
and transported from the
25 Golgi to specific intracellular and extracellular destinations. Synthesis
of integral membrane proteins,
secreted proteins, and proteins destined for the lumen of a particular
organelle occurs on the rough
endoplasmic reticulum (ER). The rough ER is so named because of the rough
appearance in electron
micrographs imparted by the attached ribosomes on which protein synthesis
proceeds. Synthesis of
proteins destined for the ER actually begins in the cytosol with the synthesis
of a specific signal
3 o peptide which directs the growing polypeptide and its attached ribosome to
the ER membrane where
the signal peptide is removed and protein synthesis is completed. Soluble
proteins destined for the ER
lumen, for secretion, or for transport to the lumen of other organelles pass
completely into the ER
lumen. Transmembrane proteins destined~fox the ER or for other cell membranes
are translocated
across the ER membrane but remain anchored in the lipid bilayer of the
membrane by one or more
35 membrane-spanning a-helical regions.
103


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Translocated polypeptide chains destined for other organelles or for secretion
also fold and
assemble in the ER lumen with the aid of certain "resident" ER proteins.
Protein folding in the ER is
aided by two principal types of protein isomerases, protein disulfide
isomerase (PDI), and peptidyl-
prolyl isomerase (PPI). PDI catalyzes the oxidation of free sulfhydryl groups
in cysteine residues to
foam intramolecular disulfide bonds in proteins. PPI, an enzyme that catalyzes
the isomerization of
certain proline imide bonds in oligopeptides and proteins, is considered to
govern one of the rate
limiting steps in the folding of many proteins to their final functional
conformation. The cyclophilins
represent a major class of PPI that was originally identified as the major
receptor for the
immunosuppressive drug cyclosporin A (Handschumacher, R.E. et al. (1984)
Science 226:544-547).
so Molecular "chaperones" such as BiP (binding protein) in the ER recognize
incorrectly folded proteins
as well as proteins not yet folded into their final form and bind to them,
both to prevent improper
aggregation between them, and to promote proper folding.
The "N-linked" glycosylation of most soluble secreted and membrane-bound
proteins by
oligosacchrides linked to asparagine residues in proteins is also performed in
the ER. This reaction is
25 catalyzed by a membrane-bound enzyme, oligosaccharyl transferase.
Gol~pparatus
The Golgi apparatus is a complex structure that lies adjacent to the ER in
eukaryotic cells and
serves primarily as a sorting and dispatching station for products of the ER
(Alberts, et al. sera, pp.
600-610). Additional posttranslational processing, principally additional
glycosylation, also occurs in
2 o the Golgi. Indeed, the Golgi is a major site of carbohydrate synthesis,
including most of the
glycosaminoglycans of the extracellular matrix. N-linked oligosaccharides,
added to proteins in the
ER, are also further modified in the Golgi by the addition of more sugar
residues to form complex N-
linked oligosaccharides. "O-linked" glycosylation of proteins also occurs in
the Golgi by the addition of
N-acetylgalactosamine to the hydroxyl group of a serine or threonine residue
followed by the
z 5 sequential addition of other sugar residues to the first. 'This process is
catalyzed by a series of
glycosyltransferases each specific for a particular donor sugar nucleotide and
acceptor molecule
(Lodish, H. et al. (1995) Molecular Cell Biology, W.H. Freeman and Co., New
Yoxk NY, pp.700-
708). In many cases, both N- and O-linked oligosaccharides appear to be
required for the secretion of
proteins or the movement of plasma membrane glycoproteins to the cell surface.
3 o The terminal compartment of the Golgi is the Trans-Golgi Network (TGN),
where both
membrane and lumenal proteins are sorted for their final destination.
Transport (or secretory) vesicles
destined for intracellular compartments, such as lysosomes, bud off of the
TGN. Other transport
vesicles bud off containing proteins destined for the plasma membrane, such as
receptors, adhesion
molecules, and ion channels, and secretory proteins, such as hormones,
neurotransmitters, and
3 5 digestive enzymes.
104


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
V acuoles
The vacuole system is a collection of membrane bound compartments in
eukaryotic cells that
functions in the processes of endocytosis and exocytosis. They include
phagosomes, lysosomes,
endosomes, and secretory vesicles. Endocytosis is the process in cells of
internalizing nutrients,
s solutes or small particles (pinocytosis) or large particles such as
internalized receptors, viruses,
bacteria, or bacterial toxins (phagocytosis). Exocytosis is the process of
transporting molecules to the
cell surface. It facilitates placement or localization of membrane-bound
receptors or other membrane
proteins and secretion of hormones, neurotransmitters, digestive enzymes,
wastes, etc.
A common property of all of these vacuoles is an acidic pH environment ranging
from
~. o approximately pH 4.5-5Ø This acidity is maintained by the presence of a
proton ATPase that uses
the energy of ATP hydrolysis to generate an electrochemical proton gradient
across a membrane
(Mellman, I. et al. (1986) Anna. Rev. Biochem. 55:663-700). Eukaryotic
vacuolar proton ATPase
(vp-ATPase) is a multimeric enzyme composed of 3-10 different subunits. One of
these subunits is a
highly hydrophobic polypeptide of approximately 16 kDa that is similar to the
proteolipid component of
15 vp-ATPases from eubacteria, fungi, and plant vacuoles (Mandel, M. et al.
(1988) Proc. Natl. Acad.
Sci. USA 85:5521-5524). The 16 kDa proteolipid component is the major subunit
of the membrane
portion of vp-ATPase and functions in the transport of protons across the
membrane.
Lysosomes
Lysosomes are membranous vesicles containing various hydrolytic enzymes used
for the
2 o controlled intracellular digestion of macromolecules. Lysosomes contain
some 40 types of enzymes
including proteases, nucleases, glycosidases, lipases, phospholipases,
phosphatases, and sulfatases, all
of which are acid hydrolases that function at a pH of about 5. Lysosomes are
surrounded by a unique
membrane containing transport proteins that allow the final products of
macromolecule degradation,
such as sugars, amino acids, and nucleotides, to be transported to the cytosol
where they may be
2 s either excreted or reutilized by the cell. A vp-ATPase, such as that
described above, maintains the
acidic environment necessary for hydrolytic activity (Alberts, su ra pp. 610-
611).
Endosomes
Endosomes are another type of acidic vacuole that is used to transport
substances from the
cell surface to the interior of the cell in the process of endocytosis. Like
lysosomes, endosomes have
3 o an acidic environment provided by a vp-ATPase (Alberts et al. su ra pp.
610-618). Two types of
endosomes are apparent based on tracer uptake studies that distinguish their
time of formation in the
cell and their cellular location. Eaxly endosomes are found near the plasma
membrane and appear to
function primarily in the recycling of internalized receptors back to the cell
surface. Late endosomes
appear later in the endocytic pxocess close to the Golgi apparatus and the
nucleus, and appear to be
3 s associated with delivery of endocytosed material to lysosomes or to the
TGN where they may be
105


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
recycled. Specific proteins are associated with particular transport vesicles
and their target
compartments that may provide selectivity in targeting vesicles to their
proper compartments. A
cytosolic prenylated GTP-binding protein, Rab, is one such protein. Rabs 4, 5,
and 11 are associated
with the early endosome, whereas Rabs 7 and 9 associate with the late
endosome.
s Mitochondria
Mitochondria are oval-shaped organelles comprising an outer membrane, a
tightly folded inner
membrane, an intermembrane space between the outer and inner membranes, and a
matrix inside the
inner membrane. 'The outer membrane contains many porin molecules that allow
ions and charged
molecules to enter the intermembrane space, while the inner membrane contains
a variety of transport
Zo ' proteins that transfer only selected molecules. Mitochondria are the
primary sites of energy
production in cells.
Energy is produced by the oxidation of glucose and fatty acids. Glucose is
initially converted
to pyruvate in the cytoplasm. Fatty acids and pyruvate are transported to the
mitochondria for
complete oxidation to COZ coupled by enzymes to the transport of electrons
from NADH and FADH2
15 to oxygen and to the synthesis of ATP (oxidative phosphorylation) from ADP
and Pi.
Pyruvate is transported into the mitochondria and converted to acetyl-CoA for
oxidation via
the citric acid cycle, involving pyruvate dehydrogenase components,
dihydrolipoyl transacetylase, and
dihydrolipoyl dehydrogenase. Enzymes involved in the citric acid cycle
include: citrate synthetase,
aconitases, isocitrate dehydrogenase, alpha-ketoglutarate dehydrogenase
complex including
2o transsuccinylases, succiuyl CoA synthetase, succinate dehydrogenase,
fumarases, and malate
dehydrogenase. Acetyl CoA is oxidized to COZ with concomitant formation of
NADH, FADH2, and
GTP. In oxidative phosphorylation, the transfer of electrons from NADH and
FADHa to oxygen by
dehydrogenases is coupled to the synthesis of ATP from ADP and Pi by the FoFl
ATPase complex in
the mitochondrial inner membrane. Enzyme complexes responsible for electron
transport and ATP
25 synthesis include the FoFl ATPase complex, ubiquinone(CoQ)-cytochrome c
reductase, ubiquinone
reductase, cytochrome b, cytochrome c1, FeS protein, and cytochrome c oxidase.
Peroxisomes
Peroxisomes, like mitochondria, are a major site of oxygen utilization. They
contain one or
more enzymes, such as catalase and urate oxidase, that use molecular oxygen to
remove hydrogen
3 o atoms from specific organic substrates in an oxidative reaction that
produces hydrogen peroxide
(Alberts, supra, pp. 574-577). Catalase oxidizes a variety of substrates
including phenols, formic acid,
formaldehyde, and alcohol and is important in peroxisomes of liver and kidney
cells for detoxifying
various toxic molecules that enter the bloodstream. Another major function of
oxidative reactions in
peroxisomes is the breakdown of fatty acids in a process called (3 oxidation.
(3 oxidation results in
106


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
shortening of the alkyl chain of fatty acids by blocks of two carbon atoms
that are converted to acetyl
CoA and exported to the cytosol for reuse in biosynthetic reactions.
Also like mitochondria, peroxisomes import their proteins from the cytosol
using a specific
signal sequence located near the C-terminus of the protein. The importance of
this import process is
s evident in the inherited human disease Zellweger syndrome, in which a defect
in importing proteins
into perixosomes leads to a perixosomal deficiency resulting in severe
abnormalities in the brain, liver,
and kidneys, and death soon after birth. One form of this disease has been
shown to be due to a
mutation in the gene encoding a perixosomal integral membrane protein called
peroxisome assembly
factor-1.
1o The discovery of new human molecules satisfies a need in the art by
providing new
compositions which are useful in the diagnosis, study, prevention, and
treatment of diseases associated
with, as well as effects of exogenous compounds on, the expression of human
molecules.
SUMMARY OF THE INVENTION
The present invention relates to nucleic acid sequences comprising human
diagnostic and
therapeutic polynucleotides (dithp) as presented in the Sequence Listing. The
dithp uniquely identify
genes encoding human structural, functional, and regulatory molecules.
The invention provides an isolated polynucleotide selected from the group
consisting of a) a
polynucleotide comprising a polynucleotide sequence selected from the group
consisting of SEQ ID
~ o NO:1-56; b) a polynucleotide comprising a naturally occurring
polynucleotide sequence at least 90%
identical to a polynucleotide sequence selected from the group consisting of
SEQ )D NO:1-56; c) a
polynucleotide complementary to the polynucleotide of a); d) a polynucleotide
complementary to the
polynucleotide of b); and e) an RNA equivalent of a) through d). In one
alternative, the polynucleotide
comprises a polynucleotide sequence selected from the group consisting of SEQ
a7 N0:1-56. In
2 s another alternative, the polynucleotide comprises at least 30 contiguous
nucleotides of a polynucleotide
selected from the group consisting of a) a polynucleotide comprising a
polynucleotide sequence
selected from the group consisting of SEQ )D N0:1-56; b) a polynucleotide
comprising a naturally
occurring polynucleotide comprising a polynucleotide sequence at Ieast 90%
identical to a
polynucleotide sequence selected from the group consisting of SEQ )D NO:1-56;
c) a polynucleotide
3 o complementary to the polynucleotide of a); d) a polynucleotide
complementary to the polynucleotide of
b); and e) an RNA equivalent of a) through d). In another alternative, the
polynucleotide comprises at
least 60 contiguous nucleotides of a polynucleotide selected from the group
consisting of a) a
polynucleotide comprising a polynucleotide sequence selected from the group
consisting of SEQ m
N0:1-56; b) a polynucleotide comprising a naturally occurring polynucleotide
comprising a
3 5 polynucleotide sequence at least 90% identical to a polynucleotide
sequence selected from the group
107


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
consisting of SE(2 7.D NU:1-56; c) a polynucleotide complementary to the
polynucleotide of a); d) a
polynucleotide complementary to the polynucleotide of b); and e) an RNA
equivalent of a) through d).
The invention further provides a composition for the detection of expression
of human diagnostic and
therapeutic polynucleotides comprising at least one isolated polynucleotide
comprising a polynucleotide
selected from the group consisting of a) a polynucleotide comprising a
polynucleotide sequence
selected from the group consisting of SEQ 1D N0:1-56; b) a polynucleotide
comprising a naturally
occurring polynucleotide sequence at least 90% identical to a polynucleotide
sequence selected from
the group consisting of SEQ ID N0:1-56; c) a polynucleotide complementary to
the polynucleotide of
a); d) a polynucleotide complementary to the polynucleotide of b); and e) an
RNA equivalent of a)
1o through d); and a detectable label.
The invention also provides a method for detecting a target polynucleotide in
a sample, said
target polynucleotide having a polynucleotide sequence of a polyneucleotide
selected from the group
consisting of a) a polynucleotide comprising a polynucleotide sequence of a
polynucleotide selected
from the group consisting of SEQ m N0:1-56; b) a polynucleotide comprising a
naturally occurring
polynucleotide sequence at least 90% identical to a polynucleotide sequence
selected from the group
consisting of SEQ 1D N0:1-56; c) a polynucleotide complementary to the
polynucleotide of a); d) a
polynucleotide complementary to the polynucleotide of b); and e) an RNA
equivalent of a) through d).
The method comprises a) amplifying said target polynucleotide or fragment
thereof using polymerise
chain reaction amplification, and b) detecting the presence or absence of said
amplified target
2o polynucleotide or fragment thereof, and, optionally, if present, the amount
thereof.
The invention also provides a method for detecting a target polynucleotide in
a sample, said
target polynucleotide having a polynucleotide sequence of a polynucleotide
selected from the group
consisting of a) a polynucleotide comprising a polynucleotide sequence
selected from the group
consisting of SEQ ID N0:1-56; b) a polynucleotide comprising a naturally
occurring polynucleotide
sequence at least 90% identical to a polynucleotide sequence selected from the
group consisting of
SEQ ID NO:1-56; c) a polynucleotide complementary to the polynucleotide of a);
d) a polynucleotide
complementary to the polynucleotide of b); and e) an RNA equivalent of a)
through d). The method
comprises a) hybridizing the sample with a probe comprising at least 20
contiguous nucleotides
comprising a sequence complementary to said target polynucleotide in the
sample, and which probe
3 o specifically hybridizes to said target polynucleotide, under conditions
whereby a hybridization complex
is formed between said probe and said target polynucleotide, and b) detecting
the presence or absence
of said hybridization complex, and, optionally, if present, the amount
thereof. In one alternative, the
invention provides a composition comprising a target polynucleotide of the
method, wherein said probe
comprises at least 30 contiguous nucleotides. In one alternative, the
invention provides a composition
comprising a target polynucleotide of the method, wherein said probe comprises
at least 60 contiguous
108


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
nucleotides.
The invention further provides a recombinant polynucleotide comprising a
promoter sequence
operably linked to an isolated polynucleotide selected from the group
consisting of a) a polynucleotide
comprising a polynucleotide sequence selected from the group consisting of SEQ
m N0:1-56; b) a
s polynucleotide comprising a naturally occurring polynucleotide sequence at
least 90% identical to a
polynucleotide sequence selected from the group consisting of SEQ ID NO:1-56;
c) a polynucleotide
complementary to the polynucleotide of a); d) a polynucleotide complementary
to the polynucleotide of
b); and e) an RNA equivalent of a) through d). In one alternative, the
invention provides a cell
transformed with the recombinant polynucleotide. In another alternative, the
invention provides a
so transgenic organism comprising the recombinant polynucleotide.
The invention also provides a method for producing a human diagnostic and
therapeutic
polypeptide, the method comprising a) culturing a cell under conditions
suitable for expression of the
human diagnostic and therapeutic polypeptide, wherein said cell is transformed
with a recombinant
polynucleotide, said recombinant polynucleotide comprising an isolated
polynucleotide selected from
15 the group consisting of i) a polynucleotide comprising a polynucleotide
sequence selected from the
group consisting of SEQ ID NO:1-56; ii) a polynucleotide comprising a
naturally occurring
polynucleotide sequence at least 90% identical to a polynucleotide sequence
selected from the group
consisting of SEQ m N0:1-56; iii) a polynucleotide complementary to the
polynucleotide of i); iv) a
polynucleotide complementary to the polynucleotide of ii); and v) an RNA
equivalent of i) through iv),
2 o and b) recovering the human diagnostic and therapeutic polypeptide so
expressed. The invention
additionally provides a method wherein the polypeptide has an amino acid
sequence selected from the
group consisting of SEQ ID NO:57-113.
The invention also provides an isolated human diagnostic and therapeutic
polypeptide
(DTIZ3P) encoded by at least one polynucleotide comprising a polynucleotide
sequence selected from
25 the group consisting of SEQ ID NO:1-56. The invention further provides a
method of screening for a
test compound that specifically binds to the polypeptide having an amino acid
sequence selected from
the group consisting of SEQ m NO:57-113. The method comprises a) combining the
polypeptide
having an amino acid sequence selected from the group consisting of SEQ ID
N0:57-113 with at least
one test compound under suitable conditions, and b) detecting binding of the
polypeptide having an
3 o amino acid sequence selected from the group consisting of SEQ ID N0:57-113
to the test compound,
thereby identifying a compound that specifically binds to the polypeptide
having an amino acid
sequence selected from the gxoup consisting of SEQ m NO:57-113.
The invention further provides a microarray wherein at least one element of
the microarray is
an isolated polynucleotide comprising at least 30 contiguous nucleotides of a
polynucleotide selected
3 5 from the group consisting of a) a polynucleotide comprising a
polynucleotide sequence selected from
109


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
the group conslstmg of JUG lU 1~IU:1-Jb; b) a polynucleotide comprising a
naturally occurring
polynucleotide sequence at least 90% identical to a polynucleotide sequence
selected from the group
consisting of SEQ m N0:1-56; c) a polynucleotide complementary to the
polynucleotide of a); d) a
polynucleotide complementary to the polynucleotide of b); and e) an RNA
equivalent of a) through d).
s The invention also provides a method for generating a transcript image of a
sample which contains
polynucleotides. The method comprises a) labeling the polynucleotides of the
sample, b) contacting
the elements of the microarray with the labeled polynucleotides of the sample
under conditions suitable
for the formation of a hybridization complex, and c) quantifying the
expression of the polynucleotides
in the sample.
so Additionally, the invention provides a method for screening a compound for
effectiveness in
altering expression of a target polynucleotide, wherein said target
polynucleotide comprises a
polynucleotide selected from the group consisting of a) a polynucleotide
comprising a polynucleotide
sequence selected from the group consisting of SEQ D7 N0:1-56; b) a
polynucleotide comprising a
naturally occurring polynucleotide sequence at least 90% identical to a
polynucleotide sequence
15 selected from the group consisting of SEQ )D N0:1-56; c) a polynucleotide
complementary to the
polynucleotide of a); d) a polynucleotide complementary to the polynucleotide
of b); and e) an RNA
equivalent of a) through d). The method comprises a) exposing a sample
comprising the target
polynucleotide to a compound, b) detecting altered expression of the target
polynucleotide, anal c)
comparing the expression of the target polynucleotide in the presence of
varying amounts of the
2o compound and in the absence of the compound.
The invention further provides a method for assessing toxicity of a test
compound, said
method comprising a) treating a biological sample containing nucleic acids
with the test compound; b)
hybridizing the nucleic acids of the treated biological sample with a probe
comprising at least 20
contiguous nucleotides of a polynucleotide selected from the group consisting
of i) a polynucleotide
25 comprising a polynucleotide sequence selected from the group consisting of
SEQ 117 N0:1-56; ii) a
polynucleotide comprising a naturally occurring polynucleotide sequence at
least 90% identical to a
polynucleotide sequence selected from the group consisting of SEQ 1D N0:1-56;
iii) a polynucleotide
complementary to the polynucleotide of i); iv) a polynucleotide complementary
to the polynucleotide
of ii); and v) an RNA equivalent of i) through iv). Hybridization occurs under
conditions whereby a
3 o specific hybridization complex is formed between said probe and a target
polynucleotide in the
biological sample, said target polynucleotide comprising a polynucleotide
sequence of a polynucleotide
selected from the group consisting of i) a polynucleotide comprising a
polynucleotide sequence
selected from the group consisting of SEQ 1D N0:1-56; ii) a polynucleotide
comprising a naturally
occurring polynucleotide sequence at least 90% identical to a polynucleotide
sequence selected from
3 5 the group consisting of SEQ )D N0:1-56; iii) a polynucleotide
complementary to the polynucleotide of
110


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an
RNA equivalent of i)
through iv), and alternatively, the target polynucleotide comprises a
polynucleotide sequence of a
fragment of a polynucleotide selected from the group consisting of i-v above;
c) quantifying the
amount of hybridization complex; and d) comparing the amount of hybridization
complex in the treated
biological sample with the amount of hybridization complex in an untreated
biological sample, wherein
a difference in the amount of hybridization complex in the treated biological
sample is indicative of
toxicity of the test compound.
The invention further provides an isolated polypeptide selected from the group
consisting of a)
a polypeptide comprising an amino acid sequence selected from the group
consisting of SEQ m
~o N0:57-113, b) a polypeptide comprising a naturally occurring amino acid
sequence at least 90%
identical to an amino acid sequence selected from the group consisting of SEQ
m N0:57-113, c) a
biologically active fragment of a polypeptide having an amino acid sequence
selected from the group
consisting of SEQ )D N0:57-113, and d) an immunogenic fragment of a
polypeptide having an amino
acid sequence selected from the group consisting of SEQ m N0:57-113. In one
alternative, the
invention provides an isolated polypeptide comprising an amino acid sequence
selected from the group
consisting of SEQ m N0:57-113.
The invention further provides an isolated polynucleotide encoding a
polypeptide selected from
the group consisting of a) a polypeptide comprising an amino acid sequence
selected from the group
consisting of SEQ D7 N0:57-113, b) a polypeptide comprising a naturally
occurring amino acid
2o sequence at least 90% identical to an amino acid sequence selected from the
group consisting of SEQ
ID NO:57-113, c) a biologically active fragment of a polypeptide having an
amino acid sequence
selected from the group consisting of SEQ )D N0:57-113, and d) an immunogenic
fragment of a
polypeptide having an. amino acid sequence selected from the group consisting
of SEQ )D N0:57-113.
In one alternative, the polynucleotide encodes a polypeptide comprising an
amino acid sequence
2 s selected from the group consisting of SEQ )D N0:57-113. In another
alternative, the polynucleotide
comprises a polynucleotide sequence selected from the group consisting of SEQ
m N0:1-56.
Additionally, the invention provides an isolated antibody which specifically
binds to a
polypeptide selected from the soup consisting of a) a polypeptide comprising
an amino acid sequence
selected from the group consisting of SEQ )D N0:57-113, b) a polypeptide
comprising a naturally
30 occurring amino acid sequence at least 90% identical to an amino acid
sequence selected from the
group consisting of SEQ m NO:57-113, c) a biologically active fragment of a
polypeptide having an
amino acid sequence selected from the group consisting of SEQ 1D N0:57-113,
and d) an
immunogenic fragment of a polypeptide having an amino acid sequence selected
from the group
consisting of SEQ 1D N0:57-113.
3 5 The invention further provides a composition comprising a polypeptide
selected from the group
111


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
consisting of a) a polypeptide comprising an amino acid sequence selected from
the group consisting
of SEQ ID N0:57-113, b) a polypeptide comprising a naturally occurring amino
acid sequence at least
90% identical to an amino acid sequence selected from the group consisting of
SEQ ID N0:57-113, c)
a biologically active fragment of a polypeptide having an amino acid sequence
selected from the group
s consisting of SEQ ID N0:57-113, and d) an immunogenic fragment of a
polypeptide having an amino
acid sequence selected from the group consisting of SEQ TD N0:57-113, and a
pharmaceutically
acceptable excipient. In one embodiment, the composition comprises a
polypeptide having an amino
acid sequence selected from the group consisting of SEQ 1D N0:57-113. The
invention additionally
provides a method of treating a disease or condition associated with decreased
expression of
so functional DITHP, comprising administering to a patient in need of such
treatment the composition.
The invention also provides a method for screening a compound for
effectiveness as an
agonist of a polypeptide selected from the group consisting of a) a
polypeptide comprising an amino
acid sequence selected from the group consisting of SEQ m N0:57-113, b) a
polypeptide comprising
a naturally occurring amino acid sequence at least 90% identical to an amino
acid sequence selected
15 from the group consisting of SEQ >D N0:57-113, c) a biologically active
fragment of a polypeptide
having an amino acid sequence selected from the group consisting of SEQ ID
N0:57-113, and d) an
immunogenic fragment of a polypeptide having an amino acid sequence selected
from the group
consisting of SEQ m N0:57-113. The method comprises a) exposing a sample
comprising the
polypeptide to a compound, and b) detecting agonist activity in the sample. In
one alternative, the
2o invention provides a composition comprising an agonist compound identified
by the method and a
pharmaceutically acceptable excipient. In another alternative, the invention
provides a method of
treating a disease or condition associated with decreased expression of
functional DITHP, comprising
administering to a patient in need of such treatment the composition.
Additionally, the invention provides a method for screening a compound for
effectiveness as
2 s an antagonist of a polypeptide selected from the group consisting of a) a
polypeptide comprising an
amino acid sequence selected from the group consisting of SEQ ID N0:57-113, b)
a polypeptide
comprising a naturally occurring amino acid sequence at least 90% identical to
an amino acid
sequence selected from the group consisting of SEQ )D N0:57-113, c) a
biologically active fragment
of a polypeptide having an amino acid sequence selected from the group
consisting of SEQ )D N0:57-
3 0 113, and d) an immunogenic fragment of a polypeptide having an amino acid
sequence selected from
the group consisting of SEQ )D N0:57-113. The method comprises a) exposing a
sample comprising
the polypeptide to a compound, and b) detecting antagonist activity in the
sample. In one alternative,
the invention provides a composition comprising an antagonist compound
identified by the method and
a pharmaceutically acceptable excipient. In another alternative, the invention
provides a method of
35 treating a disease or condition associated with overexpression of
functional DITHP, comprising
112


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
administering to a patient in need of such treatment the composition.
The invention further provides a method of screening for a compound that
modulates the
activity of a polypeptide selected from the group consisting of a) a
polypeptide comprising an amino
acid sequence selected from the group consisting of SEQ ID N0:57-113, b) a
polypeptide comprising
a naturally occurring amino acid sequence at least 90% identical to an amino
acid sequence selected
from the group consisting of SEQ ID N0:57-113, c) a biologically active
fragment of a polypeptide
having an amino acid sequence selected from the group consisting of SEQ ID
N0:57-113, and d) an
immunogenic fragment of a polypeptide having an amino acid sequence selected
from the group
consisting of SEQ ID N0:57-113. The method comprises a) combining the
polypeptide with at least
one test compound under conditions permissive for the activity of the
polypeptide, b) assessing the
activity of the polypeptide in the presence of the test compound, and c)
comparing the activity of the
polypeptide in the pxesence of the test compound with the activity of the
polypeptide in the absence of
the test compound, wherein a change in the activity of the polypeptide in the
presence of the test
compound is indicative of a compound that modulates the activity of the
polypeptide.
DESCRIPTION OF THE TABLES
Table 1 shows the sequence identification numbers (SEQ ID NOa) and template
identification
numbers (template IDs) corresponding to the polynucleotides of the present
invention, along with the
sequence identification numbers (SEQ ID NOa) and open reading frame
identification numbers (ORF
2o IDs) corresponding to polypeptides encoded by the template D7.
Table 2 shows the sequence identification numbers (SEQ ID NOa) and template
identification
numbers (template IDs) corresponding to the polynucleotides of the present
invention, along with their
GenBank hits (GI Numbers), probability scores, and functional annotations
corresponding to the
GenBank hits.
Table 3 shows the sequence identification numbers (SEQ ID NOa) and template
identification
numbers (template IDs) corresponding to the polynucleotides of the present
invention, along with
polynucleotide segments of each template sequence as defined by the indicated
"start" and "stop"
nucleotide positions. The reading frames of the polynucleotide segments and
the Pfam hits, Pfam
descriptions, and E-values corresponding to the polypeptide domains encoded by
the polynucleotide
3 o segments are indicated.
Table 4 shows the sequence identification numbers (SEQ ~ NOa) and template
identification
numbers (template 1Ds) corresponding to the polynucleotides of the present
invention, along with
polynucleotide segments of each template sequence as defined by the indicated
"start" and "stop"
nucleotide positions. The reading frames of the polynucleotide segments are
shown, and the
113


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
polypeptides encoded by the polynucleotide segments constitute either signal
peptide (SP) or
transmembrane (TM) domains, as indicated. For TM domains, the membrane
topology of the encoded
polypeptide sequence is indicated as being transmernbrane or on the cytosolic
or non-cytosolic side of
the cell membrane or organelle.
s Table 5 shows the sequence identification numbers (SEQ 1D NOa) and template
identification
numbers (template 1Ds) corresponding to the polynucleotides of the present
invention, along with
component sequence identification numbers (component ll~s) corresponding to
each template. The
component sequences, which were used to assemble the template sequences, are
defined by the
indicated "start" and "stop" nucleotide positions along each template.
so Table 6 shows the tissue distribution profiles for the templates of the
invention.
Table 7 shows the sequence identification numbers (SEQ ID NOa) corresponding
to the
polypeptides of the present invention, along with the reading frames used to
obtain the polypeptide
segments, the lengths of the polypeptide segments, the "start" and "stop"
nucleotide positions of the
polynucleotide sequences used to define the encoded polypeptide segments, the
GenBank hits (GI
15 Numbers), probability scores, and functional annotations corresponding to
the GenBank hits.
Table 8 summarizes the bioinformatics tools which are useful for analysis of
the
polynucleotides of the present invention. The first column of Table 8 lists
analytical tools, programs,
and algorithms, the second column provides brief descriptions thereof, the
third column presents
appropriate references, all of which are incorporated by reference herein in
their entirety, and the
2o fourth column presents, where applicable, the scores, probability values,
and other parameters used to
evaluate the strength of a match between two sequences (the higher the score,
the greater the
homology between two sequences).
DETAILED DESCRIPTION OF THE INVENTION
2 s Before the nucleic acid sequences and methods are presented, it is to be
understood that this
invention is not limited to the particular machines, methods, and materials
described. Although
particular embodiments are described, machines, methods, and materials similar
or equivalent to these
embodiments may be used to practice the invention. The preferred machines,
methods, and materials
set forth are not intended to limit the scope of the invention which is
limited only by the appended
3 o claims.
The singular forms "a", "an", and "the" include plural reference unless the
context clearly
dictates otherwise. All technical and scientific terms have the meanings
commonly understood by one
of ordinary skill in the art. All publications are incorporated by reference
for the purpose of describing
and disclosing the cell lines, vectors, and methodologies which are presented
and which might be used
114


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
in connection with the invention. Nothing in the specification is to be
construed as an admission that
the invention is not entitled to antedate such disclosure by virtue of prior
invention.
Definitions
As used herein, the lower case "dithp" refers to a nucleic acid sequence,
while the upper case
"DTI'HP" refers to an amino acid sequence encoded by dithp. A "full-length"
dithp refers to a nucleic
acid sequence containing the entire coding region of a gene endogenously
expressed in human tissue.
"Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum
hydroxide), and
surface active substances (lysolecithin, pluronic polyols, polyanions,
peptides, oil emulsions, keyhole
~o limpet hemocyanin, and dinitrophenol) which may be administered to increase
a host's immunological
response.
"Allele" refers to an alternative form of a nucleic acid sequence. Alleles
result from a
"mutation," a change or an alternative reading of the genetic code. Any given
gene may have none,
one, or many allelic forms. Mutations which give rise to alleles include
deletions, additions, or
substitutions of nucleotides. Each of these changes may occur alone, or in
combination with the
others, one or more times in a given nucleic acid sequence. The present
invention encompasses allelic
dithp.
An "allelic variant" is an alternative form of the gene encoding DTTHP.
Allelic variants may
result from at least one mutation in the nucleic acid sequence and may result
in altered mRNAs or in
2 o polypeptides whose structure or function may or may not be altered. A gene
may have none, one, or
many allelic variants of its naturally occurring form. Common mutational
changes which give rise to
allelic variants are generally ascribed to natural deletions, additions, or
substitutions of nucleotides.
Each of these types of changes may occur alone, or in combination with the
others, one or more times
in a given sequence.
"Altered" nucleic acid sequences encoding DTTHP include those sequences with
deletions,
insertions, or substitutions of different nucleotides, resulting in a
polypeptide the same as DITI-iP or a
polypeptide with at least one functional characteristic of DITHP. Included
within this definition are
polymorphisms which may or may not be readily detectable using a particular
oligonucleotide probe of
the polynucleotide encoding DTIZ3P, and improper or unexpected hybridization
to allelic variants, with
s o a locus other than the normal chromosomal locus for the polynucleotide
sequence encoding DITHP.
The encoded protein may also be "altered," and may contain deletions,
insertions, or substitutions of
amino acid residues which produce a silent change and result in a functionally
equivalent DTI'~lP.
Deliberate amino acid substitutions may be made on the basis of similarity in
polarity, charge, solubility,
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues,
as long as the biological
3 5 or immunological activity of DITFiP is retained. For example, negatively
charged amino acids may
115


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
include aspartic acid and glutamic acid, and positively charged amino acids
may include lysine and
arginine. Amino acids with uncharged polar side chains having similar
hydrophilicity values may
include: asparagine and glutamine; and serine and threonine. Amino acids with
uncharged side chains
having similar hydrophilicity values may include: leucine, isoleucine, and
valine; glycine and alanine;
s and phenylalanine and tyrosine.
"Amino acid sequence" refers to a peptide, a polypeptide, or a protein of
either natural or
synthetic origin. 'The amino acid sequence is not limited to the complete,
endogenous amino acid
sequence and may be a fragment, epitope, variant, or derivative of a protein
expressed by a nucleic
acid sequence.
"Amplification" refers to the production of additional copies of a sequence
and is carried out
using polymerase chain reaction (PCR) technologies well known in the art.
"Antibody" refers to intact molecules as well as to fragments thereof, such as
Fab, F(ab')~,
and Fv fragments, which are capable of binding the epitopic determinant.
Antibodies that bind DITHP
polypeptides can be prepared using intact polypeptides or using fragments
containing small peptides of
is interest as the inununizing antigen. The polypeptide or peptide used to
immunize an animal (e.g., a
mouse, a rat, or a rabbit) can be derived from the translation of RNA, or
synthesized chemically, and
can be conjugated to a carrier protein if desired. Commonly used carriers that
are chemically coupled
to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet
hemocyanin (KLH). The
coupled peptide is then used to immunize the animal.
2 o The term "aptamer" refers to a nucleic acid or oligonucleotide molecule
that binds to a
specific molecular target. Aptamers are derived from an in vitro evolutionary
process (e.g., SELEX
(Systematic Evolution of Ligands by EXponential Enrichment), described in U.S.
Patent No.
5,270,163), which selects for target-specific aptamer sequences from large
combinatorial libraries.
Aptamer compositions may be double-stranded or single-stranded, and may
include
25 deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other
nucleotide-like molecules. The
nucleotide components of an aptamer may have modified sugar groups (e.g., the
2'-OH group of a
ribonucleotide may be replaced by 2'-F or 2'-NHz), which may improve a desired
property, e.g.,
resistance to nucleases or longer lifetime in blood. Aptamers may be
conjugated to other molecules,
e.g., a high molecular weight carrier to slow clearance of the aptamer from
the circulatory system.
s o Aptamers may be specifically cross-linked to their cognate ligands, e.g.,
by photo-activation of a
cross-linker. (See, e.g., Brody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-
13.)
The term "intramer" refers to an aptamer which is expressed in vivo. For
example, a vaccinia
virus-based RNA expression system has been used to express specific RNA
aptamers at high levels
in the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Natl Acad. Sci.
USA 96:3606-3610).
3 5 The term "spiegelmer" refers to an aptamex which includes L-DNA, L-RNA, or
other left-
116


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
handed nucleotide derivatives or nucleotide-like molecules. Aptamers
containing left-handed
nucleotides are resistant to degradation by naturally occurring enzymes, which
normally act on
substrates containing right handed nucleotides,
"Antisense sequence" xefers to a sequence capable of specifically hybridizing
to a target
sequence. The antisense sequence may include DNA, RNA, or any nucleic acid
mimic or analog
such as peptide nucleic acid (PNA); oligonucleotides having modified backbone
linkages such as
phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides
having modified
sugar groups such as 2'-methoxyethyl sugars or 2'-methoxyethoxy sugars; or
oligonucleotides having
modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-
deoxyguanosine.
"Antisense technology" refers to any technology which relies on the specific
hybridization of
an antisense sequence to a target sequence.
A "bin" is a portion of computer memory space used by a computer program for
storage of
data, and bounded in such a manner that data stored in a bin may be retrieved
by the program.
"Biologically active" refers to an amino acid sequence having a structural,
regulatory, or
biochemical function of a naturally occurring amino acid sequence.
"Clone joining" is a process for combining gene bins based upon the bins'
containing sequence
information from the same clone. The sequences may assemble into a primary
gene transcript as well
as one or more splice variants.
"Complementary" describes the relationship between two single-stranded nucleic
acid
~o sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its
complement 3'-T-C-A-S').
A "component sequence" is a nucleic acid sequence selected by a computer
program such as
PHRED and used to assemble a consensus or template sequence from one or more
component
sequences.
A "consensus sequence" or "template sequence" is a nucleic acid sequence which
has been
assembled from overlapping sequences, using a computer program for fragment
assembly such as the
GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI)
or using a
relational database management system (RDMS).
"Conservative amino acid substitutions" are those substitutions that, when
made, least
interfere with the properties of the original protein, i.e., the structure and
especially the function of the
3 o protein is conserved and not significantly changed by such substitutions.
The table below shows amino
acids which may be substituted for an original amino acid in a protein and
which are regarded as
conservative substitutions.
117


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009


Original Residue Conservative Substitution


Ala Gly, Ser


Arg His, Lys


Asn Asp, Gln, His


s , Asp Asn, Glu


Cys Ala, Ser


Gln Asn, Glu, His


Glu Asp, Gln, His


Gly Ala


so His Asn, Arg, Gln, Glu


Ile Leu, Val


Leu Ile, Val


Lys Arg, Gln, Glu


Met Leu, Ile


Phe His, Met, Leu, Trp, Tyr


Ser Cys, Thr
~


Thr Ser, Val


Trp Phe, Tyr


Tyr His, Phe, Trp


~ o V al Ile, Leu, Thr


Conservative substitutions generally maintain (a) the structure of the
polypeptide backbone in
the area of the substitution, for example, as a beta sheet or alpha helical
conformation, (b) the charge
2 s or hydrophobicity of the molecule at the target site, or (c) the bulk of
the side chain.
"Deletion" refers to a change in either a nucleic or amino acid sequence in
which at least one
nucleotide or amino acid residue, respectively, is absent.
"Derivative" refers to the chemical modification of a nucleic acid sequence,
such as by
replacement of hydrogen by an alkyl, aryl, amino, hydroxyl, or other group.
3 0 "Differential expression" refers to increased or upregulated; or
decreased, downregulated, or
absent gene or protein expression, determined by comparing at least two
different samples. Such
comparisons may be carried out between, for example, a treated and an
untreated sample, or a
diseased and a normal sample.
The terms "element" and "array element" refer to a polynucleotide,
polypeptide, or other
3 5 chemical compound having a unique and defined position on a microarray.
The term "modulate" refers to a change in the activity of D1THP. For example,
modulation
may cause an increase or a decrease in protein activity, binding
characteristics, or any other biological,
functional, or immunological properties of DT~IP.
"E-value" refers to the statistical probability that a match between two
sequences occurred by
4 o chance.
118


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
"Uxon shuttliug" refers to the recombmatlon of different coding regions
(exons). Since an
exon may represent a structural or functional domain of the encoded protein,
new proteins may be
assembled through the novel reassortment of stable substructures, thus
allowing acceleration of the
evolution of new protein functions.
s A "fragment" is a unique portion of dithp or DITHP which is identical in
sequence to but
shorter in length than the parent sequence. A fragment may comprise up to the
entire length of the
defined sequence, minus one nucleotide/amino acid residue. For example, a
fragment may comprise
from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used
as a probe, primer,
antigen, therapeutic molecule, or for other purposes, may be at least 5, 10,
15, 16, 20, 25, 30, 40, 50, 60,
so 75, 100, 150, 250 or at least 500 contiguous amino acid residues or
nucleotides in length. Fragments
may be preferentially selected from certain regions of a molecule. Fox
example, a polypeptide
fragment may comprise a certain length of contiguous amino acids selected from
the first 250 or 500
amino acids (or first 25% or 50%) of a polypeptide as shown in a certain
defined sequence. Clearly
these lengths are exemplary, and any length that is supported by the
specification, including the
15 Sequence Listing and the figures, may be encompassed by the present
embodiments.
A fragment of dithp comprises a region of unique polynucleotide sequence that
specifically
identifies dithp, for example, as distinct from any other sequence in the same
genome. A fragment of
dithp is useful, for example, in hybridization and amplification technologies
and in analogous methods
that distinguish dithp from related polynucleotide sequences. The precise
length of a fragment of dithp
2 o and the region of dithp to which the fragment corresponds are routinely
determinable by one of
ordinary skill in the art based on the intended purpose for the fragment.
A fragment of DITHP is encoded by a fragment of dithp. A fragment of DTTHP
comprises a
region of unique amino acid sequence that specifically identifies DTTblP. For
example, a fragment of
DTTHP is useful as an immunogenic peptide for the development of antibodies
that specifically
2s recognize DTTHP. The precise length of a fragment of DITHP and the region
of DITHP to which
the fragment corresponds are routinely determinable by one of ordinary skill
in the art based on the
intended purpose for the fragment.
A "full length" nucleotide sequence is one containing at least a start site
for translation to a
protein sequence, followed by an open reading frame and a stop site, and
encoding a "full length"
3 o polypeptide.
"Hit" refers to a sequence whose annotation will be used to describe a given
template.
Criteria for selecting the top hit are as follows: if the template has one or
more exact nucleic acid
matches, the top hit is the exact match with highest percent identity. If the
template has no exact
matches but has significant protein hits, the top hit is the protein hit with
the lowest E-value. If the
119


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
template has no significant protein hits, but does have significant non-exact
nucleotide hits, the top hit
is the nucleotide hit with the lowest E-value.
"Homology" refers to sequence similarity either between a reference nucleic
acid sequence
and at least a fragment of a dithp or between a reference amino acid sequence
and a fragment of a
s DITHP.
"Hybridization" refers to the process by which a strand of nucleotides anneals
with a
complementary strand through base pairing. Specific hybridization is an
indication that two nucleic
acid sequences share a high degree of identity. Specific hybridization
complexes form under defined
annealing conditions, and remain hybridized after the "washing" step. The
defined hybridization
so conditions include the annealing conditions and the washing step(s), the
latter of which is particularly
important in determining the stringency of the hybridization process, with
more stringent conditions
allowing less non-specific binding, i.e., binding between pairs of nucleic
acid probes that are not
perfectly matched. Permissive conditions for annealing of nucleic acid
sequences are routinely
determinable and may be consistent among hybridization experiments, whereas
wash conditions may
15 be varied among experiments to achieve the desired stringency.
Generally, stringency of hybridization is expressed with reference to the
temperature under
which the wash step is carried out. Generally, such wash temperatures are
selected to be about 5°C
to 20°C lower than the thermal melting point (T"~ for the specific
sequence at a defined ionic strength
and pH. The Tm is the temperature (under defined ionic strength and pH) at
which 50% of the target
2 o sequence hybridizes to a per:Fectly matched probe. An equation for
calculating Tm and conditions for
nucleic acid hybridization is well known and can be found in Sambrook et al.,
1989, Molecular Cloning:
A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview
NY; specifically see
volume 2, chapter 9.
High stringency conditions for hybridization between polynucleotides of the
present invention
2s include wash conditions of 68°C in the presence of about 0.2 x SSC
and about 0.1% SDS, for 1 hour.
Alternatively, temperatures of about 65°C, 60°C, or 55°C
may be used. SSC concentration may be
varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1%.
Typically, blocking reagents
are used to block non-specific hybridization. Such blocking reagents include,
for instance, denatured
salmon sperm DNA at about 100-200 ~tg/ml. Useful variations on these
conditions will be readily
s o apparent to those skilled in the art. Hybridization, particularly under
high stringency conditions, may be
suggestive of evolutionary similarity between the nucleotides. Such sinularity
is strongly indicative of a
similar role for the nucleotides and their resultant proteins.
Other parameters, such as temperature, salt concentration, and detergent
concentration may
be varied to achieve the desired stringency. Denaturants, such as formamide at
a concentration of
3 s about 35-50% v/v, may also be used under particular circumstances, such as
RNA:DNA
120


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
hybridizations. Appropriate hybridization conditions are routinely
determinable by one of ordinary skill
in the art.
"hnmunologically active" or "immunogenic" describes the potential for a
natural, recombinant,
or synthetic peptide, epitope, polypeptide, or protein to induce antibody
production in appropriate
s animals, cells, or cell lines.
"immune response" can refer to conditions associated with inflammation,
trauma, immune
disorders, or infectious or genetic disease, etc. These conditions can be
characterized by expression
of various factors, e.g., cytokiues, chemokines, and other signaling
molecules, which may affect
cellular and systemic defense systems.
~o An "immunogenic fragment" is a polypeptide or oligopeptide fragment of
Dithp which is
capable of eliciting an immune response when introduced into a living
organism, for example, a
mammal. The term "immunogenic fragment" also includes any polypeptide or
oligopeptide fragment
of DITIE' which is useful in any of the antibody production methods disclosed
herein or known in the
art.
z5 "Insertion" or "addition" refers to a change in either a nucleic or amino
acid sequence in
which at least one nucleotide or residue, respectively, is added to the
sequence.
"Labeling" refers to the covalent or noncovalent joining of a polynucleotide,
polypeptide, or
antibody with a reporter molecule capable of producing a detectable or
measurable signal.
"Microarra~' is any arrangement of nucleic acids, amino acids, antibodies,
etc., on a
2~o substrate. The substrate may be a solid support such as beads, glass,
paper, nitrocellulose, nylon, or an
appropriate membrane.
"Linkers" are short stretches of nucleotide sequence which may be added to a
vectox or a
dithp to create restriction endonuclease sites to facilitate cloning.
"Polylinkers" are engineered to
incorporate multiple restriction enzyme sites and to provide for the use of
enzymes which leave 5' or 3'
25 overhangs (e.g., BamHI, EcoRI, and HindIlI) and those which provide blunt
ends (e.g., EcoRV,
SnaBI, and StuI).
"Naturally occurring" refers to an endogenous polynucleotide or polypeptide
that may be
isolated from viruses or prokaryotic or eukaryotic cells.
"Nucleic acid sequence" refers to the specific order of nucleotides joined by
phosphodiester
3 o bonds iu a linear, polymeric arrangement. Depending on the number of
nucleotides, the nucleic acid
sequence can be considered an oligomer, oligonucleotide, or polynucleotide.
The nucleic acid can be
DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or
synthetic origin, may be
either double-stranded or single-stranded, and can represent either the sense
or antisense
(complementary) strand.
121


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
"Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides
and as many as
about 60 nucleotides, preferably about 15 to 40 nucleotides, and most
preferably between about 20 and
30 nucleotides, that may be used in hybridization or amplification
technologies. Oligomers may be
used as, e.g., primers for PCR, and are usually chemically synthesized.
"Operably linked" refers to the situation in which a first nucleic acid
sequence is placed in a
functional relationship with the second nucleic acid sequence. For instance, a
promoter is operably
linked to a coding sequence if the promoter affects the transcription or
expression of the coding
sequence. Generally, operably linked DNA sequences may be in close proximity
or contiguous and,
where necessary to join two protein coding regions, in the same reading frame.
o "Peptide nucleic acid" (PNA) refers to a DNA mimic in which nucleotide bases
are attached
to a pseudopeptide backbone to increase stability. PNAs, also designated
antigene agents, can
prevent gene expression by targeting complementary messenger RNA.
The phrases "percent identity" and "% identity", as applied to polynucleotide
sequences, refer
to the percentage of residue matches between at least two polynucleotide
sequences aligned using a
i5 standardized algorithm. Such an algorithm may insert, in a standardized and
reproducible way, gaps in
the sequences being compared in order to optimize alignment between two
sequences, and therefore
achieve a more meaningful comparison of the two sequences.
Percent identity between polynucleotide sequences rnay be determined using the
default
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN
version 3.12e
2 o sequence alignment program. 'This program is part of the LASERGENE
software package, a suite of
molecular biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is
described in
Higgins, D.G. and Sharp, P.M. (1989) CABIOS 5:151-153 and in Higgins, D.G. et
al. (1992) CABIOS
8:189-191. For pairwise alignments of polynucleotide sequences, the default
parameters are set as
follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The
"weighted" residue
25 weight table is selected as the default. Percent identity is reported by
CLUSTAL V as the "percent
similarity" between aligned polynucleotide sequence pairs.
Alternatively, a suite of commonly used and freely available sequence
comparison algorithms
is provided by the National Center for Biotechnology Information (NCBI) Basic
Local Alignment
Search Tool (BLAST) (Altschul, S.F, et al. (1990) J. Mol. Biol. 215:403-410),
which is available from
3o several sources, including the NCBI, Bethesda, MD, and on the Internet at
http://www.ncbi.nlm.nih.govBLAST/. The BLAST software suite includes various
sequence analysis
programs including "blastn," that is used to determine alignment between a
known polynucleotide
sequence and other sequences on a variety of databases. Also available is a
tool called "BLAST 2
Sequences" that is used for direct pairwise comparison of two nucleotide
sequences. "BLAST 2
3 5 Sequences" can be accessed and used interactively at
http://www.ncbi.nlm.nih.gov/gorf/b12/. The
122


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
"BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed
below). BLAST
programs are commonly used with gap and other parameters set to default
settings. For example, to
compare two nucleotide sequences, one may use blastn with the "BLAST 2
Sequences" tool Version
2Ø9 (May-07-1999) set at default parameters. Such default parameters may be,
for example:
s Matrix: BLOSUM62
Reward for match: 1
Peraalty fo3~ mismatch: -2
Opeh Gap: S and Exteyasio~ Gap: 2 peyialties
Gap x drop-off. 50
1o Expect: l0
Word Size: 11
Filter: on
Percent identity may be measured over the length of an entire defined
sequence, for example,
as defined by a particular SEQ ID number, or may be measured over a shorter
length, for example,
is over the length of a fragment taken from a larger, defined sequence, for
instance, a fragment of at
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or
at least 200 contiguous
nucleotides. Such lengths are exemplary only, and it is understood that any
fragment length supported
by the sequences shown herein, in figures or Sequence Listings, may be used to
describe a length over
which percentage identity may be measured.
2 o Nucleic acid sequences that do not show a high degree of identity may
nevertheless encode
similar amino acrd sequences due to the degeneracy of the genetic code. It is
understood that changes
in nucleic acid sequence can be made using this degeneracy to produce multiple
nucleic acid
sequences that all encode substantially the same protein.
The phrases "percent identity" and "% identity', as applied to polypeptide
sequences, refer to
25 the percentage of residue matches between at least two polypeptide
sequences aligned using a
standardized algorithm. Methods of polypeptide sequence alignment are well-
known. Some alignment
methods take into account conservative amino acid substitutions. Such
conservative substitutions,
explained in more detail above, generally preserve the hydrophobicity and
acidity of the substituted
residue, thus preserving the structure (and therefore function) of the folded
polypeptide.
3 o Percent identity between polypeptide sequences may be determined using the
default
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN
version 3.12e
sequence alignment program (described and referenced above). For pairwise
alignments of
polypeptide sequences using CLUSTAL V, the default parameters are set as
follows: Ktuple=1, gap
penalty=3, window=5, and "diagonals saved"=S. The PAM250 matrix is selected as
the default
123


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
residue weight table. As with polynucleotlde alignments, the percent identity
is reported by
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence
pairs.
Alternatively the NCBI BLAST software suite may be used. For example, for a
pairwise
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences"
tool Version 2Ø9
s (May-07-1999) with blastp set at default parameters. Such default parameters
may be, for example:
Matrix: BLOSUM6~
Opera Gap: 11 and Extension Gap: 1 pefialty
Gap x df~op-off. 50
Expect: 10
io Wor-d Size: 3
Filter: on
Percent identity may be measured over the length of an entire defined
polypeptide sequence,
for example, as defined by a particular SEQ ID number, or may be measured over
a shorter length,
for example, over the length of a fragment taken from a larger, defined
polypeptide sequence, for
is instance, a fragment of at least 15, at least 20, at least 30, at least 40,
at least 50, at least 70 or at least
150 contiguous residues. Such lengths are exemplary only, and it is understood
that any fragment
length supported by the sequences shown herein, in figures or Sequence
Listings, may be used to
describe a length over which percentage identity may be measured.
"Post-translational modification" of a DITHP may involve lipidation,
glycosylation,
2o phosphorylation, acetylation, racemization, proteolytic cleavage, and other
modifications known in the
art. These processes may occur synthetically or biochemically. Biochemical
modifications will vary
by cell type depending on the enzymatic milieu and the DITHP.
"Probe" refexs to dithp or fragments thereof, which are used to detect
identical, allelic or
related nucleic acid sequences. Probes are isolated oligonucleotides or
polynucleotides attached to a
as detectable label or reporter molecule. Typical labels include radioactive
isotopes, ligands,
chemiluminescent agents, and enzymes. "Primers" are short nucleic acids,
usually DNA
oligonucleotides, which may be annealed to a target polynucleotide by
complementary base-pairing.
The primer may then be extended along the target DNA strand by a DNA
polymerise enzyme.
Primer pairs can be used for amplification (and identification) of a nucleic
acid sequence, e.g., by the
3 o polymerise chain reaction (PCR).
Probes and primers as used in the present invention typically comprise at
least 15 contiguous
nucleotides of a known sequence. Iu order to enhance specificity, longer
probes and primers may also
be employed, such as probes and primers that comprise at least 20, 30, 40, 50,
60, 70, 80, 90, 100, or at
least 150 consecutive nucleotides of the disclosed nucleic acid sequences.
Probes and primers may be
124


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
considerably longer than these examples, and it is understood that any length
supported by the
specification, including the figures and Sequence Listing, may be used.
Methods for preparing and using probes and primers are described in the
references, for
example Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 2"'~
ed., vol. 1-3, Cold
s Spring Harbor Press, Plainview NY; Ausubel et al., 1987, Current Protocols
in Molecular Biology,
Greene Publ. Assoc. & Wiley-Intersciences, New York NY; Innis et al., 1990,
PCR Protocols, A
Guide to Methods and Applications, Academic Press, San Diego CA. PCR primer
pairs can be
derived from a known sequence, for example, by using computer programs
intended for that purpose
such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical
Research, Cambridge MA).
1 o Oligonucleotides for use as primers are selected using software known in
the art for such
purpose. Fox example, OLIGO 4.06 software is useful for the selection of PCR
primer pairs of up to
100 nucleotides each, and for the analysis of oligonucleotides and larger
polynucleotides of up to 5,000
nucleotides from an input polynucleotide sequence of up to 32 kilobases.
Similar primer selection
programs have incorporated additional features for expanded capabilities. For
example, the PrimOU
1s primer selection program (available to the public from the Genome Center at
University of Texas
South West Medical Center, Dallas TX) is capable of choosing specific primers
from megabase
sequences and is thus useful for designing primers on a genome-wide scope. The
Primer3 primer
selection program (available to the public from the Whitehead Institute/MTT
Center for Genome
Research, Cambridge MA) allows the user to input a "mispriming library," in
which sequences to
a o avoid as primer binding sites are user-specified. Primer3 is useful, in
particular, for the selection of
oligonucleotides for microarrays. (The source code for the latter two primer
selection programs may
also be obtained from their respective sources and modified to meet the user's
specific needs.) The
PrimeGen pxogram (available to the public from the UK Human Genome Mapping
Project Resource
Centre, Cambridge UK) designs primers based on multiple sequence alignments,
thereby allowing
2 s selection of primers that hybridize to either the most conserved or least
conserved regions of aligned
nucleic acid sequences. Hence, this program is useful for identification of
both unique and conserved
oligonucleotides and polynucleotide fragments. The oligonucleotides and
polynucleotide fragments
identified by any of the above selection methods are useful in hybridization
technologies, for example,
as PCR or sequencing primers, microarray elements, or specific probes to
identify fully or partially
3 o complementary polynucleotides in a sample of nucleic acids. Methods of
oligonucleotide selection are
not limited to those described above.
"Purified" refers to molecules, either polynucleotides or polypeptides that
are isolated or
separated from their natural environment and are at least 60% free, preferably
at least 75% free, and
most preferably at least 90% free from other compounds with which they are
naturally associated.
125


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
A "recombinant nucleic acid" is a sequence that is not naturally occurring or
has a sequence
that is made by an artificial combination of two or more otherwise separated
segments of sequence.
This artificial combination is often accomplished by chemical synthesis or,
more commonly, by the
artificial manipulation of isolated segments of nucleic acids, e.g., by
genetic engineering techniques
such as those described in Sambrook, su ra. The term recombinant includes
nucleic acids that have
been altered solely by addition, substitution, or deletion of a portion of the
nucleic acid. Frequently, a
recombinant nucleic acid may include a nucleic acid sequence operably linked
to a promoter sequence.
Such a recombinant nucleic acid may be part of a vector that is used, fox
example, to transform a cell.
Alternatively, such recombinant nucleic acids maybe part of a viral vector,
e.g., based on a
so vaccinia virus, that could be use to vaccinate a mammal wherein the
recombinant nucleic acid is
expressed, inducing a protective immunological response in the mammal.
"Regulatory element" refers to a nucleic acid sequence from nontranslated
regions of a gene,
and includes enhancers, promoters, introns, and 3' untranslated regions, which
interact with host
proteins to carry out or regulate transcription or translation.
"Reporter" molecules are chemical or biochennical moieties used for labeling a
nucleic acid, an
amino acid, or an antibody. They include xadionuclides; enzymes; fluorescent,
chemiluminescent, or
chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and
other moieties known in
the art.
An "RNA equivalent," in reference to a DNA sequence, is composed of the same
linear
2 o sequence of nucleotides as the reference DNA sequence with the exception
that all occurrences of
the nitrogenous base thymine are replaced with uracil, and the sugar backbone
is composed of ribose
instead of deoxyribose.
"Sample" is used in its broadest sense. Samples may contain nucleic or amino
acids,
antibodies, or other materials, and may be derived from any source (e.g.,
bodily fluids including, but not
limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes
isolated from a cell;
genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared
cells or tissues or blots
or imprints from such cells or tissues).
"Specific binding" or "specifically binding" refers to the interaction between
a protein or
peptide and its agonist, antibody, antagonist, or other binding partner. The
interaction is dependent
s o upon the presence of a particular structure of the protein, e.g., the
antigenic determinant or epitope,
recognized by the binding molecule. For example, if an antibody is specific
for epitope "A," the
presence of a polypeptide containing epitope A, or the presence of free
unlabeled A, in a reaction
containing free labeled A and the antibody will xeduce the amount of labeled A
that binds to the
antibody.
126


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
"Substitution" refers to the replacement of at least one nucleotide or amino
acid by a different
nucleotide or amino acid.
"Substrate" refers to any suitable xigid or semi-rigid support including,
e.g., membranes, filters,
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing,
plates, polymers,
s microparticles or capillaries. The substrate can have a variety of surface
forms, such as wells,
trenches, pins, channels and pores, to which polynucleotides or polypeptides
are bound.
A "transcript image" or "expression profile" refers to the collective pattern
of gene expression
by a particular cell type or tissue under given conditions at a given time.
"Transformation" refers to a process by which exogenous DNA enters a recipient
cell.
s o Transformation may occur under natural or artificial conditions using
various methods well known in
the art. Transformation may rely on any known method for the insertion of
foreign nucleic acid
sequences into a prokaryotic or eukaryotic host cell. The method is selected
based on the host cell
being transformed.
"Transformants" include stably transformed cells in which the inserted DNA is
capable of
is replication either as an autonomously replicating plasmid or as part of the
host chromosome, as well as
cells which transiently express inserted DNA or RNA.
A "transgenic organism," as used herein, is any organism, including but not
limited to animals
and plants, in which one or more of the cells of the organism contains
heterologous nucleic acid
introduced by way of human intervention, such as by transgenic techniques well
known in the art. The
a o nucleic acid is introduced into the cell, directly or indirectly by
introduction into a precursor of the cell,
by way of deliberate genetic manipulation, such as by microinjection or by
infection with a
recombinant virus. The texm genetic manipulation does not include classical
cross-breeding, or in vitro
fertilization, but rather is directed to the introduction of a recombinant DNA
molecule. The transgenic
organisms contemplated in accordance with the present invention include
bacteria, cyanobacteria,
2s fungi, and plants and animals. The isolated DNA of the present invention
can be introduced into the
host by methods known in the art, for example infection, transfection,
transformation or
transconjugation. Techniques for transferring the DNA of the present invention
into such organisms
are widely known and provided in references such as Sambxook et al. (1989),
supra.
A "variant" of a particular nucleic acid sequence is defined as a nucleic acid
sequence having
s o at least 25% sequence identity to the particular nucleic acid sequence
over a certain length of one of
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool
Version 2Ø9 (May-07-
1999) set at default parameters. Such a pair of nucleic acids may show, for
example, at least 30%, at
least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least
91%, at least 92%, at least
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or
at least 99%o or greater
3 5 sequence identity over a certain defined length. The variant may result in
"conservative" amino acid
127


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
changes which do not affect structural and/or chemical properties. A variant
may be described as, for
example, an "allelic" (as defined above), "splice," "species," or
"polymorphic" variant. A splice
variant may have significant identity to a reference molecule, but will
generally have a greater or
lesser number of polynucleotides due to alternate splicing of exons during
mRNA processing. The
s corresponding polypeptide may possess additional functional domains or lack
domains that are present
in the reference molecule. Species variauts are polynucleotide sequences that
vary from one species
to another. The resulting polypeptides generally will have significant amino
acid identity relative to
each other. A polymorphic variant is a variation in the polynucleotide
sequence of a particular gene
between individuals of a given species. Polymorphic variants also may
encompass "single nucleotide
1o polymorphisms" (SNPs) in which the polynucleotide sequence varies by one
base. 'The presence of
SNPs may be indicative of, for example, a certain population, a disease state,
or a propensity for a
disease state.
In an alternative, variants of the polynucleotides of the present invention
may be generated
through recombinant methods. One possible method is a DNA shuffling technique
such as
15 MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent
Number
5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians,
F.C. et al. (1999) Nat.
Biotechuol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-
319) to alter or improve
the biological properties of DITHP, such as its biological or enzymatic
activity or its ability to bind to
other molecules or compounds. DNA shuffling is a process by which a library of
gene variants is
2o produced using PCR-mediated recombination of gene fragments. The library is
then subjected to
selection or screening procedures that identify those gene variants with the
desired properties. These
preferred variants may then be pooled and further subjected to recursive
rounds of DNA shuffling and
selection/screening. Thus, genetic diversity is created through "artificial"
breeding and rapid molecular
evolution. For example, fragments of a single gene containing random point
mutations may be
2 s recombined, screened, and then reshuffled until the desired properties are
optimized. Alternatively,
fragments of a given gene rnay be recombined with fragments of homologous
genes in the same gene
family, either from the same or different species, thereby maximizing the
genetic diversity of multiple
naturally occurring genes in a directed and controllable manner.
A "variant" of a particular polypeptide sequence is defined as a polypeptide
sequence having
s o at least 40% sequence identity to the particular polypeptide sequence over
a certain length of one of
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool
Version 2Ø9 (May-07-
1999) set at default parameters. Such a pair of polypeptides may show, for
example, at least 50%, at
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least
92%, at least 93%, at least
94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%
or greater identity over a
3 5 certain defined length of one of the polypeptides.
128


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
THE INVENTION
In a particular embodiment, cDNA sequences derived from human tissues and cell
lines were
aligned based on nucleotide sequence identity and assembled into "consensus"
or "template"
sequences which are designated by the template identification numbers
(template 1Ds) in column 2 of
s Table 2. The sequence identification numbers (SEQ 177 NOa) correspondilig to
the template )Ds are
shown in column 1. The template sequences have similarity to GenBank
sequences, or "hits," as
designated by the GI Numbers in column 3. The statistical probability of each
GenBank lut is
indicated by a probability score in column 4, and the functional annotation
corresponding to each
GenBank hit is listed in column 5.
o The invention incorporates the nucleic acid sequences of these templates as
disclosed in the
Sequence Listing and the use of these sequences in the diagnosis and treatment
of disease states
characterized by defects in human molecules. The invention further utilizes
these sequences in
hybridization and amplification technologies, and in particular, in
technologies which assess gene
expression patterns correlated with specific cells or tissues and their
responses in vivo or in vitro to
pharmaceutical agents, toxins, and other treatments. In this manner, the
sequences of the present
invention are used to develop a transcript image for a particular cell or
tissue.
Derivation of Nucleic Acid Se ug ences
cDNA was isolated from libraries constructed using RNA derived from norrrial
and diseased
2 o human tissues and cell lines. The human tissues and cell lines used for
cDNA library construction
were selected from a broad range of sources to provide a diverse population of
cDNAs representative
of gene transcription throughout the human body. Descriptions of the human
tissues and cell lines
used for cDNA library construction are provided in the LIFESEQ database
(Incyte Genomics, Inc.
(Incyte), Palo Alto CA). Human tissues were broadly selected from, for
example, cardiovascular,
2 s dermatologic, endocrine, gastrointestinal, hematopoietic/imrnune system,
musculoskeletal, neural,
reproductive, and urologic sources.
Cell lines used for cDNA library construction were derived from, for example,
leukemic cells,
teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and
endothelial cells. Such
cell lines include, for example, THP-1, Jurkat, HUVEC, hNT2, WI3 ~, HeLa, and
other cell lines
s o commonly used and available from public depositories (American Type
Culture Collection, Manassas
VA). Prior to mRNA isolation, cell lines were untreated, treated with a
pharmaceutical agent such as
5'-aza-2'-deoxycytidine, treated with an activating agent such as
lipopolysaccharide in the case of
leukocytic cell lines, or, in the case of endothelial cell lines, subjected to
shear stress.
129


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Sequencm~ of the cl~NAs
Methods for DNA sequencing are well known in the art. Conventional enzymatic
methods
employ the Klenow fragment of DNA polymerase I, SEQUENASE DNA polymerase (U.S.
Biochemical Corporation, Cleveland OH), Taq polymerase (Applied Biosystems,
Foster City CA),
s thermostable T7 polymerase (Amersham Pharmacia Biotech, Inc. (Amersham
Pharmacia Biotech),
Piscataway NJ), or combinations of polymerases and proofreading exonucleases
such as those found
in the ELONGASE amplification system (Life Technologies Inc. (Life
Technologies), Gaithersburg
MD), to extend the nucleic acid sequence from an oligonucleotide primer
annealed to the DNA
template of interest. Methods have been developed for the use of both single-
stranded and double-
so stranded templates. Chain termination reaction products may be
electrophoresed on urea-
polyacrylamide gels and detected either by autoradiography (for radioisotope-
labeled nucleotides) or
by fluorescence (for fluorophore-labeled nucleotides). Automated methods for
mechanized reaction
preparation, sequencing, and analysis using fluorescence detection methods
have been developed.
Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200
liquid transfer
15 system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler
(PTC200; MJ Research,
Inc. (MJ Research), Watertown MA), and ABI CATALYST 800 thermal cycler
(Applied
Biosystems). Sequencing can be carried out using, for example, the ABI 373 or
377 (Applied
Biosystems) or MEGABACE 1000 (Molecular Dynamics, Iuc. (Molecular Dynamics),
Sunnyvale
CA) DNA sequencing systems, or other automated and manual sequencing systems
well known in the
a o art.
The nucleotide sequences of the Sequence Listing have been prepared by
current, state-of
the-art, automated methods and, as such, may contain occasional sequencing
errors or unidentified
nucleotides. Such unidentified nucleotides are designated by an N. These
infrequent unidentified
bases do not represent a hindrance to practicing the invention for those
skilled in the art. Several
2 s methods employing standard recombinant techniques may be used to correct
errors and complete the
missing sequence information. (See, e.g., those described in Ausubel, F.M. et
al. (1997) Short
Protocols in Molecular Biology, John Wiley & Sons, New York NY; and Sambrook,
J. et al. (1989)
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview
NY.)
3o Assembly of cDNA Sequences
Human polynucleotide sequences may be assembled using programs or algorithms
well known
in the art. Sequences to be assembled are related, wholly or in part, and may
be derived from a single
or many different transcripts. Assembly of the sequences can be performed
using such programs as
PHRAP (Phils Revised Assembly Program) and the GELV7EW fragment assembly
system (GCG),
3 s or other methods known in the art.
130


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Alternatively, cDNA sequences are used as "component" sequences that are
assembled into
"template" or "consensus" sequences as follows. Sequence chromatograms are
processed, verified,
and quality scores are obtained using PHRED. Raw sequences are edited using an
editing pathway
known as Block 1 (See, e.g., the LIFESEQ Assembled User Guide, Incyte
Genomics, Palo Alto, CA).
s A series of BLAST comparisons is performed and low-information segments and
repetitive elements
(e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or
masked, to prevent spurious
matches. Mitochondrial and ribosomal RNA sequences are also removed. The
processed sequences
are then loaded into a relational database management system (RDMS) which
assigns edited
sequences to existing templates, if available. When additional sequences are
added into the RDMS, a
2o process is initiated which modifies existing templates or creates new
templates from works in progress
(i.e., nonhnal assembled sequences) containing queued sequences or the
sequences themselves.
After the new sequences have been assigned to templates, the templates can be
merged into bins. If
multiple templates exist in one bin, the bin can be split and the templates
reannotated.
Once gene bins have been generated based upon sequence alignments, bins are
"clone joined"
1s based upon clone information. Clone joining occurs when the 5' sequence of
one clone is present in
one bin and the 3' sequence from the same clone is present in a different bin,
indicating that the two
bins should be merged into a single bin. Only bins which share at least two
different clones are
merged.
A resultant template sequence may contain either a partial or a full length
open reading frame,
2 0 or all or part of a genetic regulatory element. This variation is due in
part to the fact that the full
length cDNAs of many genes are several hundred, and sometimes several
thousand, bases in length.
With current technology, cDNAs comprising the coding regions of large genes
cannot be cloned
because of vector limitations, incomplete reverse transcription of the mRNA,
or incomplete "second
strand" synthesis. Template sequences may be extended to include additional
contiguous sequences
2 s derived from the parent RNA transcript using a variety of methods known to
those of skill in the art.
Extension may thus be used to achieve the full length coding sequence of a
gene.
Analysis of the cDNA Sequences
The cDNA sequences are analyzed using a variety of programs and algorithms
which are
3 o well known in the art. (See, e.g., Ausubel, 1997, supra, Chapter 7.7;
Meyers, R.A. (Ed.) (1995)
Molecular Biolo~y and Biotechnology, Wiley VCH, New York NY, pp. 8S6-853; and
Table 8.) These
analyses comprise both reading frame determinations, e.g., based on triplet
codon periodicity for
particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318);
analyses of potential
start and stop codons; and homology searches.
131


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Computer programs known to those of skill in the art for performing computer-
assisted
searches for amino acid and nucleic acid sequence similarity, include, for
example, Basic Local
Alignment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300;
Altschul, S.F. et al.
( 1990) J. Mol. Biol. 215:403-410). BLAST is especially useful in determining
exact matches and
s comparing two sequence fragments of arbitrary but equal lengths, whose
alignment is locally maximal
and for which the alignment score meets or exceeds a threshold or cutoff score
set by the user
(Karlin, S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845). Using an
appropriate search tool
(e.g., BLAST or HMM), GenBank, SwissProt, BLOCKS, PFAM and other databases may
be
searched for sequences containing regions of homology to a query dithp or
DTTHP of the present
~o invention.
Other approaches to the identification, assembly, storage, and display of
nucleotide and
polypeptide sequences are provided in "Relational Database for Storing
Biomolecule Information,"
U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length
Biomolecular Sequence
Database," U.S.S.N. 08/811,758, filed March 6, 1997; and "Relational Database
and System for
15 Storing Information Relating to Biomolecular Sequences," U.S.S.N.
09/034,807, bled March 4, 1998,
all of which are incorporated by reference herein in their entirety.
Protein hierarchies can be assigned to the putative encoded polypeptide based
on, e.g., motif,
BLAST, or biological analysis. Methods for assigning these hierarchies are
described, for example, in
"Database System Employing Protein Function Hierarchies for Viewing
Biomolecular Sequence
zo Data," U.S.S.N. 08/812,290, filed March 6, 1997, incorporated herein by
reference.
Identification of Human Diasnostic and Therapeutic Molecules Encoded by dith
The identities of the DITHP encoded by the dithp of the present invention were
obtained by
analysis of the assembled cDNA sequences. .
2s SEQ ll~ NO:57 and SEQ ID N0:58, encoded by SEQ ID N0:1 and SEQ ID N0:2,
respectively, are, for example,,human enzyme molecules.
SEQ ID N0:59, SEQ ID NO:60, and SEQ ~ N0:61, encoded by SEQ ID N0:3, SEQ ID
N0:4, and SEQ ID NO:S, respectively, are, for example, receptor molecules.
SEQ ID N0:62 and SEQ ID N0:63, encoded by SEQ ll~ N0:6 and SEQ ID N0:7,
3 o respectively, are, for example, intracellular signaling molecules.
SEQ ID N0:64, SEQ m N0:65, SEQ TD NO:66, SEQ ID N0:67, SEQ ID N0:68, SEQ ID
N0:69, and SEQ ID N0:70, encoded by SEQ 117 NO:B, SEQ ID N0:9, SEQ ID N0:10,
SEQ ID
N0:11, SEQ ID NO:12, SEQ ID N0:13, and SEQ ID N0:14, respectively, are, for
example,
transcription factor molecules.
132


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
SEQ 1D NU:71, SEQ 1D NU:72, SEQ 1D N0:73, SEQ 1D NO:74, SEQ ID NO:75, SEQ 1D
N0:76, SEQ ID N0:77, SEQ ID N0:78, SEQ ID N0:79, SEQ ID N0:80, SEQ >D N0:81,
SEQ ID
N0:82, SEQ ID NO:83, SEQ ID N0:84, SEQ 117 N0:85, SEQ ID N0:86, SEQ ID N0:87,
and SEQ
ID N0:88, encoded by SEQ ID N0:15, SEQ ID NO:16, SEQ ID N0:17, SEQ ID N0:18,
SEQ ID
s N0:19, SEQ 1D N0:20, SEQ Iz7 N0:21, SEQ ID N0:22, SEQ ll~ N0:23, SEQ >D
N0:24, SEQ ID
N0:25, SEQ ID N0:26, SEQ ID N0:27, SEQ D7 N0:28, SEQ ID N0:29, SEQ )D N0:30,
SEQ ID
N0:31, and SEQ ID N0:32, respectively, are, for example, Zn forger-type
transcriptional regulators.
SEQ ID N0:89 and SEQ ll~ N0:90, encoded by SEQ 117 N0:33 and SEQ ll? N0:34,
respectively, are, for example, membrane transport molecules.
1o SEQ ID N0:91, SEQ TD N0:92, SEQ m N0:93, and SEQ ID N0:94, encoded by SEQ
ID
N0:35, SEQ ID N0:36, SEQ ll~ NO:37, and SEQ ID N0:38, respectively, axe, for
example, protein
modification and maintenance molecules.
SEQ ID N0:95, encoded by SEQ ID NO:39 is, for example, an adhesion molecule.
SEQ ID NO:96 and SEQ ID NO:97, encoded by SEQ ZD N0:40 and SEQ ID N0:41,
Z5 respectively, are, for example, antigen recognition molecules.
SEQ m N0:98, encoded by SEQ ID N0:42 is, for example, an electron transfer
associated
molecule.
SEQ ID N0:99 and SEQ ID N0:100, encoded by SEQ ID N0:43 and SEQ ID N0:44,
respectively, are, for example, cytoskeletal molecules.
2o SEQ m N0:101 and SEQ ID N0:102, encoded by SEQ ID N0:45 and SEQ ID NO:46,
respectively, are, fox example, human cell membrane molecules.
SEQ ID NO:103, SEQ ID N0:104, SEQ ID N0:105, SEQ ID N0:106, and SEQ ID N0:107,
encoded by SEQ ID NO:47, SEQ ID N0:48, SEQ ID NO:49, SEQ ID N0:50, and SEQ ID
NO:SO,
respectively, are, fox example, organelle associated molecules.
2s SEQ ID N0:108 and SEQ 1D NO:109, encoded by SEQ D7 N0:51 and SEQ ll~ N0:52,
respectively, are, for example, biochemical pathway molecules.
SEQ ID NO:110, SEQ ID N0:111, SEQ ID NO:112, and SEQ ID N0:113, encoded by SEQ
ID N0:53, SEQ II? NO:54, SEQ ID N0:55, and SEQ ID N0:56, respectively, are,
for example,
molecules associated with growth and development.
Sequences of Human Diagnostic and Therapeutic Molecules
The dithp of the present invention may be used for a variety of diagnostic and
therapeutic
purposes. For example, a dithp may be used to diagnose a particular condition,
disease, or disorder
3 5 associated with human molecules. Such conditions, diseases, and disorders
include, but are not limited
133


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
to, a cell prohterative dssorder, such as act~c keratosis, arteriosclerosis,
atherosclerosis, bursitis,
cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis,
paroxysmal nocturnal
hemoglobinuria, polycythemia vets, psoriasis, primary thrombocythemia, and
cancers including
adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma,
teratocarcinoma, and, in
particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain,
breast, cervix, gall
bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle,
ovary, pancreas, parathyroid,
penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and
uterus; an
autoimrnune/inflammatory disorder, such as inflammation, actinic keratosis,
acquired
immunodeficiency syndrome (AlDS), Addison's disease, adult respiratory
distress syndrome, allergies,.
~o ankylosing spondylitis, amyloidosis, anemia, arteriosclerosis, asthma,
atherosclerosis, autoimmune
hemolytic anemia, autoimmune thyroiditas, bronchitis, bursitis, cholecystitis,
cirrhosis, contact dermatitis,
Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus,
emphysema, erythroblastosis
fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis,
Goodpasture's syndrome, gout,
Graves' disease, Hashimoto's thyroiditis, paroxysmal nocturnal hemoglobinuria,
hepatitis,
1s hypereosinophilia, irritable bowel syndrome, episodic lymphopenia with
lymphocytotoxins, mixed
connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis,
myocardial or pericardial
inflammation, myelofibrosis, osteoarthritis, osteoporosis, pancreatitis,
polycythemia vets, polymyositis,
psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's
syndrome, systemic
anaphylaxis, systemic lupus erythematosus, systemic sclerosis, primary
thrombocythemia,
2o thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome,
complications of cancer,
hemodialysis, and extracorporeal circulation, trauma, and hematopoietic cancer
including lymphoma,
leukemia, and myeloma; an infection caused by a viral agent classified as
adenovirus, arenavirus,
bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, herpesvirus,
flavivirus, orthomyxovirus,
parvovirus, papovavirus, paramyxovirus, picornavirus, poxvirus, reovirus,
retrovirus, rhabdovirus, or
2 s togavirus; an infection caused by a bacterial agent classified as
pneumococcus, staphylococcus,
streptococcus, bacillus, corynebacterium, clostridium, meningococcus,
gonococcus, listeria, moraxella,
kingella, haemophilus, legionella, bordetella, gram-negative enterobacterium
including shigella,
salmonella, or campylobacter, pseudomonas, vibrio, brucella, francisella,
yersinia, bartonella,
norcardium, actinomyces, mycobacterium, spirochaetale, rickettsia, chlamydia,
or mycoplasma; an
3 o infection caused by a fungal agent classified as aspergillus, blastomyces,
dermatophytes, cryptococcus,
coccidioides, malasezzia, histoplasma, or other mycosis-causing fungal agent;
and an infection caused
by a parasite classified as plasmodium or malaria-causing, parasitic
entamoeba, leishmania,
trypanosoma, toxoplasrna, pneumocystis carinii, intestinal protozoa such as
giardia, trichomonas, tissue
nematode such as trichinella, intestinal nematode such as ascaris, lymphatic
filarial nematode,
35 trematode such as schistosoma, and cestrode such as tapeworm; a
developmental disorder such as
134


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
renal tubular acidosis, anemia, Cushing's syndrome, achondroplastic
dwariisiCri,"I7ucheriiie-arid Becker
muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor,
aniridia,
genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome,
myelodysplastic
syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas,
hereditary neuropathies such
s as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism,
hydrocephalus, seizure
disorders such as Syndenham's chorea and cerebral palsy, spine bifida,
anencephaly,
craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing
loss; an endocrine
disorder such as a disorder of the hypothalamus and/or pituitary resulting
from lesions such as a
primary brain tumor, adenoma, infarction associated with pregnancy,
hypophysectomy, aneurysm,
so vascular malformation, thrombosis, infection, immunological disorder, and
complication due to head
trauma; a disorder associated with hypopituitarism including hypogonadism,
Sheehan syndrome,
diabetes insipidus, Kallman's disease, Hand-Schuller-Christian disease,
Letterer-Siwe disease,
sarcoidosis, empty sella syndrome, and dwarfism; a disorder associated with
hyperpituitarism including
acromegaly, giantism, and syndrome of inappropriate antidiuretic hormone (ADH)
secretion (SIA.DH)
i s often caused by benign adenoma; a disorder associated with hypothyroidism
including goiter,
myxedema, acute thyroiditis associated with bacterial infection, subacute
thyroiditis associated with
viral infection, autoimmune thyroiditis (Hashimoto's disease), and cretinism;
a disorder associated with
hyperthyroidism including thyrotoxicosis and its various forms, Grave's
disease, pretibial myxedema,
toxic multinodular goiter, thyroid carcinoma, and Plummer's disease; a
disorder associated with
2 o hyperparathyroidism including Coon disease (chronic hypercalemia); a
pancreatic disorder such as
Type I or Type II diabetes mellitus and associated complications; a disorder
associated with the
adrenals such as hyperplasia, carcinoma, or adenoma of the adrenal cortex,
hypertension associated
with alkalosis, amyloidosis, hypokalemia, Cushing's disease, Liddle's
syndrome, and Arnold-Healy-
Gordon syndrome, pheochromocytoma tumors, and Addison's disease; a disorder
associated with
2s gonadal steroid hormones such as: in women, abnormal prolactin production,
infertility, endometriosis,
perturbation of the menstrual cycle, polycystic ovarian disease,
hyperprolactinemia, isolated
gonadotropin deficiency, amenorrhea, galactorrhea, hermaphroditism, hirsutism
and virilization, breast
cancer, and, in post-menopausal women, osteoporosis; and, in men, Leydig cell
deficiency, male
climacteric phase, and germinal cell aplasia, a hypergonadal disorder
associated with Leydig cell
3 o tumors, androgen resistance associated with absence of androgen receptors,
syndrome of 5 a-
reductase, and gynecomastia; a metabolic disorder such as Addison's disease,
cerebrotendinous
xanthomatosis, congenital adrenal hyperplasia, coumarin resistance, cystic
fibrosis, diabetes, fatty
hepatocirrhosis, fructose-1,6-diphosphatase deficiency, galactosemia, goiter,
glucagonoma, glycogen
storage diseases, hereditary fructose intolerance, hyperadrenalism,
hypoadrenalism,
3 5 hyperparathyroidism, hypoparathyroidism, hypercholesterolemia,
hyperthyroidism, hypoglycemia,
135


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
hypothyroidism, hyperlipidemia, hyperhpemia, lipid myopathies,
lipodystrophies, lysosomal storage
diseases, mannosidosis, neuraminidase deficiency, obesity, pentosuria
phenylketonuria, pseudovitarnin
D-deficiency rickets; disorders of carbohydrate metabolism such as congenital
type II
dyserythropoietic anemia, diabetes, insulin-dependent diabetes mellitus, non-
insulin-dependent diabetes
s mellitus, fructose-1,6-diphosphatase deficiency, galactosemia, glucagonoma,
hereditary fructose
intolerance, hypoglycemia, mannosidosis, neuraminidase deficiency, obesity,
galactose epimerase
deficiency, glycogen storage diseases, lysosomal storage diseases,
fructosuria, pentosuria, and
inherited abnormalities of pyruvate metabolism; disorders of lipid metabolism
such as fatty liver,
cholestasis, primary biliary cirrhosis, carnitine deficiency, carnitine
palmitoyltransferase deficiency,
so myoadenylate deaminase deficiency, hypertriglyceridemia, lipid storage
disorders such Fabry's
disease, Gaucher's disease, Niemann-Pick's disease, metachromatic
leukodystrophy,
adrenoleukodystrophy, GMZ gaugliosidosis, and ceroid lipofuscinosis,
abetalipoproteinemia, Tangier
disease, hyperlipoproteinemia, diabetes mellitus, lipodystrophy, lipomatoses,
acute panniculitis,
disseminated fat necrosis, adiposis dolorosa, lipoid adrenal hyperplasia,
minimal change disease,
lipomas, atherosclerosis, hypercholesterolemia, hypercholesterolemia with
hypertriglyceridemia,
primary hypoalphalipoproteinemia, hypothyroidism, renal disease, liver
disease, lecithin:cholesterol
acyltransferase deficiency, cerebrotendinous xanthornatosis, sitosterolemia,
hypocholesterolemia, Tay-
Sachs disease, Sandhoff's disease, hyperlipidemia, hyperlipemia, lipid
myopathies, and obesity; and
disorders of copper metabolism such as Menke's disease, Wilson's disease, and
Ehlers-Danlos
2 o syndrome type IX; a neurological disorder such as epilepsy, ischemic
cerebrovascular disease, stroke,
cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease,
dementia, Parkinson's
disease and other extrapyramidal disorders, arnyotrophic lateral sclerosis and
other motor neuron
disorders, progressive neural muscular atrophy, retinitis pigmentosa,
hereditary ataxias, multiple
sclerosis and other demyelinating diseases, bacterial and viral meningitis,
brain abscess, subdural
2s empyema, epidural abscess, suppurative intracranial thrombophlebitis,
myelitis and radiculitis, viral
central nervous system disease, priors diseases including kuru, Creutzfeldt-
Jakob disease, and
Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional
and metabolic diseases
of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal
hemangioblastomatosis,
encephalotrigerninal syndrome, mental retardation and other developmental
disorder of the central
s o nervous system, cerebral palsy, a neuroskeletal disorder, an autonomic
nervous system disorder, a
cranial nerve disorder, a spinal cord disease, muscular dystrophy and other
neuromuscular disorder, a
peripheral nervous system disorder, dermatomyositis and polymyositis,
inherited, metabolic, endocrine,
and toxic myopathy, myasthenia gravis, periodic paralysis, a mental disorder
including mood, anxiety,
and schizophrenic disorders, seasonal affective disorder (SAD), akathesia,
amnesia, catatonia, diabetic
3 5 neuropathy, tardive dyskinesia, dystonias, paranoid psychoses,
postherpetic neuralgia, and Tourette's
136


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
disorder; a gastrointestinal disorder including ulcerative colitis, gastric
and duodenal ulcers, cystinuria,
dibasicaminoaciduria, hypercystinuria, lysinuria, hartnup disease, tryptophan
malabsorption, methionine
malabsorption, histidinuria, iminoglycinuria, dicarboxylicaminoaciduria,
cystinosis, renal glycosuria,
hypouricemia, familial hypophophatemic rickets, congenital chloridorrhea,
distal renal tubular acidosis,
Menkes' disease, Wilson's disease, lethal diarrhea, juvenile pernicious
anemia, folate malabsorption,
adrenoleukodystrophy, hereditary myoglobinuria, and Zellweger syndrome; a
transport disorder such
as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia, cystic
fibrosis, Becker's muscular
dystrophy, Bell's palsy, Charcot-Marie Tooth disease, diabetes mellitus,
diabetes insipidus, diabetic
neuropathy, Duchenne muscular dystrophy, hyperkalemic periodic paralysis,
normokalemic periodic
so paralysis, Parkinson's disease, malignant hyperthermia, multidrug
resistance, myasthenia gravis,
myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheral
neuropathy, cerebral
neoplasms, prostate cancer, cardiac disorders associated with transport, e.g.,
angina, bradyarrythmia,
tachyarrythmia, hypertension, Long QT syndrome, myocarditis, cardiomyopathy,
nemaline myopathy,
centronuclear myopathy, lipid myopathy, mitochondrial myopathy, thyrotoxic
myopathy, ethanol
s 5 myopathy, dermatomyositis, inclusion body myositis, infectious myositis,
and polymyositis, neurological
disordexs associated with transport, e.g., Alzheimer's disease, amnesia,
bipolar disorder, dementia,
depression, epilepsy, Tourette's disorder, paranoid psychoses, and
schizophrenia, and other disorders
associated with transport, e.g., neurofibromatosis, postherpetic neuralgia,
trigeminal neuropathy,
sarcoidosis, sickle cell anemia, cataracts, infertility, pulmonary artery
stenosis, sensorineural autosomal
2o deafness, hyperglycemia, hypoglycemia, Grave's disease, goiter, glucose-
galactose malabsorption
syndrome, hypercholesterolemia, Cushing's disease, and Addison's disease; and
a connective tissue
disorder such as osteogenesis imperfecta, Ehlers-Danlos syndrome,
chondrodysplasias, Marfan
syndrome, Alport syndrome, familial aortic aneurysm, achondroplasia,
mucopolysaccharidoses,
osteoporosis, osteopetrosis, Paget's disease, rickets, osteomalacia,
hyperparathyroidism, renal
25 osteodystrophy, osteonecrosis, osteomyelitis, osteoma, osteoid osteoma,
osteoblastoma, osteosarcoma,
osteochondroma, chondroma, chondxoblastoma, chondromyxoid fibroma,
chondrosarcoma, fibrous
cortical defect, nonossifying fibroma, fibrous dysplasia, fibxosarcoma,
malignant fibrous histiocytoma,
Ewing's sarcoma, primitive neuroectodermal tumor, giant cell tumor,
osteoarthritis, rheumatoid
arthritis, ankylosing spondyloarthritis, Reiter's syndrome, psoriatic
arthritis, enteropathic arthritis,
s o infectious arthritis, gout, gouty arthritis, calcium pyrophosphate crystal
deposition disease, ganglion,
synovial cyst, villonodular synovitis, systemic sclerosis, Dupuytren's
contracture, hepatic fibrosis, lupus
erythematosus, mixed connective tissue disease, epidermolysis bullosa simplex,
bullous congenital
ichthyosiform erythroderma (epidermolytic hyperkeratosis), non-epidermolytic
and epidermolytic
palmoplantar keratoderma, ichthyosis bullosa of Siemens, pachyonychia
congenita, and white sponge
3 5 nevus. The dithp can be used to detect the presence of, or to quantify the
amount of, a dithp-related
137


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
polynucleotide in a sample. 'This intbrmation is then compared to information
obtained from
appropriate reference samples, and a diagnosis is established. Alternatively,
a polynucleotide
complementary to a given dithp can inhibit or inactivate a therapeutically
relevant gene related to the
dithp.
Analysis of dithp E~ression Patterns
The expression of dithp may be routinely assessed by hybridization-based
methods to
determine, for example, the tissue-specificity, disease-specificity, or
developmental stage-specificity of
dithp expression. For example, the level of expression of dithp may be
compared among different cell
so types or tissues, among diseased and normal cell types or tissues, among
cell types or tissues at
different developmental stages, or among cell types or tissues undergoing
various treatments. This
type of analysis is useful, for example, to assess the relative levels of
dithp expxession in fully or
partially differentiated cells or tissues, to determine if changes in dithp
expression levels are correlated
with the development or progression of specific disease states, and to assess
the response of a cell or
15 tissue to a specific therapy, for example, in pharmacological or
toxicological studies. Methods for the
analysis of dithp expression are based on hybridization and amplification
technologies and include
membrane-based procedures such as northern blot analysis, high-throughput
procedures that utilize, for
example, microarrays, and PCR-based procedures.
~o Hybridization and Genetic Analysis
°The dithp, their fragments, or complementary sequences may be used to
identify the presence
of and/or to determine the degree of similarity between two (or more) nucleic
acid sequences. The
dithp may be hybridized to naturally occurring or recombinant nucleic acid
sequences under
appropriately selected temperatures and salt concentrations. Hybridization
with a probe based on the
25 nucleic acid sequence of at least one of the dithp allows for the detection
of nucleic acid sequences,
including genomic sequences, which are identical or related to the dithp of
the Sequence Listing.
Probes may be selected from non-conserved or unique regions of at least one of
the polynucleotides of
SEQ >D N0:1-56 and tested for their ability to identify or amplify the target
nucleic acid sequence
using standard protocols.
3 o Polynucleotide sequences that are capable of hybridizing, in particular,
to those shown in SEQ
117 N0:1-56 and fragments thereof, can be identified using various conditions
of stringency. (See,
e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407;
Kilzunel, A.R. (1987)
Methods Enzymol. 152:507-511.) Hybridization conditions are discussed in
"Definitions."
A probe for use in Southern or northern hybridization may be derived from a
fragment of a
3 5 dithp sequence, or its complement, that is up to several hundred
nucleotides in length and is either
238


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
single-stranded or double-stranded. Such probes may be hybridized in solution
to biological materials
such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared
or sectioned tissues, or to
artificial substrates containing dithp. Microarrays axe particularly suitable
for identifying the presence
of and detecting the level of expression for multiple genes of interest by
examining gene expression
s correlated with, e.g., various stages of development, treatment with a drug
or compound, or disease
progression. An array analogous to a dot or slot blot may be used to arrange
and link polynucleotides
to the surface of a substrate using one or more of the following: mechanical
(vacuum), chemical,
thermal, or UV bonding procedures. Such an array may contain any number of
dithp and may be
produced by hand or by using available devices, materials, and machines.
~ o Microarrays may be prepared, used, and analyzed using methods known in the
art. (See, e.g.,
Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al.
(1996) Proc. Natl. Acad.
Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application
W095/251116; Shalon, D. et
al. (1995) PCT application W095/35505; Heller, R.A. et al. (1997) Proc. Natl.
Acad. Sci. USA
94:2150-2155; and Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662.)
15 Probes may be labeled by either PCR or enzymatic techniques using a variety
of
commercially available reporter molecules. For example, commercial kits are
available for radioactive
and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline
phosphatase labeling
(Life Technologies). Alternatively, dithp may be cloned into commercially
available vectors for the
production of RNA probes. Such probes may be transcribed in the presence of at
Ieast one labeled
2 o nucleotide (e.g., 32P-ATP, Amersham Pharmacia Biotech).
Additionally the polynucleotides of SEQ ID N0:1-56 or suitable fragments
thereof can be
used to isolate full length cDNA sequences utilizing hybridization and/or
amplification procedures well
known in the art, e.g., cDNA library screening, PCR amplification, etc. The
molecular cloning of such
full length cDNA sequences may employ the method of cDNA library screening
with probes using the
25 hybridization, stringency, washing, and probing strategies described above
and in Ausubel, supra,
Chapters 3, 5, and 6. These procedures may also be employed with genomic
libraries to isolate
genomic sequences of dithp in order to analyze, e.g., regulatory elements.
Genetic Mapping
Gene identification and mapping are important in the investigation and
treatment of almost all
conditions, diseases, and disorders. Cancer, cardiovascular disease,
Alzheimer's disease, arthritis,
s o diabetes, and mental illnesses are of particular interest. Each of these
conditions is more complex than
the single gene defects of sickle cell anemia or cystic fibrosis, with select
groups of genes being
predictive of predisposition for a particular condition, disease, or disorder.
For example,
cardiovascular disease may result from malfunctioning receptor molecules that
fail to clear cholesterol
from the bloodstream, and diabetes may result when a particular individual's
immune system is
3 s activated by an infection and attacks the insulin-producing cells of the
pancreas. In some studies;
139


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Alzheimer s disease has been linked to a gene on chromosome 21; other studies
predict a dstterent
gene and location. Mapping of disease genes is a complex and reiterative
process and generally
proceeds from genetic linkage analysis to physical mapping.
As a condition is noted among members of a family, a genetic linkage map
traces parts of
s chromosomes that are inherited in the same pattern as the condition.
Statistics link the inheritance of
particular conditions to particular regions of chromosomes, as defined by RFLP
or other markers.
(See, for example, Larder, E. S. and Botstein, D. (1986) Proc. Natl. Aced.
Sci. USA 83:7353-7357.)
Occasionally, genetic markers and their locations are known from previous
studies. More often,
however, the markers are simply stretches of DNA that differ among
individuals. Examples of
io genetic linkage maps can be found in various scientific journals or at the
Online Mendelian T~heritance
in Man (OMIM) World Wide Web site.
In another embodiment of the invention, dithp sequences may be used to
generate
hybridization probes useful in chromosomal mapping of naturally occurring
genomic sequences. Either
coding or noncoding sequences of dithp may be used, anal in some instances,
noncoding sequences
15 maybe preferable over coding sequences. For example, conservation of a
dithp coding sequence
among members of a multi-gene family may potentially cause undesired cross
hybridization during
chromosomal mapping. The sequences may be mapped to a particular chromosome,
to a specific
region of a chromosome, or to artificial chromosome constructions, e.g., human
artificial chromosomes
(HA.Cs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes
(BACs), bacterial P1
zo constructions, or single chromosome cDNA libraries. (Seek e.g., Harrington,
J.J. et al. (1997) Nat.
Genet. 15:345-355; Price, C.M. (1993) Blood Rev. 7:127-134; and Trask, B.J.
(1991) Trends Genet.
7:149-154.)
Fluorescent in situ hybridization (FISH) may be correlated with other physical
chromosome
mapping techniques and genetic map data. (See, e.g., Meyers, su re, pp. 965-
968.) Correlation
25 between the location of dithp on a physical chromosomal map and a specific
disorder, or a
predisposition to a specific disorder, may help define the region of DNA
associated with that disorder.
The dithp sequences may also be used to detect polymorphisms that are
genetically linked to the
inheritance of a particular condition, disease, or disorder.
In situ hybridization of chromosomal preparations and genetic mapping
techniques, such as
3 0 linkage analysis using established chromosomal markers, may be used for
extending existing genetic
maps. Often the placement of a gene on the chromosome of another mammalian
species, such as
mouse, may reveal associated markers even if the number or arm of the
corresponding human
chromosome is not known. These new marker sequences can be mapped to human
chromosomes
and may provide valuable information to investigators searching for disease
genes using positional
s s cloning or other gene discovery techniques. Once a disease or syndrome has
been crudely correlated
140


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
by genetic linkage with a particular genomic region, e.g., ataxia-
telangiectasia to 11q22-23, any
sequences mapping to that area may represent associated or regulatory genes
for further investigation.
(See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide
sequences of the subject
invention may also be used to detect differences in chromosomal architecture
due to translocation,
s inversion, etc., among normal, carrier, or affected individuals.
Once a disease-associated gene is mapped to a chromosomal region, the gene
must be cloned
in order to identify mutations or other alterations (e.g., translocations or
inversions) that may be
correlated with disease. Tlus process requires a physical map of the
chromosomal region containing
the disease-gene of interest along with associated markers. A physical map is
necessary for
~o determining the nucleotide sequence of and order of marker genes on a
particular chromosomal
region. Physical mapping techniques are well known in the art and require the
generation of
overlapping sets of cloned DNA fragments from a particular organelle,
chromosome, or genome.
These clones are analyzed to reconstruct and catalog their order. Once the
position of a marker is
determined, the DNA from that region is obtained by consulting the catalog and
selecting clones from
15 that region. The gene of interest is located through positional cloning
techniques using hybridization or
similar methods.
Dia ostic Uses
The dithp of the present invention may be used to design probes useful in
diagnostic assays.
2 o Such assays, well known to those skilled in the art, may be used to detect
or confirm conditions,
disorders, or diseases associated with abnormal levels of dithp expression.
Labeled probes developed
from dithp sequences are added to a sample under hybridizing conditions of
desired stringency. In
some instances, dithp, or fragments or oligonucleotides derived from dithp,
maybe used as primers in
amplification steps prior to hybridization. The amount of hybridization
complex formed is quantified
2 s and compared with standards for that cell or tissue. If dithp expression
varies significantly from the
standard, the assay indicates the presence of the condition, disorder, or
disease. Qualitative or
quantitative diagnostic methods may include northern, dot blot, or other
membrane or dip-stick based
technologies or multiple-sample format technologies such as PCR, enzyme-linked
immunosorbent
assay (ELISA)-like, pin, or chip based assays.
3 0 'The probes described above may also be used to monitor the progress of
conditions, disorders,
or diseases associated with abnormal levels of dithp expression, or to
evaluate the efficacy of a
particular therapeutic treatment. The candidate probe maybe identified from
the dithp that are
specific to a given human tissue and have not been observed in GenBank or
other genome databases.
Such a probe may be used in animal studies, preclinical tests, clinical
trials, or in monitoring the
s ~ treatment of an individual patient. In a typical process, standard
expression is established by methods
141


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
well known in the art for use as a basis of comparison, samples from patients
affected by the disorder
or disease are combined with the probe to evaluate any deviation from the
standard profile, and a
therapeutic agent is administered and effects are monitored to generate a
treatment profile. Efficacy
is evaluated by determining whether the expression progresses toward or
returns to the standard
s normal pattern. Treatment profiles may be generated over a period of several
days or several months.
Statistical methods well known to those skilled in the art may be use to
determine the significance of
such therapeutic agents.
The polynucleotides are also useful for identifying individuals from minute
biological samples,
fox example, by matching the RFLP pattern of a sample's DNA to that of an.
individual's DNA. The
1o polynucleotides of the present invention can also be used to determine the
actual base-by-base DNA
sequence of selected portions of an individual's genome. These sequences can
be used to prepare
PCR primers for amplifying and isolating such selected DNA, which can then be
sequenced. Using
this technique, an individual can be identified through a unique set of DNA
sequences. Once a unique
>D database is established for an individual, positive identification of that
individual can be made from
extremely small tissue samples.
In a particular aspect, oligonucleotide primers derived from the dithp of the
invention may be
used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions,
insertions and
deletions that are a frequent cause of inherited or acquired genetic disease
in humans. Methods of
SNP detection include, but are not limited to, single-stranded conformation
polymorphism (SSCP) and
2o fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived
from dithp are used to
amplify DNA using the polymerase chain reaction (PCR). The DNA may be derived,
for example,
from diseased or normal tissue, biopsy samples, bodily fluids, and the like.
SNPs in the DNA cause
differences in the secondary and tertiary structures of PCR pxoducts in single-
stranded form, and
these differences are detectable using gel electrophoresis in non-denaturing
gels. In fSCCP, the
2 s oligonucleotide primers are fluorescently labeled, which allows detection
of the amplimers in high-
throughput equipment such as DNA sequencing machines. Additionally, sequence
database analysis
methods, termed in silico SNP (isSNP), are capable of identifying
polymorphisms by comparing the
sequences of individual overlapping DNA fragments which assemble into a common
consensus
sequence. These computer-based methods filter out sequence variations due to
laboratory preparation
s o of DNA and sequencing errors using statistical models and automated
analyses of DNA sequence
chromatograms. In the alternative, SNPs may be detected and characterized by
mass spectrometry
using, for example, the high throughput MASSARR.AY system (Sequenom, Inc., San
Diego CA).
DNA-based identification techniques are critical in forensic technology. DNA
sequences
taken from very small biological samples such as tissues, e.g., hair or skin,
or body fluids, e.g., blood,
s s saliva, semen, etc., can be amplified using, e.g., PCR, to identify
individuals. (See, e.g., Erlich, H.
142


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
(1992) PCR Technolo~y, Freeman and Co., New York, NY). Similarly,
polynucleotides of the present
invention can be used as polymorpluc markers.
There is also a need for reagents capable of identifying the source of a
particular tissue.
Appropriate reagents can comprise, for example, DNA probes or primers prepared
from the
sequences of the present invention that are specific for particular tissues.
Panels of such reagents can
identify tissue by species and/or by organ type. In a similar fashion, these
reagents can be used to
screen tissue cultures for contamination.
The polynucleotides of the present invention can also be used as molecular
weight markers on
nucleic acid gels or Southern blots, as diagnostic probes for the presence of
a specific mRNA in a
Zo particular cell type, in the creation of subtracted cDNA libraries which
aid in the discovery of novel
polynucleotides, in selection and synthesis of oligomers for attachment to an
array or other support,
and as an antigen to elicit an immune response.
Disease Model Systems Using dithp
s s The dithp of the invention or their mammalian homologs may be "knocked
out" in an animal
model system using homologous recombination in embryonic stem (ES) cells. Such
techniques are
well known in the art and are useful for the generation of animal models of
human disease. (See, e.g.,
U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For example,
mouse ES cells,
such as the mouse 129/SvJ cell line, are derived from the early mouse embryo
and grown in culture.
2 o The ES cells are transformed with a vector containing the gene of interest
disrupted by a marker gene,
e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science
244:1288-1292).
The vector integrates into the corresponding region of the host genome by
homologous recombination.
Alternatively, homologous recombination takes place using the Cre-loxP system
to knockout a gene of
interest in a tissue- or developmental stage-specific manner (Marth, J.D.
(1996) Clip. Invest. 97:1999-
25 2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330).
Transformed ES cells are
identified and microinjected into mouse cell blastocysts such as those from
the C57BL/6 mouse strain.
The blastocysts are surgically transferred to pseudopregnant dams, and the
resulting chimeric progeny
are genotyped and bred to produce heterozygous or homozygous strains.
Transgenic animals thus
generated may be tested with potential therapeutic or toxic agents.
3 o The dithp of the invention may also be manipulated in vitro in ES cells
derived from human
blastocysts. Human ES cells have the potential to differentiate into at least
eight separate cell lineages
including endoderm, mesoderm, and ectodermal cell types. These cell lineages
differentiate into, for
example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson,
J.A. et al. (1998)
Science 282:1145-1147).
3 5 The dithp of the invention can also be used to create "knockin" humanized
animals (pigs) or
143


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
transgemc antrnals (mice or rats) to model human disease. With knockiu
technology, a regton of dithp
is injected into animal ES cells, and the injected sequence integrates into
the animal cell genome.
Transformed cells are injected into blastulae, and the blastulae are implanted
as described above.
Transgenic progeny or inbred lines are studied and treated with potential
pharmaceutical agents to
obtain infomnation on treatment of a human disease. Alternatively, a mammal
inbred to overexpress
dithp, resulting, e.g., in the secretion of DITHP in its milk, may also serve
as a convenient source of
that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74).
Screening Assays
io DITIiP encoded by polynucleotides of the present invention may be used to
screen for
molecules that bind to or are bound by the encoded polypeptides. The binding
of the polypeptide and
the molecule may activate (agonist), increase, inhibit (antagonist), or
decrease activity of the
polypeptide or the bound molecule. Examples of such molecules include
antibodies, oligonucleotides,
proteins (e.g., receptors), or small molecules.
1s Preferably, the molecule is closely related to the natural ligand of the
polypeptide, e.g., a ligand
or fragment thereof, a natural substrate, or a structural or functional
mimetic. (See, Coligan et al.,
(1991) Current Protocols in hnmunolo~y 1(2): Chapter 5.) Similarly, the
molecule can be closely
related to the natural receptor to which the polypeptide binds, or to at least
a fragment of the receptor,
e.g., the active site. In either case, the molecule can be rationally designed
using known techniques.
2o Preferably, the screening for these molecules involves producing
appropriate cells which express the
polypeptide, either as a secreted protein or on the cell membrane. Preferred
cells include cells from
mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide or
cell membrane fractions
which contain the expressed polypeptide are then contacted with a test
compound and binding,
stimulation, or inhibition of activity of either the polypeptide or the
molecule is analyzed.
2 s An assay may simply test binding of a candidate compound to the
polypeptide, wherein binding
is detected by a fluorophore, radioisotope, enzyme conjugate, or other
detectable label. Alternatively,
the assay may assess binding in the presence of a labeled competitor.
Additionally, the assay can be carried out using cell-free preparations,
polypeptide/molecule
affixed to a solid support, chemical libraries, or natural product mixtures.
The assay may also simply
3 o comprise the steps of mixing a candidate compound with a solution
containing a polypeptide,
measuring polypeptide/molecule activity or binding, and comparing the
polypeptide/molecule activity or
binding to a standard. .
Preferably, an ELISA assay using, e.g., a monoclonal or polyclonal antibody,
can measure
polypeptide level in a sample. 'The antibody can measure polypeptide level by
either binding, directly
3 s or indirectly, to the polypeptide or by competing with the polypeptide for
a substrate.
144


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
All of the above assays can be used in a diagnostic or prognostic context. The
molecules
discovered using these assays can be used to treat disease or to bring about a
particular result in a
patient (e.g., blood vessel growth) by activating or inhibiting the
polypeptide/molecule. Moreover, the
assays can discover agents which may inhibit or enhance the production of the
polypeptide from
s suitably manipulated cells or tissues.
Transcript Ima~in~ and Toxicological Testing
Another embodiment relates to the use of dithp to develop a transcript image
of a tissue or cell
type. A transcript image represents the global pattern of gene expression by a
particular tissue or cell
1o type. Global gene expression patterns are analyzed by quantifying the
number of expressed genes and
their relative abundance under given conditions and at a given time. (See
Seilhamer et al.,
"Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484,
expressly incorporated by
9
reference herein.) Thus a transcript image may be generated by hybridizing the
polynucleotides of the
present invention or their complements to the totality of transcripts or
reverse transcripts of a
z~ particular tissue or cell type. In one embodiment, the hybridization takes
place in high-throughput
format, wherein the polynucleotides of the present invention or their
complements comprise a subset
of a plurality of elements on a microarray. The resultant transcript image
would provide a profile of
gene activity pertaining to human molecules for diagnostics and therapeutics.
Transcript images which profile dithp expression may be generated using
transcripts isolated
a o from tissues, cell lines, biopsies, or other biological samples. The
transcript image may thus reflect
dithp expression in vivo, as in the case of a tissue or biopsy sample, or in
vitro, as in the case of a cell
line.
Transcript images which profile dithp expression may also be used in
conjunction with in vitro
model systems and preclinical evaluation of pharmaceuticals, as well as
toxicological testing of
2 s industrial and naturally-occurring environmental compounds. All compounds
induce characteristic
gene expression patterns, frequently termed molecular fingerprints or toxicant
signatures, which are
indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999)
Mol. Carcinog. 24:153-
159; Steiner, S. and Anderson, N.L. (2000) Toxicol. Lett. 112-113:467-71,
expressly incorporated by
reference herein). If a test compound has a signature similar to that of a
compound with known
3 o toxicity, it is likely to share those toxic properties. These fingerprints
or signatures are most useful and
refined when they contain expression information from a large number of genes
and gene families.
Ideally, a genome-wide measurement of expression provides the highest quality
signature. Even
genes whose expression is not altered by any tested compounds are important as
well, as the levels of
expression of these genes are used to normalize the rest of the expression
data. The normalization
3 5 procedure is useful for comparison of expression data after treatment with
different compounds.
145


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
While the assignment of gene function to elements of a toxicant signature aids
in interpretation of
toxicity mechanisms, knowledge of gene function is not necessary for the
statistical matching of
signatures which leads to prediction of toxicity. (See, for example, Press
Release 00-02 from the
National Institute of Environmental Health Sciences, released February 29,
2000, available at
s http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is important
and desirable in
toxicological screening using toxicant signatures to include all expressed
gene sequences.
In one embodiment, the toxicity of a test compound is assessed by treating a
biological sample
containing nucleic acids with the test compound. Nucleic acids that are
expressed in the treated
biological sample are hybridized with one or more probes specific to the
polynucleotides of the present
i o invention, so that transcript levels corresponding to the polynucleotides
of the present invention may be
quantified. The transcript levels in the treated biological sample are
compared with levels in an
untreated biological sample. Differences in the transcript levels between the
two samples are
indicative of a toxic response caused by the test compound in the treated
sample.
Another particular embodiment relates to the use of DITHP encoded by
polynucleotides of
15 the present invention to analyze the proteome of a tissue ox cell type. The
term proteome refers to the
global pattern of protein expression in a particular tissue or cell type. Each
protein component of a
proteome can be subjected individually to further analysis. Proteome
expression patterns, or profiles,
are analyzed by quantifying the number of expressed proteins and their
relative abundance under
given conditions and at a given time. A profile of a cell's proteome may thus
be generated by
2 o separating and analyzing the polypeptides of a particular tissue or cell
type. In one embodiment, the
separation is achieved using two-dimensional gel electrophoresis, in which
proteins from a sample are
separated by isoelectric focusing in the first dimension, and then according
to molecular weight by
sodium dodecyl sulfate slab gel electrophoresis in the second dimension
(Steiner and Anderson,
supra). The proteins are visualized in the gel as discrete and uniquely
positioned spots, typically by
2 s staining the gel with an agent such as Coomassie Blue or silver or
fluorescent stains. The optical
density of each protein spot is generally proportional to the level of the
protein. in the sample. The
optical densities of equivalently positioned protein spots from different
samples, for example, from
biological samples either treated or untreated with a test compound or
therapeutic agent, are
compared to identify any changes in protein spot density related to the
treatment. The proteins in the
3 o spots are partially sequenced using, for example, standard methods
employing chemical or enzymatic
cleavage followed by mass spectrometry. The identity of the protein in a spot
may be determined by
comparing its partial sequence, preferably of at least 5 contiguous amino acid
residues, to the
polypeptide sequences of the present invention. In some cases, further
sequence data may be
obtained for definitive protein identification. °
35 A proteomic profile may also be generated using antibodies specific for
DITHP to quantify
146


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
the levels of DITHP expression. In one embodiment, the antibodies are used as
elements on a
microaxray, and protein expression levels are quantified by exposing the
microarray to the sample and
detecting the levels of protein bound to each array element (Lueking, A. et
al. (1999) Anal. Biochem.
270:103-11; Mendoze, L.G. et al. (1999) Biotechniques 27:778-88). Detection
maybe performed by a
variety of methods known in the art, for example, by reacting the proteins in
the sample with a thiol- or
amino-reactive fluorescent compound and detecting the amount of fluorescence
bound at each array
element.
Toxicant signatures at the proteome level are also useful for toxicological
screening, and
should be analyzed in parallel with toxicant signatures at the transcript
level. There is a poor
1o correlation between transcript and protein abundances for some proteins in
some tissues (Anderson,
N.L. and Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome
toxicant signatures may be
useful in the analysis of compounds which do not significantly affect the
transcript image, but which
alter the proteomic profile. In addition, the analysis of transcripts in body
fluids is difficult, due to rapid
degradation of mRNA, so proteomic profiling may be more reliable and
informative in such cases.
~.5 In another embodiment, the toxicity of a test compound is assessed by
treating a biological
sample containing proteins with the test compound. Proteins that are expressed
in the treated
biological sample are separated so that the amount of each protein can be
quantified. The amount of
each protein is compared to the amount of the corresponding protein in an
untreated biological sample.
A difference in the amount of protein between the two samples is indicative of
a toxic response to the
2 o test compound in the treated sample. Individual proteins are identified by
sequencing the amino acid
residues of the individual proteins and comparing these partial sequences to
the DITHP encoded by
polynucleotides of the present invention.
In another embodiment, the toxicity of a test compound is assessed by treating
a biological
sample containing proteins with the test compound. Proteins from the
biological sample are incubated
25 with antibodies specific to the D1THP encoded by polynucleotides of the
present invention. The
amount of protein recognized by the antibodies is quantified. The amount of
protein in the treated
biological sample is compared with the amount in an untreated biological
sample. A difference in the
amount of protein between the two samples is indicative of a toxic response to
the test compound in
the treated sample.
3 o Transcript images may be used to profile dithp expression in distinct
tissue types. 'This
process can be used to determine human molecule activity in a particular
tissue type relative to this
activity in a different tissue type. Transcript images may be used to generate
a profile of dithp
expression characteristic of diseased tissue. Transcript images of tissues
before and after treatment
may be used for diagnostic purposes, to monitor the progression of disease,
and to monitor the efficacy
3 5 of drug treatments for diseases which affect the activity of human
molecules.
147


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Transcript images of cell lines can be used to assess human molecule activity
and/or to
identify cell lines that lack or misregulate this activity. Such cell lines
may then be treated with
pharmaceutical agents, and a transcript image following treatment may indicate
the efficacy of these
agents in restoring desired levels of this activity. A similar approach may be
used to assess the
s toxicity of pharmaceutical agents as reflected by undesirable changes in
human molecule activity.
Candidate pharmaceutical agents may be evaluated by comparing their associated
transcript images
with those of pharmaceutical agents of known effectiveness.
Antisense Molecules
so The polynucleotides of the present invention are useful in antisense
technology. Antisense
technology ox therapy relies on the modulation of expression of a target
protein through the specific
binding of an antisense sequence to a target sequence encoding the target
protein or directing its
expression. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics, Humana
Press Inc., Totawa
NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3):171-178; Crooke, S.T. (1997)
Adv. Pharmacol.
15 40:1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12):1055-1063;
and Lavrosky, Y. et
al. (1997) Biochem. Mol. Med. 62(1):11-22.) An antisense sequence is a
polynucleotide sequence
capable of specifically hybridizing to at least a portion of the target
sequence. Antisense sequences
bind to cellular mRNA and/or genomic DNA, affecting translation and/or
transcription. Antisense
sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g.,
Rossi, J.J. et al. (1991)
2o Antisense Res. Dev. 1(3):285-288; Lee, R. et al. (1998) Biochemistry
37(3):900-1010; Pardridge,
W.M. et al. (1995) Proc. Natl. Acad. Sci. USA 92(12):5592-5596; and Nielsen,
P. E. and Haaima, G.
(1997) Chem. Soc. Rev. 96:73-78.) Typically, the binding which results in
modulation of expression
occurs through hybridization or binding of complementary base pairs. Antisense
sequences can also
bind to DNA duplexes through specific interactions in the major groove of the
double helix.
2s The polynucleotides of the present invention and fragments thereof can be
used as antisense
sequences to modify the expression of the polypeptide encoded by dithp. The
antisense sequences
can be produced ex vivo, such as by using any of the ABI nucleic acid
synthesizer series (Applied
Biosystems) or other automated systems known in the art. Antisense sequences
can also be produced
biologically, such as by transforming an appropriate host cell with an
expression vector containing the
3o sequence of interest. (See, e.g., Agrawal, su ra.)
In therapeutic use, any gene delivery system suitable for introduction of the
antisense
sequences into appropriate target cells can be used. Antisense sequences can
be delivered
intraceIlularly in the form of an expression plasmid which, upon
transcription, produces a sequence
complementary to at least a portion of the cellular sequence encoding the
target protein. (See, e.g.,
35 Slater, J.E., et al. (1998) J. Allergy Clip. I_mmunol. 102(3):469-475; and
Scanlon, K.J., et al. (1995)
148


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
y(13):1~5~-12y6.) Antisense sequences can also be introduced mtracellularly
through the use of viral
vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g.,
Miller, A.D. (1990) Blood
76:271; Ausubel, F.M. et al. (1995) Current Protocols in Molecular Biolo~y,
John Wiley & Sons, New
York NY; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.)
Other gene delivery
s mechanisms include uposome-derived systems, artificial viral envelopes, and
other systems known in
the art. (See, e.g., Rossi, J.J. (1995) Br. Med. Bull. 51(1):217-225; Boado,
R.J. et al. (1998) J.
Pharm. Sci. 87(11):1308-1315; and Morns, M.C. et al. (1997) Nucleic Acids Res.
25(14):2730-2736.)
Expression
so In order to express abiologically active DITHP, the nucleotide sequences
encoding DTTHP or
fragments thereof may be inserted into an appropriate expression vector, i.e.,
a vector which contains
the necessary elements for transcriptional and translational control of the
inserted coding sequence in
a suitable host. Methods which are well known to those skilled in the art may
be used to construct
expression vectors containing sequences encoding DITHP and appropriate
transcriptional and
txanslational control elements. These methods include in vitro recombinant DNA
techniques, synthetic
techniques, and in vivo genetic recombination. (See, e.g., Sambrook, sutra,
Chapters 4, 8, 16, and 17;
and Ausubel, sera, Chapters 9, 10, 13, and 16.)
A variety of expression vector/host systems may be utilized to contain and
express sequences
encoding DTTHP. These include, but are not limited to, microorganisms such as
bacteria transformed
2 o with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors;
yeast transformed with
yeast expression vectors; insect cell systems infected with viral expression
vectors (e.g., baculovirus);
plant cell systems transformed with viral expression vectors (e.g.,
cauliflower mosaic virus, CaMV, or
tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or
pBR322 plasmids); or
animal (mammalian) cell systems. (See, e.g., Sambrook, su ra; Ausubel, 1995,
su ra, Van Heeke, G.
5 and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Bitter, G.A. et al.
(1987) Methods Enzymol.
153:516-544; Scorer, C.A. et al. (1994) Bio/Technology 12:181-184; Engelhard,
E.K. et al. (1994)
Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene
Ther. 7:1937-1945;
Takamatsu, N. (1987) EMBO J. 6:307-311; Coruzzi, G. et al. (1984) EMBO J.
3:1671-1680; Brogue,
R. et al. (1984) Science 224:838-843; Winter, J. et al. (1991) Results Probl.
Cell Differ. 17:85-105;
s o The McGraw Hill Yearbook of Science and Technolo~y (1992) McGraw Hill, New
York NY, pp.
191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-
3659; and Harrington,
J.J. et al. (1997) Nat. Genet. 15:345-355.) Expression vectors derived from
retroviruses,
adenoviruses, or herpes or vaccinia viruses, or from various bacterial
plasmids, may be used for
deuvery of nucleotide sequences to the targeted organ, tissue, or cell
population. (See, e.g., Di Nicola,
35 M. et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993)
Proc. Natl. Acad. Sci. USA
149


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815;
McCTregor,1~.Y. et al. (1994)
Mol. Tmmunol. 31(3):219-226; and Verma, LM. and N. Somia (1997) Nature 389:239-
242.) The
invention is not limited by the host cell employed.
For long term production of recombinant proteins in mammalian systems, stable
expression of
s DITHP in cell lines is preferred. For example, sequences encoding DTTHP can
be transformed into
cell lines using expression vectors which may contain viral origins of
replication and/or endogenous
expression elements and a selectable marker gene on the same or on a separate
vector. Any number
of selection systems may be used to recover transformed cell lines. (See,
e.g., Wigler, M. et al.
(1977) Cell 11:223-232; Lowy, I. et aI. (1980) CeII 22:817-823.; Wigler, M. et
al. (1980) Proc. Natl.
1o Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol.
Biol. 150:1-14; Hartman, S.C.
and R.C.Mulligan (1988) Proc. Natl. Acad. Sci. USA 85:8047-8051; Rhodes, C.A.
(1995) Methods
Mol. Biol. 55:121-131.)
Therapeutic Uses of dithp
15 The dithp of the invention may be used for somatic or germline gene
therapy. Gene therapy
may be performed to (i) correct a genetic deficiency (e.g., in the cases of
severe combined
immunodeficiency (SCID)-X1 disease characterized by X-linked inheritance
(Cavazzana-Calvo, M. et
al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome
associated with an
inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995)
Science 270:475-480;
zo Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner,
J. et al. (1993) Cell 75:207-
216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et
al. (1995) Hum. Gene
Therapy 6:667-703), thalassemias, familial hypercholesterolemia, and
hemophilia resulting from Factor
VIII or Factor IX deficiencies (Crystal, R.G. (1995) Science 270:404-410;
Verma, LM. and Somia, N.
(1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product
(e.g., in the case of
2 s cancers which result from unregulated cell proliferation), or (iii)
express a protein which affords
protection against intracellular parasites (e.g., against human retroviruses,
such as human
immunodeficiency virus (HIV) (Baltimore, D. (1988) Nature 335:395-396;
Poeschla, E. et aI. (1996)
Proc. Natl. Acad. Sci. USA. 93:11395-11399), hepatitis B or C virus (HBV,
HCV); fungal parasites,
such as Candida albicans and Paracoccidioides brasiliensis; and protozoan
parasites such as
3 o Plasmodium falciparum and Trypanosoma cruzi). In the case where a genetic
deficiency in dithp
expression or regulation causes disease, the expression of dithp from an
appropriate population of
transduced cells may alleviate the clinical manifestations caused by the
genetic deficiency.
In a further embodiment of the invention, diseases or disorders caused by
deficiencies in dithp
are treated by constructing mammalian expression vectors comprising dithp and
introducing these
3 s vectors by mechanical means into dithp-deficient cells. Mechanical
transfer techtiologies for use with
150


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cells in vivo or ex vitro include (i) direct DNA microinjection into
individual cells, (ii) ballistic gold
particle delivery, (iii) liposome-mediated transfection, (iv) receptor-
mediated gene transfer, and (v) the
use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev.
Biochem. 62:191-
217; Ivics, Z. (1997) Cell 91:501-510; Boulay, J-L. and Recipon, H. (1998)
Curr. Opin. Biotechnol.
s 9:445-450).
Expression vectors that may be effective for the expression of dithp include,
but are not
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX. vectors (Invitrogen,
Carlsbad CA),
PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF,
PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). The dithp of the
invention
1o maybe expressed using (i) a constitutively active promoter, (e.g., from
cytomegalovirus (CMV), Rous
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or (3-actin genes),
(ii) an inducible promoter
(e.g., the tetracycline-regulated promoter (Gossen, M. and Bujard, H. (1992)
Proc. Natl. Acad. Sci.
U.S.A. 89:5547-5551; Gossen, M. et al., (1995) Science 268:1766-1769; Rossi,
F.M.V. and Blau,
H.M. (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the
T-REX plasmid
15 (Invitrogen); the ecdysone-inducible promoter (available in the plasmids
PVGRXR and PIND;
Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone
inducible promoter
(Rossi, F.M.V. and Blau, H.M. su ra), or (iii) a tissue-specific promoter or
the native promoter of the
endogenous gene encoding DITHP from a normal individual.
Commercially available liposome transformation kits (e.g., the PERFECT LIPID
2o TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill
in the art to deliver
polynucleotides to target cells in culture and require minimal effort to
optimize expeximental
parameters. In the alternative, transformation is performed using the calcium
phosphate method
(Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation
(Neumann, E. et aI.
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires
modification of these
~ s standardized mammalian transfection protocols.
In another embodiment of the invention, diseases or disorders caused by
genetic defects with
respect to dithp expression are treated by constructing a retrovirus vector
consisting of (i) dithp under
the control of an independent promoter or the retrovirus long terminal repeat
(LTR) promoter, (ii)
appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE)
along with additional
3 o retrovirus cis-acting RNA sequences and coding sequences required for
efficient vector propagation.
Retrovirus vectors (e.g., PFB and PFBNEO) are commercially available
(Stratagene) and are based
on published data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. U.S.A.
92:6733-6737), incorporated
by reference herein. The vector is propagated in an appropriate vector
producing cell line (VPCL)
that expresses an envelope gene with a tropism for receptors on the target
cells or a promiscuous
3 s envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol.
61:1647-1650; Bender, M.A. et
151


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and Miller, A.D. (1988) J.
V~rol. 62:3802-3806; Dull, T.
et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol.
72:9873-9880). U.5. Patent
Number 5,910,434 to Rigg ("Method for obtaining retrovirus packaging cell
lines producing high
transducing efficiency retroviral supernatant") discloses a method for
obtaining retrovirus packaging
s cell lines and is hereby incorporated by reference. Propagation of
retrovirus vectors, transduction of
a population of cells (e.g., CD4+ T-cells), and the return of trausduced cells
to a patient are
procedures well known to persons skilled in the art of gene therapy and have
been well documented
(Range, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood
89:2259-2267; Bonyhadi,
M.L. (1997) J. Virol. 71:4707-4716; Range, U. et aI. (1998) Proc. Natl. Aced.
Sci. U.S.A. 95:1201-
so 1206; Su, L. (1997) Blood 89:2283-2290).
In the alternative, an adenovirus-based gene therapy delivery system is used
to deliver dithp to
cells which have one or more genetic abnormalities with respect to the
expression of dithp. The
construction and packaging of adenovirus-based vectors are well known to those
with ordinary skill in
the art. Replication defective adenovirus vectors have proven to be versatile
for importing genes
encoding immunoregulatory proteins into intact islets in the pancreas (Csete,
M.E. et al. (1995)
Transplantation 27:263-268). Potentially useful adenoviral vectors are
described in U.S. Patent
Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby
incorporated by
reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999)
Annu. Rev. Nutr. 19:511-544
and Verma, LM. and Somia, N. (1997) Nature 18:389:239-242, both incorporated
by reference herein.
2 o Iu another alternative, a herpes based, gene therapy delivery system is
used to deliver dithp to
target cells which have one or more genetic abnormalities with respect to the
expression of dithp. The
use of herpes simplex virus (HSV)-based vectors may be especially valuable for
introducing dithp to
cells of the central nervous system, for which HSV has a tropism. The
construction and packaging of
herpes-based vectors are well known to those with ordinary skill in the art. A
replication-competent
2 s herpes simplex virus (HSV) type 1-based vector has been used to deliver a
reporter gene to the eyes
of primates (Liu, X. et al. (1999) Exp. Eye Res.169:385-395). The construction
of a HSV-1 virus
vector has also been disclosed in detail in U.S. Patent Number 5,804,413 to
DeLuca ("Herpes simplex
virus strains for gene transfer"), which is hereby incorporated by reference.
U.5. Patent Number
5,804,413 teaches the use of recombinant HSV d92 which consists of a genome
containing at least
3 0 one exogenous gene to be transferred to a cell under the control of the
appropriate promoter for
purposes including human gene therapy. Also taught by tbis patent are the
construction and use of
recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors,
see also Goins, W.
F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et al., (1994) Dev. Biol.
163:152-161, hereby
incorporated by reference. The manipulation of cloned herpesvirus sequences,
the generation of
3 s recombinant virus following the transfection of multiple plasmids
containing different segments of the
152


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
large herpesvirus genomes, the growth and propagation of herpesvirus, and the
infection of cells with
herpesvirus are techniques well known to those of ordinary skill in the art.
In another alternative, an alphavirus (positive, single-stranded RNA virus)
vector is used to
deliver dithp to target cells. The biology of the prototypic alphavirus,
Semliki Forest Virus (SFV), has
been studied extensively and gene trausfer vectors have been based on the SFV
genome (Garoff, H.
and Li, K-J. (1998) Curr. Opin. Biotech. 9:464-469). During alphavirus RNA
replication, a
subgenomic RNA is generated that normally encodes the viral capsid proteins.
This subgenomic RNA
replicates to higher levels than the full-length genomic RNA, resulting in the
overproduction of capsid
proteins relative to the viral proteins with enzymatic activity (e.g.,
protease and polymerase).
~o Similarly, inserting dithp into the alphavirus genome in place of the
capsid-coding region results in the
production of a large number of dithp RNAs and the synthesis of high levels of
DITHP in vector
transduced cells. While alphavirus infection is typically associated with cell
lysis within a few days,
the ability to establish a persistent infection in hamster normal kidney cells
(BHK-21) with a variant of
Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can
be altered to suit the needs of
5 the gene therapy application (Dryga, S.A. et al. (1997) Virology 228:74-83).
The wide host range of
alphaviruses will allow the introduction of dithp into a variety of cell
types. The specific transduction
of a subset of cells in a population may require the sorting of cells prior to
transduction. The methods
of manipulating infectious cDNA clones of alphaviruses, performing alphavirus
cDNA and RNA
transfections, and performing alphavirus infections, are well known to those
with ordinary skill in the
z o art.
Antibodies
Anti-DII~iP antibodies may be used to analyze protein expression levels. Such
antibodies
include, but are not limited to, polyclonal, monoclonal, chimeric, single
chain, and Fab fragments. For
2s descriptions of and protocols of antibody technologies, see, e.g., Pound
J.D. (1998) T_m_m__unochemical
Protocols, Humana Press, Totowa, NJ.
The amino acid sequence encoded by the dithp of the Sequence Listing may be
analyzed by
appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to
determine
regions of high immunogenicity. The optimal sequences for immunization are
selected from the C-
3 o terminus, the N-terminus, and those~intervening, hydrophilic regions of
the polypeptide which are likely
to be exposed to the external environment when the polypeptide is in its
natural conformation.
Analysis used to select appropriate epitopes is also described by Ausubel
(1997, su ra, Chapter 11.7).
Peptides used for antibody induction do not need to have biological activity;
however, they must be
antigenic. Peptides used to induce specific antibodies may have an amino acid
sequence consisting of
3 s at least five amino acids, preferably at least 10 amino acids, and most
preferably at least 15 amino
153


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
acids. A peptide which mimncs an antigenic i~agment of the natural polypeptide
may be fused with
another protein such as keyhole limpet hemocyanin. (KLH; Sigma, St. Louis MO)
for antibody
production. A peptide encompassing an antigenic region may be expressed from a
dithp, synthesized
as described above, or purified from human cells.
s Procedures well known in the art may be used for the production of
antibodies. Various hosts
including mice, goats, and rabbits, may be immunized by injection with a
peptide. Depending on the
host species, various adjuvants may be used to increase immunological
response.
In one procedure, peptides about 15 residues in length may be synthesized
using an ABI 431A
peptide synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to
KLH (Sigma) by
1o reaction with M-maleimidobenzoyl-N hydroxysuccinimide ester (Ausubel, 1995,
supra). Rabbits are
immunized with the peptide-KLH complex in complete Freund's adjuvant. The
resulting antisera are
tested for antipeptide activity by binding the peptide to plastic, blocking
with 1 % bovine serum albumin
(BSA), reacting with rabbit autisera, washing, and reacting with
radioiodinated goat anti-rabbit IgG.
Antisera with antipeptide activity are tested for anti-DTTHP activity using
protocols well known in the
15 art, including ELISA, radioimmunoassay (RTA), and immunoblotting.
In another procedure, isolated and purified peptide may be used to immunize
mice (about 100
~,g of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide
is radioiodinated and used
to screen the immunized animals' B-lymphocytes for production of antipeptide
antibodies. Positive
cells are then used to produce hybridomas using standard techniques. About 20
mg of peptide is
2 o sufficient for labeling and screening several thousand clones. Hybridomas
of interest are detected by
screening with radioiodinated peptide to identify those fusions producing
peptide-specific monoclonal
antibody. In a typical protocol, wells of a multi-well plate (FAST, Becton-
Dickinson, Palo Alto, CA)
are coated with affinity-purified, specific rabbit-anti-mouse (or suitable
anti-species IgG) antibodies at
mg/ml. The coated wells are blocked with 1% BSA and washed and exposed to
supernatants from
2 s hybridomas. After incubation, the wells are exposed to radiolabeled
peptide at 1 mg/mI.
Clones producing antibodies bind a quantity of labeled peptide that is
detectable above
background. Such clones are expanded and subjected to 2 cycles of cloning.
Cloned hybridomas are
injected into pristane-treated mice to produce ascites, and monoclonal
antibody is purified from the
ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia
Biotech). Several
3 o procedures for the production of monoclonal antibodies, including in vitro
production, are described in
Pound su ra). Monoclonal antibodies with antipeptide activity are tested for
anti-DTTIiP activity
using protocols well known in the art, including ELISA, RIA, and
i_r!mmunoblotting.
Antibody fragments containing specific binding sites for an epitope may also
be generated.
. For example, such fragments include, but are not limited to, the F(ab')2
fragments produced by pepsin
3 5 digestion of the antibody molecule, and the Fab fragments generated by
reducing the disulfide bridges
154


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
of the r(ab~2 fragments. Alternatively, construction of Fab expression
libraries in tilamentous
bacteriophage allows rapid and easy identification of monoclonal fragments
with desired specificity
(Pound, supra, Chaps. 45-47). Antibodies generated against polypeptide encoded
by dithp can be used
to purify and characterize full-length DITHP protein and its activity, binding
partners, etc.
Assays Using Antibodies
Anti-DITHP antibodies may be used in assays to quantify the amount of DITHP
found in a
particular human cell. Such assays include methods utilizing the antibody and
a label to detect
expression level under normal or disease conditions. The peptides and
ant'bodies of the invention may
so be used with or without modification or labeled by joining them, either
covalently or noncovalently,
with a reporter molecule.
Protocols for detecting and measuring protein expression using either
polyclonal or monoclonal
antibodies are well known in the art. Examples include ELISA, RIA, and
fluorescent activated cell
sorting (FACS). Such immunoassays typically involve the formation of complexes
between the
DITHP and its specific antibody and the measurement of such complexes. These
and other assays
are described in Pound su ra).
Without further elaboration, it is believed that one skilled in the art can,
using the preceding
description, utilize the present invention to its fullest extent. The
following preferred specific
embodiments are, therefore, to be construed as merely illustrative, and not
limitative of the remainder
' of the disclosure in any way whatsoever.
The disclosures of all patents, applications, and publications mentioned above
and below,
including U.S. Ser. No. 60/261,865, U.S. Ser. No. 60/262,599, U.S. Ser. No.
60/263,102, U.S. Ser.
No. 60/262,662, U.S. Ser. No. 60/263,064, U.S. Ser. No. 60/263,330, U.S. Ser.
No. 60/263,065, U.S.
Ser. No. 601263,329, U.S. Ser. No. 60/262,207, U.S. Ser. No. 60/262,209, U.S.
Ser. No. 60/262,208,
U.S. Ser. No. 60/262,164, U.S. Ser. No. 60/262,215, U.S. Ser. No. 60/263,063,
U.S. Ser. No.
60/261,864, U.S. Ser. No. 60/262,760, U.S. Ser. No. 60/261,622, U.S. Ser. No.
60/263,077, and U.S.
Ser. No. 60/263,069 are hereby expressly incorporated by reference.
EXAMPLES
s o I. Construction of cDNA Libraries
RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated
from
various tissues. Some tissues were homogenized and lysed in guanidinium
isothiocyauate, while others
were homogenized and lysed in phenol or in a suitable mixture of denaturants,
such as TRIZOL (Life
Technologies), a monophasic solution of phenol and guanidine isothiocyanate.
The resulting lysates
s s were centrifuged over CsCl cushions or extracted with chloroform. RNA was
precipitated with either
155


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
isopropanol or sodium acetate and ethanol, or by other routine methods.
Phenol extraction and precipitation of RNA were repeated as necessary to
increase RNA
purity. In most cases, RNA was treated with DNase. For most libraries,
poly(A+) RNA was isolated
using oligo d(T)-coupled paramagnetic particles (Promega Corporation
(Pxomega), Madison WI),
OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia CA), or an OLIGOTEX
mRNA
purification kit (QIAGEN). Alternatively, RNA was isolated directly from
tissue lysates using other
RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit (Ambion, Inc.,
Austin TX).
In some cases, Stratagene was provided with RNA and constructed the
corresponding cDNA
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed
with the
so UNIZAP vector system (Stratagene Cloning Systems, Inc. (Stratagene), La
Jolla CA) or
SUPERSCRIPT plasmid system (Life Technologies), using the recommended
procedures or similar
methods known in the art. (See, e.g., Ausubel, 1997, sutra, Chapters 5.1
through 6.6.) Reverse
transcription was initiated using oligo d(T) or random primers. Synthetic
oligonucleotide adapters
were ligated to double stranded cDNA, and the cDNA was digested with the
appropriate restriction
enzyme or enzymes. For most libraries, the cDNA was size-selected (300-1000
bp) using
SEPHACRYL S 1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography
(Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs
were ligated into
compatible restriction enzyme sites of the polyliuker of a suitable plasmid,
e.g., PBLUESCR1PT
plasmid (Stratagene), PSPORT1 plasrnid (Life Technologies), PCDNA2.1 plasmid
(Invitrogen,
~o Carlsbad CA), PBI~-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid
(Invitrogen), PCMV-ICIS
plasmid (Stratagene), pIGEN (Incyte Genomics, Palo Alto CA), pRARE (Incyte
Genomics), or
pINCY (Incyte Genomics), or derivatives thereof. Recombinant plasmids were
transformed into
competent E. coli cells including XL1-Blue, XL1-BlueMRF, or SOLR from
Stratagene or DHSa,
DH10B, or ElectroMAX DH10B from Life Technologies.
II. Isolation of cDNA Clones
Plasmids were recovered from host cells by in vivo excision using the UNIZAP
vector system
(Stratagene) or by cell lysis. Plasmids were purified using at least one of
the following: the Magic or
WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep
purification kit (Edge
s o BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and
QIAWELL 8 Ultra
plasmid purification systems or the R.E.A.L. PREP 96 plasmid purification kit
(QIAGEN). Following
precipitation, plasmids were resuspended in 0.1 ml of distilled water and
stored, with or without
Iyophilization, at 4 °C.
Alternatively, plasmid DNA was amplified from host cell lysates using direct
link PCR in a
high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216:1-14.) Host cell
lysis and thermal
156


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cycling steps were carried out in a single reaction mixture. Samples were
processed and stored in
384-well plates, and the concentration of amplified plasmid DNA was quantified
fluorometrically using
PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and a
FLUOROSKAN
1I fluorescence scanner (Labsystems Oy, Helsinki, Finland).
III. Sequencing and Analysis
cDNA sequencing reactions were processed using standard methods or high-
throughput
instrumentation such as the ABI CATALYST 800 thermal cycler (Applied
Biosystems) or the PTC-
200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser
(Robbins
so Scientific Corp., Sunnyvale CA) or the MICROLAB 2200 liquid transfer system
(Hamilton). cDNA
sequencing reactions were prepared using reagents provided by Amersham
Pharmacia Biotech or
supplied in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle
sequencing
ready reaction kit (Applied Biosystems). Electrophoretic separation of cDNA
sequencing reactions
and detection of labeled polynucleotides were carried out using the MEGABACE
1000 DNA
sequencing system (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing
system (Applied
Biosystems) in conjunction with standard ABI protocols and base calling
software; or other sequence
analysis systems known in the art. Reading frames within the cDNA sequences
were identified using
standard methods (reviewed in Ausubel, 1997, su ra, Chapter 7.7). Some of the
cDNA sequences
were selected for extension using the techniques disclosed in Example V111.
ao
IV. Assembly and Analysis of Sequences
Component sequences from chromatograms were subject to PHRED analysis and
assigned a
quality score. The sequences having at least a required quality score were
subject to various pre-
processing editing pathways to eliminate, e.g., low quality 3' ends, vector
and linker sequences, polyA
25 tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial
contamination sequences, and
sequences smaller than 50 base pairs. In particular, low-information sequences
and repetitive
elements (e.g., dinucleotide repeats, Alu repeats, etc.) were replaced by
"n's", or masked, to prevent
spurious matches.
Processed sequences were then subject to assembly procedures in which the
sequences were
3 o assigned to gene bins (bins). Each sequence could only belong to one bin.
Sequences in each gene
bin were assembled to produce consensus sequences (templates). Subsequent new
sequences were
added to existing bins using BLASTn (v.1.4 WashU) and CROSSMATCH. Candidate
pairs were
identified as all BLAST hits having a quality score greater than or equal to
150. Alignments of at least
82% local identity were accepted into the bin. The component sequences from
each bin were
s s assembled using a version of PHRAP. Bins with several overlapping
component sequences were
157


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
assembled using DEEP PHRAP. The orientation (sense or antisense) of each
assembled template
was determined based on the number and orientation of its component sequences.
Template
sequences as disclosed in the sequence listing correspond to sense strand
sequences (the "forward"
reading frames), to the best determination. The complementary (antisense)
strands are inherently
s disclosed herein. The component sequences which were used to assemble each
template consensus
sequence are listed in Table 5, along with their positions along the template
nucleotide sequences.
Bins were compared against each other and those having local similarity of at
least 82% were
combined and reassembled. Reassembled bins having templates of insufficient
overlap (less than 95%
local identity) were re-split. Assembled templates were also subject to
analysis by
so STITCHER/EXON MAPPER algorithms which analyze the probabilities of the
presence of splice
variants, alternatively spliced eXOns, splice junctions, differential
expression of alternative spliced
genes across tissue types or disease states, etc. These resulting bins were
subject to several rounds
of the above assembly procedures.
Once gene bins were generated based upon sequence alignments, bins were clone
joined
15 based upon clone information. If the 5' sequence of one clone was present
in one bin and the 3'
sequence from the same clone was present in a different bin, it was likely
that the two bins actually
belonged together in a single bin. The resulting combined bins underwent
assembly procedures to
regenerate the consensus sequences.
The final assembled templates were subsequently annotated using the following
procedure.
2o Template sequences were analyzed using BLASTn (v2.0, NCBI) versus gbpri
(GenBank version
126). "Hits" were defined as an exact match having from 95 % local identity
over 200 base pairs
through 100% local identity over 100 base pairs, or a homolog match having an
E-value, i.e. a
probability score, of s 1 x 10'$. The hits were subject to frameshift FASTx
versus GENPEPT
(GenBank version 126). (See Table 8). In this analysis, a homolog match was
defined as having an
2 s E-value of _< 1 x 10'8. The assembly method used above was described in
"System and Methods for
Analyzing Biomolecular Sequences," U.S.S.N. 09/276,534, filed March 25, 1999,
and the LIFESEQ
Gold user manual (Incyte) both incorporated by reference herein.
Following assembly, template sequences were subjected to motif, BLAST, and
functional
analyses, and categorized in protein hierarchies using methods described in,
e.g., "Database System
3 o Employing Protein Function Hierarchies for Viewing Biomolecular Sequence
Data," U.S.S.N.
08/812,290, filed March 6, 1997; "Relational Database for Storing Biomolecule
Information," U.S.S.N.
08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular
Sequence Database,"
U.S.S.N. 08/811,758, filed March 6, 1997; and "Relational Database and System
for Storing
Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed
March 4, 1998, all of
158


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
which are incorporated by reference herein.
The template sequences were further analyzed by translating each template in
all three
forward reading frames and searching each translation against the Pfam
database of hidden Markov
model-based protein families and domains using the I~~IMER software package
(available to the
s public from Washington University School of Medicine, St. Louis MO). Regions
of templates which,
when translated, contain similarity to Pfam consensus sequences are reported
in Table 3, along with
descriptions of Pfam protein domains and families. Only those Pfam hits with
an E-value of s 1 x 10-3
are reported. (See also World Wide Web site http://pfam.wustl.edu/ for
detailed descriptions of Pfam
protein domains and families.)
so Additionally, the template sequences were translated in all three forward
reading frames, and
each translation was searched against hidden Markov models for signal peptides
using the »MER
software package. Construction of hidden Markov models and their usage in
sequence analysis has
been described. (See, for example, Eddy, S.R. (1996) Curr. Opin. Str. Biol.
6:361-365.) Only those
signal peptide hits with a cutoff score of 11 bits or greater are reported. A
cutoff score of 11 bits or
15 greater corresponds to at least about 91-94% true-positives in signal
peptide prediction. Template
sequences were also translated in all three forward reading frames, and each
translation was
searched against T1VIHIVI1VIER, a program that uses a hidden Markov model
(HMM) to delineate
transmembrane segments on protein sequences and determine orientation
(Sonnhammex, E.L. et al.
(1998) Proc. Sixth Intl. Conf. On Intelligent Systems for Mol. Biol., Glasgow
et al., eds., The Am.
zo Assoc. for Artificial Intelligence (AAAI) Press, Memo Park, CA, and MIT
Press, Cambridge, MA,
pp. 175-182.) Regions of templates which, when translated, contain similarity
to signal peptide or
transmembrane consensus sequences are reported in Table 4.
The results of I~VIMER analysis as reported in Tables 3 and 4 may support the
results of
BLAST analysis as reported in Table 2 or may suggest alternative or additional
properties of template-
z 5 encoded polypeptides not previously uncovered by BLAST or other analyses.
Template sequences are further analyzed using the bioinformatics tools listed
i_n Table 8, or
using sequence analysis software known in the art such as MACDNASIS PRO
software (Hitachi
Software Engineering, South San Francisco CA) and LASERGENE software
(DNASTAR).
Template sequences may be farther queried against public databases such as the
GenBank rodent,
3 o mammalian, vertebrate, prokaryote, and eukaryote databases.
The template sequences were translated to derive the corresponding longest
open reading
frame as presented by the polypeptide sequences as reported in Table 7.
Alternatively, a polypeptide
of the invention may begin at any of the methionine residues within the full
length translated
polypeptide. Polypeptide sequences were subsequently analyzed by querying
against the GenBank
3 s protein database (GENPEPT, (GenBank version 126)). Full length
polynucleotide sequences are also
159


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
analyzed using MAC:DNAS15 YRU software (Httachi Software Engineering, youth
San r'rancisco
CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence
alignments
are generated using default parameters specified by the CLUSTAL algorithm as
incorporated into the
MEGALIGN multisequence alignment program (DNASTAR), which also calculates the
percent
s identity between aligned sequences.
Table 7 shows sequences with homology to the polypeptides of the invention as
identified by
BLAST analysis against the GenBank protein (GENPEPT) database. Column 1 shows
the
polypeptide sequence identification number (SEQ ID NO:) for the polypeptide
segments of the
invention. Column 2 shows the reading frame used in the translation of the
polynucleotide sequences
so encoding the polypeptide segrnents. Column 3 shows the length of the
translated polypeptide
segments. Columns 4 and 5 show the start and stop nucleotide positions of the
polynucleotide
sequences encoding the polypeptide segments. Column 6 shows the GenBank
identification number
(GI Number) of the nearest GenBank homolog. Column 7 shows the probability
score for the match
between each polypeptide and its GenBank homolog. Column 8 shows the
annotation of the GenBank
15 homolog.
V. Analysis of Polynucleotide Expression
Northern analysis is a laboratory technique used to detect the presence of a
transcript of a
gene and involves the hybridization of a labeled nucleotide sequence to a
membrane on which RNAs
from a particular cell type or tissue have been bound. (See, e.g., Sambrook,
supra, ch. 7; Ausubel,
ao 1995, su ra, ch. 4 and 16.)
Analogous computer techniques applying BLAST were used to search for identical
or related
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This
analysis is
much faster than multiple membrane-based hybridizations. In addition, the
sensitivity of the computer
search can be modified to determine whether any particular match is
categorized as exact or similar.
25 The basis of the search is the product score, which is defined as:
BLAST Score x Percent Identity
x minimum {length(Seq. 2), length(Seq. 2)}
s o The product score takes into account both the degree of similarity between
two sequences and the
length of the sequence match. The product score is a normalized value between
0 and 100, and is
calculated as follows: the BLAST score' is multiplied by the percent
nucleotide identity and the
product is divided by (5 times the length of the shorter of the two
sequences). The BLAST score is
calculated by assigning a score of +5 for every base that matches in a high-
scoring segment pair
160


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP
(separated by
gaps). If there is more than one HSP, then the pair with the highest BLAST
score is used to calculate
the product score. The product score represents a balance between fractional
overlap and quality in a
BLAST alignment. For example, a product score of 100 is produced only for 100%
identity over the
entire length of the shorter of the two sequences being compared. A product
score of 70 is produced
either by 100% identity and 70% overlap at one end, or by 88% identity and
100% overlap at the
other. A product score of 50 is produced either by 100% identity and 50%
overlap at one end, or 79%
identity and 100% overlap.
so VI. Tissue Distribution Profiling
A tissue distribution profile is determined for each template by compiling the
cDNA library
tissue classifications of its component cDNA sequences., Each component
sequence, is derived from
a cDNA library constructed from a human tissue. Each human tissue is
classified into one of the
following categories: cardiovascular system; connective tissue; digestive
system; embryonic
z5 structures; endocrine system; exocrine glands; genitalia, female;
genitalia, male; germ cells; heroic and
immune system; liver; musculoskeletal system; nervous system; pancreas;
respiratory system; sense
organs; skin; stomatognathic system; unclassified/mixed; or urinary tract.
Template sequences,
component sequences, and cDNA library/tissue information are found in the
LIFESEQ GOLD
database (Incyte Genomics, Palo Alto CA).
2o Table 6 shows the tissue distribution profile for the templates of the
invention. For each
template, the three most frequently observed tissue categories are shown in
column 3, along with the
percentage of component sequences belonging to each category. Only tissue
categories with
percentage values of >_ 10% are shown. A tissue distribution of "widely
distributed" in column 3
indicates percentage values of <10% in all tissue categories.
VII. Transcript Innage Analysis
Transcript images are generated as described in Seilhamer et al., "Comparative
Gene
Trauscript Analysis," U.S. Patent Number 5,840,484, incorporated herein by
reference.
3 o VIII. Extension of Polynucleotide Sequences and Isolation of a Full-length
cDNA
Oligonucleotide primers designed using a dithp of the Sequence Listing are
used to extend the
nucleic acid sequence. One primer is synthesized to initiate 5' extension of
the template, and the other
primer, to initiate 3' extension of the template. The initial primers may be
designed using OLIGO 4.06
software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or
another appropriate
program, to be about 22 to 30 nucleotides in length, to have a GC content of
about 50% or more, and
161


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
to anneal to the target sequence at temperatures of about 68 °C to
about 72 °C. Any stretch of
nucleotides which would result in hairpin structures and primer-primer
dimerizations are avoided.
Selected human cDNA libraries are used to extend the sequence. If more than
one extension is
necessary or desired, additional or nested sets of primers are designed.
High fidelity amplification is obtained by PCR using methods well known in the
art. PCR is
performed in 96-well plates using the PTC-200 thermal cycler (MJ Research).
The reaction mix
contains DNA template, 200 nmol of each primer, reaction buffer containing
Mga+, (NHq)ZSOø, and 13-
mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia B.iotech), ELONGASE
enzyme (Life
Technologies), and Pfu DNA polymerase (Stratagene), with the following
parameters for primer pair
so PCI A and PCI B: Step 1: 94°C, 3 min; Step 2: 94°C, 15 sec;
Step 3: 60°C, 1 min; Step 4: 68°C, 2
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min;
Step 7: storage at 4°C. In the
alternative, the parameters for primer pair T7 and SK+ are as follows: Step 1:
94°C, 3 min; Step 2:
94°C, 15 sec; Step 3: 57°C, 1 min; Step 4: 68°C, 2 min;
Step 5: Steps 2, 3, and 4 repeated 20 times;
Step 6: 68°C, 5 min; Step 7: storage at 4°C.
15 The concentration of DNA in each well is determined by dispensing 100 ~l
PICOGREEN
quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in 1X Tris-EDTA
(TE) and 0.5 ~1 of
undiluted PCR product into each Well of an opaque fluorimeter plate (Corning
Incorporated (Corning),
Corning NY), allowing the DNA to bind to the reagent. The plate is scanned in
a FLUOROSKAN II
(Labsystems Oy) to measure the fluorescence of the sample and to quantify the
concentration of
2 o DNA. A 5 ~.1 to 10 ~.l aliquot of the reaction mixture is analyzed by
electrophoresis on a 1 % agarose
mini-gel to determine which reactions are successful in extending the
sequence.
The extended nucleotides are desalted and concentrated, transferred to 384-
well plates,
digested with CviJI cholera virus endonuclease (Molecular Biology Research,
Madison WI), and
sonicated or sheared prior to religation into pUC 18 vector (Amersham
Pharmacia Biotech). For
25 shotgun sequencing, the digested nucleotides are separated on low
concentration (0.6 to 0.8%)
agarose gels, fragments are excised, and agar digested with AGAR ACE
(Promega). Extended
clones are religated using T4 ligase (New England Biolabs, Inc., Beverly MA)
into pUC 18 vector
(Amersham Phaimacia Biotech), treated with Pfu DNA polymerase (Stxatagene) to
fill-in restriction
site overhangs, and transfected into competent E. coli cells. Transformed
cells are selected on
3 o antibiotic-containing media, individual colonies are picked and cultured
overnight at 37 °C in 384-well
plates in LB/2x carbenicilliu liquid media.
The cells are lysed, and DNA is amplified by PCR using Taq DNA polymerase
(Amersham
Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following
parameters: Step 1:
94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min;
Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4
3 s repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at
4°C. DNA is quantified by PICOGREEN
162


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
reagent (Molecular Probes) as described above. samples with low DNA recoveries
are reamplifxed
using the same conditions as described above. Samples are diluted with 20%
dimethysulfoxide (1:2,
v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the
DYENAMIC
DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator
cycle
s sequencing ready reaction kit (Applied Biosystems).
In like manner, the dithp is used to obtain regulatory sequences (promoters,
introns, and
enhancers) using the procedure above, oligonucleotides designed for such
extension, and an
appropriate genomic library.
~.o IX. Labeling of Probes and Southern Hybridization Analyses
Hybridization probes derived from the dithp of the Sequence Listing are
employed for
screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides
between 100 and
1000 nucleotides in length is specifically described, but essentially the same
procedure may be used
with larger cDNA fragments. Probe sequences are labeled at room temperature
for 30 minutes using
15 a T4 polynucleotide kinase, ~y3zP-ATP, and 0.5X One-Phor-All Plus (Amersham
Pharmacia Biotech)
buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia
Biotech). The
probe mixture is diluted to 10' dpm/~,g/ml hybridization buffer and used in a
typical membrane-based
hybridization analysis.
The DNA is digested with a restriction endonuclease such as Eco RV and is
electrophoresed
z o through a 0.7 % agarose gel. The DNA fragments are transferred from the
agarose to nylon
membrane (NYTRAN Plus, Schleicher & Schuell, Inc., Keene NH) using procedures
specified by the
manufacturer of the membrane. Prehybridization is carried out for three or
more hours at 68 °C, and
hybridization is carried out overnight at 68 °C. To remove non-specific
signals, blots are sequentially
washed at room temperature under increasingly stringent conditions, up to 0.1x
saline sodium citrate
2s (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a
PHOSPHORJMAGER
cassette (Molecular Dynamics) or are exposed to autoradiography film,
hybridization patterns of
standard and experimental lanes are compared. Essentially the same procedure
is employed when
screening RNA.
3 o X. Chromosome Mapping of dithp
The cDNA sequences which were used to assemble SEQ ID NO:1-56 are compared
with
sequences from the Incyte LIFESEQ database and public domain databases using
BLAST and other
implementations of the Smith-Watexman algorithm. Sequences from these
databases that match SEQ
ID NO:l-56 are assembled into clusters of contiguous and overlapping sequences
using assembly
3 s , algorithms such as PHRAP (Table 8). Radiation hybrid and genetic mapping
data available from
163


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
public resources such as the ~tantord Human Cienome :enter (SHCrC:), Whitehead
Institute for
Genome Research (WIGR), and Genethon are used to determine if any of the
clustered sequences
have been previously mapped. Inclusion of a mapped sequence in a cluster will
result in the
assignment of all sequences of that cluster, including its particular SEQ ID
NO:, to that map location.
s The genetic map locations of SEQ ID N0:1-56 are described as ranges, or
intervals, of human
chromosomes. The map position of an interval, in centiMorgans, is measured
relative to the terminus
of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement
based on
recombination frequencies between chromosomal markers. On average, 1 cM is
roughly equivalent to
1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and
cold spots of
so recombination.) The cM distances are based on genetic markers mapped by
Genethon which provide
boundaries for radiation hybrid markers whose sequences were included in each
of the clusters.
XI. Microarray Analysis
Probe Preparation from Tissue or Cell Samples
Zs Total RNA is isolated from tissue samples using the guanidinium thiocyanate
method and
polyA+ RNA is purified using the oligo (dT) cellulose method. Each polyA+ RNA
sample is reverse
transcribed using MMLV reverse-transcriptase, 0.05 pg/~,I oligo-dT primer
(2lmer), 1X first strand
buffer, 0.03 units/pl RNase inhibitor, 500 ~,M dATP, 500 ~.M dGTP, 500 ~,M
dTTP, 40 ACM dCTP, 40
~.M dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). 'The reverse
transcription
a o reaction is performed in a 25 ml volume containing 200 ng polyA+ RNA with
GEMBRIGHT kits
(Iucyte). Specific control polyA+ RNAs are synthesized by in yitro
transcription from non-coding
yeast genomic DNA (W. Lei, unpublished). As quantitative controls, the control
mRNAs at 0.002 ng,
0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at
ratios of 1:100,000, 1:10,000,
1:1000, 1:100 (w/w) to sample mRNA respectively. The control mRNAs are diluted
into reverse
2 s transcription reaction at ratios of 1:3, 3:1, 1:10, 10:1, 1:25, 25:1 (w/w)
to sample mRNA differential
expression patterns. After incubation at 37° C for 2 hr, each reaction
sample (one with Cy3 and
another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and
incubated for 20
minutes at 85° C to the stop the reaction and degrade the RNA. Probes
are purified using two
successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories,
Inc.
3 0 (CLONTECH), Palo Alto CA) and after combining, both reaction samples are
ethanol precipitated
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100%
ethanol. The probe is
then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook
NY) and resuspended
in 14 ~,15X SSC/0.2% SDS.
3 5 Microarray Preparation
164


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Sequences of the present invention are used to generate array elements. Each
array element
is amplified from bacterial cells containing vectors with cloned cDNA inserts.
PCR amplification uses
primers complementary to the vector sequences flanking the cDNA insert. Array
elements are
amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a
final quantity greater than 5 ~.g.
s Amplified array elements are then purifted using SEPHACRYL-400 (Amersham
Pharmacia Biotech).
Purified array elements are immobilized on polymer-coated glass slides. Glass
microscope
slides (Corning) are cleaned by ultrasound in 0.1 % SDS and acetone, with
extensive distilled water
washes between and after treatments. Glass slides are etched in 4%
hydrofluoric acid (VWR
Scientific Products Corporation (VWR), West Chester, PA), washed extensively
in distilled water, and
so coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides
are cured in a 110°C
oven.
Array elements are applied to the coated glass substrate using a procedure
described in US
Patent No. 5,807,522, incorporated herein by reference. 1 ~.l of the array
element DNA, at an average
concentration of 100 ngl~,l, is loaded into the open capillary printing
element by a high-speed robotic
apparatus. The apparatus then deposits about 5 n1 of array element sample per
slide.
Microarrays are W-crosslinked using a STRATALINKER UV-crosslinker
(Stratagene).
Microarrays are washed at room temperature once in 0.2% SDS and three times in
distilled water.
Non-specific binding sites are blocked by incubation of microarrays in 0.2%
casein in phosphate
buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60°
C followed by washes in
ao 0.2% SDS and distilled water as before.
Hybridization
Hybridization reactions contain 9 ~Cl of probe mixture consisting of 0.2 ~Cg
each of Cy3 and Cy5
labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The
probe mixture is
z5 heated to 65°C for 5 minutes and is aliquoted onto the microarray
surface and covered with an 1.8
cm2 coverslip. The arrays are transferred to a waterproof chamber having a
cavity just slightly larger
than a microscope slide. The chamber is kept at 100% humidity internally by
the addition of 140 p1 of
5x SSC in a corner of the chamber. The chamber containing the arrays is
incubated for about 6.5
hours at 60° C. The arrays are washed for 10 min at 45° C in a
first wash buffer (1X SSC, 0.1 %
3 o SDS), three times for 10 minutes each at 45° C in a second wash
buffer (0.1X SSC), and dried.
Detection
Reporter-labeled hybridization complexes are detected with a microscope
equipped with an
Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of
generating spectral lines
3 s at 488 nm for excitation of Cy3 and at 632 rim for excitation of CyS. The
excitation laser light is
165


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
focused on the array using a 20X microscope objective (Nikon, Inc., Melville
NY). The slide
containing the array is placed on a computer-controlled X-Y stage on the
microscope and raster-
scanned past the objective. The 1.8 cm x 1.8 cm array used in the present
example is scanned with a
resolution of 20 micrometers.
In two separate scans, a mixed gas multiline laser excites the two
fluorophores sequentially.
Emitted light is split, based on wavelength, into two photomultiplier tube
detectors (PMT 81477,
Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two
fluorophores. Appropriate
filters positioned between the array and the photomultiplier tubes are used to
filter the signals. The
emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for
CyS. Each array is
Zo typically scanned twice, one scan per fluorophore using the appropriate
filters at the laser source,
although the apparatus is capable of recording the spectra from both
fluorophores simultaneously.
The sensitivity of the scans is typically calibrated using the signal
intensity generated by a
cDNA control species added to the probe mix at a known concentration. A
specific location on the
array contains a complementary DNA sequence, allowing the intensity of the
signal at that location to
1~ be correlated with a weight ratio of hybridizing species of 1:100,000. When
two probes from different
sources (e.g., representing test and control cells), each labeled with a
different fluorophore, are
hybridized to a single array for the purpose of identifying genes that are
differentially expressed, the
calibration is done by labeling samples of the calibrating cDNA with the two
fluorophores and adding
identical amounts of each to the hybridization mixture.
2o The output of the photomultiplier tube is digitized using a 12-bit RTI-835H
analog-to-digital
(A/D) conversion boaxd (Analog Devices, Inc., Norwood, MA) installed in an IBM-
compatible PC
computer. The digitized data are displayed as an image where the signal
intensity is mapped using a
linear 20-color transformation to a pseudocolor scale ranging from blue (low
signal) to red (high
signal). The data is also analyzed quantitatively. Where two different
fluorophores are excited and
z5 measured simultaneously, the data are first corrected for optical crosstalk
(due to overlapping emission
spectra) between the fluorophores using each fluorophore's emission spectrum.
A grid is superimposed over the fluorescence signal image such that the signal
from each spot
is centered in each element of the grid. The fluorescence signal within each
element is then
integrated to obtain a numerical value corresponding to the average intensity
of the signal. The
3 o software used for signal analysis is the GEMTOOLS gene expression analysis
program (Incyte).
XII. Complementary Nucleic Acids
Sequences complementary to the dithp are used to detect, decrease, or inhibit
expression of
the naturally occurring nucleotide. The use of oligonucleotides comprising
from about 15 to 30 base
35 pairs is typical in the art. However, smaller or larger sequence fragments
can also be used.
166


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Appropriate ohgonucleotides are designed ii~om the dithp using OLICiU 4.U6
software (National
Biosciences) or other appropriate programs and are synthesized using methods
standard in the art or
ordered from a commercial supplier. To inhibit transcription, a complementary
oligonucleotide is
designed from the most unique 5' sequence and used to prevent transcription
factor binding to the
promoter sequence. To inhibit translation, a complementary oligonucleotide is
designed to prevent
ribosomal binding and processing of the transcript.
XIII. Expression of DITHP
Expression and purification of DITHP is accomplished using bacterial or virus-
based
io expression systems. For expression of DITHP in bacteria, cDNA is subcloned
into an appropriate
vector containing an antibiotic resistance gene and an inducible promoter that
directs high levels of
cDNA transcription. Examples of such promoters include, but are not limited
to, the trp-lac (tac)
hybrid promoter and the TS or T7 bacteriophage promoter in conjunction with
the lac operator
regulatory element. Recombinant vectors are transformed into suitable
bacterial hosts, e.g.,
BL21(DE3). Antibiotic resistant bacteria express DITHP upon induction with
isopropyl beta-D-
thiogalactopyranoside (IPTG). Expression of DITHP in eukaryotic cells is
achieved by infecting
insect or mammalian cell lines with recombinant Autographica californica
nuclear polyhedrosis virus
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of
baculovirus is
replaced with cDNA encoding DTI'I3P by either homologous recombination or
bacterial-mediated
2 o transposition involving transfer plasmid intermediates. Viral infectivity
is maintained and the strong
polyhedrin promoter drives high levels of cDNA transcription. Recombinant
baculovirus is used to
infect ~odoptera fru ig,,perda (Sf9) insect cells in most cases, or human
hepatocytes, in some cases.
Infection of the latter requires additional genetic modifications to
baculovirus. (See e.g., Engelhard,
su ra; and Sandig, supra.)
In most expression systems, D1THP is synthesized as a fusion protein with,
e.g., glutathione
S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His,
permitting rapid, single-step,
afbnity-based purification of recombinant fusion protein from crude cell
lysates. GST, a 26-kilodalton
enzyme from Schistosoma-japonicum, enables the purification of fusion proteins
on immobilized
glutathione under conditions that maintain protein activity and antigenicity
(Amersham Pharmacia
3 o Biotech). Following purification, the GST moiety can be proteolytically
cleaved from DITHP at
specifically engineered sites. FLAG, an 8-amino acid peptide, enables
i_mmunoaffinity purification
using commercially available monoclonal and polyclonal anti-FLAG antibodies
(Eastman Kodak
Company, Rochester NY). 6-His, a stretch of six consecutive histidine
residues, enables purification
on metal-chelate resins (QIAGEN). Methods for protein expression and
purification are discussed in
167


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
Ausubel (lyy5, su ra, Chapters 10 and 16). Yuni'ted Dli'HL' obtatued by these
methods can be used
directly in the following activity assay.
XIV. Demonstration of DITHP Activity
s DTTHP activity is demonstrated through a variety of specific assays, some of
which are
outlined below.
Oxidoreductase activity of DTIWP is measured by the increase in extinction
coefficient of
NAD(P)H coenzyme at 340 nmfor the measurement of oxidation activity, or the
decrease in
extinction coefficient of NAD(P)H coenzyme at 340 mnfor the measurement of
reduction activity
so (Dalziel, K. (1963) J. Biol. Chem. 238:2850-2858). One of three substrates
may be used: Asn-~3Gal,
biocytidine, or ubiquinone-10. The respective subunits of the enzyme reaction,
for example,
cytochtome ci b oxidoreductase and cytochrome c, are reconstituted. The
reaction mixture contains
a)1-2 mg/ml DITHP; and b) 15 mM substrate, 2.4 mM NAD(P)+ in 0.1 M phosphate
buffer, pH 7.1
(oxidation reaction), or 2.0 mM NAD(P)H, in 0.1 M NazHP04 buffer, pH 7.4 (
reduction reaction); in
15 a total volume of 0.1 ml. Changes in absorbance at 340 nm (A34o) are
measured at 23.5 ° C using a
recording spectrophotometer (Shimadzu Scientific Instruments, Inc., Pleasanton
CA). The amount of
NAD(P)H is stoichiometrically equivalent to the amount of substrate initially
present, and the change
in A34o is a direct measure of the amount of NAD(P)H produced; DA3do =
6620[NADH].
Oxidoreductase activity of DITHP activity is proportional to the amount of
NAD(P)H present in the
2o assay.
Transferase activity of DITHP is measured through assays such as a methyl
transferase
assay in which the transfer of radiolabeled methyl groups between a donor
substrate and an. acceptor
substrate is measured (Bokar, J.A. et al. (1994) J. Biol. Chem. 269:17697-
17704). Reaction mixtures
(50 ~.1 final volume) contain 15 mM HEPES, pH 7.9, 1.5 mM MgClz, 10 mM
dithiothreitol, 3 %
2s polyvinylalcohol, 1.5 ~,Ci [rnethyl-3H]AdoMet (0.375 ~.M AdoMet) (DuPont-
NEN), 0.6 ~,g DITHP,
and acceptor substrate (0.4 ~.g [355]RNA or 6-mercaptopurine (6-MP) to 1 mM
final concentration).
Reaction mixtures are incubated at 30 °C for 30 minutes, then 65
°C for 5 minutes. The products are
separated by chromatography or electrophoresis and the level of methyl
trausferase activity is
determined by quantification of methyl-3H recovery.
3 o DI'THP hydrolase activity is measured by the hydrolysis of appropriate
synthetic peptide
substrates conjugated with various chromogenic molecules in which the degree
of hydrolysis is
quantified by spectrophotometric (or fluorometric) absorption of the released
chromophore. (Beynon,
R.J. and J.S. Bond (1994) Proteolytic Enzymes: A Practical Approach, Oxford
University Press, New
York NY, pp. 25-55) Peptide substrates are designed according to the category
of protease activity
3 s as endopeptidase (serine, cysteine, aspartic proteases), animopeptidase
(leucine aminopeptidase), or
168


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
carboxypeptiaase (~:arboxypeptiaase ~ ana ~, procouagem:-protemase~.
DITIlP isomerase activity such as peptidyl prolyl cisltrans isomerase activity
can be assayed
by an enzyme assay described by Rahfeld, J.U., et al. (1994) (FEBS Lett. 352:
180-184). The assay
is performed at 10°C in 35 mM HEPES buffer, pH 7.8, containing
chymotrypsin (0.5 mg/ml) and
DITHP at a variety of concentrations. Under these assay conditions, the
substrate, Suc-Ala-Xaa-Pro-
Phe-4-NA, is in equilibrium with respect to the prolyl bond, with 80-95% in
tt~ans and 5-20% in cis
conformation. An aliquot (2 u1) of the substrate dissolved in dimethyl
sulfoxide (10 mg/ml) is added to
the reaction mixture described above. Only the cis isomer of the substrate is
a substrate for cleavage
by chymotrypsin. Thus, as the substrate is isomerized by DITFIP, the product
is cleaved by
so chymotrypsin to produce 4-nitroanilide, which is detected by it's
absorbance at 390 nm. 4-Nitroanilide
appears in a time-dependent and a DTI'HP concentration-dependent manner.
An assay for DITHP activity associated with growth and development measures
cell
proliferation as the amount of newly initiated DNA synthesis in Swiss mouse
3T3 cells. A plasmid
containing polynucleotides encoding DITHI' is transfected into quiescent 3T3
cultured cells using
z s methods well known in the art. The transiently transfected cells are then
incubated in the presence of
[3H]thymidine, a radioactive DNA precursor. Where applicable, varying amounts
of DITHP ligand
are added to the transfected cells. Incorporation of [3H]thymidine into acid-
precipitable DNA is
measured over an appropriate time interval, and the amount incorporated is
directly proportional to the
amount of newly synthesized DNA.
ao Growth factor activity of DITHP is measured by the stimulation of DNA
synthesis in Swiss
mouse 3T3 cells (McKay, I. and I. Leigh, eds. (1993) Growth Factors: A
Practical Approach, Oxford
University Press, New York NY). Initiation of DNA synthesis indicates the
cells' entry into the
mitotic cycle and their commitment to undergo later division. 3T3 cells are
competent to respond to
most growth factors, not only those that are mitogenic, but also those that
are involved in embryonic
2 s induction. This competence is possible because the in yivo specificity
demonstrated by some growth
factors is not necessarily inherent but is determined by the responding
tissue. In this assay, varying
amounts of DITHP are added to quiescent 3T3 cultured cells in the presence of
[3H]thymidine, a
radioactive DNA precursor. DITHP for this assay can be obtained by recombinant
means or from
biochemical preparations. Incorporation of [3H]thymidine into acid-
precipitable DNA is measured
3 0 over an appropriate time interval, and the amount incorporated is directly
proportional to the amount of
newly synthesized DNA. A linear dose-response curve over at least a hundred-
fold DTTIlP
concentration range is indicative of growth factor activity. One unit of
activity per milliliter is defined
as the concentration of DTITIP producing a 50% response level, where 100%
represents maximal
incorporation of [3H]thymidine into acid-precipitable DNA.
s 5 Alternatively, an assay for cytokine activity of DITHP measures the
proliferation of
169


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
leukocytes. In this assay, the amount oi' trltlated thymldlne Incorporated
Into newly synthesized 1~NA
is used to estimate proliferative activity. Varying amounts of DITHP are added
to cultured
leukocytes, such as granulocytes, monocytes, or lymphocytes, in the presence
of [3H]thymidine, a
radioactive DNA precursor. DITHP for this assay can be obtained by recombinant
means or from
s biochemical preparations. Incorporation of [3H]thymidine into acid-
precipitable DNA is measured
over an appropriate time interval, and the amount incorporated is directly
proportional to the amount of
newly synthesized DNA. A linear dose-response curve over at least a hundred-
fold DITHP
concentration range is indicative of DTfHP activity. One unit of activity per
milliliter is conventionally
defined as the concentration of DTTHP producing a 50% response level, where
100% represents
io maximal incorporation of [3H]thymidine into acid-precipitable DNA.
An alternative assay for DITHP cytokine activity utilizes a Boyden micro
chamber
(Neuroprobe, Cabin John MD) to measure leukocyte chemotaxis (Vicari, supra).
In this assay, about
10s migratory cells such as macrophages or monocytes are placed in cell
culture media in the upper
compartment of the chamber. Varying dilutions of DITHP are placed in the lower
compartment. The
two compartments are separated by a 5 or 8 micron pore polycarbonate filter
(Nucleopore, Pleasanton
CA). After incubation at 37 °C for 80 to 120 minutes, the filters are
fixed in methanol and stained with
appropriate labeling agents. Cells which migrate to the other side of the
filter are counted using
standard microscopy. The chemotactic index is calculated by dividing the
number of migratory cells
counted when DfI:HP is present in the lower compartment by the number of
migratory cells counted
2o when only media is present in the lower compartment. The chemotactic index
is proportional to the
activity of DITI1P.
Alternatively, cell lines or tissues transformed with a vector containing
dithp can be assayed
for DTfHP activity by immunoblotting. Cells are denatured in SDS in the
presence of (3-
mercaptoethanol, nucleic acids removed by ethanol precipitation, and proteins
purified by acetone
2s precipitation. Pellets are resuspended in 20 mM tris buffer at pH 7.5 and
incubated with Protein G-
Sepharose pre-coated with an antibody specific for DTfIiP. After washing, the
Sepharose beads are
boiled in electrophoresis sample buffer, and the eluted proteins subjected to
SDS-PAGE. The SDS-
PAGE is transferred to a nitrocellulose membrane for immunoblotting, and the
DI TIiP activity is
assessed by visualizing and quantifying bands on the blot using the antibody
specific for Dl~'HP as the
3 o primary antibody and lzsl-labeled IgG specific for the primary antibody as
the secondary antibody.
DTTHP kinase activity is measured by phosphorylation of a protein substrate
using y-labeled
[saP]-ATP and quantitation of the incorporated radioactivity using a
radioisotope counter. DITHP is
incubated with the protein substrate, [32P]-ATP, and an appropriate kinase
buffer. The [3zp]
incorporated into the product is separated from free [32P]-ATP by
electrophoresis and the
170


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
incorporated [szP] is counted. The amount of [3'P] recovered is proportional
to the kinase activity of
DTI'FiP in the assay. A determination of the specific amino acid residue
phosphorylated is made by
phosphoamino acid analysis of the hydrolyzed protein.
In the alternative, DTI'IiP activity is measured by the increase in cell
proliferation resulting
s from transformation of a mammalian cell line such as COS7, HeLa or CHO with
an eukaryotic
expression vector encoding DITHP. Eukaryotic expression vectors are
commercially available, and
the techniques to introduce them into cells are well known to those skilled in
the art. The cells are
incubated for 48-72 hours after transformation under conditions appropriate
for the cell line to allow
expression of DITHP. Phase microscopy is then used to compare the mitotic
index of transformed
1o versus control cells. An increase in the mitotic index indicates DITHP
activity.
In a further alternative, an assay for DITHP signaling activity is based upon
the ability of
GPCR family proteins to modulate G protein-activated second messenger signal
transduction
pathways (e.g., cAMP; Gaudin, P. et al. (1998) J. Biol. Chem. 273:4990-4996).
A plasmid encoding
full length DITHP is transfected into a mammalian cell line (e.g., Chinese
hamster ovary (CHO) or
15 human embryonic kidney (HEK-293) cell lines) using methods well-known in
the art. Transfected
cells are grown in 12-well trays in culture medium for 48 hours, then the
culture medium is discarded,
and the attached cells are gently washed with PBS. The cells are then
incubated in culture medium
with or without ligand for 30 minutes, then the medium is removed and cells
lysed by treatment with 1
M perchloric acid. The CAMP levels in the lysate are measured by
radioimmunoassay using methods
2 o well-known in the art. Changes in the levels of cAMP in the lysate from
cells exposed to ligand
compared to those without ligand are proportional to the amount of DITHP
present in the transfected
cells.
Alternatively, an assay for DITTI=IP protein phosphatase activity measures the
hydrolysis of P
nitrophenyl phosphate (PNPP). DTIZ3P is incubated together with PNPP in HEPES
buffer pH 7.5, in
2 s the presence of 0.1 °~o (3-mercaptoethanol at 37 °C for 60
min. The reaction is stopped by the addition
of 6 ml of 10 N NaOH, and the increase in light absorbance of the reaction
mixture at 410 nm
resulting from the hydrolysis of PNPP is measured using a spectrophotometer.
The increase in light
absorbance is proportional to the phosphatase activity of DITHP in the assay
(Diamond, R.H. et al
(1994) Mol Cell Biol 14:3752-3762).
3 o An alternative assay measures DT)?I3P-mediated G-protein signaling
activity by monitoring the
mobilization of Ca++ as an indicator of the signal transduction pathway
stimulation. (See, e.g.,
Grynkievicz, G. et al. (1985) J. Biol. Chem. 260:3440; McColl, S. et al.
(1993) J. Tm_m__unol.
150:4550-4555; and Aussel, C. et al. (1988) J. Tmmunol. 140:215-220). The
assay requires preloading
neutrophils or T cells with a fluorescent dye such as FURA-2 or BCECF
(Universal Imaging Corp,
3 5 Westchester PA) whose emission characteristics are altered by Ca**
binding. When the cells are
171


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
exposed to one or more activating stimuli artificially (e.g., anti-CD3
antibody ligation of the T cell
receptor) or physiologically (e.g., by allogeneic stimulation), Ca** flux
takes place. This flux can be
observed and quantified by assaying the cells in a fluorometer or fluorescent
activated cell sorter.
Measurements of Ca++ flux are compared between cells in their normal state and
those transfected
with DITHP. Increased Ca++ mobilization attributable to increased DITHP
concentration is
proportional to DTI'I3P activity.
DTTHP transport activity is assayed by measuring uptake of labeled substrates
into Xenopus
laevis oocytes. Oocytes at stages V and VI are injected with DTTHP mRNA (10 ng
per oocyte) and
incubated for 3 days at 18°C in OR2 medium (82.5mM NaCl, 2.5 mM KCl,
1mM CaCl2, 1mM MgClz,
so 1mM Na2HP04, 5 mM Hepes, 3.8 mM NaOH, 50~.g1m1 gentamycin, pH 7.8) to allow
expression of
DITIIP protein. Oocytes are then transferred to standard uptake medium (100mM
NaCl, 2 mM KCl,
1mM CaClz, 1mM MgClz, 10 mM Hepes/Tris pH 7.5). Uptake of various substrates
(e.g., amino
acids, sugars, drugs, ions, and neurotransmitters) is initiated by adding
labeled substrate (e.g.
radiolabeled with 3H, fluorescently labeled with rhodamine, etc.) to the
oocytes. After incubating for
30 minutes, uptake is terminated by washing the oocytes three times in Na+-
free medium, measuring
the incorporated label, and comparing with controls. DITHP transport activity
is proportional to the
level of internalized labeled substrate.
DTI'HP transferase activity is demonstrated by a test for
galactosyltransferase activity. This
can be determined by measuring the transfer of radiolabeled galactose from UDP-
galactose to a
2o GlcNAc-terminated oligosaccharide chain (Kolbinger, F. et al. (1998) J.
Biol. Chem. 273:58-65). The
sample is incubated with 14 ~Cl of assay stock solution (180 mM sodium
cacodylate, pH 6.5, 1 mglml
bovine serum albumin, 0.26 mM UDP-galactose, 2 p1 of UDP-[3H]galactose), 1 ~.1
of MnCl2 (500
mM), and 2.5 ~,1 of GlcNAc[30-(CHZ)$ COZMe (37 mg/ml in dimethyl sulfoxide)
for 60 minutes at
37 °C. The reaction is quenched by the addition of 1 ml of water and
loaded on a C18 Sep-Pak
2 s cartridge (Waters), and the column is washed twice with 5 ml of water to
remove unreacted UDP-
[3H]galactose. The ['H]galactosylated GlcNAc~iO-(CHz)$ C02Me remains bound to
the column
during the water washes and is eluted with 5 ml of methanol. Radioactivity in
the eluted material is
measured by liquid scintillation counting and is proportional to
galactosyltransferase activity in the
starting sample.
3 0 . In the alternative, DTTHP induction by heat or toxins may be
demonstrated using primary
cultures of human fibroblasts or human cell lines such as CCL-13, HEK293, or
HEP G2 (ATCC). To
heat induce DTIZ3P expression, aliquots of cells are incubated at 42 °C
for 15, 30, or 60 minutes.
Control aliquots are incubated at 37 °C for the same time periods. To
induce DITHP expression by
toxins, aliquots of cells are treated with 100 p.M arsenite or 20 mM azetidine-
2-carboxylic acid for 0, 3,
3 5 6, or 12 hours. After exposure to heat, arsenite, or the amino acid
analogue, samples of the treated
172


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
cells are harvested and cell lysates prepared for analysis by western blot.
Cells are lysed in lysis
buffer containing 1 % Nonidet P-40, 0.15 M NaCl, 50 mM Tris-HCl, 5 mM EDTA, 2
mM
N-ethylmaleimide, 2 mM phenylmethylsulfonyl fluoride, 1 mg/ml leupeptin, and 1
mg/ml pepstatin.
Twenty micrograms of the cell lysate is separated on an 8% SDS-PAGE gel and
transferred to a
s membrane. After blocking with 5% nonfat dry milk/phosphate-buffered saline
for 1 h, the membrane
is incubated overnight at 4°C or at room temperature for 2-4 hours with
a 1:1000 dilution of
anti-DTIZ3P serum in 2% nonfat dry milk/phosphate-buffered saline. The
membrane is then washed
and incubated with a 1:1000 dilution of horseradish peroxidase-conjugated goat
anti-rabbit IgG in 2°l0
dry milklphosphate-buffered saline. After washing with 0.1 % Tween 20 in
phosphate-buffered saline,
1o the DITHP protein is detected and compared to controls using
chemiluminescence.
Alternatively, DITHP protease activity is measured by the hydrolysis of
appropriate synthetic
peptide substrates conjugated with various chromogenic molecules in which the
degree of hydrolysis is
quantified by spectrophotometric (or fluorometric) absorption of the released
chromophore (Beynon,
R.J. and J.S. Bond (1994) Proteolytic Enzymes: A Practical Approach, Oxford
University Press, New
15 York, NY, pp.25-55). Peptide substrates are designed according to the
category of protease activity
as endopeptidase (serine, cysteine, aspartic proteases, or metalloproteases),
aminopeptidase (leucine
aminopeptidase), or carboxypeptidase (carboxypeptidases A and B, procollagen C-
proteinase).
Commonly used chromogens are 2-naphthylamine, 4-nitroaniline, and furylacrylic
acid. Assays are
performed at ambient temperature and contain an aliquot of the enzyme and the
appropriate substrate
2 o in a suitable buffer. Reactions are carried out in an optical cuvette, and
the increase/decrease in
absorbance of the chromogen released during hydrolysis of the peptide
substrate is measured. The
change in absorbance is proportional to the DITHP protease activity in the
assay.
In the alternative, an assay for DTfHP protease activity takes advantage of
fluorescence
resonance energy transfer (FRET) that occurs when one donor and one acceptor
fluorophore with an
2s appropriate spectral overlap are in close proximity. A flexible peptide
linker containing a cleavage site
specific for PRTS is fused between a red-shifted variant (RSGFP4) and a blue
variant (BFPS) of
Green Fluorescent Protein. This fusion protein has spectral properties that
suggest energy transfer is
occurring from BFPS to RSGFP4. When the fusion protein is incubated with
DTTHP, the substrate is
cleaved, and the two fluorescent proteins dissociate. This is accompanied by a
marked decrease in
3 o energy transfer which is quantified by comparing the emission spectra
before and after the addition of
DITHP (Mitra, R.D. et al (1996) Gene 173:13-17). This assay can also be
performed in living cells.
In this case the fluorescent substrate protein is expressed constitutively in
cells and DTTHP is
introduced on an inducible vector so that FRET can be monitored in the
presence and absence of
DITHI' (Sagot, I. et al (1999) FEBS Lett. 447:53-57).
3 5 A method to determine the nucleic acid binding activity of DI'fHP involves
a polyacrylamide
173


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
gel mobility-shift assay. In preparation for this assay, D1THP is expressed by
transforming a
mammalian cell line such as COS7, HeLa or CHO with a eukaryotic expression
vector containing
DTTFiP cDNA. The cells are incubated for 48-72 hours after transformation
under conditions
appropriate for the cell line to allow expression and accumulation of DTTHP.
Extracts containing
s solubilized proteins can be prepared from cells expressing DITHP by methods
well known in the art.
Portions of the extract containing DITHP are added to [3zp]_labeled RNA or
DNA. Radioactive
nucleic acid can be synthesized in vitro by techuiques well known in the art.
The mixtures are
incubated at 25 °C in the presence of RNase- and DNase-inhibitors under
buffered conditions for 5-10
minutes. After incubation, the samples are analyzed by polyacrylamide gel
electrophoresis followed
so by autoradiography. The presence of a band on the autoradiogram indicates
the formation of a
complex between DITHP and the radioactive transcript. A band of similar
mobility will not be present
in samples prepared using control extracts prepared from untransformed cells.
In the alternative, a method to determine the methylase activity of a DITHP
measures
transfer of radiolabeled methyl groups between a donor substrate and an
acceptor substrate. Reaction
15 mixtures (50 ~,l final volume) contain 15 mM HEPES, pH 7.9, 1.5 mM MgCl2,
10 mM dithiothreitol,
3 % polyvinylalcohol, 1.5 p,Ci [methyl-3H]AdoMet (0.375 p.M AdoMet) (DuPont-
NEN), 0.6 p,g
DITHP, and acceptor substrate (e.g., 0.4 ~,g [355]RNA, or 6-mercaptopurine (6-
MP) to 1 mM final
concentration). Reaction mixtures are incubated at 30°C for 30 minutes,
then 65°C for 5 minutes.
Analysis of [methyl-3H]RNA is as follows: 1) 50 ~.l of 2 x loading buffer (20
mM Tris-HCl, pH 7.6, 1
2o M LiCl, 1 mM EDTA, 1% sodium dodecyl sulphate (SDS)) and 50 ~.l oligo d(T)-
cellulose (10 mg/ml in
1 x loading buffer) are added to the reaction mixture, and incubated at
ambient temperature with
shaking for 30 minutes. 2) Reaction mixtures are transferred to a 96-well
filtration plate attached to a
vacuum apparatus. 3) Each sample is washed sequentially with three 2.4 ml
aliquots of 1 x oligo d(T)
loading buffer containing 0.5% SDS, 0.1% SDS, or no SDS. and 4) RNA is eluted
with 300 ~,l of
2s water into a 96-well collection plate, transferred to scintillation vials
containing liquid scintillant, and
radioactivity determined. Analysis of [methyl-3H]6-MP is as follows: 1) 500
x,10.5 M borate buffer,
pH 10.0, and then 2.5 ml of 20% (v/v) isoamyl alcohol in toluene are added to
the reaction mixtures.
2) The samples mixed by vigorous vortexing for ten seconds. 3) After
centrifugation at 7008 for 10
minutes, 1.5 ml of the organic phase is transferred to scintillation vials
containing 0.5 ml absolute
3 o ethanol and liquid scintillant, and radioactivity determined. and 4)
Results are corrected for the
extraction of 6-MP into the organic phase (approximately 41%).
An assay for adhesion activity of DITHP measures the disruption of
cytoskeletal filament
networks upon overexpression of DITHP in cultured cell lines (Rezniczek, G.A.
et al. (1998) J. Cell
Biol. 141:209-225). cDNA encoding DITHP is subcloned into a mammalian
expression vector that
35 drives high levels of cDNA expression. This construct is transfected into
cultured cells, such as rat
174.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
kangaroo YtlSL or rat bladder carcinoma tSU4Ci cells. Actin filaments and W
termediate filaments such
as keratin and vimentin are visualized by immunofluorescence microscopy using
antibodies and
techniques well known in the art. The configuration and abundance of
cytoskeletal filaments can be
assessed and quantified using confocal imaging techniques, In particular, the
bundling and collapse of
cytoskeletal filament networks is indicative of DITHP adhesion activity.
Alternatively, an assay for DITHP activity measures the expression of DITHP on
the cell
surface. cDNA encoding DITIE' is transfected into a non-leukocytic cell line.
Cell surface proteins
are labeled with biotin (de la Fuente, M.A. et al. (1997) Blood 90:2398-2405).
Tmmunoprecipitations
are performed using DITHP-specific antibodies, and immunoprecipitated samples
are analyzed using
Zo SDS-PAGE and immunoblotting techniques. The ratio of labeled
i_m_m__unoprecipitant to unlabeled
immunoprecipitant is proportional to the amount of DITHP expressed on the cell
surface.
Alternatively, an assay for DITHI' activity measures the amount of cell
aggregation induced
by overexpression of DITHP. In this assay, cultured cells such as N1H3T3 are
transfected with
cDNA encoding DTITIP contained within a suitable mammalian expression vector
under control of a
i5 strong promoter. Cotransfection with cDNA encoding a fluorescent marker
protein, such as Green
Fluorescent Protein (CLONTECIi), is useful for identifying stable
transfectants. The amount of cell
agglutination, or clumping, associated with transfected cells is compared with
that associated with
untransfected cells. The amount of cell agglutination is a direct measure of
DITHP activity.
DITHP may recognize and precipitate antigen from serum. This activity can be
measured by
2o the quantitative precipitin reaction (Golub, E.S. et al. (1987) Immunology:
A S thesis, Sinauer
Associates, Sunderland MA, pages 113-115). DITHP is isotopically labeled using
methods known in
the art. Various serum concentrations are added to constant amounts of labeled
DITHP. DTIT3P-
antigen complexes precipitate out of solution and are collected by
centrifugation. The amount of
precipitable DTI'IiP-antigen complex is proportional to the amount of
radioisotope detected in the
25 precipitate. The amount of precipitable D1THP-antigen complex is plotted
against the serum
concentration. For various serum concentrations, a characteristic
precipitation curve is obtained, in
which the amount of precipitable DITHP-antigen complex initially increases
proportionately with
increasing serum concentration, peaks at the equivalence point, and then
decreases proportionately
with further increases in serum concentration. Thus, the amount of
precipitable DITHP-antigen
3 o complex is a measure of DITHP activity which is characterized by
sensitivity to both limiting and
excess quantities of antigen.
A microtubule motility assay for DIfIiP measures motor protein activity. In
this assay,
recombinant DITHP is immobilized onto a glass slide or similar substrate.
Taxol-stabilized bovine
. brain microtubules (commercially available) in a solution containing ATP and
cytosolic extract are
3 5 perfused onto the slide. Movement of microtubules as driven by DITHP motor
activity can be
175


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
visualized and quantified using video-enhanced light microscopy and image
analysis techniques.
DTI~iP motor protein activity is directly proportional to the frequency and
velocity of microtubule
movement.
Alternatively, an assay for DITHP measures the formation of protein filaments
in vitro. A
s solution of DITHP at a concentration greater than the "critical
concentration" for polymer assembly is
applied to carbon-coated grids. Appropriate nucleation sites may be supplied
in the solution. The
grids are negative stained with 0.7% (w/v) aqueous uranyl acetate and examined
by electron
microscopy. The appearance of filaments of approximately 25 71m
(microtubules), 8 nm (actin), or 10
nm (intermediate filaments) is a demonstration of protein activity.
io DITHP electron transfer activity is demonstrated by oxidation or reduction
of NADP.
Substrates such as Asn-(3Gal, biocytidine, or ubiquinone-10 may be used. The
reaction mixture
contains 1-2 mg/ml HO1ZP, 15 rnM substrate, and 2.4 mM NAD(P)+ in 0.1 M
phosphate buffer, pH
7.1 (oxidation reaction), or 2.0 mM NAD(P)H, in 0.1 M Na2HP04 buffer, pH 7.4
(reduction
reaction); in a total volume of 0.1 ml. FAD may be included with NAD,
according to methods well
15 known in the art. Changes in absorbance are measured using a recording
spectrophotometer. The
amount of NAD(P)H is stoichiometrically equivalent to the amount of substrate
initially present, and
the change in A3ao is a direct measure of the amount of NAD(P)H produced;
~A3~o = 6620[NADH].
DTI~3P activity is proportional to the amount of NAD(P)H present in the assay.
The increase in
extinction coefficient of NAD(P)H coenzyme at 340 nm is a measure of oxidation
activity, or the
2o decrease in extinction coefficient of NAD(P)H coenzyme at 340 nm is a
measure of reduction
activity (Dalziel, K. (1963) J. Biol. Chem. 238:2850-2858).
DTTHP trauscription factor activity is measured by its ability to stimulate
transcription of a
reporter gene (Liu, H.Y. et al. (1997) EMBO J. 16:5289-5298). The assay
entails the use of a well
characterized reporter gene construct, LexAoP LacZ, that consists of LexA DNA
transcriptional
z5 control elements (LexAop) fused to sequences encoding the E. coli LacZ
enzyme. The methods for
constructing and expressing fusion genes, introducing them into cells, and
measuring LacZ enzyme
activity, are well known to those skilled in the art. Sequences encoding DITHP
are cloned into a
plasmid that directs the synthesis of a fusion protein, LexA-DTTHP, consisting
of DITHP and a DNA
binding domain derived from the LexA transcription factor. The resulting
plasmid, encoding a LexA-
3 o DITHP fusion protein, is introduced into yeast cells along with a plasmid
containing the LexAoP LacZ
reporter gene. The amount of LacZ enzyme activity associated with LexA-DITHP
trausfected cells,
relative to control cells, is proportional to the amount of transcription
stimulated by the DTIT3P.
Chromatin activity of DITHP is demonstrated by measuring sensitivity to DNase
I (Dawson,
B.A. et al. (1989) J. Biol. Chem. 264:12830-12837). Samples are treated with
DNase I, followed by
s s insertion of a cleavable biotinylated nucleotide aualog, 5-[(N-
biotinamido)hexanoamido-ethyl-1,3-
176


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
thiopropionyl-3-aminoallyl]-2 =deoxyuridine 5'-triphosphate using nick-repair
techniques well known to
those skilled in the art. Following purification and digestion with EcoRI
restriction endonuclease,
biotinylated sequences are affinity isolated by sequential binding to
streptavidin and biotincellulose.
Another specific assay demonstrates the ion conductance capacity of DITHP
using an
s electrophysiological assay. DITHP is expressed by transforming a mammalian
cell line such as
COS7, HeLa or CHO with a eukaryotic expression vector encoding DITHP.
Eukaryotic expression
vectors are commercially available, and the techniques to introduce them into
cells are well known to
those skilled in the art. A small amount of a second plasmid, which expresses
any one of a number of
marker genes such as [3-galactosidase, is co-transformed into the cells in
order to allow rapid
1 o identification of those cells which have taken up and expressed the
foreign DNA. The cells are
incubated for 48-72 hours after transformation under conditions appropriate
for the cell line to allow
expression and accumulation of D1THP and (3-galactosidase. Transformed cells
expressing (3-
galactosidase are stained blue when a suitable colorimetric substrate is added
to the culture media
under conditions that are well known in the art. Stained cells are tested for
differences in membrane
1s conductance due to various ions by electrophysiological techniques that are
well known in the art.
Untransformed cells, andlor cells transformed with either vector sequences
alone or (3-galactosidase
sequences alone, are used as controls and tested in parallel. The contribution
of DITHP to canon or
anion conductance can be shown by incubating the cells using antibodies
specific for either DITHP.
The respective antibodies will bind to the extracellular side of DTIT3P,
thereby blocking the pore in the
2 o ion channel, and the associated conductance.
XV. )~nctional Assays
DITHP function is assessed by expressing dithp at physiologically elevated
levels in
mammalian cell culture systems. cDNA is subcloned into a mammalian expression
vector containing
2 s a strong promoter that drives high levels of cDNA expression. Vectors of
choice include pCMV
SPORT (Life Technologies) and pCR3.1 (Invitrogen Corporation, Carlsbad CA),
both of which
contain the cytomegalovirus promoter. 5-10 ~,g of recombinant vector are
transiently transfected into
a human cell line, preferably of endothelial or hematopoietic origin, using
either liposome formulations
or electxoporation. 1-2 ~Cg of an additional plasmid containing sequences
encoding a marker protein
3 o are co-transfected.
Expression of a marker protein provides a means to distinguish transfected
cells from
nontransfected cells and is a reliable predictor of cDNA expression from the
recombinant vector.
Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP;
CLONTECH), CD64, or a
CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based
technique, is
177


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
used to identify transfected cells expressing GFP or CD64-GFP and to evaluate
the apoptotic state of
the cells and other cellular properties.
FCM detects and quantifies the uptake of fluorescent molecules that diagnose
events
preceding or coincident with cell death. These events include changes in
nuclear DNA content as
measured by staining of DNA with propidium iodide; changes in cell size and
granularity as measured
by forward light scatter and 90 degree side light scatter; down-regulation of
DNA synthesis as
measured by decrease in bromodeoxyuridine uptake; alterations in expression of
cell surface and
intracellular proteins as measured by reactivity with specific antibodies; and
alterations in plasma
membrane composition as measured by the binding of fluorescein-conjugated
Annexin V protein to the
1o cell surface. Methods in flow cytometry are discussed in Ormerod, M. G.
(1994) Flow Cytometry,
Oxford, New York NY.
The influence of DITHP on gene expression can be assessed using highly
purified populations
of cells transfected with sequences encoding DITHP and either CD64 or CD64-
GFP. CD64 and
CD64-GFP are expressed on the surface of transfected cells and bind to
conserved regions of human
immunoglobulin G (IgG). Transfected cells are efficiently separated from
nontransfected cells using
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL,
Inc., Lake
Success NY). mRNA can be purified from the cells using methods well known by
those of skill in the
art. Expression of mRNA encoding DIT'HP and other genes of interest can be
analyzed by northern
analysis or microarray techniques.
XVI. Production of Antibodies
DITHP substantially purified using polyacrylamide gel electrophoresis (PAGE;
see, e.g.,
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification
techniques, is used to
immunize rabbits and to produce antibodies using standard protocols.
Alternatively, the DITHP amino acid sequence is analyzed using LASERGENE
software
(DNASTAR) to determine regions of high immunogenicity, and a corresponding
peptide is synthesized
and used to raise antibodies by means known to those of skill in the art.
Methods for selection of
appropriate epitopes, such as those near the C-terminus or in hydrophilic
regions are well described in
the art. (See, e.g., Ausubel, 1995, s_~ra, Chapter 11.)
3 o Typically, peptides 15 residues in length are synthesized using an ABI
431A peptide
synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to KLH
(Sigma) by reaction with
N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase
immunogenicity. (See, e.g.,
Ausubel, supra.) Rabbits are immunized with the peptide-KLH complex in
complete Freund's
adjuvant. Resulting antisera are tested for antipeptide activity by, for
example, binding the peptide to
plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and
reacting with radio-
178


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
iodinated goat anti-rabbit lgCi. Antisera with antlpeptide activity are tested
for anti-D1THY activity
using protocols well known in the art, including ELISA, RIA, and
immunoblotting.
XVII. Purification of Naturally Occurring DITHP Using Specific Antibodies
Naturally occurring or recombinant DTIT3P is substantially purified by
immunoaffinity
chromatography using antibodies specific for DITHP. An immunoaffinity column
is constructed by
covalently coupling anti-DITHP antibody to an activated chromatographic resin,
such as
CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the
resin is
blocked and washed according to the manufacturer's instructions.
1o Media containing DTIHP are passed over the immunoaffinity column, and the
column is
washed under conditions that allow the preferential absorbance of DI1'IIP
(e.g., high ionic strength
buffers in the presence of detergent). The column is eluted under conditions
that disrupt
antibody/DITHP binding (e.g., a buffer of pH 2 to pH 3, or a high
concentration of a chaotrope, such
as urea or thiocyanate ion), and DTTIIP is collected.
l5
XVIII. Identification of Molecules Which Interact with DITHP
DITHP, or biologically active fragments thereof, are labeled with 1~I Bolton-
Hunter reagent.
(See, e.g., Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539.)
Candidate molecules
previously arrayed in the wells of a multi-well plate are incubated with the
labeled DITHP, washed,
ao and any wells with labeled DITHP complex are assayed. Data obtained using
different
concentrations of DITHP are used to calculate values for the number, affinity,
and association of
DITHP with the candidate molecules.
Alternatively, molecules interacting with DTI'I3P are analyzed using the yeast
two-hybrid
system as described in Fields, S. and O. Song (1989) Natuxe 340:245-246, or
using commercially
25 available kits based on the two-hybrid system, such as the MATCHMAKER
system (CLONTECH).
DTTHP may also be used in the PATHCALLING process (CuraGen Corp., New Haven
CT)
which employs the yeast two hybrid system in a high-throughput manner to
determine all interactions
between the proteins encoded by two large libraries of genes (Nandabalan, K.
et al. (2000) U.S.
Patent No. 6,057,101).
All publications and patents mentioned in the above specification are herein
incorporated by
reference. Various modifications and variations of the described method and
system of the invention
will be apparent to those skilled in the art without departing from the scope
and spirit of the invention.
Although the invention has been described in connection with specific
preferred embodiments, it
179


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
should be understood that the invention as claimed should not be unduly
limited to such specific
embodiments. Indeed, various modifications of the above-described modes for
carrying out the
invention which are obvious to those skilled in the field of molecular biology
or related fields are
intended to be within the scope of the following claims.
180


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N N N N N N N N N N N N N N N
r N N N N N N N N N N ,-, ,-, N N ,-, N ,_ ,_ N ,-. ,- N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
QQQQQQQQQQaQaQQQQQQaQQaQQaQaQQQQ
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 ~ ~ o ~ o o ° o 0 0 0 0 0 0 0 0 ~ ~
Q N O O O O O O O O O O N N N O O N O N N O N N N N N N N N N O O
N N N N N N N N N N .. .. N N .. N -. N -- -- -- -. N N
O O O O O O O O O O ~ ~ ~ O O ~ p O O ~ O O O O O O O O O p O
~fj r- ~- ~- ~- r r- r- ~rj ~ ~fj ~ ~- ~- ~, r- ~- ~~ r- ~- t(j ~- .- .
a~00~UNa~O~aMO~00~~~a0~MpM~o~O~NN~~_~~Un_NN
N a0 ~ ~l' N ~ ~_ U N ~ ~ ~ N ~O ~ ~' N oM0 ~ ~ N O ~ ~ V U ON ~
Mc~c~~O~Ot~O~~~IwoO~~NNf~oO~oQOOI~IwNI~IwN~
c~ c~ ~t c~ O N N V V O ~' O O M N ~N d. ~ ~ a0 - - N ~ ~ N ~ i~ ~
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
J
~ I~ 00 U O N c~ 'd' ~ ~O I~ a0 O~ O N c~ ~ ~ ~O I~. 00 U O N ch ~l' ~ ~O W o~
~ ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O 1~ Iw Iw IW Iw Iw I~ Iw Iw o0 a0 a0 a0 a0 00
00 0O o0
w
tn
N N N N N N N N N N N N N ~ N N N N N N N ~ N ~ N ~ N N N ~ N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
oQQQaQQaQaQaQQQQQQQQaQaQaQaQQQaQQ
- o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
~ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ° ° o ° o o ° 0 0 0
0 0 0 0 0 0 ° o
-~ N O O O O O O O O O O N N N O O N O N N O N N N N N N N N N O O
_O .. N N N N N N N N N N -. .. N ~ ~ ~ _ _ N .- r ~ ~ ~ ~ ~ -- -- N N
~ ~_ c~ ~O a~ O a0 Is Is U c'~ ~ ~ ~O ~O p r ~rj ~.f~ U ~j ~O Is c'~ ch a0 CV
M ~ ~i' ~j
~O a0 O Iw U N o0 c0 ~O ~ ~ a0 ~ M U M ~ N N ~ ~ Iw U Iw N N d.
N a0 ~ V N ~ ~_ U N ~ O O Iw ,p a0 ~ 00 c~ d. d' a0 ~_ U_ N O N ~ 00
~ ~ N c~ 0O Iw o0 ~ N c~ N ~ ~ ~ ~ ~ N N ~ ~ ~ 0~0 ONO ~ I~ N n ~ N ~ M Iw
c~ c~ tw O ~ O c~ O ~O I~ c~ N u7 c~
c~ c~ d M O N N '~' 'fit O ~' O O c,.) N ~ ~t ~ ~ 00
J J J J J J J J J J J J J J J J J J ~ J J J J J J J J J J J J J
Z
N C'~ d' ~ ~O 1~ 0O O~ O r N C~ ~f' ~ ~O 1~ 00 U O ~ N M d' ~ ~O Iw 00 U O N
- ~ ~ ~ ~ r- r- ~ ~ ~ ~- N N N N N N N N N N c~ c~ c~
UJ
M
Igl


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N_ N_
N N N N N N N N N N N ~ N N N N N Z Z N N N
QQaaQQQaaQQ~QaQaQaQQQaQQQ
oo~ooo~oo~oooo~~ooooo0
Q O O O ~OV O N ~ 0 0 ~ 0 'N O O ~ ~ N -N~- ~ O O O N O O
~ 0 0 0 0 0 ~ ~ o o ~ oo; 0 0 0 ~ o 0 0 0 0 0 0 0 0
r CV CV ~- U ~ r ~- ~- '- r N r- r U '-' ~rj O O ~j c~ CV ~ c''~ ~t
U U O °~ O N ~ N ~ c~7 O ~ ~ ~ O O~ N ~ N ~'
U ~ O t1 ~i' V N Q, d. O N p, c~ ~ ~ op N N N O~ O ~ Iw ~
O O I~ a0 ~ O ~ ~O O CO O °~ N N W lyy. ~ tn °'7 ~O O
c~ c ~ O N U ~ ~ ~ ~ N ~ N N ~ ~"~ c~j ~ n ~ ~ O o0 ~ ~
N ~ ~"~ a0 r ~ c~ N O O O N ~ ~ V ~ ~ 'd' O O ,_
J J J J J J J J J J J r_ J J J J J r r J J J J J J
J J
W
z U 0 N c'~ V ~ ~O (~ OO U O r N c~ ~1' ~f7 ~O I~ 00 U 0 '-' rN-
Q ~ 0p O' U U U U U U U U O' O O O O O O O O O O r '-' r r
r r r
W
N N N N N N N N N N N N N N N N CV N N N N N N N N
zzzzzzzzzzzQzzzzzQQzzzzzz
OQQQQQQQQQQQo4QQQQooQQQQQQ
~ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+- O O O O O O O O O O O N O O O O O N N O O O O O O
N N N N N N N N N N N N N N N N N N N N N N
Q r N N .-- U c~ '-- r r r r N r- ~- U ~- ~;.~ O O N C'~ N ~ c'~ ~1'
Iw O~ O~ O ~ N O N o N ~ c~') O ~ ~ ~ N N O U C~! ~
U ~ O h d' O~ ~_i' N Om-- c~ 00 I~ 1~ N O~ O
O O h o~ ~ ~ O ~ ~O O o0 O N N N I~ 1~ 'd'
~ O N U ~ ~ ~ N ~ N Is N N ~ M cN ~ n n ~ ~ °~ ~O ~O
t~ N V c'~7 a0 ~ ~ c_'~ N O O r O N ~ ~ V ~ r ~I' O O
J J J J J J J J J J J ~~~ ...1 J J J J ~~~ ~~~ J J J
J J J
Z
c~~ c~ c~ c~ c ~ ~ M OV ~I' NV V due' ~ ~ ~ V ~ ~ ~.c~
UJ
182 _ -- _


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 2
SEQ ID Template ID GI Number Annotation
NO: Probability


Score


1 LI:1983416.1;2001JAN12829098603.00E-20NADH-ubiquinone oxidoreductase


subunit CI-KFYI (Homo sapiens)


2 LI:3322b3.1:2001JAN12858172445.00E-42dJ20N2.1 (novel protein
similar to


yeast and bacterial cytosine


deaminase) (Homo sapiens)


3 LI;333886.4:2001JAN12831207 3.00E-27put.thyroid hormone receptor
(Homo


sapiens)


4 LI;478508.1;200iJANi28128400554.00E-49putative (Mus musculus)


LI:307470.1;2001JAN128104375692.00E-23unnamed protein product
(Homo


sapiens)


b LI;058298.1:2001JAN1283b615 1.00E-102serine/threonine protein
kinase


(Homo sapiens)


7. LI:205527.5;2001JAN12815617181.00E-49Human 14-3-3 epsilon mRNA,


complete cds.


8 LI;231587.1;2001 g 18722006.00E-13alternatively spliced product
JAN 12 using


exon 13A (Homo sapiens)


9 LI;402919.1:2001JAN128152777064.00E-73homeobox protein GBX-2b
(Xenopus


laevis)


LI;4b3283.1:2001 g 104374855.00E-31unnamed protein product
JAN 12 (Homo


sapiens)


11 LI:072560.1:2001 g 18722004.00E-23alternatively spliced product
JAN 12 using


exon 13A (Homo sapiens)


12 LI:1953096.1:2001JAN128104375692.00E-18unnamed protein product
(Homo


sapiens)


13 LI:1076016.1:2001JAN12873413722.00E-33retinoblastoma-binding
protein 1-


related protein (Rattus
norvegicus)


14 LI;2082796.1:2001JAN12843231523.00E-30Ets-protein Spi-C (Mus
musculus)


LI:335681.3:2001 g 144566312.00E-44dJ54B20.4 (novel KRAB box
JAN 12


containing C2H2 type zinc
finger


protein} (Homo sapiens)


16 LI;214150.1;2001JANi2810201455.00E-20DNA binding protein (Homo
sapiens)


17 LI:322783.15;2001JAN12871597992.00E-13dJ351 K20.1.1 (novel C3HC4
type Zinc


finger (RING finger) protein
(isoform


1 )) (Homo sapiens)


18 LI:422993.1;2001JANi28104401364.00E-84unnamed protein product
(Homo


sapiens)


19 LI:1172885.1:2001JAN128468708 1.00E-28zinc finger protein (Homo
sapiens)


LI:1088359.1:2001g 13361584.00E-46pancreas only zinc finger
JAN 12 protein


(Rattus norvegicus)


21 LI:813422.1;2001JAN12870232161.00E-16unnamed protein product
(Homo


sapiens)


22 LI:1186426.1:2001JAN128506502 1.00E-141NK10 (Mus musculus)


23 LI:1182817.1:2001JAN12810201450 DNA binding protein (Homo
sapiens}


24 LI;1170153.9:2001870232167.00E-68unnamed protein product
JAN 12 (Homo


sapiens)


LI;1171553.1:2001860881001.00E-29zinc finger protein (ZFD25)
JAN 12 (Homo


sapiens)


26 LI:2121978.1;2001JAN12861183832.00E-14zinc finger protein ZNF223
(Homo


sapiens)


1~3


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 2
SEQ Template ID GI Number Annotation
ID Probability
NO:


Score


27 LI:1174292.5;2001JAN12g79592070 KIAA1473 protein (Homo
sapiens)


28 LI;1179173.1:2001JAN12g108352840 Zinc finger protein ZNF223
(amino


acids 82-482) (Homo sapiens)


29 LI:2122025.1:2001JAN12g26894441.00E-93ZNF134 (Homo sapiens)


30 LI:2049224.1:2001 g70232163.00E-55unnamed protein product
JAN 12 (Homo


sapiens)


31 LI;758541.1:2001JAN12g340444 3.00E-61zinc finger protein 41
(Homo sapiens)


32 LI:137815.1:2001JAN12g69841720 zinc finger protein ZNF226
(Homo


sapiens)


33 LI:335097.1:2001 g70204401.00E-24unnamed protein product
JAN 12 (Homo


sapiens)


34 LI:232059.2:2001 g92801526.00E-23unnamed portein product
JAN 12 (Macaca


fascicularis)


35 LI:400109.2:2001 g 126981825.00E-29hypothetical protein (Macaca
JAN 12


fascicularis)


36 ~LI;329770.1:2001JAN12g104375693.00E-26unnamed protein product
(Homo


sapiens)


37 LI;898841.9:2001JAN12g75424904.00E-07FK506 binding protein
precursor


(Homo sapiens)


38 LI:1183848.3:2001 g32063 0 Human hepatoma mRNA for
JAN 12 serine


protease hepsin.


39 LI;2037121.1:2001JAN12g10420821.00E-27laminin alpha 4 chain
(Homo


40 LI:35b090.1:2001JAN12g104374852.00E-13unnamed protein product
(Homo


sapiens)


41 LI:212142.1:2001 g89806673.00E-12PADI-H protein (Homo sapiens)
JAN 12


42 LI:1096706.1:2001 g 126981822.00E-20hypothetical protein (Macaca
JAN 12


fascicularis)


43 LI;012b22.1:2001 g37241413.00E-62myosin I (Rattus norvegicus)
JAN 12


44 LI:1171095.29:2001g56054 2.00E-22D 100 (Rattus norvegicus)
JAN 12


45 LI;023813.1:2001JAN12g66902483.00E-24PR00657 (Homo sapiens)


46 ~ LI:229030.1:2001g 104374851.00E-20unnamed protein product
JAN 12 (Homo


sapiens)


47 LI:1072894.9:2001JAN12g65237974.00E-08adrenal gland protein
AD-002


(Homo sapiens)


48 LI:2031263.1:2001 g95884081.00E-82dJ 1184F4.4 (novel protein
JAN 12 similar to


nucleolar protein 4 (NOL4)
(NOLP))


(Homo sapiens)


49 LI:432285.3:2001 g386987 6.00E-47ornithine aminotransferase
JAN 12 (Homo


sapiens)


50 LI:1177772.30:2001JAN12g81010700 Homo sapiens golgin-like
protein


(GLP) gene, complete cds.


51 LI:475420.2;2001JAN12g46892800 retinoid-binding protein
IRBP (Mus


musculus)


52 LI:017599.3;2001JAN12g135591796.00E-11dJ1049G16.2.2 (continued
from


bA456N23.2 in Em;AL353777
and


dJ237J2.1 in Em:AL021394)
(Homo


sapiens)


53 LI;030502.2:2001JAN12g70204405.00E-24unnamed protein product
(Homo


sapiens)


___ __ -_ ,
184




CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 2
SEQ ID NO; Template ID GI Number Probability Annotation
Score
54 LI:1181337.3:2001 JAN 12 g 1196425 1.00E-31 envelope protein (Homo sapiens)
55 LI:1164672.3:2001 JAN 12 g 11078529 6.00E-38 putative gag-pro-pol
polyprotein
(DG-75 Murine leukemia virus)
56 LI:1167059.4;2001JAN12 g7688657 0 septin 10 (Homo sapiens)
185 _ __


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
~ o ~ ~ o°'0 0 0 ~ o o ~ o o ~''~ o ~ o ~' o ~' o ~ ~''~ o' ~ '°
~ ~r
~ c~ d' O O O ~ N O '~ O ~, O ~';~ O i p N O ; O N N O ~ ~ ~ ~-
O O ~ ~ ~ O O O O w O w O w O W O ~ O ~ O ~ O ~ ~ ~ ~ i ~ ~-
O ~ ~ .p O O O O ~ O ,- O N O p, O p, O ~ O ~ O .p N ,p w uJ O
O O O O O O O O O O - - ~ 00-
LLI O ~ 'd~ ~ O O O O N O N O ~ ~ '-' O N ~ ~ O r O ~''~ ~ ~ r O N N
O O O O O O O ~ O O O
Q Q Q
_C ~ '~ .O
U C C C
O U a U
O ~ ~ ~ O O O
N N N N N N N N U N N 'a 'a '~ ~ N N Q
L L L L L L
Q E ~ -~ -~ -~ -~ -~ -~' -~ -~' -~' -~.' .~ +-O O O p O O Q C
N N N N N N N N N N N O O O
'+- ~ O N N N N N N N N N N N ~ ~ ~ O O O
~ U ~ ~ U U U U U U U U U U U 'a 'a ~
U .= ~ ~ ~ ._ ~: ~: L: L: ~: ~: L L ~ U U U p O O ~ O
N x N U U N x ~7 x N x U x N x U x N x N U U U ± .,._ .~ O .
~ O ~ ~ ~ U7 O ~ O ~ O ~ O ~ O ~ O ~ O ~ -C ~ -C O O O Q -"'
D C C O ~ C C C C .Q C -O C .Q C ~ C ~ C .Q C .Q C C C C ~ ~ -C >
'v=-. :,. :,..- 'v=-. m 'v= m y..- m y.... m :,_- m 'v= m 'v= ~p ~ 'W 'W 'W Q
Q Q
U 'i- ~ Q U U U U Q U Q U Q U Q U Q U Q U Q U 0 O O O O O +-
C 0 O ~ C C C C C C ~ C C C C C ~ C ~ C ~ C ~' ~' ~' +- +- +- O U
d N d Z Y N fV IV IV Y N ~ N ~ N ~ N Y N ~ N Y N
X
N O N N N N N N N N N N N N N N Q U
U O O Q N N N N Q N Q N Q N Q N Q N Q N Q N ~I ~I ~I a' ~'
m ~.~~~~UUUU~U~U~U~U~U~U~U
J ~ N~ ~Y,~i .~.~.~Yvi Yvi Yvi Yv'-Y.~'-Y,~Yv'- N cWn.-- Od
Q. Q. ~ N N N N N N N N N N N
I--
N ~ r- c~ c~ c~ r- N c~ r- c~ c~ N N N N ~ c~ ~ c'~ c'~ o'~ c~'~ r- N c~7 r- N
crJ M
a U U U a U U U 0 U U a U U U U U D U U U U U a a U U a U U
°~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Q O ~O ~ N O o0 Iw o0 ~_O ~' cr7 ~ N N ~O ~ ~O O~ O ~ ~ ~ I~ U c~ c~ ~ a0
O N ~O ~ Iw ~ ~ r7 ~I7 Iw N ~ O O ~ U ~O c1' ~l' U O N a~ U °Wt7 O
o0
c0 U ~ ~ ~ '~!' N ~O '~I' Iw ~ c~ ~ ~Y ~ c'~ a0 N c'~ ~O N ~ ~O U ~ ~ c~ - O
t U CO ~ ~l' N O U O Wit' ~O ~ °~ '~T ~ ~ U ch N ~ ~ I~ O O ~ ~ Iw
N °~ ~ O c0 ~O ~ ~ O~ N ~ ~O O N N N o00 U N ~ N O ~ U ~ N N U N
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
~QQQQQQ~QQ~QQQQQQQQQQQQQQQQQQQQ
~ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
.~- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
_U N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
_ .- ilj ~fj r ~ ~- ~- ~- ~_ ~ ~- ~ ,- M V'
~ ~O a0 U ~ ~ U U ~j ~O Iv Iw N N M ch ~ ~ ~ tn tn N N N O O O N U
00 ~ ~ N O~ O~ 1_~ i~ N N N I~ ~
a0 U e0 00 N ~1' N N N N N N
O a0 N U ~O .O °~ c°7 c~ d. ~ 00 a0 N N O O N ~ 00 a0 ~O ~O
~O d' ~t V ~ O
~' c~ a0 N u7 ~ N e0 00 ~,.~ ~O N N V' '~I' U U N N U ap ~ Iw N_ N N
c~ ~ O o'~ c~ ~ CO 00 00 00 00 I~ I~ I~ i~ N_ N_ V ~ C,'~ c,~ r' f~ I~ fw ~ ~O
c~ O V ~ c~ ~ O O op r ~ N N ON ~ '- '-' O O O d: ~I' wt
J J J J J ~ J J -1 J J J J J J J J J J J J J J J J J _I J ~ J
Z
D c~ ~O U ~ ~ O' O O N c~ c~ IW a0 c0 U U O N N c~ c~ c~ ~ .p
- ~-- ~- ~ N N N N N N N N N N N N c~ c~ c~ c~ 'd' d' '~' ~ ~ ~ ~ tf7
W
N
1~6


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
O N o N N N N N N N N O o
c c cU c cU C cU C c c c cU c
U U U U U
U U


_ ~ _ ~ _ ~ ~ ~ _ ~ _ _ ~
_ U ~ U ~ -O U -O U __ U ~ _ U
__ ~ -O ~ -O ~ ~ ~ _ ~ O ~ -O
~ U ~ ~ ~ ~ ~ fn - N ~ ~ ~ U -O
O O fn 47 O ~ N fn
V7 C~ O fn
!n -
Cn


_p O o o O O ~ O O ~ ON


o N N O O N O N O N N N N O
U U O U o U U N U O U O o U U O
O U U
~ ~


~ ~ ~ ~
C C C ~
O


tn cn t~ tn tn ts~N f~ tn tn fn fn Cn
''- U c c c C c C c C c C c c
c o U o U C U c U C U c C U
o c o o c o c o
o o U o
o


azz o oz a azz a oz a ozz a az ozz o
= =


~ ~ W ~ ~ E= ~ ~ ~ ~ E=


c
o
N N N ~-- .- .- r- .- r- ~ r- ~ N N N N N N N N N M M M M M M M M M M M
~ a a ~ a ~ ~ a a a ~ a ~ a a a a a a ~ a ~ a a a ~ a a a a a ~ a
"- o 0 o 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 o O o 0 0 0 0 0 0 0
O ~ ~ ON N V ~ ~ U ~ ~ ~ n P ~ ~ n ~ O O ~ ~ n U O ~ 1~ o 0
r r ~O ~O ~O ~O ~O n n n ~ ~O ~O ~O n n n n a0 ~ n n n n ~ M ~ M ~
Q 't ~ a_0 M ~O Iw O V Iw0 U d' d' I_w ~ 00 V ~t U f~ M ~O ~ M V'
O ~ r- N ~l' ~ ao U r u7 Iw ~ ~ I~ U ~f7 Iw o0 O ~ ~ ~O I~ U ~- ~- M ~
~O ~O ~O ~O ~O n n n ~O ~O ~O n n I~ n a0 n n n n M ~ V ~I'
N ~ N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzz~zzzzzzzzzzzzzz
oQy~QQaQQQaQQQQaQaQQaQQQQQQaQQQQQ
o °o °o °o ° 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
-~ N N N O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
_a - -- N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
M M M M c~ M M M M M M M c~ M M c'~ M M M M M M O O O o0 00 00
O d' V ~ ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O
n n n U U U
.~ M N N N N N N N N N N N N N N N N N N N N N N N ~J' ~J' V N N N
M M M N N N N N N N N N N N N N N N N N N N N N N N I~ I~ I~ 00 00 00
O, ~ Q, M M M M M M M M M M M M M M M M M M M M M M M O O O ~
M M M M M M M M M M M M M M ~ M M M M M M M M M M O O O
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
z
D_ r- r ~ N N N N N N N N N N N N N N N N N N N N N N N tn M ~ ~O ~O ~O
W
M
__


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N N N N ~ N N N N ~ N
C U C c U c U c U c c U C U c C C c c
U U U U


~~~O.O ~_ _ ~ .O~U _ ~ ~ .O~ _ ~U _ ~U _ ~
U ~.00 U O U U O ~O _U ~O U


O O _ O O _ O O O _ O _ O O O _ O _
O O O 47 O ~-O fn O '~- O
fn N fn N O N
N
V1


U U ~ ~ U ~ U ~ U ~ ~ U ~ U ~ ~ ~ ~ ~
~ U ~ ~ ~ ~ ~ U ~~ U


c c ~, ~, ~.,c ~, c ~, ~ c ~, c ~, ~ ~, ~, ~,
c o U c V o U o U c o U o U c U c U
o c c c c c c c c U c c
c o o c o
o


azz ~ azz a z a z ~ ~ z ~ z a az a az a


N- ~- H- ~- ~- ~- ~- ~- E- ~- s-


_c
0
r r r r N N N c~ c~ c~ r r r r r r r c~ c~ c~ cY7 c~ c~ r7 ~- r r r r
~ a a a a a ~ ~ a a a ~ ~ a ~ a ~ a a a a a ~ a a a a a a v ~ ~ a
" o o 0 o O O o 0 0 0 o 0 0 o O o 0 0 0 o O O o 0 o O o o O O O o
4.- y.- .~... .N 4- y.- 4- 4- 4- Y- 4-- 4-. .H w- .H 4- ~F- 4-- ~N 4-. y- 4-
~N ~ ~H ~H 4- 4- 4- 4- ~H ~N
I~ O O ~ I~ U V I~ U ~ U .gyp U O N ~J' I~ U ~ ~ O cr7 V a~0 0 N ~ ~
~ c~ 'd' Iw O~ c~ a0 c~ 00 ~ ~f7 ~O ~O 1~ Iw Iw ~ ~ ~O ~O Iw I~ I~ N ch c~ c~
c~
~ ,_ tl~ ~O o0 ,_ ~ 00 r ~ 00 r ~ U ~ U O N r n U ~ O O ~ r a~0 O N V
c~ ~1' Iw .- c~ r c~ ~ u7 ~O ~O iw I~ u7 ~ ~O f~ t~ I~ N c~ c~ t~
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
~ Q Q Q d Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q
~ °o °o °o °o °o °o °o
°o °o o °o °o °o °o °o
°o °o °o °o °o °o °o
°o °o °o °o °o °o °o
°o °o °o
O N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
y- r iCj uj i.fj tCj ~Cj iCj ~Cj iCj iCj ifj t1j ~- r r ~ r r r ~ r r r r r r
r r r r r
00 00 I~ I~ t~ I~ I~ I~ I~ I~ I~ I~ 1~ U_ U_ U_ U_ U_ O_~ U_ U_ U_ U_ ~_ U_ U_
U_ O O O O O
U U N N N N N N N N N N N
O N N ~ ~ u~ ~ ~ ~ ~ ~ ~ ~ ~ U U U U U U U U U O~ U U U U ~ ~ ~ O u7
~ 00 00 O ~ ~ ~ ~ ~ ~(7 ~ ~l7 ~ ~ N N N N N N N N N N N N N N N N N N N
~ O O O O O O O O O O O O O O O O O O O O O O O O O I~ Iw I~ f~ i~
O O N N N N N N N N N N N <t' V ~ d' V ~f V ~ ~' V d' V' ~t '~I' O O O O O
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
z
~_ 'D ~O I~ 1~ I~ I~ i~ I~ I~ I~ I~ I~ i~ Q' U U U U U U U U U U U U U r r r r
r
UJ
188


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N O N N N N N N N N O N
C _U C C U_ U_ C C U_ C _U C _U _U_ C C _U_ C _U C C _U _U C
O ~ U ~ .O ~O ~ U ~ ~O U ~ O U ~ O O ~ U ~ O ~ U O ~ U ~ O O ~ U
O) Q N .Q = ~ N N .Q = .Q N --- .Q N - ,Q v~ vo ~ - ~ v~ ,Q - ~n ~ _ ,Q m cn ~
o ~ o ~ o ~,~o ~ o ~'~o ~'~o ~'~.~~ o E,~~ o-°~~ o ~,~-°5,E o
0 0 ~0 0 0 'So 0 0 0 0 0 0 0 0 0 0 0 0 0
U U U +- U U U U U U U U
c o c U c o o c U c o U c o U c o o c ~ c o c U o c U c o o c ~
~z a azz a az az azz a az ~ z ~ azz a
W= E= ~ H ~ E'- H ~ H W-'
c
o
N N N N N N N N N N N ~-- r ~ ~ ~- r- ~ N N N N N r7 cr7 c~
~'c3'a'a'a~'a'~'~'a~~'o'o~'~~'a~~'a'~~~-o~~'a'~'a'a~~
L L L L L L L L L L L L L L L L L Y. L L L Y~. L L L L L L L L L L
°~'~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
v- y... .~-. ..- ~ .~-. .~ y- 4- ...-. .v-, .~ .~- .~-. .a-. y- 4-. 4- .~- .~
y... .~ .~-. y- 4- .i
d' Q N ~ a0 O O ~ 00 U N ~ ~O oo ~t O M N N ~O ~O o0
J ~ ~ V ~ ~ a~0 c~7 c~ ~ c~~ 0~0 ~ N ~ N ~.c~ N ~ ~ ~ c~ M I~ ~ ~ c j ~ Is ~ ~
c j O
.- r
Q "~ O c~ ~O U ~O U O M O U N ~ ~ M '- ~1' c~ c~ r ~I7 00 U N ,_ ~ 00
a ~7 i~ 00 O N ~ ~ N 'dwf? Iw- Iw- N ~c7 ~- c~ ~ Iw
~ j M M d' ~ ~C7 ~ M M M M M ~ r- M .- 1- r- ~ ~ M V n M
N N N N N N N N N N N ~ ~ N ~ N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
oQQaQaQQQaQQQQQQaQQaQQQaQaQaQQQQQ
0 0 o O o 0
00000ooooopopppooooo00000000000
-i- O O O O O O O O O O O ~J N N N N N p O O O O O O O O O O O O O O
O N N N N N N N N N N N . ~ ~ ~ ~ N N N N N N N N N N N N N N N
Q r- ~ r- r- r- r- r- r- ~ ~ ~ r ~ _ _ _ r- .- r- ~ ~ ~- ~ ~ ~- r- ~ ~- .- r-
Q'UUMcrJC~~UUUUUUUUUUUUUU
~ u7 ~ ~ ~ ~ W.f~ u7 ~ ~c7 u7 N N N ~ onO o~o ~ p, U O, O~ O~ U O~ O~ U O- O~
U O~ O
N N N N N N N N N N N ap op pp N N N N N N N N N N N N N N N N N N
I~ I~ I~ I~ i~ I~ I~ I~ I~ I~ i~ N N N N N N N N N N N N N N N
O O O O O O O O O O O O O O N N N ~. ,~. d. ~ d. ~. ~. ~ d. d.
CV N N c?~ c'7 ~? _
J J J J J J J J J -1 J J J J J J J -~ J J J J J J J J J J J J J J
z
d' I~ I~ I~ 00 OO 0D OD 00 0D 00 00 00 00 CO 00 00 00 ~O
W
189


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
C _U_ U_ C C _U C _U_ C C _U_ C _U _U C C _U_ U_ C C _U C U_ C
O -O ~ U ~ O ~ U O ~ U ~ O U ~ O O ~ U ~ O -O ~ U ~ O U ~ -O ~ U
-Q ~ -Q -- ~ -Q -- -Wn -- -Q v~ vn ,Q - ,~ u> w ,Q -- ,yn -- ,Q v~ ~
o ~'~~~ o ~~~ o ~~ o ~~o ~~~~ o ~~.~.~ o ~'~o ~~~ o
c o o c U c o c U o c U c o U ~ o o c U c o o c U ~ o U c o c U
~zz ~ ~z a z a ~z azz ~ azz a ~z az a
i. 1. L L L L L L Y. Y~ Y~ Y~. L
D
c'~ c'~ r .-- ~- ~- ~- r r N N N N N r r r N N N N N c~ c~ c~ c~ c~ N N N N N
~ 'a 'a ~ ~ ~ ~ ~ -a 'O '~ 'O 'O -O '~ -O -O -a -O -O 'O 'a 'a ~ '~ ~ -D 'O 'O
~ 'a ~ ~
L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L L
O a O O a O 0 O D O O O O O O O a O O a D O O O a O O O O O O a
°~~~~~~~~~~~~~~~~~~~~'~~~~~~~~~~~~
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
y- y- .H 4- y- y-. ..-. .H y. .H y- .H. y. .F... y. ..... .~-.. .H y. 4- .~...
y_ ~. y- y. y_ .~-. .~-. y- .y- a~ .~
d' Q- ~O ~ N ~ ~ 00 ~ a0 O ~ o_O O r7 U ~ a0 O ~, N N ~ U ,Mp a~0 n U N ~O U
c'~ ~ O
N ~ ~ O ~ a-0 ~ a~0 U ~ ~O ~ ~ U N N ~ r ~ c ) c~ ~ O O ~ ~ ~ ~ Iw o0 O ~l'
r r r '-'
t U Iw M ~O ~O U ~O U ~O U_ ~ ~O U O ~,.~ t~ ~O .gyp a-0 n ~ I~ O d'
O O N r 00 O ~O o0 c~ ~t7 r U c~ ~ r I~ O~ c~ ~ O
r ~O IW IW I~ a0 00 Lf~ ~O ~O ~O N N r N Cl' p.~ w~ r O O ~ r r ~.f~ 00 00 ,
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
r r r
QaQaQQQQaQQQaQQQaQQQQaQQQaQQaaQa
r r r
O O O O O O O O O O O O O O O O O O O O O O O O O O O
O O O O O O O O O O O O O O O O O O O O O ~ O O O O O O O O O O O
O O N N N N N N N N N N N N ~ O O N N N N N N N N N N N N N N N
r r r r r ~ r r r r r r ~ r r r r r r r r r r r r Q. p, p. p. p.
M~~~C?~~~~~~~C?~~~~C?~NNNNNNNN~Islvlvl_~~_~_~M_C?~MC?~_
c'~ c~ c~ c~ c~ c~ c'~ c~ c~ c~ c~ c'~ N N N ~. ~ ~ Wit' V' 00 00 00 ap a0
I- ~' ~ a0 a0 a0 00 00 a0 00 00 00 00 CO o0 ~ ~ ~ ~O ~O ~O ~O N N N N N O O O
O O
N N op a0 a0 00 a0 00 00 00 00 00 c0 00 M M M c0 00 a0 00 a0 a0 00 a0 4O a0 Iw
Iw Iw I~ Iw
~ O_ O_ O_ O_ O_ O_ O_ O_ O_ O O_ O_ ~ ~ ~ ~ _ _ _ _ _ _ _ _ _ _ _ _ _ _
J J J J J J J J J J J J J J -~ J 'J J J J J J J J J J J J J J J J
0
z
0 a0 a0 O O O O O O O O O O O O N N N N N c~ r7 r7 M M ~ V ~ d' ~1
- r r N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
UJ
M
190 - _ ____


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N ~ N ~ ~ N N ~ N ~ ~ N N
c U C c c U c U c C c c c C c C c
U U U U ~
U


~ _ ~ ~ ~ _ ~ -O~ _ ~ _ ~ _ ~ _ ~
U O U .O U O U fnU ~ U ~ U ~ U ~ U
~ f/7~ -O ~ fn~ ~ O ~ O ~ O ~ O
~ ~ ~ ~ - ~ ~ ~ !A ~ ~
fn fn N (A
fn


p0 ~00N oo ON o O o O o ON o Q ~ ON o N0
p o


o ~'N N N O N ~ ~ ~ ~ N ~ ~ ~
N o o ~ o o o o o
o



c o c c c o c o c c c c c c c c c
U o U U U o U o U o U o U
o


a z a azz a z ~ z a ~z a az ~ ~z a az a


f- ~ ~ ~- ~- m -


c
0
M M M M M M M ~- r r N N N ~- ~- ~ r r r r r r- .- .- r r r- r- r- .- r r
a a a a ~ ~ ~ a ~ ~ ~ ~ ~ ~ ~ ~ D ~ ~ ~ a ~ a ~ ~ D ~ ~ a a ~ ~
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
pOMI~O~U,~pM~~~~~NV ~n d'~~a~ON~c~~~U~~oN00~
~ N ~ ~ o~ N N V W v I~ ~ ~ Iw ~ ~t7 ~ ~ ~O ~O ~O ~O ~O IW s 1 I~ I~ o~ c0 00
U U
Q t ~t ~O O, N ~ I~ ' ~t7 00 N ~ U_ N a0 O M I~ O O M ~O O~ M ~O
N~~oONN'-nn ~ ~~t~L~~O~~~~lvnn~a~OO~QaMO~
N
N N N N N N ~ N N N N N N N N N N N N N N N N N N N N N N N N
r
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz~z
QQQQQQQaQQQQQQQQQQQQaQaQQQQaQQaQ
- 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N N N N N N N N N N N N N
p . .. .. .. .. .. .. N N N N N N N N N N N N N N N N N N N
Q U U U O~ U U U .- r r r- .- r ,_ ~- r r- ~- .- .- ~ ~ r r ~ ~ ~ r r- a
c'~ M M M M M M M M M M M M - _. . . . _
~_ u7 ~_ ~_ ~ ~ ~_ I_~ n I_w n I_~ n d. ~ Wit' ~i' ~ ~ ~f' d' d' ~T d' d' ~t ~
d' '~f ~ ~' V
OOOOOOOUO~UUUU~o~oo~oa~oo~0o~oa~0a~oa~o~aMOO~oo~o~o~0a~oo~oa~oo~o
rr- r- r ~ ~ I~ I~ n I~ I~ n f~ I~ n I~ t~ I~ n I~ n I~ f~
J J J J J J J J J J J J J -~ 'J J J J J J J J J J J J J J J J J J
z
N N N N N N N N N N N N N M M M M M M M M M M M M M M M M M M M
w
N
191-


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N N N N N N N N O N O N N O
C C U C C C C C C C C C C C C
U C U U U U U U


_ ~ _ _ ~ _ ~ _ ~ _ ~ U _ ~ _ ~
~ U _ _ U ~ U ~ U _ U _ U _ U
~ Q O U = O = O = ~ ~ . ~ = ~
.O = ~ ~ ~ N O = O O
~ ~ O ~ Q ~
Q = N
~ N


. . .Q .Q .Q .Q .Q ~ ~ .Q ~ O . .Q .Q ~
O ~ O O ~ ~ ~ ~ ~ ~ O ~ ~ ~ O
~ O ~ ~ O O O O O O O O O
O ~ O ~ ~ ~ ~ ~
~ ~


O p p O N N N N N p N O N N N Q7
Q7 O N O O ' O O O O



c c o U c ~ c c c c c U c c c c
o U c c U o U o ~ o U o U o U
o


az a z az a ~z ~ az a az ~ az ~ ~z a
~


W =


c
o
r ~ ~ r- N N N N N N N N N N N N N N N N N N N M M M M M M M M M
N 'a 'a '~ 'D 'O 'O 'O 'O 'O 'D 'D 'a 'O 'a 'a ~ ~ ~ ~ ~ ~ ~ ~ 'a 'O 'O 'O 'O
'O 'O ~ ~
~ a a a ~ a a ~ ~ ~ o ~ a a a a a a ~ a a a a a a a ~ ~ a a a a ~
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
~-. v.- 4- .~ .~ .~ .v... .~-. .i-. ..-. ..-. .~ .~. .~ .~ .~ .~- .~-. .a-. y-
v- 4- v- ~~-. .~- y
Q- O I~ I~ U N d' N I~ ~O U M ~O ~O U I~ O M ~O ~ ~ I~ ~ a0 I~ ~- O O M
O '~t' ~ ~O Iw .- 'd' ~ I~ ~ ~O N 'd' ~O o0 O N ~O O~ N V ~ Iw Iw ~ I~ ~ ~O N
~ O N ~O
N U U U U M M d' d' ~ ~ ~O ~O ~O ~O n n n n o0 00 a0 00 U ~ ~ ~ ~ ~O ~O n n n
t a0 ~.t~ a0 O M LI] M ~ 00 I~ O 'd' I~ I~ O o0 CI' Iw0 ~O ~O O~ LC? 00 N Cl'
a ~- d' 'd' ~O r N d' ~ Iw V ~O N vO ~O c0 O M ~O U N ~l' ~C? 1w- ~ Iw V ~O N
~ O N
~ j U U U U M M ~ ~ ~f7 ~ ~O ~O r0 ~O n n n n a0 a0 00 a0 Wit' ~ ~ ~ ~O ~O Iw
n
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
oQQQaQQQQaQaQQaQQQaQaQQQaQaQQQaQQ
0000000000000000000000000000
+- o 0 0 0 0 o O o O O O o o O O o 0 0 o O O o o O O O O O O O O O
p N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
d' d' V' ~' '~' '~' d' d' ~' ~' V 'd' d' ~ V V ~ V V ~ ~ d' ~ 'd' d'
W.c~ ~ ~C7 ~ ~ ~ ~ ~ ~ ~ ~.f~ ~c7 u7 ~ ~c7 ~ u7 ~ ~ ~c7
M M M M M M M M M M M M M M M M M M M M ~ M M M ~ M M M M M M M
~ n n n n n n n ~ n n n ~ n n ~ n n n n ~ n ~ n n n n n ~ ~ ~ n
J J J -1 J J J J J J J J J J J J J J J J J J J J J J J J J J J J
z
M M M M M M M M M M M M M M M M M M M M M M M M M M M M a7 M M M
W
M
__ __ 19~ ____ _ _ .


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
C _U C C U_ C _U_ C C _U_ _U C C _U_ C U_ C C _U C C _U_ C C
O ~ U ~ ~O ~ U O ~ U ~ O O ~ U ~ O ~ U .O ~ U ~ O ~ U ~ O ~ U
~ .Q N ~ = .O N .Q = ~ ~ = ~ ~n v~ ~ - ,Q ~n ,Q -. ~n ,Q - ~ v~ ~ - Q ~n ~ - ~
o ~ o ~ o ~ o ~ o o ~ o ~ o o ~ o ~ o ~ o o ~ o ~ o ~ o ~ o ~ o ~
o ~ ~ ~~~ ~ E~~ E~~ ~ ~ ~'~E ~ ~ ~.~ ~~~ ~ ~'~~ ~ ~'~~
~ c o c U c o c U o c U c o o c U c o c U o c U c o c U c o c U c
~ Z ~ ~ Z ~ Z D a Z z ~ ~ Z D Z D a Z ~ a Z t~ a
c
0
M M M M M M M M r r r .- r N N N N N N N M M M M M M M M M M M M
L Y.. L L 1.. L L S.. L L 1~. Y.. L S~ Y~ 1.. 1.. L 1~ f-. L L Y~ L 1.. L L
Y.. L~ L 1~. Y
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
- y- ~ .~... y- v- .~- y-. 4- .~- y-. .~ 4-. v- .~ .E-. ~ 4-- .i-. .~ .~- .~-
w- y- 4-. .~- .~, .r-. .~
d' Q- 'd' V I~ Iw O M ~O ~O ~ ~O U Iw ~ M O~ N M_ ~O ~O O M 'd' Iw N ~_ ~ ~O
~.f~ a0 M ~_O
O o0 M ~ 1~ O V ~O I~ ~ ~ ~ ,~ ~ N N N N ~O ~O ~ O O ~ ~ ~ N N N N N ~ ~O
r r r .- r r .- r- .- r- r .- r r r r r r r ~ r r r r
m
Q '~ N M I~ O ~O ~i' O M d' Iw d' ~ a0 M ~O ~O I~ ~O U ~
N M ~ 00 00 ~ .p a0 U N M ~ ~O oO M ~ I~ Wit' ~O U M ~ ~O a0 U
~O a0 M ~ I~ O ~ ~O r ~ ~ ~ ,O ~- N N N N ~O ~O r O O ~- r ~- N N N N N ~
(n ~ ~ ~ ~ ~ ~ ~ ~ r r r r r r r r r r r r r r r r r r r r r
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzz~zzzzzzzzzzzz
oQQaQQaQQaQaQQQQQQaQQQaQaQQQaQQQQ
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
-.- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
_UNNNNN__NNN__NNNN_ NNN__NNNNNNNNNNNN_NNNNN
Q r r r r
Wr ~ ~' 'd' V ~ ~lWr '~ W.ci ~ci Sri Wci ui Wci ~ti ui Lei Sri Wci ~ci u'i Wti
ui Sri of ~ci
M M M M M M M M n n ~ n n n n n n n n n n n n n ~ n n n n
u7 ~ ~ ~ ~ ~ ~ ~ M M M M M M M M M M M M M M M M M M M M M M M M
r r- r r r r r r r ~- r .- r r r r r r r r r .- r r
J J J J J J J J J ~ J J J J J J J J J J J J J J J J J J J J J J
Z
r N N N N N N N N N N N N N N N N N N N N N N N N
M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M
W
193 ____


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N O N N O N N O N O O O N N
U U CU C C C C CU C C C C CU C U
C U U U U C


_ _ _ _ ~ _ ~ _ ~ _ _ ~ _ ~ _
~.O.O U ~ ~ U _ U ~ U ~ _ U ~ ~ .O
N ~ ~O O ~ ~ ~ .O ~ O ~ ~ .O ~ N
N U ~ - O - Q ~ U O - ~ -
~ - N ~ N ~ Q N
N - N N N
-


OO O ~ O ON O O _ ~ ~OO _ ON O O O
ON OO O O O


O U O N N N O O N O O O N N N ~
V N V O U O U O U O U U O U O N
O C C C C C O C C C
C C


N N N N N N N N N N N N N N
'- o ~ c c c c c c c c c c c c o
o c o U o U o U o U o o U o U c
U


z z ~z ~z a az ~ ~z ~ az az a az a z
a = a


f= ~ ~ ~ m = . m


c
0
M M M M M M ~ ~ ~ ~- ~- r- ~ ~ r ~- ~- ~- r- r r N N N N N N N N N M M
~ ~ a a a a a ~ a a ~ ~ a a a a a a v a ~ a a a ~ ~ a a a ~ a a a
°~~'~~~~~~~~~~~~~~~~'~~~~~~~~~~~~~~
" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d' Q ~ ~ N ~ ~ N 'u7 a0 ~f I~ O C_T N ~O '~1' O M I~ O_ ~O
N r- u~ I~ O ~ O M U r ~ M M ~ Iw O d' M N O N '~' O Iw op M Wit' N
J ~ ~ ~ r- N N ~ ~ ~ M M ~ r ~ N N N M M M Ct' M N N N N N M ~ M
m
M ~O M ~O r U N ~ O M ~O U N ~ a0 O M r M ~O N O ~I' a0
O .- O Iw ~ M M M I~ O N 'w' N d' ~ Iw o0
CV N '- ~ ~ M ~ - ~ r N N N M M M r M N N N N N M '-
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
oQQQQQQaQQQQaQQQQQaQQQQQaQQQQQaQQ
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
U N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
N N N N N N N N N N N N N N N N N N N N N N N N N N
OU~UUU~~~~~~~~u~'~~~~u~'»
O o0 O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
~ t~ W(7 ~l7 tC7 O N N N N N N N N N N N N N N N N N N N N N N N N N N
M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M
r- M M M M M N N N N N N N N N N N N N N N N N N N N N N N N N N
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
z
N M M M M M 'd' 'd' d' 'd' 'd' c1' V ~ ~ ~ ~ d' ~t d' 'd' ti' V ~ d' 'd' V' V
V ~ d' V'
- M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M
UJ
M
194


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N N N ~ ~ N N ~ N N ~ N
C U C C U C C U C C C C C C C U N
U U U U U C
U C


~ U ~ _ ~ ~ _ ~ _ ~ _ ~ _ ~ _ ~
~ .O U ~ .O U ~ _ ~ ~ U ~ U ~ U O U
fn O ~ O ~ O . O ~ O ~ f/7
N tA O ~ ~ ~ O ~ ~ ~
fn fn ~ ~ !A
V7 fn
t!)


~ ~ ~
-


O o N ~ N N ~' N N ~ N N ~ ~ N
N o U U ~ U o U ~ U o U o U o
U - - U - - U -


~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
S S S S S ~


v~ C v n C v~ enC v~ c~ cn in v~ N v~ C ~n
c o ~ C c c C c C c C c C c o en
U c o U o U ~ U C U c U c
o o o c o U
o c
o


az a azz ~ fizz a ~z ~ fizz a az ~ z a



c
0
c~ c'~7 c~ c~ cY7 cr7 c~7 r- ~ r ~- ~ N N N N N N N N N r- ~ ~- r r ~- r- N N
N N
~ a ~ v a a a a ~ a a ~ a a ~ ~ a a a a ~ a a a a ~ a ~ a a a a ~
" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
oNDU 'O~c~ ~~nN~~~o~0 ~~~ON~~aODO~c~~U
N N V d' '~T V ~ r ~ ~ N N ~ d' ~1' ~ ~O ~O ~O ~O N c~'~ c~ c'~
m
Q '~ d' N ~ U_ U O ch ~ 00 U N ~ op d' op 00 U O_, a0
t- ~ c~ 0 0'7 N ,- ,p a0 r ch u'7 f O O c~ ~0 ~1' ~O c~'? ~O ~- c~ a0 0 c~ ~ U
~O
c~ ~O o0 U r N N d' V V V r N ~' ~O a0 ~ ,- N N 'd' d' ~ ~O ~O ~O N c'~ c~
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
~QaQQQQQQaQQaQQaQQQQQaQQQaQaQQaQQ
0000000000000000000
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
Q N N N N N N N N N N N N N N N N N N N N N r
p. p. Q. Q. p. p; p: U U U O~ U Q~ U U O~ O~ U U U O O O O O O O O O O O
Ltd LI~ ~ lp u7 ~ O_ O_ O_ O_ O_ O_ O_ O_ O_ O_ O_ O_ O O_ I~ 1~ 1~ I~ I~ I~
I~ I~ I~ I~ 1~
O O O O O O O ~ ~ n ~ n n n
N N N N N N N O O O O O O O O O O O O O O U U U U U U U U U U U
N N N N N N N '~ OV ~ ~ ~ OV ~ '~I' V V ~ ~ ~ ~' c~ e'~ c~ c~ <''~ M c'') ch
c~ c~ M
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
d' V V d' d' d' d' tf? ~ ~ ~ ~ ~ ~ ~ ~f5 ~ ~ ~ ~ ~ ~O ~O ~O ~O ~O ~O ~O ~O ~O
~O 'O
- c'~ c~ c~ c~ e~ c'7 c'~ c~ c'~ c~'~ c~ n7 e'7 c'~ c'~ c~ c'~ c'~ c''~ cr7 c~
c~ c'') M c~ c'~ c''J c'~ c~ c~ c~ r~
W
N
_. _---___ __. _.
195


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N (U N N N ~ N ~ N N ~ ~ N N
_U c c _U c c U _U c c _U c c _U G c _U c c _U c C _U
~ O ~ U ~ O ~ U ~ .O O ~ U ~ O ~ U ~ O ~ U ~ O ~ U U ~ O ~ U U ~ O
o'~~ o ~'~~ o ~'S~~ o ~~~ o ~~~ o ~~~ o o ~'S~ o o ~~
Q U ~ -S ~ U ~ ..~ ~ U U ~ -S ~ U ~ ..S ~ U ~ -S ~ U ~ .S -~. ~ U ~ -~. -~ ~ U
c ~ U ~ c c ~ U ~ c ~ U ~ c ~ U ~ c ~ U U ~ c N U U N c
o c c o c c o o c c o c c o c c o c c o c c o
z a ~z ~ fizz a ~z ~ ~z ~ az a ~z a az
E-
_c
0
N N N N N N N N N c~ a7 0'~ c~ c'~ c~ c'~ c~ r7 c~ c~7 c~ c~ c~ c~ N N N N N
c~ c~ c~
~ a ~ a $ a a a a a a a a ~ a a a a a a ~ a a a a ~ a ~ a a a ~ a
°~~~~~~~~~~~~~~~~~~~~'~~~~~~~~~~~~
." 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
d' Q- ~l' t~ c~ 00 O ch op U N N_ O O c~ N_ ~ t~ ~O U N ao U N
~ V N~ ~ ~ ~ ~ ~ ~ N ~ ~ M N N c ~ M ~l' ~f ~ ~ ~ ~ ~~p - ~I' ~O a0 O ~ ~ oNp
.- r-
aUONN~~~o00 M~~oMO~M~nMc~~no000~NO~~o~O
c j M d' V ~ ~ ~ ~ ~ ~O r N ~T ,- ,-. N N c~ ch d' ~ ~ ~f7 ~ ~O ~ ~- ~ ~ r ~i'
~O
N N N N N N N N N N N N N N N N N N N N N N N N ~ N N ~ ~ N N N
zz~~zzzzzzzzzzzzzzzzzzzzzzzzzzzz
~QQQQaQaQQQQQQQQQaQaQQaQaQQQQQQQQ
~000000000000000000000000000
-i- O O O O O O O O O O O O O O O O O O O O O O O O N N N N N ~ O O
N N N N N N N N N N N N N N N N N N N N N N N N - .- .- . N N N
O O O O O O O O O O O O O O O O O O O O O O O O o p O O O O O O
I~ I~ I~ I~ I~ I~ I~ I~ I~ i~ I~ 1~ I~ I~ I~ 1~ I~ I~ I~ I~ I~ I~ I~ I~ M Ch
c'~
I~ I~ I~ I~ I~ I~ I~ I~ I~ I~ i~ I~ I~ I~ I~ I~ I~ I~ I~ I~ I~ I~ I~ I~ ,~p ~
~ ~ ~ O O O
U U U U U U U U U U O~ U U U U U P U U U U U U U ~, U U U O~ ~ O~ U
N N N N N N N N N N N N N N N N N N N N N N N N N N N
c~ c~ c~ r'~ c~ c~ c~ c~'~ c~ M c~') c~ c~ c~ c~ o'~ e'~ a7 c'') c~ c'~ ch c'~
M ~ O O O O N N N
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J -~ J J
z
°~ °~~~
w
196


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
N N N U N N U N N N N U U
C C C C C C C U C C U C C U C C
U U U U U U


U U U U U U U ._ U U ._ U U ._ U U
~ ~ U U ._ ~ .- ~ .- ~ U ~ ._ ~ ._ ~ ._
_U _U ~ U ~ U ~ O U ~ O U ~ O U ~
O O O O O O


~ O O O ~,~~ ~~ ~ E-S o ~ E-S-~ ~ ~-S-S E E-~
O E -S O


~~'S'~'~'~ ~ E'~ E ~ ~ ~~~ ~'~E ~ E~ ~ ~ ~ E
U ~ >. ~ c ~ ~ >. U
c c c c c c c ~ N
c


~ NUUUU N ~U ~ ~U ~ ~U ~U ~ ~U ~ U c
c c c c c c c o c c o c c o c o
o o o o o


a az a az a az a z a azz a azz a az



c
0
M M ~ N r- r r- r r- r r r ~- ~- ~- r- r- N N N N N M M M M M
N 'a '~ 'D '~ 'U 'U 'U 'U 'U 'a 'U 'a 'a '~ ~ ~ ~ ~ ~ 'a ~ ~ 'a 'a 'a ~ 'a ~ ~
~ ~ ~
~ ~ ~ a a ~ v a ~ ~ a ~ ~ ~ a a a a a ~ a a a a a ~ a a a a a ~ a
a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
" 0 0 o O o o O O o o O o 0 0 0 0 0 0 o O o 0 0 0 0 0 0 0 0 0 0 0
4- .E... 4- .~.- .~. .~. .~. ..... ..-. .~. ,~- .~... 4-. ..- y-. ..- .~ ..- 4-
..- ..-. ..-. .~ ..-. ..-. ..-. r-. y-. ..-. ..-. ..-. ..-.
~1 Q- ~t7 I~ ~O N ~ 00 ~_- Iw O ~f Iw ~ ~ ~ N ~ I_~ O N
(O O I~ ~ ~ V~ ~ aMp o ~ M ~ I~ I~ n n n a-0 M N VN' ~ ~ ~ c~~ ~ ~ ~ V ~ ~
Q U M O r r ~ o d. i~ M ~O U N o_0 ~ a0 ~ M ~O r ~ ~ V aM0 O
M '- d' n M ~ r ~ M ~t ~ n r V N d' ~O M V ~O '- ~t ~7 ~ ~l7
~ n n n n n
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz
~ Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q
~ o o °o °0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0
+- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
M M M M M M M M M M M M M M M <'') « M M M M M M CV CV N CV N
0~00~0o~0a~0a~0o~0o~0a~0o~0o~0a~0a~0a~0o~0oMOO~0o~0o~0a~0oM0a~0a~0a~ONNNONO
O O M M N N N N N N N N N N N N N N N N N N N N N N N Vii' V ~ 'd' V
~NN~~~c'N~cNcN~c~c'N~c'N~c~c'N~c'N~c'N~cNcNrJC'N~cNcNc'N~McN~cNcNcN~c~~n~nn
N N ~ ~ ~1' V ~Y 'Ct d' ~t V V ~ V d' ~' d' ~Y d' ~' V ~l' V V ~ ~ ~ d' d' d'
~f' V
J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J J
z
~O ~O I~ I~. U U U U U U O' U O' U U U ~ O' U U U U O~ O' O~ U U ~- r r- r
d' ,d.
UJ
I _ __-197


CA 02434677 2003-07-09
WO 021079473 PCTlUS02101009
C _U U C C _U C _U C C U C _U_ C C U U C G U C
~o~ o~ o.o.o o-Q o o-Q o~ o o~ o-° o o ~ ° E °I° E
° E.-°~E
O >~ ~ O N ~~' ~ O N ~ O N ~ ~ O O N U N O U N O O U U N O O U N O
o °'c ~'~~ ~ ~ ~U ~ ~U ~ ~, ~UU ~ c ~U c ~U ~ ~ ~ ~U ~ c ~U
~- o c U c o o c c o c o c c o c o c c o o c c o c
z a azz a az ~z a az a z a °zz ~ ~z a
_c
N N N N N e'~ o'~ cr? c~ G'~ r r r~ r r c'~ c~ M M c'~ c~ c'~ c~ th <'') r ~-'
r r
OOO~O~OO~O~,O,X000,~,00~0~0,~00,~0~,000~0,~,00000~000
1.1
Q, ~l' h h O ~ ~T O M r _ p o'~. ~0 00 r ~D 00 00 O N M M ~_O ~ c~ lO
O o0 O N ~ h U N d' h N N O O '0 N N N ~ r d' N ~ ~ '~d' dM' ~ h n h n
e1 ~ ~ ~ ~ V ~ ~ W.C~ r M r r
d 'y.t~ a0 00 ~- N ~_f7 ~-- ,~ d' U N U N CT N M ~O ~ h ~O
r V ~ ~ ~ r ~V ~ y.f7 r M N N O ,-- h O N N r r ~ h O r ~O 00 U N
r C''~ M r e-- N M d' d' ~~ 1~ h h
r r r r N N_ N N_ N N_ N_ N N N_ N N
N N N N N N N N N N N N N N N N N N_ N N ,- r r
r r r r r r r r r r r r
Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z d d d d ,~ d d d d Q d d
dddddddddddddddddaCdd-_-,-_,~-~-~~_-_>-->-_~~_-~-~
,'7-. '_~ ~_ '_~ '7 ~ ~_ '_'> '7 ~_ ~ ~_ r r r r r r
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
+- O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
p N N N N N N N. N N N N N N N N N N N N N N N N N N N N N N N N N
~" CV CV CV N CV CV CV N CV N ~ <''~ ~"~ ~''~ ~''~ M M
Q O O O O O O O O O U U U O~ ~'
N N N N N N N N N N O~ O~ O~ O' a' Q' 0' Q
O Wit' ~t d' d' dwi' d' ~ d' ~IwC) ~ ~ ~ ~ ~ ~ ~.c~ M ~ M c'~ c~ c~ c~ O O O O
O O O
f"' LQ ~ lQ lp ~ ~ LQ Ln ~ L(7 h I_'~ (_~ h I_~. h h I~ f_~. h '- r .- h h h h
!~~ h h
h h h. I'~ h h h h h h _ _ _ _ _ 00 00 00 4_O 00 ~O ~O ~O ~O ~_O ~_O ~O
C? d' 'CI' d' V ~t ~' ~' '~ d' O O O O O O O O ~ ~ r r r r r r- r- r .- r ~ r
..J J J ~ .:~ _I ~ J ..-.l J ~ J J J J J J J J J ~ J J J J J J J J J J J
_ N N N N N N N N N N V d' d' d' d' 'fl ~
~ ~.c~ tf7 ~ ~ ~ ~c? t(? ~ ~ ~ ~ ~ M ~ M ~ ~c7 ~ ~ ~l7 ~ ~ tI~ ~ ~ Wf7 i.f~
W
M
198


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
C _U C C U C _U_ C C _U_ C C U_ C
O ~ U ~ ~O ~ U O ~ U ~ O ~ U ~ .O ~ U
.Q
o ~ o ~ o ~ o ~ o o ~ o ~ o ~ o ~ o ~ o
0 0~0 0 0 ~a~ o~0 0 0~0 0 0~0 0
c o c U c o c U o c ~ c o c U c o c U
D Z ~ ~ Z D Z ~ ~ Z D a Z
i- i- i- ~
_c
0
.- m- r- r- ~ m- M M M M M C~ M O7 C''~ M
N 'a '~ ~ ~ '~ 'a 'a ~ ~ ~ 'a '~ ~ ~ 'a ~ ~ '~ T3
~ a a a a o a a o a a a a o v a a a a a
°~~~~~~~~~~~~~~~~~~~
" 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
'd' Q- ~ a0 'd' N c'~ ~O o0 c'~ ~O ~ t!) ~O U a0
J ~ j 1~ 00 00 a0 U O' O' O' I~ 00 00 00 0Q OO OD op U U U
m
t ~O ~O U N N ~ M V' U N '_d' I~ ~O ~O I~ O U N
U ~ I~ c~ ~O a0 O ~'7 ~ r- I~ O M ~ ~O I~ O N
I~ n 00 M 00 U U U IW a0 M M a0 00 M O~ U U
N N N N N N N N N N N N N N N N N N N
zzzzzzzzzzzzzzzzzzz
aQaQQQQaQQaQQQaQQaQ
- 0 0 0 0 0 0 0 0 0 0 0 ~ 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N N N N N N N N N N N N N N N N N N N
Q d; ~ d' d' 'd' 'd' ~t ~I' ~t ~ V 'd' ~t ~ 'd' ~1 V V Wit'
O O O O O O O O O O O O O O O O O O O
~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O
J J J J J J J J J J J J J J J J J J J
Z
~O ~O ~O 'O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O ~O
~ ~ ~C7 i.(7 ~ ~l7 ~(7 ~ ~ ~ ~ M tp ~
LU
199


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component Stop
NO; IC Start


i LI:1983416.1; 814031971 1 363
2001 JAN 12


1 LI: T 9834 i 7976079N 22 435
6,1;2001 JAN i ~
12


2 LI:332263.1:200145842376 1542 2088
JAN 12


2 LI;332263, i 7185040 1544 2090
:2001 JAN 12 i V 1


2 LI;332263,1;2001g3896002 1549 1997
JAN 12


2 LI;332263.1:2001JAN12g7374740 1910 2260


2 LI:3322b3.1;200 70312653D 465 944
i JAN 12 1


2 LI;332263,1;2001JAN1270314846D1465 874


2 LI:332263,1:200171852779V 1707 2560
JAN i 2 1


2 LI;3322b3. i 71852887V 1020 1925
:2001 JAN 12 1


2 LI:3322b3,1;200170313794D1992 1621
JAN l 2


2 LI;3322b3,1:200170311926D 991 1426
JAN 12 1


2 LI:332263.1:200171851150V 1024 i
JAN 12 1 885


2 LI;3322b3.1;20015463866H 1037 1234
JAN 12 1


2 LI:332263,1;200170313280D11098 1667
JAN 12


2 LI :332263,1; 703 i 39 1098 1533
2001 JAN 12 i 9D 1


2 LI;332263.1:2001JAN1270312150D11098 1405


2 LI;332263.1:200 70312922D 1099 1614
i JAN 12 i


2 LI:332263.1:200170313561 1098 1464
JAN 12 D 1


2 LI;332263.1;20016709051 110 i
JAN 12 N 1 i 689


2 L(:332263.1:2001JAN1271852837V11125 2056


2 LI:3322b3.1.200170312621 1115 1727
JAN 12 D 1


2 L1:332263.1; g 7279826 1119 )
2001 JAN i 2 580


2 LI;332263,1:2001JAN1270314358D11126 1688


2 LI:332263,1:20013838795H 1624 1927
JAN 12 1


2 Li:3322b3.1:200170312119D 542 1078
JAN T 2 1


2 LI :332263,1; 70314905D 544 936
2001 JAN 12 1


2 LI;3322b3.1:2001JAN124045920H1 545 843


2 LI;332263,1:200 7206449H 572 1142
i JAN 12 1


2 LI;332263.1:2001JAN121379201 595 760
H1


2 LI:3322b3.1:200171851568V 636 1476
JAN 12 1


2 LI;332263, i 6292760H 1447 1667
:2001 JAN 12 1


2 LI;332263,1:200171854502V 1444 2128
JAN 12 1


2 LI:332263,1; 71853944V 1463 2097
2001 JAN 12 1


2 LI:332263.1;200171852987V 1548 2374
JAN 12 i


2 LI;332263,1:20015094039H 1506 1774
JAN 12 1


2 LI:332263.1:2001g3922068 1536 2000
JAN 12


2 LI:3322b3.1:20012792057H 973 1290
JAN 12 1


2 LI;332263,1:2001JAN1271853948V11619 2269


2 LI:332263.1;200171853977V11622 2375
JAN 12


2 LI:3322b3,1:200171791379V 1614 1966
JAN 12 1


2 LI:332263.1:200171852744V 1617 2535
JAN 12 1


2 LI:332263.1:200171852463V 1606 2613
JAN i 2 1


2 LI;332263.1;2001JAN1271851278V11612 2325


2 LI:3322b3.1:200171852263V 1552 2473
JAN 12 1


2 LI:3322b3. i 70314220D 1407 1859
:2001 JAN 12 1


2 LI:332263.1:200171853922V 1494 2529
JAN 12 1


2 LI:332263.1:200170314489D i 1835
JAN 12 1 431


2 LI;332263.1:2001343561476 1433 i
JAN i 2 944


2 LI:332263.1:200171853327V 1433 2106
JAN 12 1


_ _ ._ ._ 200 __..


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Start
Stop


2 LI;3322b3,1;20012699409H 1434
JAN 12 1 1726


2 LI:332263,1;20016546955H 1441
JAN 12 1 1992


2 LI:332263.1;200171853660V11441
JAN 12 197
3


2 LI;3322b3,1;2001g2716813 1441
JAN 12 1992


2 LI;332263, i 6294741 i 447
:2001 JAN 12 N 1 1786


2 LI:332263,1:200 7 i 854495V1488
T JAN 12 1 2361


2 LI;332263.1;2001JAN1271853577V11598
2510


2 LI;332263,1;2001JAN1271850908V11607
2470


2 LI:3322b3,1:2001g3147676 T 584
JAN 12 T 993


2 LI:332263.1:200171850788V 1595
JAN 12 1 2362


2 LI;332263,1;200172333294V 1589
JAN 12 1 2170


2 LI:332263,1:2001703117 465 989
JAN 12 i 1 D
1


2 LI;332263,1:2001JAN1270312113D1465 936


2 LI:332263.1;200170313820D 465 936
JAN 12 1


2 LI;332263. i 70313450D 465 930
;200 i JAN 12 1


2 LI :332263.1:200170314581 465 913
JAN 12 D 1


2 LI :332263,1:200170313195D 465 856
JAN 12 1


2 LI:332263.1;2001JAN1271853150V11231
1647


2 LI:3322b3,1:200171854540V 1264
JAN 12 1 2359


2 LI;332263.1:20014067866H 1254
JAN 12 1 1555


2 LI;332263.1:200171854017V11259
JAN 12 21 Ob


2 LI:332263.1; 71854180V 1276
200 T JAN 12 1 2074


2 LI:332263.1;200171852017V 1271
JAN 12 1 2193


2 LI;332263.1;2001JAN1271852613V11273
1952


2 LI:332263, i 1783069N 1299
:2001 JAN 12 1 1550


2 LI:332263.1:20017 7 854868V1304
JAN 12 7 1854


2 LI;332263.1;200171854722V11310
JAN 12 2040


2 LI; 332263.1:200171854441 13 i
JAN 12 V 1 6 2052


2 LI:3322b3.1:200171852902V 1318
JAN 12 1 2084


2 LI;332263.1:2001JAN12g5591690 1332
1638


2 LI;3322b3.1:2001689587H 1335
JAN 12 i 1587


2 LI:3322b3.1:200171850440V 1364
JAN 12 1 2213


2 LI;3322b3.1;2001JAN1271852044V11390
2459


2 LI;332263.1:2001g3988317 T 882
JAN 12 1992


2 LI;3322b3,1:20015852582H 1883
JAN 12 1 1992


2 LI;332263.1:2001JAN1271852773V11976
2805


2 LI:332263.1:2001g5544052 2019
JAN 12 2259


2 LI:3322b3.1.200171851901 2081
JAN i 2 V 1 2539


2 LI;332263.1:20016882753H 2112
JAN 12 1 2643


2 LI;3322b3.1;2001246477176 2148
JAN 12 2207


2 LI;3322b3,1;2001JAN1271855066V12148
2738


2 LI:332263.1:2001956688H 2203
JAN 12 1 2259


2 LI;3322b3.1:2001JAN122050914H1 2372
2632


2 LI;3322b3,1;20016275080N2 2429
JAN 12 2907


2 LI:3322b3.1:2001463120876 2496
JAN 12 2611


2 LI:332263.1;20014631208F6 2503
JAN 12 2640


2 LI:332263.1:20014631208H 2504
JAN 12 1 2668


2 LI;3322b3.1:2001267505576 2801
JAN 12 3292


2 LI:332263,1; 6882753) 2851
2001 JAN 12 1 3454


2 LI;332263.1:2001JAN12g3675996 2881
3330


201


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ iD Template ID Component
NO; 1C Start
Stop


2 LI:332263.1:2001JAN12g4738663 2987 3330


2 LI:3322b3.1:2001g5540864 3200 3330
JAN 12


2 LI;3322b3.1:2001JAN1271851008V11 1785
740


2 LI:332263.1; g 1933806 1159 1471
2001 JAN 12


2 LI:332263.1:2001g7456875 1179 1578
JAN 12


2 LI:332263.1;200171851136V 1177 2042
JAN 12 1


2 LI;3322b3.1:20017185 i 1183 2132
JAN 12 292V 1


2 LI:332263.1; 71854201 1205 1881
2001 JAN 12 V 1


2 LI :332263.1; 71516539V 1195 1890
2001 JAN 12 1


2 LI;3322b3.1:2001JAN122356586F6 1227 1487


2 LI;332263.1:20012356586H 1227 i
JAN 12 1 476


2 LI:3322b3.1;2001g6713066 832 1290
JAN 12


2 _ LI;3322b3.1:2001JAN125990242H1 835 1128


2 LI:332263.1:200171514509V 864 1666
JAN 12 1


2 LI:332263.1:20015803858H 909 1231
JAN 12 1


2 LI;3322b3.1:20014669412H 913 1126
JAN 12 1


2 LI:3322b3.1:2001744596271 925 1465
JAN 12


2 LI;3322b3.1:2001JAN1270314212D1939 1369


2 LI:332263.1;200171854789V 933 1798
JAN 12 1


2 LI:332263.1;200171853078V 1005 1905
JAN 12 1


2 LI;3322b3. i 3435614F6 76b 1303
;2001 JAN 12


2 LI:3322b3, l 3435614H 766 1
;2001 JAN 12 1 O
15


2 LI;3322b3.1:20012541284H 767 1020
JAN 12 1


2 LI :332263.1:2001g5878326 782 1244
JAN 12


2 LI:3322b3.1;2001g287834 822 1195
JAN 12 i


2 L1:3322b3.1:20011647392H 86 258
JAN 12 1


2 LI:3322b3.1:20018114707H 128 628
JAN 12 1


2 LI:332263.1:200145842386 213 63
JAN i 2 i


2 LI;332263.1:2001JAN12458423H1 213 495


2 LI:332263.1:20013166541 229 529
JAN 12 H 1


2 LI:3322b3.1:20011256781 251 527
JAN 12 H 1


2 LI:332263.1;20012464771 286 704
JAN i 2 F6


2 LI:3322b3.1:20012464771 286 540
JAN 12 H 1


2 LI:332263.1:2001g746839 347 536
JAN 12


2 LI:3322b3.1:20012495063F6 395 778
JAN 12


2 LI:332263.1:20012495063H 395 644
JAN 12 1 '


2 LI:3322b3.1;2001JAN1270313857D1435 936


2 LI:3322b3.1;200170313094D1435 884
JAN 12


2 LI:3322b3.1:200170313906D 465 838
JAN 12 i


2 LI:3322b3.1:200170311631 465 1079
JAN 12 D 1


2 LI:3322b3.1;200170314994D1465 988
JAN 12


2 LI :332263.1; 71854192V i 2375
200 T JAN 12 1 677


2 LI:3322b3.1:200171851234V 1644 2306
JAN 12 1


2 LI:3322b3.1:200171856454V 1875 2556
JAN 12 1


2 LI:3322b3.1:2001g3934367 1837 2269
JAN 12


2 LI:332263.1:2001JAN1271855032V11534 2237


2 LI:3322b3.1:200171851502V 1561 2209
JAN 12 1


2 LI:332263.1:200170313005D11126 1721
JAN 12


2 LI:3322b3.1:200 5327304H 540 825
i JAN 12 1


2 LI;3322b3.1;2001JAN1270311672D1542 986


__ -202 _ _-


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component Stop
NO: IC Start


2 LI;332263. T 70313719D 542 936
;2001 JAN 12 1


2 LI:332263.1;200170314533D1542 936
JAN 12


2 LI;332263.1:20016080132H 715 1204
JAN 12 1


2 LI;332263.1:2001JAN124591037H1 736 1000


2 LI:332263.1:2001466711 745 936
JAN 12 OH 1


2 LI:3322b3.1;20015289580H 759 1026
JAN 12 1


2 LI;3322b3.1:2001JAN1270313461 676 1096
D1


2 LI :332263.1:200170314649D 685 1096
JAN 12 1


2 LI;3322b3.1:2001JAN1270314563D1692 1096


2 LI;332263.1;2001JAN1270313370D1692 1095


2 LI;332263.1:2001JAN1270315127D1710 1096


2 LI;332263.1:2001JAN1270312983D1709 936


2 LI:3322b3.1:20017 i 85629113812033
JAN 12 V 1


2 LI:3322b3.1;200170315216D 14071978
JAN 12 T


2 LI;3322b3.1;2001JAN122495063T6 15501946


2 L1;332263.1:2001JAN12g3701266 15531989


2 LI;332263.1:2001g3896306 15602002
JAN 12


2 LI;3322b3.1:2001JAN12g3896000 15781998


2 LI :332263.1:2001458423 15282231
JAN 12 F 1


2 LI:3322b3.1:200171852936V 7 781
JAN 12 1


2 LI:332263.1;2001JAN126022701 1 596
F7


2 L1:332263.1:20016022701 1 190
JAN 12 H 1


2 LI:332263.1;200171850609V111 601
JAN T 2


2 LI;332263.1;2001JAN122675055F6 11 403


2 LI:332263.1;20012675055H 11 276
JAN 12 1


2 LI:332263.1;20012677085H 11 251
JAN 12 1


2 LI:332263.1:20017062834H 30 594
JAN 12 1


2 LI;3322b3.1:20011647392F6 75 511
JAN 12


2 LI :332263.1:2001g6452151 17962260
JAN 12


2 LI :332263.1:2001g6470559 18452259
JAN 12


2 LI:332263.1;200171853046V 15432264
JAN 12 1


2 LI;332263.1:2001JAN1270315117D111261677


2 LI;332263. T 70313508D 11261405
:2001 JAN 12 1


2 LI;332263. T 70314381 11261404
:2001 JAN 12 D i


2 LI:332263.1:2001703 i 3034D465 7b8
JAN 12 1


2 LI:332263.1:20013033484H 465 752
JAN 12 1


2 LI:332263.1:20017031 T 466 856
JAN 12 789D 1


2 LI;332263.1:2001JAN1270313425D1466 842


2 LI:332263.1;200170313899D1470 936
JAN 12


2 LI:3322b3.1;2001JAN1270311446D1465 909


2 LI:332263.1:200170313753D1472 855
JAN 12


2 LI;332263.1:20015039112H2 488 732
JAN 12


2 LI :332263.1:200170311908 536 936
JAN 12 D 1


2 LI;332263.1:200 g2820485 539 760
T JAN T 2


2 LI:332263. i g 1933745 17762260
:200 T JAN 12


2 LI:332263.1:200171853831 17002148
JAN 12 V 1


2 LI;332263.1:2001g4332511 17011992
JAN 12


2 LI:332263. T 4666990H 17101998
:2001 JAN 12 1


2 LI;332263.1;2001JAN123267078H1 17112035


2 LI:332263.1;200171850938V 17532686
JAN 12 1


_ _ 203____ _-_


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component StartStop
NO: IC


2 LI;332263.1;200171856550V 16282425
JAN 12 1


2 LI:332263.1:20017 i 8546 16622017
JAN 12 i 8V 1


2 LI:332263.1:200171854970V 17292251
JAN 12 1


2 LI :332263.1:200171853990V 19072079
JAN 12 1


3 LI:333886.4;2001JAN1255098527H1 1 795


3 LI;333886.4:200155138862H 57 475
JAN 12 1


3 LI;333886.4:200155138870) 56 850
JAN 12 1


3 LI:333886.4:200155138962H 57 727
JAN 12 1


3 LI:333886.4; 55138986H 58 751
2001 JAN 12 1


3 LI:333886.4;200155138954H 92 777
JAN 12 1


3 LI;333886.4:200155138878H 92 862
JAN i 2 1


3 LI:333886.4:200155138886H 92 811
JAN 7 2 1


3 LI;333886.4:200155138970) 106737
JAN 12 1


3 LI;333886.4;2001JAN122932976H1 116386


3 LI;333886.4:2001g6704745 150703
JAN 12


3 LI:333886.4:200 7077452H 232815
i JAN i 2 1


3 LI:333886.4:200155138894H 329522
JAN 12 1


3 LI:333886.4:2001g4734742 477952
JAN 12


3 LI;333886.4:2001JAN121549488H1 505717


4 LI;478508, i 4614387H 175420
.200 i JAN 12 1


4 LI:478508.1:20015692056H 180256
JAN 12 1


4 LI:478508.1:20016247423H 124592
JAN 12 1


4 LI:478508.1:20016737711 195275
JAN 12 T8


4 LI;478508.1:20018031730)2 1 707
JAN 12


4 LI:478508.1:20012532826H 76 256
JAN 12 1


4 LI:478508.1:20018031592)2 112709
JAN 12


4 LI:478508.1:2001200796586 286471
JAN 12


4 LI;478508.1:2001200796576 331778
JAN 12


4 LI:478508.1:2001g5850995 373797
JAN 12


4 Ll:478508.1:2001g2953720 405798
JAN 12


4 LI:478508.1:2001g7280553 435794
JAN 12


4 LI:478508.1:2001g5446115 609800
JAN 12


4 LI;478508.1:20012007965H 212413
JAN 12 1


4 LI;478508.1:200iJAN124617844F6 264769


4 LI;478508.1:200146 i 7844H 264516
JAN 12 1


4 LI :478508.1:20015449985H 197256
JAN 12 1


LI:307470.1:200155025674H 448963
JAN 12 1


5 LI; 307470.1:200155025674) 445962
JAN 12 1


5 LI:307470. i g2881602 443626
:2001 JAN 12


5 LI:307470. 7 6885575) 431544
:2001 JAN 12 1


5 LI:307470.1:20018053491 1 523
JAN 12 J 1


b LI :058298.1:200171870273V 10831678
JAN 12 1


b ~ LI:058298.1:2001531491 OH 10831331
JAN 12 1


b LI; 058298.1:20011698381 6701228
JAN 12 T6


6 LI:058298.1:20011698381 417915
JAN 12 Fb


6 LI:058298.1:2001g2539246 558852
JAN 12


6 LI:058298.1:200155067990H 389682
JAN 12 1


6 LI;058298.1;200155067990) 389681
JAN 12 i


6 LI:058298.1:200155068290) 1 681
JAN 12 1


6 LI:058298.1:2001550682~96J 32 680
JAN 12 1


204 -. _ .


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO. IC Start
Stop


b LI;058298.1;200155068289H 114 681
JAN 12 1


6 LI;058298.1;200155068294J 23 679
JAN 12 1


6 LI;058298.1;200155068295J 43 679
JAN 12 1


6 LI:058298.1:200155067993J 22 679
JAN 12 1


b LI:058298.1;200155068292H 1 679
JAN 12 1 Ob


b LI;058298.1:200155067989J 5b 679
JAN 12 1


6 LI:058298.1; 55068293J 1 677
2001 JAN 12 1


6 LI:058298.1;200155068291 1 674
JAN 12 J 1


6 LI; 058298.1:20071698381 417 633
JAN 12 H 1


7 LI;205527.5;2001JAN122111743H1 1 270


8 LI :231587.1:2001817376H 445 621
JAN 12 1


8 LI:231587.1:2001147 4044H 1 127
JAN 12 1


8 LI;231587.1;20011414796H 1 225
JAN T 2 1


8 LI;231587.1:20011414796F6 1 127
JAN 12


8 LI;231587.1;20015773911 145 670
JAN 12 T8


8 LI:231587.1:2001141479676 441 667
JAN 12


9 LI;402919.1;2001g2896813 1 2223
JAN 12


9 LI;402919.1:2001g5106977 384 1595
JAN 12


9 LI;402919.1;20016424763F8 436 965
JAN 12


9 LI:402919.1;20015758845F8 519 904
JAN 12


9 LI :402919.1.20017159011 819 1161
J AN 12 H 1


9 LI;402919.1;2001g 1957910 977 1428
JAN 12


9 LI:402919.1;20016767163) 1099 1717
JAN 12 1


9 LI:402919.1:2001g3090058 1343 1763
JAN 12


9 LI;402919.1;2001JAN125450865H1 1361 1616


9 LI :402919.1:200171581490V 1361 2062
JAN 12 1


9 LI;402919.1;200171475285V 1361 1697
JAN 12 1


9 LI :402919.1:200171603157V 1361 1543
JAN 12 1


9 LI;402919.1;2001JAN1271582280V11361 2033


9 LI:402919.1;200171558462V 1361 1814
JAN 12 1


9 LI;402919.1:20015450865F6 1361 1658
JAN 12


9 LI;402919.1;200171583969V11450 1610
JAN 12


9 LI;402919.1;200171583436V 1454 i
JAN 12 1 636


9 LI:402919.1:2001545086576 1580 2083
JAN 12


9 LI:402919.1:200171582366V 1593 2105
JAN 12 1


9 LI:402919.1;2001JAN125310987H1 1596 1855


9 LI;402919.1;200171582549V11611 2118
JAN 12


9 LI:402919.1:200171582121 1609 2107
JAN 12 V 1


9 LI :402919.1; 71584255V 1658 2105
2001 JAN 12 1


9 LI:402919.1:2001g 1757400 1883 2213
JAN T 2


9 LI;402919.1;20017239306H 1923 2169
JAN 12 1


9 LI:402919.1:2001575884578 1961 2049
JAN 12


i 0 LI;463283.1:200 g67 i 2118T 355
i JAN i 2


LI:463283.1; 4211831 260 393
2001 JAN 12 H 1


10 LI;463283.1;20014211831 261 934
JAN T 2 F8


10 LI:463283.1:20014211831 286 860
JAN 12 T8


11 LI :072560.1; 544903 262 499
2001 JAN 12 H 1


11 LI:072560.1;2001707 i 5970V1478 i
JAN 12 i 594


11 LI:0725b0.1:200170715782V 1473 1594
JAN 12 1


1 i LI:072560.1;200170720015V 1083 1590
JAN 12 1


_ -~OS_____


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SE6~ Template ID Component
ID NO: IC Start
Stop


11 LI;072560.1:2001JAN127383278H1 1160 1563


11 LI:072560.1:2001JAN1270720947V11054 1522


11 LI:072560.1:20013461875H 1445 1502
JAN 12 1


11 LI;072560.1;2001JAN12707183b1V1882 1213


i i LI;072560. i 6888979H 885 1140
:200 i JAN i 1
2


11 LI:0725b0.1:20017071858 8b0 1140
JAN 12 7 V 1


11 LI:0725b0.1:200170721033V 877 1140
JAN 12 1


11 LI:072560.1:200170715974V 897 1140
JAN 12 1


11 LI:0725b0.1:200170716831 498 1130
JAN 7 2 V 1


11 LI:072560.1;200170719723V 484 1128
JAN 12 1


11 LI:072560.1:200170716832V 478 1088
JAN 12 1


11 LI:072560.1:2001729056588 879 1011
JAN 12


11 LI;072560.1:2001729056586 797 1011
JAN 12


11 LI:072560.1;2001g45b5346 2337 2584
JAN 12


11 LI:072560.1; 6078609H 2428 2584
2001 JAN 12 1


11 LI;072560.1:2001g2556167 2200 2558
JAN 12


11 LI;072560.1:200182223594 2201 2480
JAN 12


11 LI;072560.1:20018051578) i 2442
JAN 12 1 917


11 LI:0725b0.1:200170b47879V 2216 2387
JAN 12 1


11 LI:072560.1;2001215347276 1780 2362
JAN 12


11 LI;0725b0.1:2001JAN12282861876 1777 2350


1 i LI:072560.1;20012153472Fb 1701 2201
JAN 12


11 LI;0725b0.1:2001JAN1270717413V11793 2198


11 LI;072560.1;200170720711 1698 2084
JAN 12 V 1


11 LI;0725b0.1:200170715804V 1477 2072
JAN 12 7


11 LI:072560.1;200i5182174H2 1955 2055
JAN 12


11 LI;072560.1;200170720554V 1456 2013
JAN 12 1


11 LI:072560.1;200 70717671 1683 1912
i JAN 12 V 1


11 LI;072560.1;2001JAN1282000573 1573 1901


11 LI:072560.1;200170717070V 1464 1899
JAN 12 1


1 i LI:072560, i 70716090V 1683 1895
;200 i JAN 12 1


11 LI:072560.1:200170711495V 1734 1859
JAN 12 1


11 LI:072560.1;20012153472H 1701 1810
JAN 12 1


1 T LI:072560.1;200170647864V 1439 1746
JAN 12 1


71 LI:0725b0.1;200170715717V i 1746
JAN 12 1 441


11 LI :072560.1;200170712315V 913 1633
JAN 12 1


11 LI;072560.1:200170720508V11052 1620
JAN 12


11 LI:0725b0.1; 70718736V 1439 1608
2001 JAN 12 1


11 LI:072560.1:200170715911 1473 1594
JAN 12 V 1


11 LI:0725b0.1:200170716279V 1384 1594
JAN 12 1


11 LI:0725b0.1:2001729076588 880 1008
JAN 12


11 LI:072560.1:200170716585V 372 877
JAN 12 1


11 LI:072560.1:200170716866V 372 828
JAN 12 1


11 LI:072560.1:2001JAN122828b18Fb 372 859


11 LI:072560.1:20012828618H 372 674
JAN 12 1


11 LI:072560.1:2001550130586 1 440
JAN 12


12 LI: i 953096.1:2001639 i 308781 620
JAN 12


13 LI:1076016.1:20018186651 298 966
JAN 12 H 1


13 LI:1076016.1:200171934981 362 811
JAN 12 V 1


13 LI: T 07601 b.1;20017 i 924015V362 825
JAN 12 i


_ _ ._ _. ___ ______,
206


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SE6~ Template ID Component Stop
ID NO: IC Start


13 LI:1076016.1:20017463085H 1 592
JAN 12 1


13 LI;1076016.1;2001g5100484 32 221
JAN 12


T 3 LI:107b016.1;20018581 Ob96 42 222
JAN 12


13 LI;107b01 b. 8015729) 223 820
T ;2001 JAN 1
12


13 LI:107601 b. 8054154) 223 699
7 :2001 JAN 1
12


13 LI:1076016.1:20017608984) 225 782
JAN 12 1


14 LI:2082796.1:2001827317771 1 591
JAN 12


T 5 LI; 335681, 3:200171652050V 1 652
JAN 12 1


15 LI:335681.3:200171657465V 2 583
JAN 12 1


15 LI:335681.3:20013171186F6 1 420
JAN 12


15 LI:335681.3;2001JAN123168910H1 1 198


15 LI;335681,3;2001JAN122820819H1 50 380


15 LI:335681,3:20015027433H 346 555
JAN 12 1


15 LI:335681.3;20013171 T 375 862
JAN 12 8676


15 LI:335b81, 3; 2013111 405 621
2001 JAN 12 N T


15 LI;335681.3:2001g4306593 452 901
JAN 12


15 LI;335681,3:2001g680951 487 724
JAN 12


15 LI .335681.3:20013934551 521 801
JAN 12 H 1


15 LI;335681.3:20013934519H 522 808
JAN T 2 1


T 5 LI:335b81, 3:200128 T 1041 583 857
JAN 12 Tb


15 LI :335681.3:200170623951 1 531
JAN 12 V 1


15 LI:335681.3;200171657365V 1 580
JAN 12 1


15 LI;335681.3:2001JAN1271653874V11 514


15 LI;335681,3:200170621934V 1 437
JAN T 2 1


15 LI;335b81,3;2001JAN123171186N7 1 198


15 LI:335b81.3:20017 T 597437VT 639
JAN 12 1


15 LI:335681.3:20012811041 590 896
JAN 12 F6


15 LI :335681.3;20012811041 590 873
JAN 12 H 1


15 LI:335681.3:20015870171 632 917
JAN 12 H 1


15 LI:335681.3:2001g3922387 734 1165
JAN 12


15 LI:335681,3;2001g3785451 796 11
JAN 12 T
5


16 LI:214150.1;2001g 1733190 602 718
JAN 12


16 LI:214150.1:20016890523H 605 703
JAN 12 1


16 LI :214150,1:200171984167 1 595
JAN T 2 V 1


16 LI:2 T 4150.1:200171984038V 159 703
JAN T 2 1


16 LI:214150. T 4646294F6 422 703
:2001 JAN 12


16 LI;214150.1;20014646294H 422 689
JAN 12 1


16 LI:214150.1:20013703252F6 495 703
JAN 12


16 LI:214150.1;20013703252H 495 717
JAN 12 1


i 7 LI:322783.15:20017208772H T 1
JAN 12 1 T
5


17 LI:322783.15:20017609888) 131 685
JAN 12 1


17 LI;322783.15:20012650602H 184 425
JAN 12 1


17 LI:322783.15; 2763906H 195 433
2001 JAN 12 1


17 LI;322783,15:2001JAN12720785878 317 765


17 LI:322783.15;20017154586H 1 527
JAN 12 1


T7 LI:322783.i5;2001JAN127206356Hi T 490


17 LI:322783.15:20017207535H 3 379
JAN T 2 1


17 LI;322783.15:2001JAN127209140H1 1 622


17 LI;322783,15;2001JAN127209489H1 1 618


17 LI:322783.15;20017210871 1 603
JAN 12 H 1


207 -__ .




CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Start
Stop


17 LI:322783. 7 7b85079H T 550
5:2001 JAN 12 1


17 LI:322783.15;200169b4009H 1 560
JAN 12 1


18 LI;422993.1;20016310043H 1 415
JAN 12 1


18 LI;422993.1:200163 i 0340F81 394
JAN 12


18 LI:422993.1:20014722590H 52 213
JAN 12 1


18 LI:422993.1:2001631004378 205 700
JAN 12


18 LI:422993.1:2001809424bH 542 1031
JAN 12 i


18 LI:422993.1:20017594841 808 1359
JAN 12 H 1


18 LI;422993.1;200TJAN12g663999i 1013
1429


18 LI:422993.1:2001g7457143 1021
JAN 12 1425


18 LI:422993.1:2001g7320380 i 064
JAN i 2 1434


19 LI:1172885.1:2001192348876 1 O 1
JAN 12 575


19 LI;1172885.1;2001476809476 42 599
JAN 12


19 LI:1172885.1:20014874348H 194 468
JAN 12 1


19 LI: i 172885.1:2001192329676 203 574
JAN i 2


19 LI:1 T 72885.1;20015085341 204 638
JAN 12 F6


19 LI:1172885.1:2001192348886 1 438
JAN 12


19 LI:1172885.1:2001192329bRb T 389
JAN 12


19 Li:1 T 72885.1.2001i 92329bH i 299
JAN 12 1


19 LI:1172885.1;20011923488H 1 280
JAN 12 1


19 LI:1172885.1:2001576090878 24 486
JAN 12


19 LI:1172885.1:20015085341 182 384
JAN 12 H 1


19 LI:1172885.1:200119141 ObH 242 502
JAN 12 1


20 LI:1088359.1:20013956181 1975
JAN 12 H 1 2272


20 LI:1088359.1:20016779755F6 1953
JAN 12 2564


20 LI:1088359.1:20013b03043H 197 i
JAN 12 1 2277


20 LI;1088359.1:20015309011 1972
JAN 12 H 1 2212


20 LI:1088359.1:2001220139676 1496
JAN 12 2090


20 LI:1088359.1:20071250004V 1516
T JAN 12 1 2177


20 LI :1088359.1.2001710b3424V 1205
JAN 12 1 1871


20 LI: i 088359.1:200171 Obb7b9V1166
JAN 12 1 1771


20 LI:1088359.1:200171063893V 1146
JAN 12 1 1697


20 LI:1088359.1:20013169092H 1948
JAN 12 1 2220


20 LI:1088359.1:20071834740V 1198
T JAN 12 1 2149


20 LI:1088359.1:200b532b50H 1247
T JAN 12 i 1746


20 LI:1088359.1:200171249332V 193 T
JAN 12 1 2620


20 LI:1088359.1;200171064441 1780
JAN 12 V 1 2297


20 L!:1088359.1:200171249705V 1822
JAN 12 1 2360


20 Li:1088359.1:2005203628 184 i
T JAN 12 H 1 2141


20 LI:1088359.1;2001JAN12710b5750V11710
2238


20 , LI:1088359.1;20012503778F6 1732
JAN 12 22b4


20 LI:1088359.1:20012503778H 1732
JAN 12 1 2000


20 LI:1088359.1:200186192bH 1484
JAN 12 1 1742


20 LI:1088359.1:20017 7 83b277V1133
JAN 12 1 2130


20 LI:1088359.1:20017 T 835532V1705
JAN i 2 1 2334


20 LI:1088359.1:20017183488bV 1627
JAN 12 T 2541


20 LI;1088359.1;20016702383H 1135
JAN 12 1 17b3


20 LI:1088359.1:2001JAN1271249827V11084
1780


20 LI:1088359.1:2006023487H 1453
i JAN 12 1 1758


20 LI:1088359.1;20014597863H 1455
JAN 12 1 1710


_ _ _ _ 208 _ _


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Start
Stop


20 LI;1088359.1:20017 T 834883V1410 2328
JAN 12 1


20 LI;1088359.1:20011 b 17721 1424 1890
JAN 12 F6


20 LI:1088359.1:20011 b 17721 1424 1597
JAN 12 H 1


20 LI ;1088359.1:200171835736V 1479 2201
JAN 12 1


20 LI:1088359.1;20014550412H 1600 1865
JAN 12 1


20 LI: T 088359. 5216519H 1621 1769
T ; 2001 JAN 1
12


20 LI:1088359.1:200171250431 1632 2303
JAN 12 V 1


20 LI;1088359.1:200171835228V 1039 1291
JAN 12 1


20 LI;1088359.1:2001JAN122762282H1 1069 1323


20 LI.1088359.1;200171063570V 934 1567
JAN 12 1


20 LI :1088359.1:200171837048V 917 1919
JAN 12 1


20 LI;1088359.1;200171063612V11342 1951
JAN i 2


20 LI;1088359.1:200171837212V 1390 2090
JAN 12 1


20 LI:1088359.1:200171839966V 1279 1781
JAN 12 1


20 LI:1088359.1:20012790964H 1303 T
JAN 12 1 627


20 LI;1088359.1:200171063820V 1328 1970
JAN 12 1


20 LI:1088359.1;200171063338V i 2064
JAN 12 1 340


20 LI; T 088359.1.200171838280V 1326 1764
JAN 12 1


20 LI:1088359.1;200171838522V 2027 2636
JAN 12 1


20 LI:1088359.1;20014539127H 2042 2325
JAN 12 1


20 LI;1088359.1:200171065280V 1566 2201
JAN 12 1


20 LI:1088359.1:2001220138876 1573 2089
JAN 7 2


20 LI:1088359.1:200171248778V 1555 2067
JAN 12 1


20 LI;1088359.1:20013346980H 1272 1572
JAN 12 1


20 LI :1088359.1; 71065 i 1260 1972
2001 JAN 12 b 1 V
T


20 LI:1088359.1:20016423133H 889 1467
JAN 12 1


20 LI:1088359.1;200171249545V 849 1498
JAN 12 1


20 LI:1088359, i 71 Obb585Vi 602
:2001 JAN 12 1


20 LI:1088359.1:20014059690H 1 256
JAN 12 1


20 LI:1088359.1:20014059690F6 1 211
JAN 12


20 LI:1088359.1:2001649075486 89 647
JAN 12


20 LI:1088359.1; 649075489 109 644
2001 JAN 12


20 LI:1088359.1:2001721 b2bbH 158 715
JAN 12 1


20 LI:1088359. T 71835294V 165 1068
:2001 JAN 12 1


20 LI;1088359.1;20014791776F8 1 790
JAN 12 b5


20 LI :1088359.1:2001.4791776H 1 442
JAN 12 1 b5


20 LI :1088359.1:200171834608V 165 324
JAN 12 1


20 LI:1088359.1:200171835333V 199 1082
JAN 12 1


20 LI :1088359.1;20015923409H 259 577
JAN 12 1


20 LI;1088359.1;200171066229V 279 969
JAN 12 1


20 LI: i 088359. 6472286H 338 968
i :2001 JAN 1
12


20 LI ;1088359.1; 71 Ob5422V341 972
2001 JAN 12 1


20 LI:1088359.1;200171064427V 358 900
JAN 12 1


20 Lf:1088359.1;200171836729V 447 861
JAN 12 1


20 LI;1088359.1:200171835921 462 1128
JAN 12 V 1


20 LI:1088359.1:200171838277V 463 900
JAN 12 1


20 LI:1088359.1;200171064989V 476 1171
JAN 12 1


20 LI:1088359.1:200171 Ob487bV504 1115
JAN 12 1


20 LI:1088359.1:200171837042V 507 1363
JAN 12 1


20 LI:1088359.1:200171835214V 531 1286
JAN 12 1


_ __ _ -209 - _


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEES Template ID Component
ID NO; IC Start
Stop


20 LI;1088359.1:200171063437V 520 1234
JAN 12 1


20 LI;1088359.1:200171065765V 551 1211
JAN 12 1


20 LI;1088359.1:200171063360V 577 1282
JAN 12 T


20 LI: 1 088359.1:20017 l 837470V599 1244
JAN 12 1


20 LI:1088359.1;200171835745V1625 1324
JAN 12


20 LI:1088359.1:200171 Ob6606V633 1393
JAN 12 1


20 LI:1088359.1;200171066167V1638 1330
JAN 12


20 LI:1088359.1; 71834968V 645 1276
2001 JAN 12 1


20 LI:1088359.1:200171835067V1645 1284
JAN 12


20 LI:1088359.1;200171834934V 647 1346
JAN 12 1


20 LI;1088359.1;2001JAN1271249028V1672 1340


20 LI;1088359.1:200171835549V 664 1305
JAN 12 1


20 LI;1088359.1:20071835748V 668 1328
1 JAN 12 1


20 LI;1088359. 1 3778 T 684 997
;2001 JAN 12 42H 1


20 LI:1088359.1:200171837689V 691 1508
JAN 1 2 1


20 LI;1088359.1;200171834711 692 1224
JAN 12 V 1


20 LI;1088359.1;200171835154V 694 1329
JAN 12 1


20 LI;1088359.1:200171836394V 698 1369
JAN 12 1


20 LI ;1088359.1.200171834554V 699 1328
JAN 12 1


20 LI;1088359.1;20015379413H 703 968
JAN 12 1


20 LI;1088359.1:2001712493 716 1449
JAN 1 2 1 8V 1


20 LI: T 088359.1; 71064614V 729 1417
2001 JAN 12 1


20 LI ;1088359.1:200171836396V 729 1554
JAN 12 1


20 LI;1088359.1;200170868279V1828 1528
JAN 12


20 LI;1088359.1:20012201396F6 811 1250
JAN 1 2


20 LI:1088359.1:20012201396H 811 1104
JAN 12 1


20 LI:1088359.1;20014725513H 2024
JAN 12 1 2310


20 LI:1088359.1:20016779755H 2013
JAN 12 1 2582


20 L1:1088359.1:20015512450H 1252
JAN 12 1 1497


20 LI:1088359.1; 7183711 1 O 10
2001 JAN 12 OV 1 1782


20 LI:1088359.1; 71064542V 1 548
2001 JAN 12 1


20 LI:1088359.1:2005335469H 2005
1 JAN 12 1 2242


20 LI:1088359.1; 5335451 2005
2001 JAN 12 H 1 2244


20 LI:1088359.1:200171063814V 971 1667
JAN 12 1


20 LI:1088359.1; 71249963V 15 T
2001 JAN T 2 1 6 2117


20 LI;1088359. T 71063801 936 1572
; 2001 JAN 12 V 1


20 LI:1088359.1:2001g 1297935 930 1156
JAN 12


20 LI ;1088359.1:200171063143V 936 1615
JAN 12 1


20 LI; 1 088359.1:20017 1 836240V1200
JAN 12 1 2060


20 LI:1088359.1:200171835981 2106
JAN 12 V 1 2665


20 LI:1088359.1:20017100805H 2108
JAN 12 1 2581


20 LI :1088359.1:200145504127 2113
JAN 12 1 2637


20 LI: 1 088359.1:200116 1 772 2123
JAN 12 1 Tb 2672


20 LI:1088359.1:2001250377876 2164
JAN 12 2661


20 LI:1088359.1:2001405969076 2170
JAN 12 2633


20 LI:1088359.1:20015897856H 2170
JAN 12 1 2444


20 LI:1088359.1:2001JAN125897857H1 2170
2442


20 LI:1088359.1:2001g6300913 2202
JAN 12 2633


20 LI:1088359.1:200186576959 2222
JAN 12 2633


20 Li:1088359.1:2001848521 2226
JAN 12 b9 2633


___ _ -210 _____


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Start
Stop


20 LI:1088359.1:20012230817F6 2259
JAN 12 2765


20 LI:1088359.1:20012230817H 2259
JAN 12 1 2511


20 LI:1088359.1;20012585714F6 2262
JAN 12 2817


20 LI .1088359.1:20012585714H 2262
JAN 12 1 2537


20 LI;1088359.1;20015691260H 2263
JAN 12 T 2530


20 LI;1088359.1;2007350573H 2263
T JAN 12 1 2709


20 LI ;1088359.1; 5084 T 2267
200 7 JAN 12 56H 1 252
T


20 LI:1088359.1:2001258571476 2309
JAN 12 2811


20 LI :1088359.1:20017259781 2314
JAN 12 T6 2519


20 LI;1088359.1:2001g 1267511 2327
JAN 12 2705


20 LI :1088359.1:20011422186H 2348
JAN 12 1 2595


20 LI:1088359.1;2001421986H 2348
T JAN T 2 1 2559


20 LI:1088359.1:20013599245H 2353
JAN 12 1 2633


20 LI:1088359.1:2001g6462695 2370
JAN 12 2849


20 LI:1088359.1:2001g4970568 2371
JAN 12 2846


20 LI;1088359.1:20011311825H 2369
JAN 12 1 2628


20 LI:1088359.1;2001g3770489 2387
JAN 12 2852


20 LI:1088359. T g4264897 2429
:2001 JAN T 2848
2


20 LI:1088359.1;2001g2969595 2434
JAN 12 2633


20 L1:1088359.1:2001g3430805 2437
JAN 12 2849


20 LI:1088359.1;2001g3758006 2452
JAN 12 2828


20 L1:1088359.1:2001g3755236 2458
JAN 12 2846


20 LI:1088359.1;2001g727891 2461
JAN 12 b 2848


20 LI:1088359.1:2001g3298627 2466
JAN 12 2625


20 Li:1088359.1:20013424273H 2493
JAN 12 1 2633


20 LI;1088359.1;2001g3191621 2514
JAN 12 2847


20 Ll: T 088359.1:200182910683 2519
JAN 12 2770


20 LI;1088359.1:2001241452H 2543
JAN 12 1 2637


20 LI:1088359.1.200183134539 2641
JAN 12 2850


20 LI:1088359.1:200186035581 2726
JAN 12 2846


21 LI:813422.1:2001894136H 24 202
JAN 12 1


21 LI;813422.1;2001JAN3685359F6 46 515
12


21 LI:813422.1:20012497517F6 93 592
JAN 12


21 LI :813422.1:20012497517 93 413
JAN 12 H 1


21 LI:813422.1:2001701 67792V98 520
JAN 12 1


21 LI :813422.1:20017992231 104 554
JAN 12 H 1


2 T LI:813422.1:200170168921 131 65
JAN T 2 V T T


21 LI;813422.1:20018124465H 11 671
JAN 12 1


21 LI; 813422.1:200170164598V 1430
JAN 12 1 1890


21 LI:813422.1:2001g 1485371 1771
JAN 12 1911


21 LI;813422.1:2001JAN126593962F8 1 169


21 LI;813422.1:2001JAN126593962H1 1 169


21 LI :813422.1:20017978969H 11 650
JAN 12 1


21 LI:8 T 3422, 89359 T 25 315
i :2001 JAN H i
T 2


21 LI:813422.1:20018102680H 45 624
JAN 12 1


21 LI:813422. T 3685359H 46 358
:2001 JAN 12 1


21 LI;813422.1;2001JAN1284689970 203 625


21 LI;813422.1:2001JAN12368535976 278 556


21 LI:813422.1:2001249751776 380 534
JAN 12


21 LI:813422.1:200 70165351 407 888
T JAN 12 V 1


_ _ -21 i _ - ____


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO: IC Start
Stop


21 Li;813422, i 70169649V1392 534
:2001JAN 12


21 Li:8 i 3422.1; 70165894V 426 932
2001 JAN i 2 1


21 LI:813422.1; g4074539 413 534
2001 JAN 12


21 LI : 813422.1; 70166959V 491 1
2001 JAN 12 1 O
10


21 LI ; 813422.1; 7016631 505 984
2001 JAN 12 OV 1


21 LI;813422,i;2001JAN1270169334V1600 1080


21 LI:813422.1;200170168807V1731 1199
JAN 12


21 LI:813422.1:200170165598V 822' 1303
JAN 12 1


21 LI ; 813422. 70166092V 869 1371
i :2001 JAN 1
12


21 LI; 813422.1:2001l 006417 967 1150
JAN 12 H 1


21 LI:813422.1;200170165798V 998 1514
JAN 12 1


21 LI :813422.1; 70164520V 1059 1493
2001 JAN 12 1


21 LI :813422.1:20017016606 1116 1651
JAN 12 7 V i


21 LI:813422.1:20016526780H 1118 1714
JAN 12 1


21 LI :813422.1; 70168451 1137 1627
2001 JAN 12 V 1


21 LI:813422.1;20015263856H2 1149 1244
JAN 12


21 LI :813422.1:20013338768H 1244 1485
JAN 12 1


21 LI :813422.1:200170166692V 1277 1766
JAN 12 1


21 LI:813422.1; 466542H 1322 1470
2001 JAN 12 1


21 LI :813422.1; 6433091 1331 1735
200 i JAN 12 H 1


21 LI:813422.1; 6433091 1413 1735
2001 JAN 12 T8


22 LI:118b426.1:200403060276 1584 7
7 JAN 12 906


22 LI :1186426.1:20017068262H 1789 1906
JAN 12 i


22 LI:1186426.1:2001JAN121749930H1 1799 1906


22 LI:1186426.1:2001392992576 1513 1906
JAN 12


22 LI;1186426.1;2001376599276 1529 1906
JAN 12


22 LI:1186426.1;20013973648H 1445 1721
JAN 12 1


22 LI :1186426.1:2001263999576 1563 1906
JAN 12


22 LI:1 i 86426.1:2001414084679 26 550
JAN 12


22 LI:1186426.1:20017254119H 491 935
JAN 12 1


22 LI:1186426.1;20013280090F7 1697 1906
JAN 12


22 LI:1186426.1:2001g389786 1040 1462
JAN 12


22 ~ LI ;1186426.1:20011269724F 1117 1602
JAN 12 1


22 LI :1186426.1:20016934544H 1361 1875
JAN 12 1


22 LI :1186426.1:2001111294F 1503 i
JAN 12 1 906


22 LI:1186426.1:20013418022H2 1525 1769
JAN 12


22 LI:1186426.1;20011309523H 961 1135
JAN 12 1


22 LI:1186426.1:2001g3281713 i 1463
JAN 12 T
05


22 LI:1186426.1:200111129476 1556 1906
JAN 12


22 LI;118b426.1:2001g6476089 1689 1906
JAN 12


22 LI:1186426.1:20011269724H 1117 1324
JAN 12 i


22 LI :1186426.1:20013236461 424 641
JAN 12 H 1


22 LI :1186426.1.20012639995H 830 1076
JAN 12 1


22 LI:1186426.1;20014326477F6 1 381
JAN 12


22 LI : i 186426.1:20014123479H 1222 1457
JAN 12 1


22 LI :1186426.1;20014729333H 1304 1392
JAN 12 1


22 LI:1186426.1:2001g2823562 1641 1906
JAN 12


22 LI:1186426.1:2001g2162375 725 915
JAN 12


22 LI:1186426.1:20016420281 653 1184
JAN 12 F7


22 L!:1186426.1:20016l 66536F8739 1220
JAN 12


.-__ X12 -.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEA ID Template ID Component
NO: IC Start
Stop


22 LI;118b426.1:20012639995F6 830 1347
JAN 12


22 LI;1186426.1:20015089719H 1005 1291
JAN 12 1


22 LI;1186426.1:2001g 1158071 1771 1906
JAN 12


22 LI:1186426.1;2001g5394522 1629 1906
JAN 12


22 LI:1186426.1;2001g4308651 1731 1906
JAN 12


22 LI:118b42b.1;20017278738H 1026 1590
JAN 12 1


22 LI;1186426.1;20011396166H 1252 1508
JAN 12 1


22 LI;118642b.1:20017081835H 125 677
JAN 12 1


22 LI;118b42b.1:20011946252H 961 1173
JAN 12 1


22 LI. T 186426.1;20012964009F6 1432 1743
JAN 12


22 LI:118642b.1;20012964009H 1433 1733
JAN 12 i


22 LI:1186426.1;20018016538) 650 1137
JAN 12 1


22 LI: i 186426.1:2001707351 i 689
JAN 12 OH 1 i
9


22 LI;118642b.1:200i6594866H2 204 442
JAN 12


22 LI;118b42b.1; 725411988 491 1055
200 i JAN 12


22 LI:1186426.1:20016405368H 315 599
JAN 12 1


22 LI .1186426.1:2001g 1994599 1754 1906
JAN 12


22 LI:118642b.1:20012639996F6 830 1218
JAN 12


22 LI:1186426.1:20012639996T6 1408 1938
JAN 12


22 LI:118642b.1:20014779651 7 1646
JAN 12 H 1 413


22 LI:1186426.1:20013398318H 1431 1691
JAN 12 1


22 LI;118b42b.1:20016420281 1446 1958
JAN 12 T8


22 LI ;1186426.1:20013973648T7 1492 1948
JAN 12


22 LI :1186426.1:2001815993H 1490 1725
JAN 12 1


22 LI :1186426.1:2001g 1636974 1586 1880
JAN 12


22 , LI:1186426.1:2001g4971669 1595 1906
JAN 12


22 LI:118642b.1;20014326477T6 1606 1906
JAN 12


22 LI:1186426.1:20013280090H 1 1837
JAN 12 1 6l
4


22 LI:1186426.1:20015658235H 1688 1942
JAN 12 1


22 LI;1186426.1:20017068162H 1734 1906
JAN 12 1


22 LI :1186426.1:20013530095H 1794 1938
JAN 12 1


22 LI:1186426.1;20013530095F6 1795 1906
JAN 12


22 LI: i 186426.1:2001400967H 1110 i
JAN i 2 i 287


22 LI:1186426. i 4326477H 3 1
:2001 JAN T 1 b0
2


22 LI:118b42b.1:2001375201 i 1915
JAN i 2 OTb 737


22 LI:118b426.1:20013888314H 52 314
JAN 12 1


22 LI;1186426.1:20011335071 515 737
JAN 12 H 1


22 LI:118642b.1:20012639980H 830 1081
JAN 12 1


22 LI:118b426.1:20017 760413H 1442 1684
JAN 12 1


22 LI;118b426.1:2001784584H 430 641
i JAN 12 1


22 LI;118642b.1:20011269724F6 1117 1519
JAN 12


23 LI:1182817.1:200193677386 1848 2321
JAN 12


23 LI :1182817.1:20014515965F8 109 297
JAN 12


23 LI:1182817.1;2001JAN12630308H1 107 236


23 LI :1182817,1:20014526224H 111 388
JAN 12 1


23 LI:1182817.1:2001JAN1271076180V11997 2541


23 LI:1182817.1:200171075053V 2008 2480
JAN 12 1


23 LI:1182817.1:20014761107H 1612 1898
JAN 12 1


23 LI;1182817.1:20015283776H 1559 1724
JAN 12 1


23 LI :1182817.1:200170012416D 4036 4158
JAN 12 1


_ . _ _ _ .__ ___ _____ __ _,
213


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO: IC Start
Stop


23 LI :11828 7 7,1:200170004713 3977
JAN 12 D 1 4158


23 LI:1182817.1;200171078853V 1642
JAN 12 1 2263


23 LI;1182817,1;20016815352F8 4518
JAN 12 4581


23 LI:1182817,1;20015296321 2513
JAN 12 T6 2895


23 LI:1182817.1; 70818565V 2586
2001 JAN 12 1 2917


23 LI:1182817,1;2001g2820002 3346
JAN 12 3673


23 LI:1182817,1:20012192463H 3763
JAN 12 1 3895


23 LI;1182817,1;20014054850H 121 409
JAN 12 T


23 LI;1 i 82817.1;20012072020H 98 350
JAN i 2 1


23 LI .1182817.1:200170875970V 2134
JAN 12 1 2337


23 LI;1182817,1:20016045182H 1880
JAN 12 1 2331


23 LI;1182817.1:20017979070H 1687
JAN 12 1 2316


23 L!:1182817.1:20015580371 2064
JAN 12 H2 2313


23 LI:1182817,1:20015989373F6 3959
JAN 12 4158


23 LI;1182817,1;20017119092F8 1 b7
JAN 12 697


23 LI:1182817,1;2001g5365749 3819
JAN 12 4158


23 LI:1182817,1;2001g2344282 447 712
JAN i 2


23 LI:1182817.1:200ig5398345 455 712
JAN 12


23 LI:1182817.1:20015960949F8 542 1148
JAN 12


23 LI;1182817.1;20013663229H 523 801
JAN 12 1


23 LI;1182817. 7 7 7 080192V1837
;2001 JAN i 1 2262
2


23 LI;1182817.1;200171077979V12080
JAN 12 2499


23 LI :1182817.1:20015702634H 112 376
JAN 12 1


23 LI;1182817,1:20015312350H 115 363
JAN 12 1


23 LI:1182817,1:2001g3959779 145 464
JAN 12


23 LI ;1182817.1:20017405974H 646 1035
JAN 12 1


23 LI:1182817.1;200170876563V 2245
JAN 12 1 2448


23 LI:1182817.1:20014441737F8 2420
JAN 12 3049


23 LI:1182817.1;2001444173778 2420
JAN 12 2969


23 LI:1182817.1:200171080855V 2496
JAN 12 1 2934


23 LI;1182817,1;20015081881 1195
JAN 12 H 1 1378


23 LI:1182817,1;20015950751 116 730
JAN 12 F6


23 LI:1182817,1:2001g498 i 107 5577
JAN 12 5 i


23 LI;1182817,1:2001598540776 116 539
JAN 12


23 LI;1182817.1:200162524786 150 759
JAN 12


23 LI;1182817,1:20016983827H 173 507
JAN 12 1


23 LI:1182817. i 703041586 274 868
:2001 JAN 12


23 LI:1182817.1:2001g4088247 294 764
JAN 12


23 LI :1182817.1:20014107871 438 685
JAN 12 T6


23 LI ;1182817.1:20016989339H 118 288
JAN 12 1


23 LI:11828 i 7. 70006686D 3809
i :2001 JAN 1 4 i
12 58


23 LI;1182817, i 70002146D 3687
:2001 JAN 12 1 3895


23 LI :1182817,1:20017405488H 3860
JAN 12 1 4158


23 LI;1182817,1;20016885580H 1034
JAN 12 1 1349


23 LI:1182817,1; g 1258928 1898
2001 JAN 12 2185


23 LI:1182817,1:20013680137H 3484
JAN 12 1 3771


23 LI:1182817,1.2001g3147068 3387
JAN 12 3734


23 LI;1182817,1:200107848486 4498
JAN 12 4581


23 LI;1182817.1:200170003344D 3955
JAN i 2 1 4158


23 LI:1182817.1:2001228051386 95 529
JAN 12


__ _X14___ _.


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Start
Stop


23 LI;11828 T 7.1:200182703382 3881
JAN 12 4158


23 LI;1182817.1;2001JAN1285152312 2556
2933


23 LI:1182817.1;20013957291 112 389
JAN 12 H2


23 LI:1182817.1;20011650615H 103 319
JAN 12 1


23 LI;1182817.1:2001625247H 150 412
JAN 12 1


23 Li;1182817.1:20015283776F7 1559
JAN 12 2015


23 LI: T i 828 i 70009908D 3990
7.1;200 i JAN 1 4158
i 2


23 LI:1182817.1; 5704237H 112 386
200 T JAN T 1
2


23 LI:1182817.1:200170008902D 3983
JAN 12 1 4466


23 L1;1182817.1:20014890756F6 4034
JAN 12 4570


23 LI:1182817.1;20014890756H 4034
JAN 12 1 4308


23 LI:1182817.1:200170010596D 4064
JAN 12 1 4483


23 LI;1182817.1:20011979451 4100
JAN 12 H 1 4301


23 Li:1182817.1:2001078484H 4239
JAN 12 i 4537


23 LI:1182817.1:2004057979H 4452
T JAN 12 1 4581


23 LI:1182817.1:200186576327 470 712
JAN 12


23 LI:1182817.1:200171075096V 1885
JAN 12 1 2328


23 LI:1182817.1;200171079767V12401
JAN 12 2932


23 LI:1182817.1:20015950751 111 432
JAN 12 H 1


23 LI:1182817.1:20012344725H 119 340
JAN 12 1


23 LI;1182817.1;2001g 1476715 3471
JAN T 2 3896


23 LI; T 182817.1;20017329981 1781
JAN 12 H T 2217


23 LI:1182817.1; 84687524 1821
2001 JAN 12 2239


23 LI:1182817.1; 736134H 613 712
2001 JAN 12 1


23 LI:1182817.1; 3047721 112 407
2001 JAN 12 H 1


23 LI;1182817.1;200171834483V191 951
JAN 12


23 LI .1182817.1:2001570178877 111 596
JAN 12


23 LI:1182817, i 6407301 88 582
; 200 i JAN H i
12


23 LI:1182817.1;20014594380H2 88 350
JAN 12


23 LI :1182817.1; 228051376 220 674
2001 JAN 12


23 LI:1182817.1:20012052608H 908 1199
JAN T 2 1


23 LI :1182817.1:20015293501 1176
JAN 12 H 1 1366


23 LI:1182817.1;20015960949H 533 1094
JAN 12 1


23 LI:1182817.1; 71834509V 499 875
2001 JAN 12 1


23 LI;1182817.1:20016885580F8 903 1349
JAN 12


23 LI: i 182817.1:2007636187H 1318
i JAN 12 1 1792


23 LI;11828 T 7.1:20011877551 4090
JAN 12 H 1 4158


23 LI:1182817.1;20013048843H 112 395
JAN 12 1


23 LI;1182817.1:20013209530F6 3421
JAN 12 3911


23 LI :1182817.1:20015403290H 3509
JAN 12 1 3773


23 LI;1182817.1:20014593603H 3562
JAN 12 1 3860


23 LI:1182817.1:2001700 i 03083687
JAN 12 D 1 4036


23 LI:1182817. T 86709927 3739
; 2001 JAN 12 419
i


23 LI;1182817.1:20012192463F6 3763
JAN 12 4192


23 L1:1182817.1:20015989373H 3938
JAN 12 1 4188


23 LI:1182817.1:20016058282F8 3958
JAN 12 4158


23 LI:1182817.1:20015684836H 98 332
JAN 12 1


23 LI:1182817.1:20017119092F6 1 b7
JAN 12 771


23 LI;1182817.1:200171079787V12012
JAN 12 2631


23 Lt:1182817.1:200171078724V T 740
JAN 12 1 2342


_. _ _ .__ 215 _ _._


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO: IC Start
Stop


23 LI;1182817.1;20016045182) 1880
JAN 12 1 2331


23 LI;1182817.1:200171078819V 1996
JAN 12 1 2663


23 LI;1182817.1:200170011951 3928
JAN 12 D 1 4158


23 LI:1182817.1:20017033305H 1702
JAN 12 1 2261


23 LI;1182817.1:20015651323H 1723
JAN 12 1 2267


23 LI:1182817.1:20012280513H 95 376
JAN 12 1


23 LI;1182817.1;20017430784H 2310
JAN 12 1 2549


23 LI;1182817.1:20016815352H 4518
JAN 12 1 4603


23 LI;1182817.1;20017035275H 4498
JAN 12 1 4592


23 LI:1182817.1:200170006655014498
JAN 12 4790


23 LI:1182817.1:20014885835H 4475
JAN 12 1 4581


23 LI: i i 82817.1;2001320953076 4498
JAN 12 4570


23 LI;1182817.1:200170005861 4498
JAN 12 D 1 4581


23 LI;1182817.1:20012766706H 115 406
JAN 12 1


23 LI:1182817.1;200171077855V 2080
JAN 12 1 2506


23 LI:1182817.1;20015701788F7 111 698
JAN 12


23 LI:1182817.1:20014441737H 2420
JAN 12 1 2554


23 LI:1182817.1;2001g 1163665 107 270
JAN 12


23 LI:1182817.1:20011553221 58 551
JAN 12 F6


23 LI ;11828 T 7.1:20015985407 1 664
JAN 12 F6


23 LI:1182817.1:20016904578H 75 604
JAN 12 1


23 LI:1182817.1:20015985407H 2 287
JAN 12 1


23 LI; i i 828 T i 553221 51 545
7.1:2001 JAN T6
12


23 LI;11828 T 7.1;2001698535988 79 299
JAN 12


23 Li;1182817.1;20015701788H i 10
JAN 12 1 379


23 LI;11828 T 7.1;20016765960H 3244
JAN 12 1 3645


23 LI:1182817.1; 3436771 3995
2001 JAN 12 H 1 4158


23 LI:1182817.1;20015704329H 111 366
JAN 12 1


23 LI:1182817.1;20015906230H 2674
JAN 12 1 2970


23 LI;1182817.1:2001g5754660 2807
JAN 12 3060


23 LI:1182817.1:20015637896H 2841
JAN 12 1 3099


23 LI;1182817.1:20014613246H 2863
JAN 12 1 3104


23 LI;1182817.1;20016478968F6 2955
JAN 12 3570


23 LI:1182817.1:20016478968H 2955
JAN 12 1 3544


23 LI: T T 82817.1:20015374252H 33 T
JAN 12 1 3 3535


23 LI;1182817.1:2001g6989944 3323
JAN 12 3734


23 LI:1182817.1:200170009231 3415
JAN 12 D 1 3906


23 LI;1182817.1:2004906748H2 98 390
i JAN 12


23 LI:1182817.1.20014515965H 112 349
JAN 12 1


23 LI;1182817.1:2007412642H 1762
JAN 12 1 1969


23 LI:1182817.1;20017400633H 3425
JAN 12 1 3930


23 LI:1182817.1:20015637796H 2840
JAN 12 1 3099


23 LI:1182817.1.2001605965H 508 712
JAN 12 1


23 LI:1182817.1:20018184771 2821
JAN 12 H 1 3341


23 LI:1182817.1:20017119092H 167 569
JAN 12 1


23 LI :1182817.1:200171078169V 2340
JAN 12 1 2700


23 LI:1182817.1:2001g296457 2107
JAN 12 3078


23 LI:1182817.1:2001547620H 4498
JAN 12 1 4581


23 LI ;1182817.1:20017039205H 1959
JAN 12 1 2437


23 LI:1182817.1:200171076276V 2595
JAN 12 1 2932


_._216 - _


CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template 1D Component
NO: IC Start
Stop


23 LI:1182817.1;2001604518288 1880
JAN 12 2330


23 LI;1182817.1;2001619505H 3405
JAN 12 1 3650


23 LI:1182817.1;20016045182F8 1880
JAN 12 2331


23 LI:1182817.1:200170007048D13770
JAN 12 4158


23 LI:1182817.1:200154762081 4495
JAN 12 4581


23 LI:1182817.1;20015635348H 2840
JAN 12 1 31 O1


23 LI:1182817.1:20016970765H 3925
JAN 12 1 4158


23 LI;1182817.1:20013209530H 3422
JAN 12 1 3599


23 LI:1182817.1;20013682137H 3484
JAN 12 1 3775


23 LI :1182817.1:20016407333 95 634
JAN 12 H T


23 LI;1182817.1:2001155322 95 238
JAN T 2 i H 1


24 Li; l i 70153.9;20016780 i 235 646
JAN 12 22J i


24 LI:1170153.9:20017637752H 1 492
JAN 12 1


24 LI:1170153.9;20017711512) 358 982
JAN i 2 1


24 Li: i 170153.9;2007711512H2 890 1411
i JAN 12


25 LI:1171553.1:2001g i 1399622929
JAN 12 3148


25 LI :1171553.1:20017279338H 1527
JAN 12 1 2023


25 LI:1171553.1;20012766073F6 32 524
JAN 12


25 LI:1171553.1:20016847280H 1718
JAN 12 1 2229


25 LI:1171553.1:20016847279H 1719
JAN 12 1 2222


25 LI:1171553.1:20017030822H 1505
JAN 12 1 2034


25 LI;1171553.1:200170749683V 1025
JAN 12 1 1592


25 LI :1171553.1;20014357194H 1045
JAN 12 1 1147


25 LI:1171553.1:2001g728172 1066
JAN 12 1333


25 LI:1171553.1:200170746065V 11 O
JAN 12 1 1 1676


25 LI:1171553.1;20016847280F6 1718
JAN 12 2238


25 LI :1171553.1:20016803301 34 400
JAN 12 H 1


25 LI;1171553.1:20016803301 34 410
JAN 12 J 1


25 LI:1171553.1;20017584018H 39 585
JAN 12 1


25 LI :1171553.1; 4622596H 57 301
2001 JAN 12 1


25 LI:1171553.1:2001579075H 59 320
JAN 12 1


25 LI :1171553.1:20012210964H 60 328
JAN 12 1


25 LI:1171553.1:20017269665H 172 689
JAN 12 1


25 LI: i 171553.1;200493487H 225 467
i JAN 12 1


25 LI:1171553.1:2001334614476 2619
JAN 12 3132


25 LI:1171553.1; 554389678 1641
2001 JAN i 2 220
i


25 LI:1171553.1:200170748784V 1486
JAN 12 1 2057


25 LI:1171553.1;20013965837H 854 964
JAN 12 1


25 LI;1171553.1:20013962267H 854 1084
JAN 12 1


25 LI :1171553.1; 70755507V 858 1049
2001 JAN 12 1


25 LI ;1171553.1:2001g 1927512 801 1268
JAN 12


25 LI:1171553.1:20018186679H 932 7
JAN 12 1 554


25 LI:1171553.1:200170746640V 947 1273
JAN 12 1


25 LI:1171553.1:20016570232H 809 1353
JAN 12 1


25 LI ;1171553.1:200170749500V 952 1020
JAN 12 1


25 LI :1171553.1:20016570232F8 835 1353
JAN 12


25 LI:1171553.1; 396226778 853 1276
2001 JAN 12


25 LI:1171553.1:20013962293H 854 1123
JAN 12 1


25 LI:1171553.1:20011525067H 990 1194
JAN 12 1


25 LI :1171553.1:2001g 1634138 992 1378
JAN 12


_ .__ . __ _-
217 1




CA 02434677 2003-07-09
WO 02/079473 PCT/US02/01009
TABLE 5
SEQ ID Template ID Component
NO; IC Starf
Stop


25 LI:1171553.1:20016465961 692 1262
JAN 12 F7


25 LI:1 T 71553. 6465961 692 1175
T :2001 JAN H 1
T 2


25 LI:1171553. T 70746313V 702 13
:2001 JAN 12 T T 0


25 LI;1171553.1;20016465961 717 1273
JAN 12 F8


25 LI;1171553.1;20016570232F6 745 1353
JAN 12


25 LI:1171553.1:2001JAN1270748165V1772 992


25 LI. T 17 T 553.1:200170747578V 1642
JAN 12 1 2230


25 LI.117 T 553.1:200150772481 1658
JAN 12 2 T
02


25 LI: i 171553. 507724H 1658
T :2001 JAN 1 1958
12


25 LI: T 171553.1:2001g505547 1561
JAN 12 1902


25 LI:1171553.1:2001396229379 1585
JAN 12 2087


25 LI;1171553.1;2001JAN123602618H1 1594
1910


25 LI:1171553.1;2001290776H 1807
JAN 12 1 2108


25 LI : T 171553.1;200118471276 1814
JAN 12 2183


25 LI:1171553.1;2001485856577 1822
JAN 12 2105


25 LI: T 171553.1:200g4988603 1835
i JAN 12 2223


25 LI: T 171553.1:2001g3960390 1835
JAN 12 2221


25 LI:1171553. T g 1927397 1853
:2001 JAN T 2221
2


25 LI:1171553.1:200180 T 7138)1894
JAN 12 1 2548


25 LI:1171553.1;20013384759F6 1910
JAN 12 2209


25 LI :1171553.1:20013384759H 1910
JAN 12 1 2169


25 LI:1171553.1;2001g4302161 1919
JAN 12 2222


25 LI :1171553.1; 70749172V 1960
2001 JAN 12 1 2572


25 LI:1 T 71553.1;200g2787880 1983
T JAN 12 2141


25 LI:1171553.1:200170747353V 2 i 79
JAN 12 1 2724


25 LI:1 T 71553.1:200170749489V 2180
JAN 12 1 2650


25 LI :1171553.1:20015574155H 2196
JAN 12 1 2448


25 LI:1171553.1; 70755004V 2200
2001 JAN 12 1 2348


25 LI:1171553.1:20016052963) 2302
JAN 12 1 2656


25 LI;1171553.1;20017690493H 2419
JAN 12 1 2507


25 LI;1171553.1:2001557415579 2501
JAN 12 3038


25 LI: i 171553.1; 276607376 2548
2001 JAN 12 31 i
1


25 LI ;1171553.1:200170747745V 1127
JAN 12 1 1742


25 LI:117 T 553.1:200170750202V 1148
JAN 12 1 T 612


25 LI:1171553.1:200118471286 1194
JAN 12 1548


25 LI:1171553.1:2001JAN12184712F1 1659
2241


25 LI;1171553.1:20016052963H 1669
JAN 12 1 2170


25 LI:1171553. T 70754414V 1512
:2001 JAN 12 T 2077


25 LI:1171553.1:20012766073H 32 324
JAN 12 1


25 LI:1171553. T 3371190H 1 209
;2001 JAN 12 1


25 LI:1171553.1:200170747296V 25 598
JAN 12 1


25 LI:1171553.1:200170761984V 32 354
JAN 12 1


25 LI:1171553.1:200170748552V 550 1139
JAN 12 1


25 LI:1171553.1:20016l 69913H 1538
JAN 12 1 1838


25 LI:1171553.1;20017042860H 1597
JAN 12 1 21 b3


25 LI:1171553.1;2001704286088 1598
JAN 12 2194


25 LI :1171553.1:20017042860F8 1598
JAN 12 2138


25 LI:1171553.1:200170749765V 348 964
JAN 12 1


25 LI:1171553. i 55068831 385 518
:2001 JAN 12 J 1


25 LI:1171553.1:200155068831 409 544
JAN 12 H 1


____-- ____
21~




DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 218
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 218
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
2002045
NOTE POUR LE TOME / VOLUME NOTE:

Representative Drawing

Sorry, the representative drawing for patent document number 2434677 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2002-01-09
(87) PCT Publication Date 2002-10-10
(85) National Entry 2003-07-09
Dead Application 2005-10-12

Abandonment History

Abandonment Date Reason Reinstatement Date
2004-10-12 FAILURE TO RESPOND TO OFFICE LETTER
2005-01-10 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $300.00 2003-07-09
Maintenance Fee - Application - New Act 2 2004-01-09 $100.00 2003-12-23
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
PANZER, SCOTT R.
LINCOLN, STEPHEN E.
ALTUS, CHRISTINA M.
DUFOUR, GERARD E.
HILLMAN, JENNIFER L.
JONES, ANISSA L.
DAM, TAM C.
LIU, TOMMY F.
HARRIS, BERNARD
FLORES, VINCENT
DAFFO, ABEL
MARWAHA, RAKESH
CHEN, ALICE J.
CHANG, SIMON C.
GERSTIN, JR. EDWARD H.
PERALTA, CAREYNA H.
DAVID, MARIE H.
LEWIS, SAMANTHA A.
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Abstract 2003-07-09 1 88
Claims 2003-07-09 4 191
Cover Page 2003-09-17 2 52
Description 2003-07-09 220 15,251
Description 2003-07-09 110 6,565
PCT 2003-07-09 1 27
Assignment 2003-07-09 3 123
Correspondence 2003-09-15 1 24
Prosecution-Amendment 2003-07-09 2 53
PCT 2003-07-09 1 48
PCT 2003-07-10 8 390

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :