Note: Descriptions are shown in the official language in which they were submitted.
DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1
NOVEL NUCLEIC ACIDS AND SECRETED
POLYPEPTIDES
1. CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part application of U.S. Application
Serial No.
09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries",
Attorney Docket No. 784CIP, which in turn is a continuation-in-part
application of U.S.
Application Serial No. 09/488,725 filed, January 21, 2000 entitled "Novel
Contigs Obtained
from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No.
09/491,404
filed January 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries'.', Attorney
Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000
entitled "Novel
Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in
turn is a
continuation-in-part application of U.S. Application Serial No. 09/496,914
filed February 03,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 787;
U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 788CIP, which in turn
is,a
continuation-in-part application of U.S. Application Serial No. 09/515,126
filed February 28,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 788;
U.S. Application Serial No. 091574,454 filed May 19, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/519,705
filed March 07,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 789;
U.S. Application Serial No. 091649,167 filed August 23, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/540,217
filed March 31,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 790;
U.S. Application Serial No. 09/770,160 filed January 26, 2001 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 791CIP, which is in turn
a
continuation-in-part application of U.S. Application Serial No. 091552,929
filed April 18,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 791;
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 792; all of which are
incorporated
herein by reference in their entirety.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
2
2. BACKGROUND OF THE INVENTION
2.1 TECHNICAL FIELD
The present invention provides novel polynucleotides and proteins encoded by
such
polynucleotides, along with uses for these polynucleotides and proteins, for
example in
therapeutic, diagnostic and research methods.
2.2 BACKGROUND
Technology aimed at the discovery of protein factors (including e.g.,
cytokines, such
as lymphokines, interferons, circulating soluble factors, chemokines, and
interleukins) has
matured rapidly over the past decade. The now routine hybridization cloning
and expression
cloning techniques clone novel polynucleotides "directly" in the sense that
they rely on
information directly related to the discovered protein (i.e., partial
DNA/amino acid sequence
of the protein in the case of hybridization cloning; activity of the protein
in the case of
expression cloning). More recent "indirect" cloning techniques such as signal
sequence
cloning, which isolates DNA sequences based on the presence of a now well-
recognized
secretory leader sequence motif, as well as various PCR-based or low
stringency
hybridization-based cloning techniques, have advanced the state of the art by
making
available large numbers of DNA/amino acid sequences for proteins that are
known to have
biological activity, for example, by virtue of their secreted nature in the
case of leader
sequence cloning, by virtue of their cell or tissue source in the case of PCR-
based
techniques, or by virtue of structural similarity to other genes of known
biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications
in,
for example, diagnostics, forensics, gene mapping; identification of mutations
responsible
for genetic disorders or other traits, to assess biodiversity, and to produce
many other types
of data and products dependent on DNA and amino acid sequences.
3. SUMMARY OF THE INVENTION
The compositions of the present invention include novel isolated polypeptides,
novel
isolated polymcleotides encoding such polypeptides, including recombinant DNA
molecules,
cloned genes or degenerate variants thereof, especially naturally occurring
variants such as
allelic variants, antisense polynucleotide molecules, and antibodies that
specifically recognize
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
one or more epitopes present on such polypeptides, as well as hybridomas
producing such
antibodies.
The compositions of the present invention additionally include vectors,
including
expression vectors, containing the polynucleotides of the invention, cells
genetically engineered
to contain such polynucleotides and cells genetically engineered to express
such
polynucleotides.
The present invention relates to a collection or library of at least one novel
nucleic acid
sequence assembled from expressed sequence tags (ESTs) isolated mainly by
sequencing by
hybridization (SBH), and in some cases, sequences obtained from one or more
public
databases. The invention relates also to the proteins encoded by such
polynucleotides, along
with therapeutic, diagnostic and research utilities for these polynucleotides
and proteins. These
nucleic acid sequences axe designated as SEQ ID NO: 1-1041, or 2083-2534 and
are provided
in the Sequence Listing. In the nucleic acids provided in the Sequence
Listing, A is adenine; C
is cytosine; G is guanine; T is thymine; and N is any of the four bases or
unknown. In the
amino acids provided in the Sequence Listing, * corresponds to the stop codon.
The nucleic acid sequences of the present invention also include, nucleic acid
sequences
that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under
stringent
hybridization conditions; nucleic acid sequences which are allelic variants or
species
homologues of any of the nucleic acid sequences recited above, or nucleic acid
sequences that
encode a peptide comprising a specific domain or truncation of the peptides
encoded by SEQ
ID NO: 1-1041, or 2083-2534. A polynucleotide comprising a nucleotide sequence
having at
least 90% identity to an identifying sequence of SEQ m NO: 1-1041, or 2083-
2534 or a
degenerate variant or fragment thereof. The identifying sequence can be 100
base pairs in
length.
The nucleic acid sequences of the present invention also include the sequence
information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-
2534. The
sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-
2534 that
uniquely identifies or represents the sequence information of SEQ ID NO: 1-
1041, or 2083-
2534.
A collection as used in this application can be a collection of only one
polynucleotide.
The collection of sequence information or identifying information of each
sequence can be
provided on a nucleic acid array. In one embodiment, segments of sequence
information are
provided on a nucleic acid array to detect the polynucleotide that contains
the segment. The
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4
array can be designed to detect full-match or mismatch to the polynucleotide
that contains the
segment. The collection can also be provided in a computer-readable format.
This invention also includes the reverse or direct complement of any of the
nucleic acid
sequences recited above; cloning or expression vectors containing the nucleic
acid sequences;
and host cells or organisms transformed with these expression vectors. Nucleic
acid sequences
(or their reverse or direct complements) according to the invention have
numerous applications
in a variety of techniques known to those skilled in the art of molecular
biology, such as use as
hybridization probes, use as primers for PCR, use in an array, use in computer-
readable media,
use in sequencing full-length genes, use for chromosome and gene mapping, use
in the
recombinant production of protein, and use in the generation of anti-sense DNA
or RNA, their
chemical analogs and the like.
In a preferred embodiment, the nucleic acid sequences of SEQ m NO: 1-1041, or
2083-
2534 or novel segments or parts of the nucleic acids of the invention are used
as primers in
expression assays that are well knov~m in the art. In a particularly preferred
embodiment, the
nucleic acid sequences of SEQ m NO: 1-1041, or 2083-2534 or novel segments or
parts of the
nucleic acids provided herein are used in diagnostics for identifying
expressed genes or, as well
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992),
as expressed
sequence tags for physical mapping of the human genome.
The isolated polynucleotides of the invention include, but are not limited to,
a
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ
ID NO: 1-
1041, or 2083-2534; a polynucleotide comprising aaiy of the full length
protein coding
sequences of SEQ )D NO: 1-1041, or 2083-2534; and a polynucleotide comprising
any of the
nucleotide sequences of the mature protein coding sequences of SEQ ~ NO: 1-
1041, or 2083-
2534. The polynucleotides of the present invention also include, but are not
limited to, a
polynucleotide that hybridizes under stringent hybridization conditions to (a)
the complement of
any one of the nucleotide sequences set forth in SEQ m NO: 1-1041, or 2083-
2534; (b) a
nucleotide sequence encoding any one of the amino acid sequences set forth in
SEQ m NO: 1-
1041, or 2083-2534; (c) a pol5mucleotide which is an allelic variant of any
polynucleotides
recited above; (d) a polynucleotide which encodes a species homolog (e.g.
orthologs) of any of
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide
comprising a
specific domain or truncation of any of the polypeptides comprising an amino
acid sequence set
forth in SEQ m NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide
comprising any of the amino acid sequences set forth in the Sequence Listing;
or the
corresponding full length or mature protein. Polypeptides of the invention
also include
polypeptides with biological activity that are encoded by (a) any of the
polynucleotides having
a nucleotide sequence set forth in SEQ B7 NO: 1-1041, or 2083-2534; or (b)
polynucleotides
that hybridize to the complement of the polynucleotides of (a) under stringent
hybridization
conditions. Biologically active variants of any of the polypeptide sequences
in the Sequence
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%,
70%, 75%, 80%,
85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain
biological
activity are also contemplated. The polypeptides of the invention may be
wholly or partially
chemically synthesized but are preferably produced by recombiilant means using
the genetically
engineered cells (e.g. host cells) of the invention.
The invention also provides compositions comprising a polypeptide of the
invention.
Polypeptide compositions of the invention may further comprise an acceptable
carrier, such
as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a
polynucleotide of the invention.
The invention also relates to methods for producing a polypeptide of the
invention
comprising growing a culture of the host cells of the invention in a suitable
culture medium
under conditions permitting expression of the desired polypeptide, and
purifying the
polypeptide from the culture or from the host cells. Preferred embodiments
include those in
which the protein produced by such processes is a mature form of the protein.
Polynucleotides according to the invention have numerous applications in a
variety
of techniques known to those skilled in the art of molecular biology. These
techniques
include use as hybridization probes, use as oligomers, or primers, for PCR,
use for
chromosome and gene mapping, use in the recombinant production of protein, and
use in
generation of anti-sense DNA or RNA, their chemical analogs and the like. For
example,
when the expression of an mRNA is largely restricted to a particular cell or
tissue type,
polynucleotides of the invention can be used as hybridization probes to detect
the presence
of the particular cell or tissue mRNA in a sample using, e.g., ira situ
hybridization.
In other exemplary embodiments, the polynucleotides are used in diagnostics as
expressed sequence tags for identifying expressed genes or, as well known in
the art and
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
6
exemplified by Vollrath et al., Science 25:52-59 (1992), as expressed sequence
tags for
physical mapping of the human genome.
The polypeptides according to the invention can be used in a variety of
conventional
procedures and methods that are currently applied to other proteins. For
example, a
polypeptide of the invention can be used to generate an antibody that
specifically binds the
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful
for detecting or
quantitating the polypeptide in tissue. The polypeptides of the invention can
also be used as
molecular weight markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical
condition which comprises the step of administering to a mammalian subject a
therapeutically effective amount of a composition comprising a polypeptide of
the present
invention and a pharmaceutically acceptable carrier.
In particular, the polypeptides and polynucleotides of the invention can be
utilized,
for example, in methods for the prevention and/or treatment of disorders
involving aberrant
protein expression or biological activity.
The present invention further relates to methods for detecting the presence of
the
polynucleotides or polypeptides of the invention in a sample. Such methods
can, for
example, be utilized as part of prognostic and diagnostic evaluation of
disorders as recited
herein and for the identification of subjects exhibiting a predisposition to
such conditions.
The invention provides a method for detecting the polynucleotides of the
invention in a
sample, comprising contacting the sample with a compound that binds to and
forms a
complex with the polynucleotide of interest for a period sufficient to form
the complex and
under conditions sufficient to form a complex and detecting the complex such
that if a
complex is detected, the polynucleotide of interest is detected. The invention
also provides a
method for detecting the polypeptides of the invention in a sample comprising
contacting the
sample with a compound that binds to and forms a complex with the polypeptide
under
conditions and for a period sufficient to form the complex and detecting the
formation of the
complex such that if a complex is formed, the polypeptide is detected.
The invention also provides kits comprising polynucleotide probes and/or
monoclonal antibodies, and optionally quantitative standards, for carrying out
methods of the
invention. Furthermore, the invention provides methods for evaluating the
efficacy of drugs,
and monitoring the progress of patients, involved in clinical trials for the
treatment of
disorders as recited above.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
7
The invention also provides methods for the identification of compounds that
modulate (i.e., increase or decrease) the expression or activity of the
polynucleotides and/or
polypeptides of the invention. Such methods can be utilized, for example, for
the
identification of compounds that can ameliorate symptoms of disorders as
recited herein.
Such methods can include, but are not limited to, assays for identifying
compounds and
other substances that interact with (e.g., bind to) the polypeptides of the
invention. The
invention provides a method for identifying a compound that binds to the
polypeptides of the
invention comprising contacting the compound with a polypeptide of the
invention in a cell
for a time sufficient to form a polypeptide/compound complex, wherein the
complex drives
expression of a reporter gene sequence in the cell; and detecting the complex
by detecting
the reporter gene sequence expression such that if expression of the reporter
gene is detected
the compound that binds to a polypeptide of the invention is identified.
The methods of the invention also provide methods for treatment which involve
the
administration of the polynucleotides or polypeptides of the invention to
individuals
exhibiting synptoms or tendencies. In addition, the invention encompasses
methods for
treating diseases or disorders as recited herein comprising administering
compounds and
other substances that modulate the overall activity of the target gene
products. Compounds
and other substances can affect such modulation either on the level of target
gene/protein
expression or target protein activity.
The polypeptides of the present invention and the polynucleotides encoding
them are
also useful for the same functions known to one of skill in the art as the
polypeptides and
polynucleotides to which they have homology (set forth in Table 2); for which
they have a
signature region (as set forth in Table 3); or for which they have homology to
a gene family
(as set forth in Table 4). If no homology is set forth for a sequence, then
the polypeptides
and polynucleotides of the present invention are useful for a variety of
applications, as
described herein, including use in arrays for detection.
4. DETAILED DESCRIPTION OF THE INVENTION
4.1 DEFINITIONS
It must be noted that as used herein and in the appended claims, the singular
forms
"a", "an" and "the" include plural references unless the context clearly
dictates otherwise.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "active" refers to those forms of the polypeptide which retain the
biologic
and/or immunologic activities of any naturally occurnng polypeptide. According
to the
invention, the terms "biologically active" or "biological activity" refer to a
protein or peptide
having structural, regulatory or biochemical functions of a naturally
occurring molecule.
Likewise "immunologically active" or "immunological activity" refers to the
capability of
the natural, recombinant or synthetic polypeptide to induce a specific immune
response in
appropriate animals or cells and to bind with specific antibodies.
The term "activated cells" as used in this application are those cells which
are
engaged in extracellular or intracellular membrane trafficking, including the
export of
secretory or enzymatic molecules as part of a normal or disease process.
The terms "complementary" or "complementarity" refer to the natural binding of
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to
the
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded
molecules may be "partial" such that only certain portions) of the nucleic
acids bind or it
may be "complete" such that total complementarity exists between the single
stranded
molecules. The degree of complementarity between the nucleic acid strands has
significant
effects on the efficiency and strength of the hybridization between the
nucleic acid strands.
The term "embryonic stem cells (ES)" refers to a cell that can give rise to
many
differentiated cell types in an embryo or an adult, including the germ cells.
The term "germ
line stem cells (GSCs)" refers to stem cells derived from primordial stem
cells that provide a
steady and continuous source of germ cells for the production of gametes. The
term
"primordial germ cells (PGCs)" refers to a small population of cells set aside
from other cell
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during
embryogenesis
that have the potential to differentiate into germ cells and other cells. PGCs
are the source
from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells
are .
capable of self renewal. Thus these cells not only populate the germ line and
give rise to a
plurality of terminally differentiated cells that comprise the adult
specialized organs, but are
able to regenerate themselves.
The term "expression modulating fragment," EMF, means a series of nucleotides
which modulates the expression of an operably linked ORF or another EMF.
As used herein, a sequence is said to "modulate the expression of an operably
linked
sequence" when the expression of the sequence is altered by the presence of
the EMF.
EMFs include, but are not limited to, promoters, and promoter modulating
sequences
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
9
(inducible elements). One class of EMFs are nucleic acid fragments which
induce the
expression of an operably linked ORF in response to a specific regulatory
factor or
physiological event.
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or
"oligonucleotide" are used interchangeably and refer to a heteropolymer of
nucleotides or
the sequence of these nucleotides. These phrases also refer to DNA or RNA of
genomic or
synthetic origin which may be single-stranded or double-stranded and may
represent the
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-
like or RNA-like
material. In the sequences herein A is adenine, C is cytosine, T is thymine, G
is guanine and
N is A, C, G, or T (L~ or unknown. It is contemplated that where the
polynucleotide is
RNA, the T (thymine) in the sequences provided herein is substituted with U
(uracil).
Generally, nucleic acid segments provided by this invention may be assembled
from
fragments of the genome and short oligonucleotide linkers, or from a series of
oligonucleotides, or from individual nucleotides, to provide a synthetic
nucleic acid which is
capable of being expressed in a recombinant transcriptional unit comprising
regulatory
elements derived from a microbial or viral operon, or a eukaryotic gene.
The terms "oligonucleotide fragment" or a "polynucleotide fragment",
"portion," or
"segment" or "probe" or "primer" are used interchangeably and refer to a
sequence of
nucleotide residues which are at least about 5 nucleotides, more preferably at
least about 7
nucleotides, more preferably at least about 9 nucleotides, more preferably at
least about 11
nucleotides and most preferably at least about 17 nucleotides. The fragment is
preferably
less than about 500 nucleotides, preferably less than about 200 nucleotides,
more preferably
less than about 100 nucleotides, more preferably less than about 50
nucleotides and most
preferably less than 30 nucleotides. Preferably the probe is from about 6
nucleotides to
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more
preferably
from about 17 to 30 nucleotides and most preferably from about 20 to 25
nucleotides.
Preferably the fragments can be used in polymerase chain reaction (PCR),
various
hybridization procedures or microarray procedures to identify or amplify
identical or related
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify
each
polynucleotide sequence of the present invention. Preferably the fragment
comprises a
sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534.
Probes may, for example, be used to determine whether specific mRNA molecules
are present in a cell or tissue or to isolate similar nucleic acid sequences
from chromosomal
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl
1:241-250).
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or
other methods
well known in the art. Probes of the present invention, their preparation
andlor labeling are
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory
Manual, Cold
5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current
Protocols in
Molecular Biology, John Wiley & Sons, New York NY, both of which are
incorporated
herein by reference in their entirety.
The nucleic acid sequences of the present invention also include the sequence
infornlation from the nucleic acid sequences of SEQ ff~ NO: 1-1041, or 2083-
2534. The
10 sequence information can be a segment of any one of SEQ m NO: 1-1041, or
2083-2534
that uniquely identifies or represents the sequence information of that
sequence of SEQ m
NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and
8. One such
segment can be a twenty-mer nucleic acid sequence because the probability that
a twenty-
mer is fully matched in the human genome is 1 in 300. In the human genome,
there are three
billion base pairs in one set of chromosomes. Because 42° possible
twenty-mers exist, there
are 300 times more twenty-mers than there are base pairs in a set of human
chromosomes.
Using the same analysis, the probability for a seventeen-mer to be fully
matched in the
human genome is approximately 1 in 5. When these segments are used in arrays
for
expression studies, fifteen-mer segments can be used. The probability that the
fifteen-mer is
fully matched in the expressed sequences is also approximately one in five
because
expressed sequences comprise less than approximately 5% of the entire genome
sequence.
Similarly, when using sequence information for detecting a single mismatch, a
segment
can be a twenty-five mer. The probability that the twenty-five mer would
appear in a human
genome with a single mismatch is calculated by multiplying the probability for
a full match
(1=4z5) times the increased probability for mismatch at each nucleotide
position (3 x 25). The
probability that an eighteen mer with a single mismatch can be detected in an
array for
expression studies is approximately one in five. The probability that a twenty-
mer with a single
mismatch can be detected in a human genome is approximately one in five.
The term "open reading frame," ORF, means a series of nucleotide triplets
coding for
amino acids without any termination codons and is a sequence translatable into
protein.
The terms "operably linked" or "operably associated" refer to functionally
related
nucleic acid sequences. For example, a promoter is operably associated or
operably linked
with a coding sequence if the promoter controls the transcription of the
coding sequence.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
11
While operably linked nucleic acid sequences can be contiguous and in the same
reading
frame, certain genetic elements e.g. repressor genes are not contiguously
linked to the coding
sequence but still control transcription/translation of the coding sequence.
The term "pluripotent" refers to the capability of a cell to differentiate
into a number
of differentiated cell types that are present in an adult organism. A
pluripotent cell is
restricted in its differentiation capability in comparison to a totipotent
cell.
The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and
to naturally
occurring or synthetic molecules. A polypeptide "fragment," "portion," or
"segment" is a
stretch of amino acid residues of at least about 5 amino acids, preferably at
least about 7
amino acids, more preferably at least about 9 amino acids and most preferably
at least about
17 or more amino acids. The peptide preferably is not greater than about 200
amino acids,
more preferably less than 150 amino acids and most preferably less than 100
amino acids.
Preferably the peptide is from about 5 to about 200 amino acids. To be active,
any
polypeptide must have sufficient length to display biological and/or
irmnunological activity.
The term "naturally occurring polypeptide" refers to polypeptides produced by
cells
that have not been genetically engineered and specifically contemplates
various polypeptides
arising from post-translational modifications of the polypeptide including,
but not limited to,
acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and
acylation.
The term "translated protein coding portion" means a sequence which encodes
for the
full-length protein which may include any leader sequence or any processing
sequence.
The term "mature protein coding sequence" means a sequence which encodes a
peptide or protein without a signal or leader sequence. The "mature protein
portion" means
that portion of the protein which does not include a signal or leader
sequence. The peptide
may have been produced by processing in the cell wluch removes any
leader/signal
sequence. The mature protein portion may or may not include the initial
methionine residue.
The methionine residue may be removed from the protein during processing in
the cell. The
peptide may be produced synthetically or the protein may have been produced
using a
polynucleotide only encoding for the mature protein coding sequence.
The term "derivative" refers to polypeptides chemically modified by such
techniques
as ubiquitination, labeling (e.g., with radionuclides or various enzymes),
covalent polymer
attachment such as pegylation (derivatization with polyethylene glycol) and
insertion or
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12
substitution by chemical synthesis of amino acids such as ornithine, which do
not normally
occur in human proteins.
The term "variant"(or "analog") refers to any polypeptide differing from
naturally
occurnng polypeptides by amino acid insertions, deletions, and substitutions,
created using,
a g., recombinant DNA techniques. Guidance in determining which amino acid
residues
may be replaced, added or deleted without abolishing activities of interest,
may be found. by
comparing the sequence of the particular polypeptide with that of homologous
peptides and
minimizing the number of amino acid sequence changes made in regions of high
homology
(conserved regions) or by replacing amino acids with consensus sequence.
Alternatively, recombinant variants encoding these same or similar
polypeptides may
be synthesized or selected by making use of the "redundancy" in the genetic
code. Various
codon substitutions, such as the silent changes which produce various
restriction sites, may
be introduced to optimize cloning into a plasmid or viral vector or expression
in a particular
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may
be
reflected in the polypeptide or domains of other peptides added to the
polypeptide to modify
the properties of any part of the polypeptide, to change characteristics such
as ligand-binding
affinities, interchain affinities, or degradation/turnover rate.
Preferably, amino acid "substitutions" are the result of replacing one amino
acid with
another amino acid having similar structural and/or chemical properties, i.
e., conservative
amino acid replacements. "Conservative" amino acid substitutions may be made
on the
basis of similarity in polarity, charge, solubility, hydrophobicity,
hydrophilicity, and/or the
amphipathic nature of the residues involved. For example, nonpolar
(hydrophobic) amino
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine,
tryptophan, and
methionine; polar neutral amino acids include glycine, serine, threonine,
cysteine, tyrosine,
asparagine, and glutamine; positively charged (basic) amino acids include
arginine, lysine,
and histidine; and negatively charged (acidic) amino acids include aspartic
acid and glutamic
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20
amino acids,
more preferably 1 to 10 amino acids. The variation allowed may be
experimentally
determined by systematically making insertions, deletions, or substitutions of
amino acids in
a polypeptide molecule using recombinant DNA techniques and assaying the
resulting
recombinant variants for activity.
Alternatively, where alteration of function is desired, insertions, deletions
or
non-conservative alterations can be engineered to produce altered
polypeptides. Such
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13
alterations can, for example, alter one or more of the biological functions or
biochemical
characteristics of the polypeptides of the invention. For example, such
alterations may
change polypeptide characteristics such as ligand-binding affinities,
interchain affinities, or
degradation/turnover rate. Further, such alterations can be selected so as to
generate
polypeptides that are better suited for expression, scale up and the like in
the host cells
chosen for expression. For example, cysteine residues can be deleted or
substituted with
another amino acid residue in order to eliminate disulfide bridges.
The terms "purified" or "substantially purified" as used herein denotes that
the
indicated nucleic acid or polypeptide is present in the substantial absence of
other biological
macromolecules, e.g., polynucleotides, proteins, and the like. In one
embodiment, the
polynucleotide or polypeptide is purified such that it constitutes at least
95% by weight,
more preferably at least 99% by weight, of the indicated biological
macromolecules present
(but water, buffers, and other small molecules, especially molecules having a
molecular
weight of less than 1000 daltons, can be present).
The term "isolated" as used herein refers to a nucleic acid or polypeptide
separated
from at least one other component (e.g., nucleic acid or polypeptide) present
with the nucleic
acid or polypeptide in its natural source. In one embodiment, the nucleic acid
or polypeptide
is found in the presence of (if anything) only a solvent, buffer, ion, or
other component
normally present in a solution of the same. The terms "isolated" and
"purified" do not
encompass nucleic acids or polypeptides present in their natural source.
The term "recombinant," when used herein to refer to a polypeptide or protein,
means
that a polypeptide or protein is derived from recombinant (e.g., microbial,
insect, or
mammalian) expression systems. "Microbial" refers to recombinant polypeptides
or proteins
made in bacterial or fungal (e.g., yeast) expression systems. As a product,
"recombinant
microbial" defines a polypeptide or protein essentially free of native
endogenous substances
and unaccompanied by associated native glycosylation. Polypeptides or proteins
expressed
in most bacterial cultures, e.g., E. coli, will be free of glycosylation
modifications;
polypeptides or proteins expressed in yeast will have a glycosylation pattern
in general
different from those expressed in mammalian cells.
The term "recombinant expression vehicle or vector" refers to a plasmid or
phage or
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An
expression
vehicle can comprise a transcriptional unit comprising an assembly of (1) a
genetic element
or elements having a regulatory role in gene expression, for example,
promoters or
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
14
enhancers, (2) a structural or coding sequence which is transcribed into mRNA
and
translated into protein, and (3) appropriate transcription iutiation and
termination sequences.
Structural units intended for use in yeast or eukaryotic expression systems
preferably include
a leader sequence enabling extracellular secretion of translated protein by a
host cell.
Alternatively, where recombinant protein is expressed without a leader or
transport
sequence, it may include an amino terminal methionine residue. This residue
may or may
not be subsequently cleaved from the expressed recombinant protein to provide
a final
product.
The term "recombinant expression system" means host cells which have stably
integrated a recombinant transcriptional unit into chromosomal DNA or carry
the
recombinant transcriptional unit extrachromosomally. Recombinant expression
systems as
defined herein will express heterologous polypeptides or proteins upon
induction of the
regulatory elements linked to the DNA segment or synthetic gene to be
expressed. This term
also means host cells which have stably integrated a recombinant genetic
element or
elements having a regulatory role in gene expression, for example, promoters
or enhancers.
Recombinant expression systems as defined herein will express polypeptides or
proteins
endogenous to the cell upon induction of the regulatory elements linked to the
endogenous
DNA segment or gene to be expressed. The cells can be prokaryotic or
eukaryotic.
The term "secreted" includes a protein that is transported across or through a
membrane, including transport as a result of signal sequences in its amino
acid sequence
when it is expressed in a suitable host cell. "Secreted" proteins include
without limitation
proteins secreted wholly (e.g., soluble proteins) or partially (e.g.,
receptors) from the cell in
which they are expressed. "Secreted" proteins also include without limitation
proteins that
are transported across the membrane of the endoplasmic reticulum. "Secreted"
proteins are
also intended to include proteins containing non-typical signal sequences
(e.g. Interleukin-1
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and
factors
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see
Arend, W.P. et. al.
(1995) Annu. Rev. hnmunol. 16:27-55)
Where desired, an expression vector may be designed to contain a "signal or
leader
sequence" which will direct the polypeptide through the membrane of a cell.
Such a
sequence may be naturally present on the polypeptides of the present invention
or provided
from heterologous protein sources by recombinant DNA techniques.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "stringent" is used to refer to conditions that are commonly
understood in
the art as stringent. Stringent conditions can include highly stringent
conditions (i.e.,
hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate
(SDS), 1
mM EDTA at 65°C, and washing in O.1X SSC/0.1% SDS at 68°C), and
moderately stringent
5 conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other
exemplary hybridization
conditions are described herein in the examples.
In instances of hybridization of deoxyoligonucleotides, additional exemplary
stringent hybridization conditions include washing in 6X SSC/0.05% sodium
pyrophosphate
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base
oligonucleotides), 55°C (for 20-
10 base oligonucleotides), and 60°C (for 23-base oligonucleotides).
As used herein, "substantially equivalent" or "substantially similar" can
refer both to
nucleotide and amino acid sequences, for example a mutant sequence, that
varies from a
reference sequence by one or more substitutions, deletions, or additions, the
net effect of
which does not result in an adverse functional dissimilarity between the
reference and
15 subject sequences. Typically, such a substantially equivalent sequence
varies from one of
those listed herein by no more than about 35% (i.e., the number of individual
residue
substitutions, additions, and/or deletions in a substantially equivalent
sequence, as compared
to the corresponding reference sequence, divided by the total number of
residues in the
substantially equivalent sequence is about 0.35 or less). Such a sequence is
said to have
65% sequence identity to the listed sequence. In one embodiment, a
substantially
equivalent, e.g., mutant, sequence of the invention varies from a listed
sequence by no more
than 30% (70% sequence identity); in a variation of this embodiment, by no
more than 25%
(75% sequence identity); and in a further variation of this embodiment, by no
more than
20% (80% sequence identity) and in a further variation of this embodiment, by
no more than
10% (90% sequence identity) and in a further variation of this embodiment, by
no more that
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid
sequences
according to the invention preferably have at least 80% sequence identity with
a listed amino
acid sequence, more preferably at least 85% sequence identity, more preferably
at least 90%
sequence identity, more preferably at least 95% sequence identity, more
preferably at least
98% sequence identity, and most preferably at least 99% sequence identity.
Substantially
equivalent nucleotide sequence of the invention can have louver percent
sequence identities,
taking into account, for example, the redundancy or degeneracy of the genetic
code.
Preferably, the nucleotide sequence has at least about 65% identity, more
preferably at least
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
16
about 75% identity, more preferably at least about 80% sequence identity, more
preferably at
least 85% sequence identity, more preferably at least 90% sequence identity,
more preferably
at least about 95% sequence identity, more preferably at least 98% sequence
identity, and
most preferably at least 99% sequence identity. For the purposes of the
present invention,
sequences having substantially equivalent biological activity and
substantially equivalent
expression characteristics are considered substantially equivalent. For the
purposes of
determining equivalence, truncation of the mature sequence (e.g., via a
mutation which
creates a new stop codon) should be disregarded. Sequence identity may be
determined,
e.g., using the Jotun Hein method (Hero, J. (1990) Methods Enzymol. 183:626-
645).
Identity between sequences can also be determined by other methods known in
the art, e.g.
by varying hybridization conditions.
The term "totipotent" refers to the capability of a cell to differentiate into
all of the
cell types of an adult organism.
The term "transformation" means introducing DNA into a suitable host cell so
that
the DNA is replicable, either as an extrachromosomal element, or by
chromosomal
integration. The term "transfection" refers to the taking up of an expression
vector by a
suitable host cell, whether or not any coding sequences are in fact expressed.
The term
"infection" refers to the introduction of nucleic acids into a suitable host
cell by use of a
virus or viral vector.
As used herein, an "uptake modulating fragment," UMF, means a series of
nucleotides which mediate the uptake of a linked DNA fragment into a cell.
UMFs can be
readily identified using known UMFs as a target sequence or target motif with
the
computer-based systems described below. The presence and activity of a UMF can
be
confirmed by attaching the suspected UMF to a marker sequence. The resulting
nucleic acid
molecule is then incubated with an appropriate host under appropriate
conditions and the
uptake of the marker sequence is determined. As described above, a UMF will
increase the
frequency of uptake of a linked marker sequence.
Each of the above terms is meant to encompass all that is described for each,
unless
the context dictates otherwise.
4.2 NUCLEIC ACIDS OF THE INVENTION
Nucleotide sequences of the invention are set forth in the Sequence Listing.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
17
The isolated polynucleotides of the invention include a polynucleotide
comprising
the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534; a polynucleotide
encoding
any one of the peptide sequences of SEQ m NO: 1-1041, or 2083-2534; and a
polynucleotide comprising the nucleotide sequence encoding the mature protein
coding
sequence of the polynucleotides of any one of SEQ m NO: 1-1041, or 2083-2534.
The
polynucleotides of the present invention also include, but are not limited to,
a polynucleotide
that hybridizes under stringent conditions to (a) the complement of any of the
nucleotides
sequences of SEQ m NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding
any one
of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c)
a
polynucleotide which is an allelic variant of any polynucleotide recited
above; (d) a
polynucleotide which encodes a species homolog of any of the proteins recited
above; or (e)
a polynucleotide that encodes a polypeptide comprising a specific domain or
truncation of
the polypeptides of SEQ m NO: 1042-2082, or 2535-2986 (for example, as set
forth in
Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the
encoded
polypeptide; e.g., domains in receptor-like polypeptides include ligand-
binding,
extracellular, transmembrane, or cytoplasmic domains, or combinations thereof;
domains in
irmnunoglobulin-like proteins include the variable immunoglobulin-like
domains; domains
in enzyme-like polypeptides include catalytic and substrate binding domains;
and domains in
ligand polypeptides include receptor-binding domains.
The polynucleotides of the invention include naturally occurring or wholly or
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The
polynucleotides may include entire coding region of the cDNA or may represent
a portion of
the coding region of the cDNA.
The present invention also provides genes corresponding to the cDNA sequences
disclosed herein. The corresponding genes can be isolated in accordance with
known methods
using the sequence information disclosed herein. Such methods include the
preparation of
probes or primers from the disclosed sequence information for identification
and/or
amplification of genes in appropriate genomic libraries or other sources of
genomic materials.
Further 5' and 3' sequence can be obtained using methods known in the art. For
example, full
length cDNA or genomic DNA that corresponds to any of the polynucleotides of
SEQ m NO:
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic
DNA
libraries under suitable hybridization conditions using any of the
polynucleotides of SEQ m
NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the
polynucleotides of
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
18
SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primers)
that allow
identification and/or amplification of genes in appropriate genomic DNA or
cDNA libraries.
The nucleic acid sequences of the invention can be assembled from ESTs~and
sequences
(including cDNA and genomic sequences) obtained from one or more public
databases, such as
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence
information, representative fragment or segment information, or novel segment
information for
the full-length gene.
The polynucleotides of the invention also provide pol5mucleotides including
nucleotide sequences that are substantially equivalent to the polynucleotides
recited above.
Polynucleotides according to the invention can have, e.g., at least about 65%,
at least about
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more
typically at least
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%,
93%, 94%,
and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence
identity to a
polynucleotide recited above.
Included within the scope of the nucleic acid sequences of the invention are
nucleic
acid sequence fragments that hybridize under stringent conditions to any of
the nucleotide
sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which
fragment is
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably
greater than 9
nucleotides and most preferably greater than 17 nucleotides. Fragments of,
e.g. 15, 17, or 20
nucleotides or more that are selective for (i.e. specifically hybridize to)
any one of the
polynucleotides of the invention are contemplated. Probes capable of
specifically
hybridizing to a polynucleotide can differentiate polynucleotide sequences of
the invention
from other polynucleotide sequences in the same family of genes or can
differentiate human
genes from genes of other species, and are preferably based on unique
nucleotide sequences.
The sequences falling within the scope of the present invention are not
limited to these
specific sequences, but also include allelic and species variations thereof.
Allelic and species
variations can be routinely determined by comparing the sequence provided in
SEQ ID NO: 1-
1041, or 2083-2534, a representative fragment thereof, or a nucleotide
sequence at least 90%
identical, preferably 95% identical, to SEQ m NO: 1-1041, or 2083-2534 with a
sequence from
another isolate of the same species. Furthermore, to accommodate colon
variability, the
invention includes nucleic acid molecules coding for the same amino acid
sequences as do the
specific ORFs disclosed herein. In other words, in the coding region of an
ORF, substitution of
one colon for another colon that encodes the same amiilo acid is expressly
contemplated.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
19
The nearest neighbor or homology results for the nucleic acids of the present
invention,
including SEQ m NO: 1-1041, or 2083-2534 can be obtained by searching a
database using an
algorithm or a program. Preferably, a BLAST (Basic Local Aligmnent Search
Tool) program is
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36
290-300 (1993) and
Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA
version 3 search
against Genpept, using FASTXY algorithm may be performed.
Species homologs (or orthologs) of the disclosed polynucleotides and proteins
are
also provided by the present invention. Species homologs may be isolated and
identified by
making suitable probes or primers from the sequences provided herein and
screening a
suitable nucleic acid source from the desired species.
The invention also encompasses allelic variants of the disclosed
polynucleotides or
proteins; that is, naturally-occurring alternative forms of the isolated
polynucleotide which
also encode proteins which are identical, homologous or related to that
encoded by the
polynucleotides.
The nucleic acid sequences of the invention. are further directed to sequences
which
encode variants of the described nucleic acids. These amino acid sequence
variants may be
prepared by methods known in the art by introducing appropriate nucleotide
changes into a
native or variant polynucleotide. There are two variables in the construction
of amino acid
sequence variants: the location of the mutation and the nature of the
mutation. Nucleic
acids encoding the amino acid sequence variants are preferably constructed by
mutating the
polynucleotide to encode an amino acid sequence that does not occur in nature.
These
nucleic acid alterations can be made at sites that differ in the nucleic acids
from different
species (variable positions) or in highly conserved regions (constant
regions). Sites at such
locations will typically be modified in series, e.g., by substituting first
with conservative
choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid)
and then with
more distant choices (e.g., hydrophobic amino acid to a charged amino acid),
and then
deletions or insertions may be made at the target site. Amino acid sequence
deletions
generally range from about 1 to 30 residues, preferably about 1 to 10
residues, and are
typically contiguous. Amino acid insertions include amino- and/or carboxyl-
terminal
fusions ranging in length from one to one hundred or more residues, as well as
intrasequence
insertions of single or multiple amino acid residues. Intrasequence insertions
may range
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues.
Examples of
terminal insertions include the heterologous signal sequences necessary for
secretion or for
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
intracellular targeting in different host cells and sequences such as FLAG or
poly-histidine
sequences useful for purifying the expressed protein.
In a preferred method, polynucleotides encoding the novel amino acid sequences
are
changed via site-directed mutagenesis. This method uses oligonucleotide
sequences to alter
5 a polynucleotide to encode the desired amino acid variant, as well as
sufficient adjacent
nucleotides on both sides of the changed amino acid to form a stable duplex on
either side of
the site of being changed. In general, the techniques of site-directed
mutagenesis are well
known to those of skill in the art and this technique is exemplified by
publications such as,
Edelman et al., DNA 2:183 (1983). A versatile and efficient method for
producing
10 site-specific changes in a polynucleotide sequence was published by Zoller
and Smith,
Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino
acid
sequence variants of the novel nucleic acids. When small amounts of template
DNA are
used as starting material, primers) that differs slightly in sequence from the
corresponding
region in the template DNA can generate the desired amino acid variant. PCR
amplification
15 results in a population of product DNA fragments that differ from the
polynucleotide
template encoding the polypeptide at the position specified by the primer. The
product DNA
fragments replace the corresponding region in the plasmid and this gives a
polynucleotide
encoding the desired amino acid variant.
A further technique for generating amino acid variants is the cassette
mutagenesis
20 technique described in Wells et al., Gene 34:315 (1985); and other
mutagenesis techniques
well known in the art, such as, for example, the techniques in Sambrook et
al., supra, and
Cur~eht Protocols i~z MoleculaY Biology, Ausubel et al. Due to the inherent
degeneracy of
the genetic code, other DNA sequences which encode substantially the same or a
functionally equivalent amino acid sequence may be used in the practice of the
invention for
the cloning and expression of these novel nucleic acids. Such DNA sequences
include those
which are capable of hybridizing to the appropriate novel nucleic acid
sequence under
stringent conditions.
Polynucleotides encoding preferred polypeptide truncations of the invention
could be
used to generate polynucleotides encoding chimeric or fusion proteins
comprising one or
more domains of the invention and heterologous protein sequences.
The polynucleotides of the invention additionally include the complement of
any of
the polynucleotides recited above. The polynucleotide can be DNA (genomic,
cDNA,
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
21
polynucleotides are well known to those of skill in the art and can include,
for example,
methods for determining hybridization conditions that can routinely isolate
polynucleotides
of the desired sequence identities.
In accordance with the invention, polynucleotide sequences comprising the
mature
protein coding sequences corresponding to any one of SEQ m NO: 1-1041, or 2083-
2534,
or functional equivalents thereof, may be used to generate recombinant DNA
molecules that
direct the expression of that nucleic acid, or a functional equivalent
thereof, in appropriate
host cells. Also included are the cDNA inserts of any of the clones identified
herein.
A polynucleotide according to the invention can be joined to any of a variety
of other
nucleotide sequences by well-established recombinant DNA techniques (see
Sambrook J et
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory, NY).
Useful nucleotide sequences for joining to polynucleotides include an
assortment of vectors,
e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like,
that are well
known in the art. Accordingly, the invention also provides a vector including
a
polynucleotide of the invention and a host cell containing the polynucleotide.
In general, the
vector contains an origin of replication functional in at least one organism,
convenient
restriction endonuclease sites, and a selectable marker for the host cell.
Vectors according to
the invention include expression vectors, replication vectors, probe
generation vectors, and
sequencing vectors. A host cell according to the invention can be a
prokaryotic or
eukaryotic cell and can be a unicellular organism or part of a multicellular
organism.
The present invention further provides recombinant constructs comprising a
nucleic
acid having any of the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534
or a
fragment thereof or any other pol5mucleotides of the invention. In one
embodiment, the
recombinant constructs of the present invention comprise a vector, such as a
plasmid or viral
vector, into which a nucleic acid having any of the nucleotide sequences of
SEQ m NO: 1-
1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse
orientation. In
the case of a vector comprising one of the ORFs of the present invention, the
vector may
further comprise regulatory sequences, including for example, a promoter,
operably linked to
the ORF. Large numbers of suitable vectors and promoters are known to those of
skill in the
art and are commercially available for generating the recombinant constructs
of the present
invention. The following vectors are provided by way of example: Bacterial:
pBs,
phagescript, PsiX174, pBluescript SK, pBs KS, pNHBa, pNHl6a, pNHl8a, pNH46a
(Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia);
Eukaryotic:
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
22
pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL
(Pharmacia).
The isolated polynucleotide of the invention may be operably linked to an
expression
control sequence such as the pMT2 or pED expression vectors disclosed in
Kaufinan et al.,
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein
recombinantly.
Many suitable expression control sequences are known in the art. General
methods of
expressing recombinant proteins are also known and are exemplified in R.
Kaufinan,
Methods iu Enzymology 185, 537-566 (1990). As defined herein "operably linked"
means
that the isolated polynucleotide of the invention and an expression control
sequence are
situated within a vector or cell in such a way that the protein is expressed
by a host cell
which has been transformed (transfected) with the ligated
polynucleotide/expression control
sequence.
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol transferase) vectors or other vectors with selectable
markers. Two
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial
promoters include
lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV
immediate
early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and
mouse
metallothionein-I. Selection of the appropriate vector and promoter is well
within the level
of ordinary skill in the art. Generally, recombinant expression vectors will
include origins of
replication and selectable markers permitting transformation of the host cell,
e.g., the
ampicillin resistance gene of E. coli and S. cerevisiae TRP 1 gene, and a
promoter derived
from a highly expressed gene to direct transcription of a downstream
structural sequence.
Such promoters can be derived from operons encoding glycolytic enzymes such as
3-
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock
proteins, among
others. The heterologous structural sequence is assembled in appropriate phase
with
translation initiation and termination sequences, and preferably, a leader
sequence capable of
directing secretion of translated protein into the periplasmic space or
extracellular medium.
Optionally, the heterologous sequence can encode a fusion protein including an
amino
terminal identification peptide imparting desired characteristics, e.g.,
stabilization or
simplified purification of expressed recombinant product. Useful expression
vectors for
bacterial use are constructed by inserting a structural DNA sequence encoding
a desired
protein together with suitable translation initiation and termination signals
in operable
reading phase with a functional promoter. The vector will comprise one or more
phenotypic
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
23
selectable markers and an origin of replication to ensure maintenance of the
vector and to, if
desirable, provide amplification within the host. Suitable prokaryotic hosts
for
transformation include E. coli, Bacillus subtilis, Salmonella typhimur iuna
and various species
within the genera Pseudomonas, Streptonayces, and Staphylococcus, although
others may
also be employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for
bacterial
use can comprise a selectable marker and bacterial origin of replication
derived from
commercially available plasmids comprising genetic elements of the well known
cloning
vector pBR322 (ATCC 37017). Such commercial vectors include, for example,
pI~K223-3
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech,
Madison, WI,
USA). These pBR322 "backbone" sections are combined with an appropriate
promoter and
the structural sequence to be expressed. Following transformation of a
suitable host strain
and growth of the host strain to an appropriate cell density, the selected
promoter is induced
or derepressed by appropriate means (e.g., temperature shift or chemical
induction) and cells
are cultured for an additional period. Cells axe typically harvested by
centrifugation,
disrupted by physical or chemical means, and the resulting crude extract
retained for further
purification.
Polynucleotides of the invention can also be used to induce immune responses.
For
example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999),
incorporated herein by
reference, nucleic acid sequences encoding a polypeptide may be used to
generate antibodies
against the encoded polypeptide following topical administration of naked
plasmid DNA or
following injection, and preferably intra-muscular injection of the DNA. The
nucleic acid
sequences are preferably inserted in a recombinant expression vector and may
be in the form
of naked DNA.
4.3 ANTISENSE
Another aspect of the invention pertains to isolated antisense nucleic acid
molecules
that are hybridizable to or complementary to the nucleic acid molecule
comprising the
nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs
or
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide
sequence that is
complementary to a "sense" nucleic acid encoding a protein, e.g.,
complementary to the
coding strand of a double-stranded cDNA molecule or complementary to an mRNA
sequence. In specific aspects, antisense nucleic acid molecules are provided
that comprise a
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
24
sequence complementary to at least about 10, 25, 50, 100, 250 or 500
nucleotides or an
entire coding strand, or to only a portion thereof. Nucleic acid molecules
encoding
fragments, homologs, derivatives and analogs of a protein of any of SEQ >D NO:
1-1041, or
2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence
of SEQ m
NO: 1-1041, or 2083-2534 are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a
"coding
region" of the coding strand of a nucleotide sequence of the invention. The
term "coding
region" refers to the region of the nucleotide sequence comprising codons
which are
translated into amino acid residues. In another embodiment, the antisense
nucleic acid
molecule is antisense to a "noncoding region" of the coding strand of a
nucleotide sequence
of the invention. The term "noncoding region" refers to 5' and 3' sequences
that flank the ,
coding region that are not translated into amino acids (i.e., also referred to
as 5' and 3'
untranslated regions).
Given the coding strand sequences encoding a nucleic acid disclosed herein
(e.g.,
SEQ >D NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can
be designed
according to the rules of Watson and Crick or Hoogsteen base pairing. The
antisense nucleic
acid molecule can be complementary to the entire coding region of an mRNA, but
more
preferably is an oligonucleotide that is antisense to only a portion of the
coding or noncoding
region of an mRNA. For example, the antisense oligonucleotide can be
complementary to
the region surrounding the translation start site of an mRNA. An antisense
oligonucleotide
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides
in length. An
antisense nucleic acid of the invention can be constructed using chemical
synthesis or
enzymatic ligation reactions using procedures known in the art. For example,
an antisense
nucleic acid (e.g., an antisense oligonucleotide) can be chemically
synthesized using
naturally occurring nucleotides or variously modified nucleotides designed to
increase the
biological stability of the molecules or to increase the physical stability of
the duplex formed
between the antisense and sense nucleic acids, e.g., phosphorothioate
derivatives and
acridine substituted nucleotides can be used.
Examples of modified nucleotides that can be used to generate the antisense
nucleic
acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine,
xanthine, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, 5-
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil,
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-
methylguanine,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-
methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-
methylaminomethyluracil, 5-methoxyamiuomethyl-2-thiouracil, beta-D-
mannosylqueosine,
5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-
isopentenyladenine,
5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-
thiocytosine, 5-methyl-
2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic
acid methylester,
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-
carboxypropyl) uracil,
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can
be produced
.biologically using an expression vector into which a nucleic acid has been
subcloned in an
10 antisense orientation (i.e., RNA transcribed from the inserted nucleic acid
will be of an
antisense orientation to a target nucleic acid of interest, described further
in the following
subsection).
The antisense nucleic acid molecules of the invention are typically
administered to a
subject or generated in situ such that they hybridize with or bind to cellular
mRNA and/or
15 genomic DNA encoding a protein according to the invention to thereby
inhibit expression of
the protein, e.g., by inhibiting transcription and/or translation. The
hybridization can be by
conventional nucleotide complementarity to form a stable duplex, or, for
example, in the
case of an antisense nucleic acid molecule that binds to DNA duplexes, through
specific
interactions in the major groove of the double helix. An example of a route of
20 administration of antisense nucleic acid molecules of the invention
includes direct injection
at a tissue site. Alternatively, antisense nucleic acid molecules can be
modified to target
selected cells and then administered systemically. For example, for systemic
administration,
antisense molecules can be modified such that they specifically bind to
receptors or antigens
expressed on a selected cell surface, e.g., by linking the antisense nucleic
acid molecules to
25 peptides or antibodies that bind to cell surface receptors or antigens. The
antisense nucleic
acid molecules can also be delivered to cells using the vectors described
herein. To achieve
sufficient intracellular concentrations of antisense molecules, vector
constructs in which the
antisense nucleic acid molecule is placed under the control of a strong pol II
or pol III
promoter are preferred.
W yet another embodiment, the antisense nucleic acid molecule of the invention
is an
a,-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms
specific
double-stranded hybrids with complementary RNA in which, contrary to the usual
a,-units,
the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids
Res 15:
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
26
6625-6641). The antisense nucleic acid molecule can also comprise a
2'-o-methylribonucleotide (moue et al. (1987) Nucleic Acids Res 15: 6131-6148)
or a
chimeric RNA -DNA analogue (moue et al. (1987) FEBS Lett 215: 327-330).
4.4 RIBOZYMES AND PNA MOIETIES
In still another embodiment, an antisense nucleic acid of the invention is a
ribozyme.
Ribozymes are catalytic RNA molecules with ribonuclease activity that are
capable of
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described
in
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically
cleave
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having
specificity
for a nucleic acid of the invention can be designed based upon the nucleotide
sequence of a
DNA disclosed herein (i.e., SEQ ID NO: 1-1041, or 2083-2534). For example, a
derivative
of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide
sequence of the
active site is complementary to the nucleotide sequence to be cleaved in a
mRNA. See, e.g.,
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742.
Alternatively,
mRNA of the invention can be used to select a catalytic RNA having a specific
ribonuclease
activity from a pool of RNA molecules. See, e.g., Bartel et al., (1993)
Seience
261:1411-1418.
Alternatively, gene expression can be inhibited by targeting nucleotide
sequences
complementary to the regulatory region (e.g., promoter and/or enhancers) to
form triple
helical structures that prevent transcription of the gene in target cells. See
generally, Helene.
(1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N Y. Acad.
Sci.
660:27-36; and Maher (1992) Bioassays 14: 807-15.
In various embodiments, the nucleic acids of the invention can be modified at
the
base moiety, sugar moiety or phosphate backbone to improve, e.g., the
stability,
hybridization, or solubility of the molecule. For example, the deoxyribose
phosphate
backbone of the nucleic acids can be modified to generate peptide nucleic
acids (see Hyrup
et al. (1996) Bioorg Med Chern 4: 5-23). As used herein, the terms "peptide
nucleic acids"
or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the
deoxyribose
phosphate backbone is replaced by a pseudopeptide backbone and only the four
natural
nucleobases are retained. The neutral backbone of PNAs has been shown to allow
for
specific hybridization to DNA and RNA under conditions of low ionic strength.
The
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
27
synthesis of PNA oligomers can be performed using standard solid phase peptide
synthesis
protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al.
(1996) PNAS 93:
14670-675.
PNAs of the invention can be used in therapeutic and diagnostic applications.
For
example, PNAs can be used as antisense or antigene agents for sequence-
specific modulation
of gene expression by, e.g., inducing transcription or translation arrest or
inhibiting
replication. PNAs of the invention can also be used, e.g., in the analysis of
single base pair
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial
restriction enzymes
when used in combination with other enzymes, e.g., S1 nucleases (Hyrup B.
(1996) above);
or as probes or primers for DNA sequence and hybridization (Hyrup et al.
(1996), above;
Perry-O'Keefe (1996), above).
In another embodiment, PNAs of the invention can be modified, e.g., to enhance
their stability or cellular uptake, by attaching lipophilic or other helper
groups to PNA, by
the formation of PNA-DNA chimeras, or by the use of liposomes or other
techniques of drug
delivery known in the art. For example, PNA-DNA chimeras can be generated that
may
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the
DNA
portion while the PNA portion would provide high binding affinity and
specificity.
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected
in terms of
base stacking, number of bonds between the nucleobases, and orientation (Hyrup
(1996)
above). The synthesis of PNA-DNA chimeras can be performed as described in
Hyrup
(1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a
DNA chain
can be synthesized on a solid support using standard phosphoramidite coupling
chemistry,
and modified nucleoside analogs, e.g., 5'-(4-methoxytrityl)amino-5'-deoxy-
thymidine
phosphoramidite, can be used between the PNA and the 5' end of DNA (Mag et al.
(1989)
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner
to
produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn
et al.
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5'
DNA
segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Clzem
Lett 5:
1119-11124.
In other embodiments, the oligonucleotide may include other appended groups
such
as peptides (e.g., for targeting host cell receptors in vivo), or agents
facilitating transport
across the cell membrane (see, e.g., Letsinger et al., 1989, P~oc. Natl. Acad.
Sci. U.S.A.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
28
86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT
Publication
No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No.
W089/10134).
In addition, oligonucleotides can be modified with hybridization triggered
cleavage agents
(See, e.g., Krol et al., 1988, BioTechhiques 6:958-976) or intercalating
agents. (See, e.g.,
Zon, 1988, Pha~m. Res. 5: 539-549). To this end, the oligonucleotide may be
conjugated to
another molecule, e.g., a peptide, a hybridization triggered cross-linking
agent, a transport
agent, a hybridization-triggered cleavage agent, etc.
4.5 HOSTS
The present invention further provides host cells genetically engineered to
contain
the polynucleotides of the invention. For example, such host cells may contain
nucleic acids
of the invention introduced into the host cell using known transformation,
transfection or
infection methods. The present invention still fizrther provides host cells
genetically
engineered to express the polynucleotides of the invention, wherein such
polynucleotides are
in operative association with a regulatory sequence heterologous to the host
cell which
drives expression of the polynucleotides in the cell.
Knowledge of nucleic acid sequences allows for modification of cells to
permit, or
increase, expression of endogenous polypeptide. Cells can be modified (e.g.,
by
homologous recombination) to provide increased polypeptide expression by
replacing, in
whole or in part, the naturally occurring promoter with all or part of a
heterologous promoter
so that the cells express the polypeptide at higher levels. The heterologous
promoter is
inserted in such a manner that it is operatively linked to the encoding
sequences. See, for
example, PCT International Publication No. WO94/12650, PCT International
Publication
No. W092/20808, and PCT International Publication No. W091/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl
phosphate
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA
may be
inserted along with the heterologous promoter DNA. If linked to the coding
sequence,
amplification of the marker DNA by standard selection methods results in co-
amplification
of the desired protein coding sequences in the cells.
The host cell can be a higher eukaryotic host cell, such as a mammalian cell,
a lower
eukaryotic host cell, such as a yeast cell, or the host cell can be a
prokaryotic cell, such as a
bacterial cell. Introduction of the recombinant construct into the host cell
can be effected by
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
29
calcium phosphate transfection, DEAE, dextran mediated transfection, or
electroporation
(Davis, L. et al., Basic Metlaods iri Molecular Biology (1986)). The host
cells containing one
of the polynucleotides of.the invention, can be used in conventional manners
to produce the
gene product encoded by the isolated fragment (in the case of an ORF) or can
be used to
produce a heterologous protein under the control of the EMF.
Any host/vector system can be used to express one or more of the ORFs of the
present invention. These include, but are not limited to, eukaryotic hosts
such as HeLa cells,
Cv-1 cell, COS cells, 293 cells, and S~ cells, as well as prokaryotic host
such as E. coli and
B. subtilis. The most preferred cells are those which do not normally express
the particular
polypeptide or protein or which expresses the polypeptide or protein at low
natural level.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other
cells under
the control of appropriate promoters. Cell-free translation systems can also
be employed to
produce such proteins using RNAs derived from the DNA constructs of the
present
invention. Appropriate cloning arid expression vectors for use with
prokaryotic and
eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A
Laboratory
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of
which is
hereby incorporated by reference.
Various mammalian cell culture systems can also be employed to express
recombinant protein. Examples of mammalian expression systems include the COS-
7 lines
of monkey kichley fibroblasts, described by Gluzman, Cell 23:175 (1981). Other
cell lines
capable of expressing a compatible vector are, for example, the C127, monkey
COS cells,
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal
A431 cells,
human Co1o205 cells, 3T3 cells, CV-1 cells, other transformed primate cell
lines, normal
diploid cells, cell strains derived from ih vitro culture of primary tissue,
primary explants,
HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian
expression
vectors will comprise an origin of replication, a suitable promoter and also
any necessary
ribosome binding sites, polyadenylation site, splice donor and acceptor sites,
transcriptional
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences
derived
from the SV40 viral genome, for example, SV40 origin, early promoter,
enhancer, splice,
and polyadenylation sites may be used to provide the required nontranscribed
genetic
elements. Recombinant polypeptides and proteins produced in bacterial culture
are usually
isolated by initial extraction from cell pellets, followed by one or more
salting-out, aqueous
ion exchange or size exclusion chromatography steps. Protein refolding steps
can be used,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
as necessary, in completing configuration of the mature protein. Finally, high
performance
liquid chromatography (HPLC) can be employed for final purification steps.
Microbial cells
employed in expression of proteins can be disrupted by any convenient method,
including
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing
agents.
Alternatively, it may be possible to produce the protein in lower eukaryotes
such as
yeast or insects or in prokaryotes such as bacteria. Potentially suitable
yeast strains include
SaccharonZyces cerevisiae, SclZizosacchaYOtnyces potrtbe, Kluyvet~omyces
strains, Candida,
or any yeast strain capable of expressing heterologous proteins. Potentially
suitable bacterial
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimuriut~t,
or any bacterial
10 strain capable of expressing heterologous proteins. If the protein is made
in yeast or
bacteria, it may be necessary to modify the protein produced therein, for
example by
phosphorylation or glycosylation of the appropriate sites, in order to obtain
the functional
protein. Such covalent attachments may be accomplished using known chemical or
enzymatic methods.
15 hl another embodiment of the present invention, cells and tissues may be
engineered
to express an endogenous gene comprising the polynucleotides of the invention
under the
control of inducible regulatory elements, in which case the regulatory
sequences of the
endogenous gene may be replaced by homologous recombination. As described
herein, gene
targeting can be used to replace a gene's existing regulatory region with a
regulatory
20 sequence. isolated from a different gene or a novel regulatory sequence
synthesized by
genetic engineering methods. Such regulatory sequences may be comprised of
promoters,
enhancers, scaffold-attachment regions, negative regulatory elements,
transcriptional
initiation sites, and regulatory protein binding sites or combinations of said
sequences.
Alternatively, sequences which affect the structure or stability of the RNA or
protein
25 produced may be replaced, removed, added, or otherwise modified by
targeting. These
sequence include polyadenylation signals, mRNA stability elements, splice
sites, leader
sequences for enhancing or modifying transport or secretion properties of the
protein, or
other sequences which alter or improve the function or stability of protein or
RNA
molecules.
30 The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
31
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different
cell-type specificity than the naturally occurnng elements. Here, the
naturally occurring
sequences are deleted and new sequences are added. In all cases, the
identification of the
targeting event may be facilitated by the use of one or more selectable marker
genes that are
contiguous with the targeting DNA, allowing for the selection of cells in
which the
exogenous DNA has integrated into the host cell genome. The identification of
the targeting
event may also be facilitated by the use of one or more marker genes
exhibiting the property
of negative selection, such that the negatively selectable marker is linked to
the exogenous
DNA, but configured such that the negatively selectable marker flanks the
targeting
sequence, and such that a correct homologous recombination event with
sequences in the
host cell genome does not result in the stable integration of the negatively
selectable marker.
Markers useful for this purpose include the Herpes Simplex Virus thymidine
kinase (TK)
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance
with this aspect of the invention are more particularly described in U.S.
Patent No. 5,272,071
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Tnternational
Application No.
PCT/US92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/US90/06436 (W091/06667) by Skoultchi et al., each of which is incorporated
by
reference herein in its entirety.
4.6 POLYPEPTIDES OF THE INVENTION
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide comprising: the amino acid sequences set forth as any one of SEQ
ID NO: 1042-
2082, or 2535-2986 or an amino acid sequence encoded by any one of the
nucleotide
sequences SEQ DJ NO: 1-1041, or 2083-2534 or the corresponding full length or
mature
protein. Polypeptides of the invention also include polypeptides preferably
with biological or
immunological activity that are encoded by: (a) a polynucleotide having any
one of the
nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b)
polynucleotides
encoding any one of the amino acid sequences set forth as SEQ m NO: 1042-2082,
or 2535-
2986 or (c) polynucleotides that hybridize to the complement of the
polynucleotides of either
(a) or (b) under stringent hybridization conditions. The invention also
provides biologically
active or immunologically active variants of any of the amino acid sequences
set forth as
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
32
SEQ m NO: 1042-2082, or 2535-2986 or the corresponding full length or mature
protein;
and "substantial equivalents" thereof (e.g., with at least about 65%, at least
about 70%, at
least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%,
at least about
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more
typically at least
about 98%, or most typically at least about 99% amino acid identity) that
retain biological
activity. Polypeptides encoded by allelic variants may have a similar,
increased, or
decreased activity compared to polypeptides comprising SEQ m NO: 1042-2082, or
2535-
2986.
Fragments of the proteins of the present invention which are capable of
exhibiting
biological activity are also encompassed by the present invention. Fragments
of the protein
may be in linear form or they may be cyclized using known methods, for
example, as
described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in
R. S.
McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are
incorporated herein by reference. Such fragments may be fused to Garner
molecules such as
immunoglobulins for many purposes, including increasing the valency of protein
binding
sites. Fragments are also identified in Tables 3, 5, 6, and 8.
The present invention also provides both full-length and mature forms (for
example,
without a signal sequence or precursor sequence) of the disclosed proteins.
The protein
coding sequence is identified in the sequence listing by translation of the
disclosed
nucleotide sequences. The predicted signal sequence is set forth in Table 6.
The mature
form of such protein may be obtained and confirmed by expression of a full-
length
polynucleotide in a suitable mammalian cell or other host cell and sequencing
of the cleaved
product. One of skill in the art will recognize that the actual cleavage site
may be different
than that predicted in Table 6. The sequence of the mature form of the protein
is also
determinable from the amino aci°d sequence of the full-length form.
Where proteins of the
present invention are membrane bound, soluble forms of the proteins are also
provided. In
such forms, part or all of the regions causing the proteins to be membrane
bound are deleted
so that the proteins are fully secreted from the cell in which they are
expressed.
Protein compositions of the present invention may further comprise an
acceptable
carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The present invention further provides isolated polypeptides encoded by the
nucleic
acid fragments of the present invention or by degenerate variants of the
nucleic acid
fragments of the present invention. By "degenerate variant" is intended
nucleotide
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
33
fragments which differ from a nucleic acid fragment of the present invention
(e.g., an ORF)
by nucleotide sequence but, due to the degeneracy of the genetic code, encode
an identical
polypeptide sequence. Preferred nucleic acid fragments of the present
invention are the
ORFs that encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one
of the
isolated polypeptides or proteins of the present invention. At the simplest
level, the amino
acid sequence can be synthesized using commercially available peptide
synthesizers. The
synthetically-constructed protein sequences, by virtue of sharing primary,
secondary or
tertiary structural and/or conformational characteristics with proteins may
possess biological
properties in common therewith, including protein activity. This technique is
particularly
useful in producing small peptides and fragments of larger polypeptides.
Fragments are
useful, for example, in generating antibodies against the native polypeptide.
Thus, they may
be employed as biologically active or immunological substitutes for natural,
purified
proteins in screening of therapeutic compounds and in immunological processes
for the
development of antibodies.
The polypeptides and proteins of the present invention can alternatively be
purified
from cells which have been altered to express the desired polypeptide or
protein. As used
herein, a Bell is said to be altered to express a desired polypeptide or
protein when the cell,
through genetic manipulation, is made to produce a polypeptide or protein
which it normally
does not produce or which the cell normally produces at a lower level. One
skilled in the art
can readily adapt procedures for introducing and expressing either recombinant
or synthetic
sequences into eukaryotic or prokaryotic cells in order to generate a cell
which produces one
of the polypeptides or proteins of the present invention.
The invention also relates to methods for producing a polypeptide comprising
growing a culture of host cells of the invention in a suitable culture medium,
and purifying
the protein from the cells or the culture in which the cells are grown. For
example, the
methods of the invention include a process for producing a polypeptide in
which a host cell
containing a suitable expression vector that includes a polynucleotide of the
invention is
cultured under conditions that allow expression of the encoded polypeptide.
The
polypeptide can be recovered from the culture, conveniently,from the culture
medium, or
from a lysate prepared from the host cells and further purified. Preferred
embodiments
include those in which the protein produced by such process is a full length
or mature form
of the protein.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
34
In an alternative method, the polypeptide or protein is purified from
bacterial cells
which naturally produce the polypeptide or protein. One skilled in the art can
readily follow
known methods for isolating polypeptides and proteins in order to obtain one
of the isolated
polypeptides or proteins of the present invention. These include, but are not
limited to,
S immunochromatography, HPLC, size-exclusion chromatography, ion-exchange
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Pf-
ateih
Pu~ificatiafa: Priheiples afad PYactice, Springer-Verlag (1994); Sambrook, et
al., in
Molecular Cloning: A Laboy~atoYy Manual; Ausubel et al., Cu~~efzt Protocols in
Molecular
Biology. Polypeptide fragments that retain biologicallimmunological activity
include
fragments comprising greater than about 100 amino acids, or greater than about
200 amino
acids, and fragments that encode specific protein domains.
The purified polypeptides can be used in in vitro binding assays which are
well
knov~m in the art to identify molecules which bind to the polypeptides. These
molecules
include but are not limited to, for e.g., small molecules, molecules from
combinatorial
1S libraries, antibodies or other proteins. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cellla~zimal death or prolonged survival of
the animal/cells.
In addition, the peptides of the invention or molecules capable of binding to
the
peptides may be complexed with toxins, e.g., ricin or cholera, or with other
compounds that
are toxic to cells. The toxin-binding molecule complex is then targeted to a
tumor. or other
cell by the specificity of the binding molecule for SEQ )D NO: 1042-2082, or
2S3S-2986.
The protein of the invention may also be expressed as a product of transgenic
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or
sheep which are
2S characterized by somatic or germ cells containing a nucleotide sequence
encoding the
protein.
The proteins provided herein also include proteins characterized by amino acid
sequences similar to those of purified proteins but into which modification
are naturally
provided or deliberately engineered. For example, modifications, in the
peptide or DNA
sequence, can be made by those skilled in the art using known techniques.
Modifications of
interest in the protein sequences may include the alteration, substitution,
replacement,
insertion or deletion of a selected amino acid residue in the coding sequence.
Fox example,
one or more of the cysteine residues may be deleted or replaced with another
amino acid to
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
alter the conformation of the molecule. Techniques for such alteration,
substitution,
replacement, insertion or deletion are well known to those skilled in the art
(see, e.g., U.S.
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement,
insertion or
deletion retains the desired activity of the protein. Regions of the protein
that are important
5 for the protein function can be determined by various methods known in the
art including the
alanine-scanning method which involved systematic substitution of single or
strings of
amino acids with alanine, followed by testing the resulting alanine-containing
variant for
biological activity. This type of analysis determines the importance of the
substituted amino
acids) in biological activity. Regions of the protein that are important for
protein function
10 may be determined by the eMATRIX program.
Other fragments and derivatives of the sequences of proteins which would be
expected to retain protein activity in whole or in part and are useful for
screening or other
immunological methodologies may also be easily made by those skilled in the
art given the
disclosures herein. Such modifications are encompassed by the present
invention.
15 The protein may also be produced by operably linking the isolated
polynucleotide of
the invention to suitable control sequences in one or more insect expression
vectors, and
employing an insect expression system. Materials and methods for
baculovirus/insect cell
expression systems are commercially available in kit form from, e.g.,
Invitrogen, San Diego,
Calif., U.S.A. (the MaxBatTM kit), and such methods are well known in the art,
as described
20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No.
1555 (1987),
incorporated herein by reference. As used herein, an insect cell capable of
expressing a
polynucleotide of the present invention is "transformed."
The protein of the invention may be prepared by culturing transformed host
cells
under culture conditions suitable to express the recombinant protein. The
resulting
25 expressed protein may then be purified from such culture (i.e., from
culture medium or cell
extracts) using known purification processes, such as gel filtration and ion
exchange
chromatography. The purification of the protein may also include an affinity
column
containing agents which will bind to the protein; one or more column steps
over such affinity
resins as concanavalin A-agarose, heparin-toyopearlTM or Cibacrom blue 3GA
SepharoseTM;
30 one or more steps involving hydrophobic interaction chromatography using
such resins as
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography.
Alternatively, the protein of the invention may also be expressed in a form
which will
facilitate purification. For example, it may be expressed as a fusion protein,
such as those of
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
36
maltose binding protein (MBP), glutatluone-S-transferase (GST) or thioredoxin
(TRX), or as
a His tag. Kits for expression and purification of such fusion proteins are
commercially
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscatav~iay,
N.J.) and
Invitrogen, respectively. The protein can also be tagged with an epitope and
subsequently
purified by using a specific antibody directed to such epitope. One such
epitope ("FLAG~")
is commercially available from Kodak (New Haven, Conn.).
Finally, one or more reverse-phase high performance liquid chromatography (RP-
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having
pendant
methyl or other aliphatic groups, can be employed to further purify the
protein. Some or all
of the foregoing purification steps, in various combinations, can also be
employed to provide
a substantially homogeneous isolated recombinant protein. The protein thus
purified is
substantially free of other mammalian proteins and is defined in accordance
with the present
invention as an "isolated protein."
The polypeptides of the invention include analogs (variants). This embraces
fragments, as well as peptides in which one or more amino acids has been
deleted, inserted,
or substituted. Also, analogs of the polypeptides of the invention embrace
fusions of the
polypeptides or modifications of the polypeptides of the invention, wherein
the polypeptide
or analog is fused to another moiety or moieties, e.g., targeting moiety or
another therapeutic
agent. Such analogs may exhibit improved properties such as activity and/or
stability.
Examples of moieties Which may be fused to the polypeptide or an analog
include, for
example, targeting moieties which provide for the delivery of polypeptide to
pancreatic cells,
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-
cells, monocytes,
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed
on pancreatic or
immune cells. Other moieties which may be fused to the polypeptide include
therapeutic
agents which are used for treatment, for example, immunosuppressive drugs such
as
cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also,
polypeptides may be
fused to immune modulators, and other cytokines such as alpha or beta
interferon.
4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE
IDENTITY AND SIMILARITY
Preferred identity and/or similarity are designed to give the largest match
between
the sequences tested. Methods to determine identity and similarity are
codified in computer
programs including, but are not limited to, the GCG program package, including
GAP
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
37
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics
Computer Group,
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA
(Altschul,
S.F. et al., J. Molec. Biol. 215:403-410 (1990), PST-BLAST (Altschul S.F. et
al., Nucleic
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix
software (Wu
et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by
reference), eMotif
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein
incorporated by
reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1),
pp. 320-322
(1998), herein incorporated by reference) and the Kyte-Doolittle
hydrophobocity prediction
algorithm (J. Mo1 Biol, 157, pp. 105-31 (1982), incorporated herein by
reference).
polypeptide sequences were examined by a proprietary algorithm, SeqLoc that
separates the
proteins into three sets of locales: intracellular, membrane, or secreted.
This prediction is
based upon three characteristics of each polypeptide, including percentage of
cysteine
residues, Kyte-Doolittle scores for the f rst 20 amino acids of each protein,
and Kyte-
Doolittle scores to calculate the longest hydrophobic stretch of the said
protein. Values of
predicted proteins are compared against the values from a set of 592 proteins
of known
cellular localization from the Swissprot database
(http:llwww.expasy.ch/sprot). Predictions
are based upon the maximum likelihood estimation.
The BLAST programs are publicly available from the National Center for
Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul,
S., et al.
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-
410
(1990).
4.7 CHIMERIC AND FUSION PROTEINS
The invention also provides chimeric or fusion proteins. As used herein, a
"chimeric
protein" or "fusion protein" comprises a polypeptide of the invention
operatively linked to
another polypeptide. Within a fusion protein the polypeptide according to the
invention can
correspond to all or a portion of a protein according to the invention. In one
embodiment, a
fusion protein comprises at least one biologically active portion of a protein
according to the
invention. In another embodiment, a fusion protein comprises at Least two
biologically
active portions of a protein according to the invention. Within the fusion
protein, the term
"operatively linked" is intended to indicate that the polypeptide according to
the invention
and the other polypeptide are fused in-frame to each other. The polypeptide
can be fused to
the N-terminus or C-terminus, or to the middle.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
38
For example, in one embodiment a fusion protein comprises a polypeptide
according
to the invention operably linked to the extracellular domain ~of a second
protein.
In another embodiment, the fusion protein is a GST-fusion protein in which the
polypeptide sequences of the invention are fused to the C-terminus of the GST
(i.e.,
glutathione S-transferase) sequences.
In another embodiment, the fusion protein is an immunoglobulin fusion protein
in
which the polypeptide sequences according to the invention comprise one or
more domains
fused to sequences derived from a member of the immunoglobulin protein family.
The
immunoglobulin fusion proteins of the invention can be incorporated into
pharmaceutical
compositions and administered to a subject to inhibit an interaction between a
ligand and a
protein of the invention on the surface of a cell, to thereby suppress signal
transduction ira
viv~. The immunoglobulin fusion proteins can be used to affect the
bioavailability of a
cognate ligand. Inhibition of the ligand/protein interaction may be useful
therapeutically for
both the treatment of proliferative and differentiative disorders, e.g.,
cancer as well as
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the
immunoglobulin
fusion proteins of the invention can be used as immunogens to produce
antibodies in a
subject, to purify ligands, and in screening assays to identify molecules that
inhibit the
interaction of a polypeptide of the invention with a ligand.
A chimeric or fusion protein of the invention can be produced by standard
recombinant DNA techniques. For example, DNA fragments coding for the
different
polypeptide sequences are ligated together in-frame in accordance with
conventional
techniques, e.g., by employing blunt-ended or stagger-ended termini for
ligation, restriction
enzyme digestion to provide for appropriate termini, filling-in of cohesive
ends as
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and
enzymatic
ligation. In another embodiment, the fusion gene can be synthesized by
conventional
techniques including automated DNA synthesizers. Alternatively, PCR
amplification of
gene fragments can be carned out using anchor primers that give rise to
complementary
overhangs between two consecutive gene fragments that can subsequently be
annealed and
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et
al. (eds.)
CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1992). Moreover,
many expression vectors are commercially available that already encode a
fusion moiety
(e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the
invention can be
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
39
cloned into such an expression vector such that the fusion moiety is linked in-
frame to the
protein of the invention.
4.8 GENE T~IERAPY
Mutations in the polynucleotides of the invention gene may result in loss of
normal
function of the encoded protein. The invention thus provides gene therapy to
restore normal
activity of the polypeptides of the invention; or to treat disease states
involving polypeptides
of the invention. Delivery of a functional gene encoding polypeptides of the
invention to
appropriate cells is effected ex vivo, ih situ, or is? vivo by use of vectors,
and more
particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a
retrovirus), or ex vivo
by use of physical DNA transfer methods (e.g., liposomes or chemical
treatments). See, for
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998).
For
additional reviews of gene therapy technology see Friedmann, Science, 244:
1275-1281
(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-
460 (1992).
Introduction of amy one of the nucleotides of the present invention or a gene
encoding the
polypeptides of the present invention can also be accomplished with
extrachromosomal
substrates (transient expression) or artificial chromosomes (stable
expression). Cells may
also be cultured ex vivo in the presence of proteins of the present invention
in order to
proliferate or to produce a desired effect on or activity in such cells.
Treated cells can then
be introduced ifa vivo for therapeutic purposes. Alternatively, it is
contemplated that in other
human disease states, preventing the expression of or inhibiting the activity
of polypeptides
of the invention will be useful in treating the disease states. It is
contemplated that antisense
therapy or gene therapy could be applied to negatively regulate the expression
of
polypeptides of the invention.
Other methods inhibiting expression of a protein include the introduction of
antisense
molecules to the nucleic acids of the present invention, their complements, or
their translated
RNA sequences, by methods known in the art. Further, the polypeptides of the
present
invention can be inhibited by using targeted deletion methods, or the
insertion of a negative
regulatory element such as a silencer, which is tissue specific.
The present invention still further provides cells genetically engineered ih
vivo to
express the polynucleotides of the invention, wherein such polynucleotides are
in operative
association with a regulatory sequence heterologous to the host cell which
drives expression of
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
the polynucleotides in the cell. These methods can be used to increase or
decrease the
expression of the polynucleotides of the present invention.
Knowledge of DNA sequences provided by the invention allows for modification
of
cells to permit, increase, or decrease, expression of endogenous polypeptide.
Cells can be
5 modified (e.g., by homologous recombination) to provide increased
polypeptide expression by
replacing, in whole or in part, the naturally occurring promoter with all or
part of a heterologous
promoter so that the cells express the protein at lugher levels. The
heterologous promoter is
inserted in such a manner that it is operatively linked to the desired protein
encoding sequences.
See, for example, PCT International Publication No. WO 94/12650, PCT
International
10 Publication No. WO 92/20808, and PCT International Publication No. WO
91/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA (e.g.,
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate
synthase,
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be
inserted along with
the heterologous promoter DNA. If linked to the desired protein coding
sequence,
15 amplification of the marker DNA by standard selection methods results in co-
amplification of
the desired protein coding sequences in the cells.
In another embodiment of the present invention, cells and tissues may be
engineered to
express an endogenous gene comprising the polynucleotides of the invention
under the control
of inducible regulatory elements, in which case the regulatory sequences of
the endogenous
20 gene may be replaced by homologous recombination. As described herein, gene
targeting can
be used to replace a gene's existing regulatory region with a regulatory
sequence isolated from
a different gene or a novel regulatory sequence synthesized by genetic
engineering methods.
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-
attachment
regions, negative regulatory elements, transcriptional initiation sites,
regulatory protein binding
25 sites or combinations of said sequences. Alternatively, sequences which
affect the structure or
stability of the RNA or protein produced may be replaced, removed, added, or
otherwise
modified by targeting. These sequences include polyadenylation signals, mRNA
stability
elements, splice sites, leader sequences for enhancing or modifying transport
or secretion
properties of the protein, or other sequences which alter or improve the
function or stability of
30 protein or RNA molecules.
The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting'a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
41
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different cell-type
specificity than the naturally occurring elements. Here, the naturally
occurring sequences are
deleted and new sequences are added. In all cases, the identification of the
targeting event may
be facilitated by the use of one or more selectable marker genes that are
contiguous with the
targeting DNA, allowing for the selection of cells in which the exogenous DNA
has integrated
into the cell genome. The identification of the targeting event may also be
facilitated by the use
of one or more marker genes exhibiting the property of negative selection,
such that the
negatively selectable marker is linked to the exogenous DNA, but configured
such that the
negatively selectable marker flanks the targeting sequence, and such that a
correct homologous
recombination event with sequences in the host cell genome does not result in
the stable
integration of the negatively selectable marker. Markers useful for this
purpose include the
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xantlune-
guanine
phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance with
this aspect of the invention are more particularly described in U.S. Patent
No. 5,272,071 to
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; W ternational
Application No.
PCT/LTS92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/LTS90/06436 (W091/06667) by Skoultchi et al., each of which is
incorporated by
reference herein in its entirety.
4.9 TRANSGENIC ANIMALS
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)J. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
42
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. WO94/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of a promoter of the
polynucleotides of the invention is either activated or inactivated to alter
the level of
expression of the polypeptides of the invention. Inactivation can be carried
out using
homologous recombination methods described above. Activation can be achieved
by
supplementing or even replacing the homologous promoter to provide for
increased protein
expression. The homologous promoter can be supplemented by insertion of one or
more
heterologous enhancer elements known to confer promoter activation in a
particular tissue.
The polynucleotides of the present invention also make possible the
development,
through, e.g., homologous recombination or knock out strategies, of animals
that fail to
express polypeptides of the invention or that express a variant polypeptide.
Such animals are
useful as models for studying the i~ vivo activities of polypeptide as well as
for studying
modulators of the polypeptides of the invention.
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)x. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. W094/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of the polynucleotides
of the
invention promoter is either activated or inactivated to alter the level of
expression of the
polypeptides of the invention. Inactivation can be carried out using
homologous
recombination methods described above. Activation can be achieved by
supplementing or
even replacing the homologous promoter to provide for increased protein
expression. The
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
43
homologous promoter can be supplemented by insertion of one or more
heterologous
enhancer elements known to confer promoter activation in a particular tissue.
4.10 USES AND BIOLOGICAL ACTIVITY
The polynucleotides and proteins of the present invention are expected to
exhibit one
or more of the uses or biological activities (including those associated with
assays cited
herein) identified herein. Uses or activities described fox proteins of the
present invention
may be provided by administration or use of such proteins or of
polynucleotides encoding
such proteins (such as, for example, in gene therapies or vectors suitable for
introduction of
DNA). The mechanism underlying the particular condition or pathology will
dictate whether
the polypeptides of the invention, the polynucleotides of the invention or
modulators
(activators or inhibitors) thereof would be beneficial to the subject in need
of treatment.
Thus, "therapeutic compositions of the invention" include compositions
comprising isolated
polynucleotides (including recombinant DNA molecules, cloned genes and
degenerate
variants thereof) or polypeptides of the invention (including full length
protein, mature
protein and truncations or domains thereof), or compounds and other substances
that
modulate the overall activity of the target gene products, either at the level
of target
gene/protein expression or target protein activity. Such modulators include
polypeptides,
analogs, (variants), including fragments and fusion proteins, antibodies and
other binding
proteins; chemical compounds that directly or indirectly activate or inhibit
the polypeptides
of the invention (identified, e.g., via drug screening assays as described
herein); antisense
polynucleotides and polynucleotides suitable for triple helix formation; and
in particular
antibodies or other binding partners that specifically recognize one or more
epitopes of the
polypeptides of the invention.
The polypeptides of the present invention may likewise be involved in cellular
activation or in one of the other physiological pathways described herein.
4.10.1 RESEARCH USES AND UTILITIES
The polynucleotides provided by the present invention can be used by the
research
community for various purposes. The polynucleotides can be used to express
recombinant
protein for analysis, characterization or therapeutic use; as markers for
tissues in which the
corresponding protein is preferentially expressed (either constitutively or at
a particular stage
of tissue differentiation or development or in disease states); as molecular
weight markers on
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
44
gels; as chromosome markers or tags (when labeled) to identify chromosomes or
to map
related gene positions; to compare with endogenous DNA sequences in patients
to identify
potential genetic disorders; as probes to hybridize and thus discover novel,
related DNA
sequences; as a source of information to derive PCR primers for genetic
fingerprinting; as a
probe to "subtract-out" known sequences in the process of discovering other
novel
polynucleotides; for selecting and making oligomers for attachment to a "gene
chip" or other
support, including for examination of expression patterns; to raise anti-
protein antibodies
using DNA immunization techniques; and as an antigen to raise anti-DNA
antibodies or
elicit another immune response. Where the polynucleotide encodes a protein
which binds or
potentially binds to another protein (such as, for example, in a receptor-
ligand interaction),
the polynucleotide can also be used in interaction trap assays (such as, for
example, that
described in Gyuris et al., Cell 75:791-803 (1993)) to identify
polynucleotides encoding the
other protein with which binding occurs or to identify inhibitors of the
binding interaction.
The polypeptides provided by the present invention can similarly be used in
assays to
determine biological activity, including in a panel of multiple proteins for
high-throughput
screening; to raise antibodies or to elicit another immune response; as a
reagent (including
the labeled reagent) in assays designed to quantitatively determine levels of
the protein (or
its receptor) in biological fluids; as markers for tissues in which the
corresponding
polypeptide is preferentially expressed (either constitutively or at a
particular stage of tissue
differentiation or development or in a disease state); and, of course, to
isolate correlative
receptors or ligands. Proteins involved in these binding interactions can also
be used to
screen for peptide or small molecule inhibitors or agonists of the binding
interaction.
Any or all of these research utilities are capable of being developed into
reagent
grade or kit format for commercialization as research products.
Methods for performing the uses listed above are well known to those skilled
in the
art. References disclosing such methods include without limitation "Molecular
Cloning: A
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J.,
E. F.
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to
Molecular
Cloning Techniques", Academic Press, Bergen S. L. and A. R. Kimmel eds., 1987.
4.10.2 NUTRITIONAL USES
Polynucleotides and polypeptides of the present invention can also be used as
nutritional sources or supplements. Such uses include without limitation use
as a protein or
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
aanino acid supplement, use as a carbon source, use as a nitrogen source and
use as a source of
carbohydrate. In such cases the polypeptide or polynucleotide of the invention
can be added to
the feed of a particular organism or can be administered as a separate solid
or liquid
preparation, such as in the form of powder, pills, solutions, suspensions or
capsules. In the case
of microorganisms, the polypeptide or polynucleotide of the invention can be
added to the
medium in or on which the microorganism is cultured.
4.10.3 CYTOHINE ANI) CELL PROLIFERATION/DIFFERENTIATION
ACTIVITY
10 A polypeptide of the present invention may exhibit activity relating to
cytokine, cell
proliferation (either inducing or inhibiting) or cell differentiation (either
inducing or
inhibiting) activity or may induce production of other cytokines in certain
cell populations.
A polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Many protein factors discovered to date, including all known cytokines, have
exhibited
15 activity in one or more factor-dependent cell proliferation assays, and
hence the assays serve
as a convenient confirmation of cytokine activity. The activity of therapeutic
compositions
of the present invention is evidenced by any one of a number of routine factor
dependent cell
proliferation assays for cell lines including, without limitation, 32D, DA2,
DAIG, T10, B9,
B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RBS, DAl, 123, T1165, HT2, CTLL2, TF-1,
20 Mo7e, CMI~, HUVEC, and Caco. Therapeutic compositions of the invention can
be used in
the following:
Assays for T-cell or thymocyte proliferation include without limitation those
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M.
Kruisbeek, D. H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
25 Wiley-Interscience (Chapter 3, Ih Yitro assays for Mouse Lymphocyte
Function 3.1-3.19;
Chapter 7, linmunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Bertagnolli et al., J. Iminunol. 145:1706-1712, 1990; Bertagnolli et al.,
Cellular Irmnunology
133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992;
Bowman et al., I.
hnmunol. 152:1756-1761, 1994.
30 Assays for cytokine production and/or proliferation of spleen cells, lymph
node cells
or thymocytes include, without limitation, those described in: Polyclonal T
cell stimulation,
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in hnmunology. J. E.
e.a. Coligan
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and
Measurement of
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
46
mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in
Immunology. J. E.
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.
Assays for proliferation and differentiation of hematopoietic and
lymphopoietic cells
include, without limitation, those described in: Measurement of Human and
Murine
Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E.
In Current
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John
Wiley and
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau
et al.,
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A.
80:2931-2938,
1983; Measurement of mouse and human interleukin 6--Nordan, R. In Current
Protocols in
Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons,
Toronto. 1991;
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of
human
Interleukin 11--Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In
Current Protocols
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons,
Toronto. 1991;
Measurement of mouse and human Interleukin 9--Ciarletta, A., Giannotti, J.,
Clark, S. C.
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1
pp. 6.13.1,
John Wiley and Sons, Toronto. 1991.
Assays for T-cell clone responses to antigens (which will identify, among
others,
proteins that affect APC-T cell interactions as well as direct T-cell effects
by measuring
proliferation and cytokine production) include, without limitation, those
described in:
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H.
Margulies,
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-
Interscience
(Chapter 3, Ih T~itYO assays for Mouse Lymphocyte Function; Chapter 6,
Cytokines and their
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et
al., Proc.
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun.
11:405-41 l,
1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. hnmunol.
140:508-512,
1988.
4.10.4 STEM CELL GROWTH FACTOR ACTIVITY
A polypeptide of the present invention may exhibit stem cell growth factor
activity
and be involved in the proliferation, differentiation and survival of
pluripotent and totipotent
stem cells including primordial germ cells, embryonic stem cells,
hematopoietic stem cells
and/or germ line stem cells. Administration of the polypeptide of the
invention to stem cells
in vivo or ex vivo is expected to maintain and expand cell populations in a
totipotential or
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
47
pluripotential state wluch would be useful for re-engineering damaged or
diseased tissues,
transplantation, manufacture of bio-pharmaceuticals and the development of bio-
sensors.
The ability to produce large quantities of human cells has important working
applications for
the production of human proteins which currently must be obtained from non-
human sources
or donors, implantation of cells to treat diseases such as Parkinson's,
Alzheimer's and other
neurodegenerative diseases; tissues for grafting such as bone marrow, skin,
cartilage,
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea,
neural cells,
gastrointestinal cells and others; and organs for transplantation such as
kidney, liver,
pancreas (including islet cells), heart and lung.
It is contemplated that multiple different exogenous growth factors and/or
cytokines
may be administered in combination with the polypeptide of the invention to
achieve the
desired effect, including any of the growth factors listed herein, other stem
cell maintenance
factors, and specifically including stem cell factor (SCF), leukemia
inhibitory factor (LIF),
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6
receptor fused to IL-
6, macrophage inflammatory protein 1-alpha (MIP-1-alpha), G-CSF, GM-CSF,
thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor
(PDGF),
neural growth factors and basic fibroblast growth factor (bFGF).
Since totipotent stem cells can give rise to virtually any mature cell type,
expansion
of these cells in culture will facilitate the production of large quantities
of mature cells.
Techniques for culturing stem cells are known in the art and administration of
polypeptides
of the invention, optionally with other growth factors and/or cytokines, is
expected to
enhance the survival and proliferation of the stem cell populations. This can
be
accomplished by direct administration of the polypeptide of the invention to
the culture
medium. Alternatively, stroma cells transfected with a polynucleotide that
encodes for the
polypeptide of the invention can be used as a feeder layer for the stem cell
populations in
culture or in vivo. Stromal support cells fort feeder layers may include
embryonic bone
marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured
embryonic
fibroblasts (see U.S. Patent No. 5,690,926).
Stem cells themselves can be transfected with a polynucleotide of the
invention to
induce autocrine expression of the polypeptide of the invention. This will
allow for
generation of undifferentiated totipotential/pluripotential stem cell lines
that are useful as is
or that can then be differentiated into the desired mature cell types. These
stable cell lines
can also serve as a source of undifferentiated totipotential/pluripotential
mRNA to create
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
48
cDNA libraries and templates for polymerase chain reaction experiments. These
studies
would allow for the isolation and identification of differentially expressed
genes in stem cell
populations that regulate stem cell proliferation and/or maintenance.
Expansion and maintenance of totipotent stem cell populations will be useful
in the
treatment of many pathological conditions. For example, polypeptides of the
present
invention may be used to manipulate stem cells in culture to give rise to
neuroepithelial cells
that can be used to augment or replace cells damaged by illness, autoimmune
disease,
accidental damage or genetic disorders. The polypeptide of the invention may
be useful for
inducing the proliferation of neural cells and for the regeneration of nerve
and brain tissue,
i.e. for the treatment of central and peripheral nervous system diseases and
neuropathies, as
well as mechanical and traumatic disorders which involve degeneration, death
or trauma to
neural cells or nerve tissue. In addition, the expanded stem cell populations
can also be
genetically altered for gene therapy purposes and to decrease host rejection
of replacement
tissues after grafting or implantation.
Expression of the polypeptide of the invention and its effect on stem cells
can also be
manipulated to achieve controlled differentiation of the stem cells into more
differentiated
cell types. A broadly applicable method of obtaining pure populations of a
specific
differentiated cell type from undifferentiated stem cell populations involves
the use of a cell-
type specific promoter driving a selectable marker. The selectable marker
allows only cells
of the desired type to survive. For example, stem cells can be induced to
differentiate into
cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et
al., J. Clin.
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. Tn:
Prifaciples of
Tissue Ehgiraeering eds. Lanza et al., Academic Press (1997)). Alternatively,
directed
differentiation of stem cells can be accomplished by culturing the stem cells
in the presence
of a differentiation factor such as retinoic acid and an antagonist of the
polypeptide of the
invention which would inhibit the effects of endogenous stem cell factor
activity and allow
differentiation to proceed. i
I~ vitro cultures of stem cells can be used to determine if the polypeptide of
the
invention exhibits stem cell growth factor activity. Stem cells are isolated
from any one of
various cell sources (including hematopoietic stem cells and embryonic stem
cells) and
cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad.
Sci, U.S.A.,
92: 7844-7848 (1995), in the presence of the polypeptide of the invention
alone or in
combination with other growth factors or cytokines. The ability of the
polypeptide of the
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
49
invention to induce stem cells proliferation is determined by colony formation
on semi-solid
support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991).
4.10.5 HEMATOPOIESIS REGULATING ACTIVITY
A polypeptide of the present invention may be involved in regulation of
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell
disorders.
Even marginal biological activity in support of colony forming cells or of
factor-dependent
cell lines indicates involvement in regulating hematopoiesis, e.g. in
supporting the growth
and proliferation of erythroid progenitor cells alone or in combination with
other cytokines,
thereby indicating utility, for example, in treating various anemias or for
use in conjunction
with irradiation/chemotherapy to stimulate the production of erythroid
precursors and/or
erythroid cells; in supporting the growth and proliferation of myeloid cells
such as
granulocytes and monocytes/macrophages (i.e., traditional CSF activity)
useful, for example,
in conjunction with chemotherapy to prevent or treat consequent myelo-
suppression; in
supporting the growth and proliferation of megakaryocytes and consequently of
platelets
thereby allowing prevention or treatment of various platelet disorders such as
thrombocytopenia, and generally for use in place of or complimentary to
platelet
transfusions; and/or in supporting the growth and proliferation of
hematopoietic stem cells
which are capable of maturing to any and all of the above-mentioned
hematopoietic cells and
therefore find therapeutic utility in various stem cell disorders (such as
those usually treated
with transplantation, including, without limitation, aplastic anemia and
paroxysmal nocturnal
hemoglobinuria), as well as in repopulating the stem cell compartment post
irradiation/chemotherapy, either i~-vivo or ex-vivo (i.e., in conjunction with
bone marrow
transplantation or with peripheral progenitor cell transplantation (homologous
or
heterologous)) as normal cells or genetically manipulated for gene therapy.
Therapeutic compositions of the invention can be used in the following:
Suitable assays for proliferation and differentiation of various hematopoietic
lines are
cited above.
Assays for embryonic stem cell differentiation (which will identify, among
others,
proteins that influence embryonic differentiation hematopoiesis) include,
without limitation,
those described in: Johansson et al. Cellular Biology 15:141-151, 1995;
I~eller et al.,
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood
81:2903-2915,
1993.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Assays for stem cell survival and differentiation (which will identify, among
others,
proteins that regulate lympho-hematopoiesis) include, without limitation,
those described in:
Methylcellulose colony forming assays, Freshney, M. G. In Culture of
Hematopoietic Cells.
R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, W c., New York, N.Y.
1994;
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 l, 1992; Primitive
hematopoietic
colony forming cells with high proliferative potential, McNiece, I. I~. and
Briddell, R. A. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39,
Wiley-Liss, Inc.,
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994;
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of
Hematopoietic Cells.
10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y.
1994; Long term
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter,
M. and Allen,
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-
179, Wiley-Liss,
Inc., New York, N.Y. I994; Long term culture initiating cell assay,
Sutherland, H. J. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162,
Wiley-Liss, Inc.,
15 New York, N.Y. 1994.
4.10.6 TISSUE GROWTH ACTIVITY
A polypeptide of the present invention also may be involved in bone,
cartilage,
tendon, ligament and/or nerve tissue growth or regeneration, as well as in
wound healing and
20 tissue repair and replacement, and in healing of burns, incisions and
ulcers.
A polypeptide of the present invention which induces cartilage and/or bone
growth in
circumstances where bone is not normally fomned, has application in the
healing of bone
fractures and cartilage damage or defects in humans and other animals.
Compositions of a
polypeptide, antibody, binding partner, or other modulator of the invention
may have
25 prophylactic use in closed as well as open fracture reduction and also in
the improved
fixation of artificial joints. De novo bone formation induced by an osteogenic
agent
contributes to the repair of congenital, trauma induced, or oncologic
resection induced
craniofacial defects, and also is useful in cosmetic plastic surgery.
A polypeptide of this invention may also be involved in attracting bone-
forming
30 cells, stimulating growth of bone-forming cells, or inducing
differentiation of progenitors of
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone
degenerative disorders, or
periodontal disease, such as through stimulation of bone and/or cartilage
repair or by
blocking inflammation or processes of tissue destruction (collagenase
activity, osteoclast
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
51
activity, etc.) mediated by inflammatory processes may also be possible using
the
composition of the invention.
Another category of tissue regeneration activity that may involve the
polypeptide of
the present invention is tendoWligament formation. Induction of
tendon/ligament-like tissue
or other tissue formation in circumstances where such tissue is not normally
formed, has
application in the healing of tendon or ligament tears, deformities and other
tendon or
ligament defects in humans and other animals. Such a preparation employing a
tendon/ligament-like tissue inducing protein may have prophylactic use in
preventing
damage to tendon or ligament tissue, as well as use in the improved fixation
of tendon or
ligament to bone or other tissues, and in repairing defects to tendon or
ligament tissue. De
novo tendon/ligament-like tissue formation induced by a composition of the
present
invention contributes to the repair of congenital, trauma induced, or other
tendon or ligament
defects of other origin, and is also useful in cosmetic plastic surgery for
attachment or repair
of tendons or ligaments. The compositions of the present invention may provide
environment to attract tendon- or ligament-forming cells, stimulate growth of
tendon- or
ligament-forming cells, induce differentiation of progenitors of tendon- or
ligament-forming
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for
return ira vivo to
effect tissue repair. The compositions of the invention may also be useful in
the treatment of
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The
compositions
may also include an appropriate matrix and/or sequestering agent as a carrier
as is well
known in the art.
The compositions of the present invention may also be useful for proliferation
of
neural cells and for regeneration of nerve and brain tissue, i.e. for the
treatment of central
and peripheral nervous system diseases and neuropathies, as well as mechanical
and
traumatic disorders, which involve degeneration, death or trauma to neural
cells or nerve
tissue. More specifically, a composition may be used in the treatment of
diseases of the
peripheral nervous system, such as peripheral nerve injuries, peripheral
neuropathy and
localized neuropathies, and central nervous system diseases, such as
Alzheimer's,
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and
Shy-Drager
syndrome. Further conditions which may be treated in accordance with the
present invention
include mechanical and traumatic disorders, such as spinal cord disorders,
head trauma and
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting
from
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
52
chemotherapy or other medical therapies may also be treatable using a
composition of the
invention.
Compositions of the invention may also be useful to promote better or faster
closure
of non-healing wounds, including without limitation pressure ulcers, ulcers
associated with
vascular insufficiency, surgical and traumatic wounds, and the like.
Compositions of the present invention may also be involved in the generation
or
regeneration of other tissues, such as organs (including, for example,
pancreas, liver,
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac)
and vascular
(including vascular endothelium) tissue, or for promoting the growth of cells
comprising
such tissues. Part of the desired effects may be by inhibition or modulation
of fibrotic
scarring may allow normal tissue to regenerate. A polypeptide of the present
invention may
also exhibit angiogenic activity.
A composition of the present invention may also be useful for gut protection
or
regeneration and treatment of lung or liver fibrosis, reperfusion injury in
various tissues, and
conditions resulting from systemic cytokine damage.
A composition of the present invention may also be useful for promoting or
inhibiting differentiation of tissues described above from precursor tissues
or cells; or for
inhibiting the growth of tissues described above.
Therapeutic compositions of the invention can be used in the following:
Assays for tissue generation activity include, without limitation, those
described in:
International Patent Publication No. W095/16035 (bone, cartilage, tendon);
International
Patent Publication No. W095/05846 (nerve, neuronal); International Patent
Publication No.
W091/07491 (skin, endothelium).
Assays for wound healing activity include, without limitation, those described
in:
Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T.,
eds.),
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and
Mertz, J. Invest.
Dermatol 71:382-84 (1978).
4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY
A polypeptide of the present invention may also exhibit immune stimulating or
immune suppressing activity, including without limitation the activities for
which assays are
described herein. A polynucleotide of the invention can encode a polypeptide
exhibiting
such activities. A protein may be useful in the treatment of various immune
deficiencies and
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
53
disorders (including severe combined immunodeficiency (SCID)), e.g., in
regulating (up or
down) growth and proliferation of T andlor B lymphocytes, as well as effecting
the cytolytic
activity of NIA cells and other cell populations. These immune deficiencies
may be genetic or
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or
may result from
autoimmune disorders. More specifically, infectious diseases causes by viral,
bacterial,
fungal or other infection may be treatable using a protein of the present
invention, including
infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania
spp., malaria
spp. and various fungal infections such as candidiasis. Of course, in this
regard, proteins of
the present invention may also be useful where a boost to the immune system
generally may
be desirable, i.e.~ in the treatment of cancer.
Autoimmune disorders which may be treated using a protein of the present
invention
include, for example, connective tissue disease, multiple sclerosis, systemic
lupus
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation,
Guillain-Barre
syndrome, autoirmnune thyroiditis, insulin dependent diabetes mellitis,
myasthenia gravis,
graft-versus-host disease and autoimmune inflammatory eye disease. Such a
protein (or
antagonists thereof, including antibodies) of the present invention may also
to be useful in
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum
sickness, drug
reactions, food allergies, insect venom allergies, mastocytosis, allergic
rhinitis,
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic
dermatitis, allergic
contact dermatitis, erythema, multiforme, Stevens-Johnson syndrome, allergic
conjunctivitis,
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary
conjunctivitis and
contact allergies), such as asthma (particularly allergic asthma) or other
respiratory
problems. Other conditions, in which immune suppression is desired (including,
for
example, organ transplantation), may also be treatable using a protein (or
antagonists
thereof) of the present invention. The therapeutic effects of the polypeptides
or antagonists
thereof on allergic reactions can be evaluated by in vivo animals models such
as the
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66,
I99~), skin
prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin
sensitization test
(Vohr et al., Arch. Toxocol. 73: 501-9), and marine local lymph node assay
(Kimber et al.,
J. Toxicol. Environ. Health 53: 563-79).
Using the proteins of the invention it may also be possible to modulate immune
responses, in a number of ways. Down regulation may be in the form of
inhibiting or
blocking an immune response already in progress or may involve preventing the
induction of
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
54
an immune response. The functions of activated T cells may be inhibited by
suppressing T
cell responses or by inducing specific tolerance in T cells, or both.
Immunosuppression of T
cell responses is generally an active, non-antigen-specific, process which
requires continuous
exposure of the T cells to the suppressive agent. Tolerance, which involves
inducing
non-responsiveness or energy in T cells, is distinguishable from
immunosuppression in that
it is generally antigen-specific and persists after exposure to the tolerizing
agent has ceased.
Operationally, tolerance can be demonstrated by the lack of a T cell response
upon
reexposure to specific antigen in the absence of the tolerizing agent.
Down regulating or preventing one or more antigen functions (including without
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g.,
preventing high
level lymphokine synthesis by activated T cells, will be useful in situations
of tissue, skin
and organ transplantation and in graft-versus-host disease (GVHD). For
example, blockage
of T cell function should result in reduced tissue destruction in tissue
transplantation.
Typically, in tissue transplants, rejection of the transplant is initiated
through its recognition
as foreign by T cells, followed by an immune reaction that destroys the
transplant. The
administration of a therapeutic composition of the invention may prevent
cytokine synthesis
by immune cells, such as T cells, and thus acts as an immunosuppressant.
Moreover, a lack
of costimulation may also be sufficient to energize the T cells, thereby
inducing tolerance in
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking
reagents may
avoid the necessity of repeated administration of these blocking reagents. To
achieve
sufficient immunosuppression or tolerance in a subject, it may also be
necessary to block the
function of a combination of B lymphocyte antigens.
The efficacy of particular therapeutic compositions in preventing organ
transplant
rejection or GVHD can be assessed using animal models that are predictive of
efficacy in
humans. Examples of appropriate systems which can be used include allogeneic
cardiac
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of
which have been
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in
vivo as
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al.,
Proc. Natl. Aced.
Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul
ed.,
Fundamental Irmnunology, Raven Press, New York, 1989, pp. 846-847) can be used
to
determine the effect of therapeutic compositions of the invention on the
development of that
disease.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Blocking antigen function may also be therapeutically useful for treating
autoimmune diseases. Many autoimmune disorders are the result of inappropriate
activation
of T cells that are reactive against self tissue and which promote the
production of cytokines
asld autoantibodies involved in the pathology of the diseases. Preventing the
activation of
5 autoreactive T cells may reduce or eliminate disease symptoms.
Administration of reagents
which block stimulation of T cells can be used to inhibit T cell activation
and prevent
production of autoantibodies or T cell-derived cytokines which may be involved
in the
disease process. Additionally, blocking reagents may induce antigen-specific
tolerance of
autoreactive T cells which could lead to long-teen relief from the disease.
The efficacy of
10 blocking reagents in preventing or alleviating autoimmune disorders can be
determined
using a number of well-characterized animal models of human autoimmune
diseases.
Examples include marine experimental autoixnmune encephalitis, systemic lupus
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, marine autoimmune
collagen
arthritis, diabetes mellitus in NOD mice and BB rats, and marine experimental
myasthenia
15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989,
pp.
840-856).
Upregulation of an antigen function (e.g., a B lymphocyte antigen function),
as a
means of up regulating immune responses, may also be useful in therapy.
Upregulation of
immune responses may be in the form of enhancing an existing immune response
or eliciting
20 an initial immune response. For example, enhancing an immune response may
be useful in
cases of viral infection, including systemic viral diseases such as influenza,
the common
cold, and encephalitis.
Alternatively, anti-viral immune responses may be enhanced in an infected
patient by
removing T cells from the patient, costimulating the T cells in vitro with
viral antigen-pulsed
25 APCs either expressing a peptide of the present invention or together with
a stimulatory
form of a soluble peptide of the present invention and reintroducing the in
vitro activated T
cells into the patient. Another method of enhancing anti-viral immune
responses would be to
isolate infected cells from a patient, transfect them with a nucleic acid
encoding a protein of
the present invention as described herein such that the cells express all or a
portion of the
30 protein on their surface, and reintroduce the transfected cells into the
patient. The infected
cells would now be capable of delivering a costimulatory signal to, and
thereby activate, T
cells in vivo.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
56
A polypeptide of the present invention may provide the necessary stimulation
signal
to T cells to induce a T cell mediated immune response against the transfected
tumor cells.
W addition, tumor cells which lack MHC class I or MHC class II molecules, or
which fail to
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be
transfected
with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain
truncated portion)
of an MHC class I alpha chain protein and (32 microglobulin protein or an MHC
class II
alpha chain protein and an MHC class II beta chain protein to thereby express
MHC class I
or MHC class II proteins on the cell surface. Expression of the appropriate
class I or class II
MHC in conjunction with a peptide having the activity of a B lymphocyte
antigen (e.g.,
B7-1, B7-2, B7-3) induces a T cell mediated immune response against the
transfected tumor
cell. Optionally, a gene encoding an antisense construct which blocks
expression of an MHC
class II associated protein, such as the invariant chain, can also be
cotransfected with a DNA
encoding a peptide having the activity of a B lymphocyte antigen to promote
presentation of
tumor associated antigens and induce tumor specific immunity. Thus, the
induction of a T
cell mediated immune response in a human subject may be sufficient to overcome
tumor-specific tolerance in the subject.
The activity of a protein of the invention may, among other means, be measured
by
the following methods:
Suitable assays for thymocyte or splenocyte cytotoxicity include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. I~ruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Hemnann et al., Proc. Natl. Acad.
Sci. USA
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et
al., J.
Tm_m__unol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500,
1986; Takai et al.,
J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998;
Bertagnolli et
al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Tmmunol. 153:3079-
3092,
1994.
Assays for T-cell-dependent immunoglobulin responses and isotype switching
(which will identify, among others, proteins that modulate T-cell dependent
antibody
responses and that affect Thl/Th2 profiles) include, without limitation, those
described in:
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function:
In vitro
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
57
antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in
Immunology. J.
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto.
1994.
Mixed lymphocyte reaction (MLR) assays (which will identify, among others,
proteins that generate predominantly Thl and CTL responses) include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. Kruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol.
149:3778-3783,
1992.
Dendritic cell-dependent assays (which will identify, among others, proteins
expressed by dendritic cells that activate naive T-cells) include, without
limitation, those
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et aL,
Journal of
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of
Immunology
154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-
260,
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al.,
Science
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-
1264,
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and
Inaba et al.,
Journal of Experimental Medicine 172:631-640, 1990.
Assays for lymphocyte survival/apoptosis (which will identify, among others,
proteins that prevent apoptosis after superantigen induction and proteins that
regulate
lymphocyte homeostasis) include, without limitation, those described in:
Darzynkiewicz et
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993;
Gorczyca et
al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991;
Zacharchuk,
Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897,
1993;
Gorczyca et al., International Journal of Oncology 1:639-648, 1992.
Assays for proteins that influence early steps of T-cell commitment and
development
include, without limitation, those described in: Antica et al., Blood 84:111-
117, 1994; Fine
et al., Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-
2778, 1995;
Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.
4.10.8 ACTIVIN/INHIBIN ACTIVITY
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
58
A polypeptide of the present invention may also exhibit activin- or inhibin-
related
activities. A polynucleotide of the invention may encode a polypeptide
exhibiting such
characteristics. Inhibins are characterized by their ability to inhibit the
release of follicle
stimulating hormone (FSH), while activins and are characterized by their
ability to stimulate
the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the
present
invention, alone or in heterodimers with a member of the inhibin family, may
be useful as a
contraceptive based on the ability of inlubins to decrease fertility in female
mammals and
decrease spermatogenesis in male marmnals. Administration of sufficient
amounts of other
inlubins can induce infertility in these mammals. Alternatively, the
polypeptide of the
invention, as a homodimer or as a heterodimer with other protein subunits of
the inhibin
group, may be useful as a fertility inducing therapeutic, based upon the
ability of activin
molecules in stimulating FSH release from cells of the anterior pituitary.
See, for example,
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for
advancement
of the onset of fertility in sexually immature mammals, so as to increase the
lifetime
reproductive performance of domestic animals such as, but not limited to,
cows, sheep and
pigs.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods.
Assays for activiWinhibin activity include, without limitation, those
described in:
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782,
1986; Vale et
al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage
et al., Proc.
Natl. Acad. Sci. USA 83:3091-3095, 1986.
4.10.9 CHEMOTACTIC/CHEMOHINETIC ACTIVITY
A polypeptide of the present invention may be involved in chemotactic or
chemokinetic activity for mammalian cells, including, for example, monocytes,
fibroblasts,
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial
cells. A
polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Chemotactic and chemokinetic receptor activation can be used to mobilize or
attract a
desired cell population to a desired site of action. Chemotactic or
chemokinetic compositions
(e.g. proteins, antibodies, binding partners, or modulators of the invention)
provide particular
advantages in treatment of wounds and other trauma to tissues, as well as in
treatment of
localized infections. For example, attraction of lymphocytes, monocytes or
neutrophils to
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
59
tumors or sites of infection may result in improved immune responses against
the tumor or
infecting agent.
A protein or peptide has chemotactic activity for a particular cell population
if it can
stimulate, directly or indirectly, the directed orientation or movement of
such cell
population. Preferably, the protein or peptide has the ability to directly
stimulate directed
movement of cells. Whether a particular protein has chemotactic activity for a
population of
cells can be readily determined by employing such protein or peptide in any
known assay for
cell chemotaxis.
Therapeutic compositions of the invention can be used in the following:
Assays for chemotactic activity (which will identify proteins that induce or
prevent
chemotaxis) consist of assays that measure the ability of a protein to induce
the migration of
cells across a membrane as well as the ability of a protein to induce the
adhesion of one cell
population to another cell population. Suitable assays for movement and
adhesion include,
without limitation, those described in: Current Protocols in Immunology, Ed by
J. E.
Coligan, A. M. I~ruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub.
Greene
Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of
alpha and beta
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995;
Lind et al.
APMIS 103:140-146, 1995; Muller et al Eur. J. Imrnunol. 25:1744-1748; Gruber
et al. J. of
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768,
1994.
4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY
A polypeptide of the invention may also be involved in hemostatis or
thrombolysis or
thrombosis. A polynucleotide of the invention can encode a polypeptide
exhibiting such
attributes. Compositions may be useful in treatment of various coagulation
disorders
(including hereditary disorders, such as hemophiliac) or to enhance
coagulation and other
hemostatic events in treating wounds resulting from trauma, surgery or other
causes. A
composition of the invention may also be useful for dissolving or inhibiting
formation of
thromboses and for treatment and prevention of conditions resulting therefrom
(such as, for
example, infarction of cardiac and central nervous system vessels (e.g.,
stroke).
Therapeutic compositions of the invention can be used in the following:
Assay for hemostatic and thrombolytic activity include, without limitation,
those
described in: Linet et al., J. Clin. Phannacol. 26:131-140, 1986; Burdick et
al., Thrombosis
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub,
Prostaglandins 35:467-474, 1988.
4.14.11 CANCER DIAGNOSIS AND THERAPY
5 Polypeptides of the invention may be involved in cancer cell generation,
proliferation
or metastasis. Detection of the presence or amount of polynucleotides or
polypeptides of the
invention may be useful for the diagnosis and/or prognosis of one or more
types of cancer.
For example, the presence or increased expression of a
polynucleotide/polypeptide of the
invention may indicate a hereditary risk of cancer, a precancerous condition,
or an ongoing
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide
may be
associated with a cancer condition. Identification of single nucleotide
polymorphisms
associated with cancer or a predisposition to cancer may also be useful for
diagnosis or
prognosis.
Cancer treatments promote tumor regression by inhibiting tumor cell
proliferation,
15 inhibiting angiogenesis (growth of new blood vessels that is necessary to
support tumor
growth) and/or prohibiting metastasis by reducing tumor cell motility or
invasiveness.
Therapeutic compositions of the invention may be effective in adult and
pediatric oncology
including in solid phase tumors/malignancies, locally advanced tumors, human
soft tissue
sarcomas, metastatic cancer, including l5nnphatic metastases, blood cell
malignancies
20 including multiple myeloma, acute and chronic leukemias, and lymphomas,
head and neck
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers
including
small cell carcinoma and non-small cell cancers, breast cancers including
small cell .
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal
cancer,
stomach cancer, colon cancer, colorectal cancer and polyps associated with
colorectal
25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including
bladder cancer and
prostate cancer, malignancies of the female genital tract including ovarian
carcinoma, uterine
(including endometrial) cancers, and solid tumor in the ovarian follicle,
kidney cancers
including renal cell carcinoma, brain cancers including intrinsic brain
tumors,
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell
invasion in the central
30 nervous system, bone cancers including osteomas, skin cancers including
malignant
melanoma, tumor progression of human skin keratinocytes, squamous cell
carcinoma, basal
cell carcinoma, hemangiopericytoma and Karposi's sarcoma.
Polypeptides, polynucleotides, or modulators of polypeptides of the invention
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
61
(including inhibitors and stimulators of the biological activity of the
polypeptide of the
invention) may be administered to treat cancer. Therapeutic compositions can
be
administered in therapeutically effective dosages alone or in combination with
adjuvant
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and
laser
therapy, and may provide a beneficial effect, e.g. reducing tumor size,
slowing rate of tumor
growth, inhibiting metastasis, or otherwise improving overall clinical
condition, without
necessarily eradicating the cancer.
The composition can also be administered in therapeutically effective amounts
as a
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of
the polypeptide or
modulator of the invention with one or more anti-cancer drugs in addition to a
pharmaceutically acceptable carrier for delivery. The use of anti-cancer
cocktails as a cancer
treatment is routine. Anti-cancer drugs that are well knovcm in the art and
can be used as a
treatment in combination with the polypeptide or modulator of the invention
include:
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan,
Carboplatin,
Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine
HCl
(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCI,
Doxombicin HCl,
Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-
Fluorouracil (5-Fu),
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-Za,
Interferon
Alpha-Zb, Leuprolide acetate (LHRH-releasing factor analog), Lomustine,
Mechlorethamine
HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX),
Mitomycin, Mitoxantrone HCI, Octreotide, Plicamycin, Procaxbazine HCI,
Streptozocin,
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine
ulfate,
Amsacrine, Azacitidine, Hexamethyhnelamine, Interleukin-2, Mitoguazone,
Pentostatin,
Semustine, Teniposide, and Vindesine sulfate.
In addition, therapeutic compositions of the invention may be used for
prophylactic
treatment of cancer. There axe hereditary conditions and/or environmental
situations (e.g.
exposure to carcinogens) known in the art that predispose an individual to
developing
cancers. Under these circumstances, it may be beneficial to treat these
individuals with
therapeutically effective doses of the polypeptide of the invention to reduce
the risk of
developing cancers.
In vitfro models can be used to determine the effective doses of the
polypeptide of the
invention as a potential cancer treatment. These ivy. vitYO models include
proliferation assays
of cultured tumor cells, growth of cultured tumor cells in soft agar (see
Freshney, (1987)
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
62
Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY
Ch 18
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J.
Natl. Can. Inst.,
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden
Chamber assays
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and
angiogenesis
assays such as induction of vascularization of the chick chorioallantoic
membrane or
induction of vascular endothelial cell migration as described in Ribatta et
al., Intl. J. Dev.
Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9
(1899), respectively.
Suitable ttunor cells lines are available, e.g. from American Type Tissue
Culture Collection
catalogs.
4.10.12 RECEPTOR/LIGAND ACTIVITY
A polypeptide of the present invention may also demonstrate activity as
receptor,
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A
polynucleotide of
the invention can encode a polypeptide exhibiting such characteristics.
Examples of such
receptors and ligands include, without limitation, cytokine receptors and
their ligands,
receptor kinases and their ligands, receptor phosphatases and their ligands,
receptors
involved in cell-cell interactions and their ligands (including without
limitation, cellular
adhesion molecules (such as selectins, integrins and their ligands) and
receptorfligand pairs
involved in antigen presentation, antigen recognition and development of
cellular and
humoral immune responses. Receptors and ligands are also useful for screening
of potential
peptide or small molecule inhibitors of the relevant receptor/ligand
interaction. A protein of
the present invention (including, without limitation, fragments of receptors
and ligands) may
themselves be useful as inhibitors of receptor/ligand interactions.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods:
Suitable assays for receptor-ligand activity include without limitation those
described
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
Wiley-
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static
conditions
7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987;
Bierer et al.,
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160
1989;
Stoltenborg et al., J. Iminunol. Methods 175:59-68, 1994; Stitt et al., Cell
80:661-670, 1995.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
63
By way of example, the polypeptides of the invention may be used as a receptor
for a
ligand(s) thereby transmitting the biological activity of that ligand(s).
Ligands may be
identified through binding assays, affinity chromatography, dihybrid screening
assays,
BIAcore assays, gel overlay assays, or other methods knOWn 1I1 the art.
Studies characterizing drugs or proteins as agonist or antagonist or partial
agonists or
a partial antagonist require the use of other proteins as competing ligands.
The polypeptides
of the present invention or ligand(s) thereof may be labeled by being coupled
to
radioisotopes, colorimetric molecules or a toxin molecules by conventional
methods.
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in
Enzymology Vol. 182
(1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but
are not
limited to, tritium and carbon-14 . Examples of colorimetric molecules
include, but are not
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other
colorimetric
molecules. Examples of toxins include, but are not limited, to ricin.
4.10.13 DRUG SCREENING
This invention is particularly useful for screening chemical compounds by
using the
novel polypeptides or binding fragments thereof in any of a variety of drug
screening
techniques. The polypeptides or fragments employed in such a test may either
be free in
solution, affixed to a solid support, borne on a cell surface or located
intracellularly. One
method of drug screening utilizes eukaryotic or prokaryotic host cells which
are stably
transformed with recombinant nucleic acids expressing the polypeptide or a
fragment
thereof. Drugs are screened against such transformed cells in competitive
binding assays.
Such cells, either in viable or fixed form, can be used for standard binding
assays. One may
measure, for example, the formation of complexes between polypeptides of the
invention or
fragments and the agent being tested or examine the diminution in complex
formation
between the novel polypeptides and an appropriate cell line, which are well
known in the art.
Sources for test compounds that may be screened for ability to bind to or
modulate
(i.e., increase or decrease) the activity of polypeptides of the invention
include (1) iilorganic
and organic chemical libraries, (2) natural product libraries, and (3)
combinatorial libraries
comprised of either random or mimetic peptides, oligonucleotides or organic
molecules.
Chemical libraries may be readily synthesized or purchased from a number of
commercial sources, and may include structural analogs of known compounds or
compounds
that are identified as "hits" or "leads" via natural product screening.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
64
The sources of natural product libraries are microorganisms (including
bacteria and
fungi), animals, plants or other vegetation, or marine organisms, and
libraries of mixtures for
screening may be created by: (1) fermentation and extraction of broths from
soil, plant or
marine microorganisms or (2) extraction of the organisms themselves. Natural
product
libraries include polyketides, non-ribosomal peptides, and (non-naturally
occurring) variants
thereof. For a review, see Science 282:63-68 (1998).
Combinatorial libraries are composed of large numbers of peptides,
oligonucleotides
or organic compounds and can be readily prepared by traditional automated
synthesis
methods, PCR, cloning or proprietary synthetic methods. Of particular interest
are peptide
and oligonucleotide combinatorial libraries. Still other libraries of interest
include peptide,
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial,
and polypeptide
libraries. For a review of combinatorial chemistry and libraries created
therefrom, see
Myers, Curs. Opin. BioteclZnol. 8:701-707 (1997). For reviews and examples of
peptidomimetic libraries, see Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23
(1998); Hruby
et al., Curn Opin Clzem Biol, 1(1):114-19 (1997); Dorner et al., BioofgMed
Chem,
4(5):709-15 (1996) (alkylated dipeptides).
Identification of modulators through use of the various libraries described
herein
permits modification of the candidate "hit" (or "lead") to optimize the
capacity of the "hit"
to bind a polypeptide of the invention. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cell/animal death or prolonged survival of
the animal/cells.
The binding molecules thus identified may be complexed with toxins, e.g.,
ricin or
cholera, or with other compounds that are toxic to cells such as
radioisotopes. The
toxin-binding molecule complex is then targeted to a tumor or other cell by
the specificity of
the binding molecule for a polypeptide of the invention. Alternatively, the
binding
molecules may be complexed with imaging agents for targeting and imaging
purposes.
4.10.14 ASSAY FOR RECEPTOR ACTIVITY
The invention also provides methods to detect specific binding of a
polypeptide e.g. a
ligand or a receptor. The art provides numerous assays particularly useful for
identifying
previously unknown binding partners for receptor polypeptides of the
invention. For
example, expression cloning using mammalian or bacterial cells, or dihybrid
screening
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
assays can be used to identify polynucleotides encoding binding partners. As
another
example, affinity chromatography with the appropriate immobilized polypeptide
of the
invention can be used to isolate polypeptides that recognize and bind
polypeptides of the
invention. There are a number of different libraries used for the
identification of
5 compounds, and in particular small molecules, that modulate (i.e., increase
or decrease)
biological activity of a polypeptide of the invention. Ligands for receptor
polypeptides of the
invention can also be identified by adding exogenous ligands, or cocktails of
ligands to two
cells populations that are genetically identical except for the expression of
the receptor of the
invention: one cell population expresses the receptor of the invention whereas
the other does
10 not. The responses of the two cell populations to the addition of
ligands(s) are then
compared. Alternatively, an expression library can be co-expressed with the
polypeptide of
the invention in cells and assayed for an autocrine response to identify
potential ligand(s). As
still another example, BIAcore assays, gel overlay assays, or other methods
known in the art
can be used to identify binding partner polypeptides, including, (1) organic
and inorganic
15 chemical libraries, (2) natural product libraries, and (3) combinatorial
libraries comprised of
random peptides, oligonucleotides or organic molecules.
The role of downstream intracellular signaling molecules in the signaling
cascade of
the polypeptide of the invention can be determined. For example, a chimeric
protein in
which the cytoplasmic domain of the polypeptide of the invention is fused to
the
20 extracellular portion of a protein, whose ligand has been identified, is
produced in a host
cell. The cell is then incubated with the ligand specific for the
extracellular portion of the
chimeric protein, thereby activating the chimeric receptor. Known downstream
proteins
involved in intracellular signaling can then be assayed for expected
modifications i.e.
phosphorylation. Other methods known to those in the art can also be used to
identify
25 signaling molecules involved in receptor activity.
4.10.15 ANTI-INFLAMMATORY ACTIVITY
Compositions of the present invention may also exhibit anti-inflammatory
activity.
The anti-inflammatory activity may be achieved by providing a stimulus to
cells involved in
30 the inflammatory response, by inhibiting or promoting cell-cell
interactions (such as, for
example, cell adhesion), by inhibiting or promoting chemotaxis of cells
involved in the
inflammatory process, inhibiting or promoting cell extravasation, or by
stimulating or
suppressing production of'other factors which more directly inhibit or promote
an
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
66
inflammatory response. Compositions with such activities can be used to treat
inflammatory
conditions including chronic or acute conditions), including without
limitation intimation
associated with infection (such as septic shock, sepsis or systemic
inflammatory response
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis,
complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-
induced lung
injury, inflammatory bowel disease, Crohn's disease or resulting from over
production of
cytokines such as TNF or IL-1. Compositions of the invention may also be
useful to treat
anaphylaxis and hypersensitivity to an antigenic substance or material.
Compositions of this
invention may be utilized to prevent or treat conditions such as, but not
limited to, sepsis,
acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid
arthritis, chronic
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1,
graft versus
host disease, inflammatory bowel disease, inflamation associated with
pulmonary disease,
other autoimmune disease or inflammatory disease, an antiproliferative agent
such as for
acute or chronic mylegenous leukemia or in the prevention of premature labor
secondary to
intrauterine infections.
4.10.16 LEUKEMIAS
Leukemias and related disorders may be treated or prevented by administration
of a
therapeutic that promotes or inhibits function of the polynucleotides and/or
polypeptides of
the invention. Such leukemias and related disorders include but are not
limited to acute
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic,
promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia,
chronic
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a
review of such
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co.,
Philadelphia).
4.10.17 NERVOUS SYSTEM DISORDERS
Nervous system disorders, involving cell types which can be tested for
efficacy of
intervention with compounds that modulate the activity of the polynucleotides
and/or
polypeptides of the invention, and which can be treated upon thus observing an
indication of
therapeutic utility, include but are not limited to nervous system injuries,
and diseases or
disorders which result in either a disconnection of axons, a diminution or
degeneration of
neurons, or demyelination. Nervous system lesions which may be treated in a
patient
(including human and non-human mammalian patients) according to the invention
include
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
67
but are not limited to the following lesions of either the central (including
spinal cord, brain)
or peripheral nervous systems:
(i) traumatic lesions, including lesions caused by physical injury or
associated
with sua-gery, for example, lesions which sever a portion of the nervous
system, or
compression injuries;
(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous
system
results in neuronal injury or death, including cerebral infarction or
ischemia, or spinal cord
infarction or ischemia;
(iii) infectious lesions, in which a portion of the nervous system is
destroyed or
injured as a result of infection, for example, by an abscess or associated
with infection by
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with
Lyme
disease, tuberculosis, syphilis;
(iv) degenerative lesions, in which a portion of the nervous system is
destroyed or
injured as a result of a degenerative process including but not limited to
degeneration
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea,
or
amyotrophic lateral sclerosis;
(v) lesions associated with nutritional diseases or disorders, in which a
portion of
the nervous system is destroyed or injured by a nutritional disorder or
disorder of
metabolism including but not limited to, vitamin B 12 deficiency, folic acid
deficiency,
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease
(primary
degeneration of the corpus callosum), and alcoholic cerebellar degeneration;
(vi) neurological lesions associated with systemic diseases including but not
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus
erythematosus,
carcinoma, or sarcoidosis;
(vii) lesions caused by toxic substances including alcohol, lead, or
particular
neurotoxins; and
(viii) demyelinated lesions in which a portion of the nervous system is
destroyed or
injured by a demyelinating disease including but not limited to multiple
sclerosis, human
immunodeficiency virus-associated myelopathy, transverse myelopathy or various
etiologies, progressive multifocal leukoencephalopathy, and central pontine
myelinolysis.
Therapeutics which are useful according to the invention for treatment of a
nervous
system disorder may be selected by testing for biological activity in
promoting the survival
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
68
or differentiation of neurons. For example, and not by way of limitation,
therapeutics which
elicit any of the following effects may be useful according to the invention:
(i) increased survival time of neurons in culture;
(ii) increased sprouting of neurons in culture or in vivo;
(iii) increased production of a neuron-associated molecule in culture or in
vivo,
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor
neurons; or
(iv) decreased symptoms of neuron dysfunction in vivo.
Such effects may be measured by any method known in the art. In preferred,
non-limiting embodiments, increased survival of neurons may be measured by the
method
set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3S1S); increased
sprouting of neurons
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol.
70:65-82) or
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of
neuron-associated molecules may be measured by bioassay, enzymatic assay,
antibody
binding, Northern blot assay, etc., depending on the molecule to be measured;
and motor
1 S neuron dysfunction may be measured by assessing the physical manifestation
of motor
neuron disorder, e.g., weakness, motor neuron conduction velocity, or
functional disability.
In specific embodiments, motor neuron disorders that may be treated according
to the
invention include but are not limited to disorders such as infarction,
infection, exposure to
toxin, trauma, surgical damage, degenerative disease or malignancy that may
affect motor
neurons as well as other components of the nervous system, as well as
disorders that
selectively affect neurons such as amyotrophic lateral sclerosis, and
including but not limited
to progressive spinal muscular atrophy, progressive bulbar palsy, primary
lateral sclerosis,
infantile and juvenile muscular atrophy, progressive bulbar paralysis of
childhood (Fazio-
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary
Motorsensory
2S Neuropathy (Charcot-Marie-Tooth Disease).
4.10.18 OTHER ACTIVITIES
A polypeptide of the invention may also exhibit one or more of the following
additional activities or effects: inhibiting the growth, infection or function
of, or killing,
infectious agents, including, without limitation, bacteria, viruses, fungi and
other parasites;
effecting (suppressing or enhancing) bodily characteristics, including,
without limitation,
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue
pigmentation, or
organ or body part size or shape (such as, for example, breast augmentation or
diminution,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
69
change in bone form or shape); effecting biorhythms or circadian cycles or
rhythms;
effecting the fertility of male or female subjects; effecting the metabolism,
catabolism,
anabolism, processing, utilization, storage or elimination of dietary fat,
lipid, protein,
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or
component(s);
effecting behavioral characteristics, including, without limitation, appetite,
libido, stress,
cognition (including cognitive disorders), depression (including depressive
disorders) and
violent behaviors; providing analgesic effects or other pain reducing effects;
promoting
differentiation and growth of embryonic stem cells in Iineages other than
hematopoietic
lineages; hormonal or endocrine activity; in the case of enzymes, correcting
deficiencies of
the enzyme and treating deficiency-related diseases; treatment of
hyperproliferative
disorders (such as, for example, psoriasis); immunoglobulin-like activity
(such as, for
example, the ability to bind antigens or complement); and the ability to act
as an antigen in a
vaccine composition to raise an immune response against such protein or
another material or
entity which is cross-reactive with such protein.
4.10.19 IDENTIFICATION OF POLYMORPHISMS
The demonstration of polymorphisms makes possible the identification of such
polymorphisms in human subjects and the pharmacogenetic use of this
information for
diagnosis and treatment. Such polymorphisms may be associated with, e.g.,
differential
predisposition or susceptibility to various disease states (such as disorders
involving
inflammation or immune response) or a differential response to drug
administration, and this
genetic information can be used to tailor preventive or therapeutic treatment
appropriately.
For example, the existence of a polymorphism associated with a predisposition
to
inflammation or autoimmune disease makes possible the diagnosis of this
condition in
humans by identifying the presence of the polymorphism.
Polymorphisms can be identified in a variety of ways known in the art which
all
generally involve obtaining a sample from a patient, analyzing DNA from the
sample,
optionally involving isolation or amplification of the DNA, and identifying
the presence of
the polymorphism in the DNA. For example, PCR may be used to amplify an
appropriate
fragment of genomic DNA which rnay then be sequenced. Alternatively, the DNA
may be
subjected to allele-specific oligonucleotide hybridization (in which
appropriate
oligonucleotides are hybridized to the DNA under conditions permitting
detection of a single
base mismatch) or to a single nucleotide extension assay (in which an
oligonucleotide that
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
hybridizes immediately adjacent to the position of the polymorphism is
extended with one or
more labeled nucleotides). In addition, traditional restriction fragment
length polymorphism
analysis (using restriction enzymes that provide differential digestion of the
genomic DNA
depending on the presence or absence of the polymorphism) may be performed.
Arrays with
5 nucleotide sequences of the present invention can be used to detect
polyrnorphisms. The
array can comprise modified nucleotide sequences of the present invention in
order to detect
the nucleotide sequences of the present invention. In the alternative, any one
of the
nucleotide sequences of the present invention can be placed on the array to
detect changes
from those sequences.
10 Alternatively a polymorphism resulting in a change in the amino acid
sequence could
also be detected by detecting a corresponding change in amino acid sequence of
the protein,
e.g., by an antibody specific to the variant sequence.
4.10.20 ARTHRITIS AND INFLAMMATION
15 The immunosuppressive effects of the compositions of the invention against
rheumatoid arthritis is determined in an experimental animal model system. The
experimental model system is adjuvant induced arthritis in rats, and the
protocol is described
by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963,
Int. Arch.
Allergy Appl. Immunol., 23:129. W duction of the disease can be caused by a
single
20 injection, generally intradermally, of a suspension of killed Mycobacterium
tuberculosis in
complete Freund's adjuvant (CFA). The route of injection can vary, but rats
may be injected
at the base of the tail with an adjuvant mixture. The polypeptide is
administered in phosphate
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of
administering
PBS only.
25 The procedure for testing the effects of the test compound would consist of
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by
immediately
administering the test compound and subsequent treatment every other day until
day 24. At
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an
overall arthritis
score may be obtained as described by J. Holoskitz above. An analysis of the
data would
30 reveal that the test compound would have a dramatic affect on the swelling
of the joints as
measured by a decrease of the arthritis score.
4.11 THERAPEUTIC METHODS
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
71
The compositions (including polypeptide fragments, analogs, variants and
antibodies
or other binding partners or modulators including antisense polynucleotides)
of the invention
have numerous applications in a variety of therapeutic methods. Examples of
therapeutic
applications include, but are not limited to, those exemplified herein.
4.11.1 EXAMPLE
One embodiment of the invention is the administration of an effective amount
of the
polypeptides or other composition of the invention to individuals affected by
a disease or
disorder that can be modulated by regulating the peptides of the invention.
While the mode
of administration is not particularly important, parenteral administration is
preferred. An
exemplary mode of administration is to deliver an intravenous bolus. The
dosage of the
polypeptides or other composition of the invention will normally be determined
by the
prescribing physician. It is to be expected that the dosage will vary
according to the age,
weight, condition and response of the individual patient. Typically, the
amount of
polypeptide administered per dose will be in the range of about 0.01 ~,g/kg to
100 mg/kg of
body weight, with the preferred dose being about 0.1 ~.g/kg to 10 mg/kg of
patient body
weight. For parenteral administration, polypeptides of the invention will be
formulated in an
injectable form combined with a pharmaceutically acceptable parenteral
vehicle. Such
vehicles are well known in the art and examples include water, saline,
Ringer's solution,
dextrose solution, and solutions consisting of small amounts of the human
serum albumin.
The vehicle may contain minor amounts of additives that maintain the
isotonicity and
stability of the polypeptide or other active ingredient. The preparation of
such solutions is
within the skill of the art.
4.12 PHARMACEUTICAL . FORMULATIONS AND ROUTES OF
ADMINISTRATION
A protein or other composition of the present invention (from whatever source
derived, including without limitation from recombinant and non-recombinant
sources and
including antibodies and other binding partners of the polypeptides of the
invention) may be
administered to a patient in need, by itself, or in pharmaceutical
compositions where it is
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a
variety of
disorders. Such a composition may optionally contain (in addition to protein
or other active
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers,
solubilizers, and other
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
72
materials well known in the art. The term "pharmaceutically acceptable" means
a non-toxic
material that does not interfere with the effectiveness of the biological
activity of the active
ingredient(s). The characteristics of the carrier will depend on the route of
administration.
The pharmaceutical composition of the invention may also contain cytokines,
lymphokines,
or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3,
IL-4, IL-5,
IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO,
TNF1, TNF2,
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In
further
compositions, proteins of the invention may be combined with other agents
beneficial to the
treatment of the disease or disorder in question. These agents include various
growth factors
such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF),
transforming
growth factors (TGF-oc and TGF-[3), insulin-like growth factor (IGF), as well
as cytokines
described herein.
The pharmaceutical composition may further contain other agents which either
enhance the activity of the protein or other active ingredient or complement
its activity or
use in treatment. Such additional factors and/or agents may be included in the
pharmaceutical composition to produce a synergistic effect with protein or
other active
ingredient of the invention, or to minimize side effects. Conversely, protein
or other active
ingredient of the present invention may be included in formulations of the
particular clotting
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-
thrombotic
factor, or anti- inflammatory agent to minimize side effects of the clotting
factor, cytokine,
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic
factor, or
anti-inflammatory agent (such as IL-lRa, IL-1 Hyl, IL-I Hy2, anti-TNF,
corticosteroids,
immunosuppressive agents). A protein of the present invention may be active in
multimers
(e.g., heterodimers or homodimers) or complexes with itself or other proteins.
As a result,
pharmaceutical compositions of the invention may comprise a protein of the
invention in
such multimeric or complexed foam.
As an alternative to being included in a pharmaceutical composition of the
invention
including a first protein, a second protein or a therapeutic agent may be
concurrently
administered with the first protein (e.g., at the same time, or at differing
times provided that
therapeutic concentrations of the combination of agents is achieved at the
treatment site).
Techniques for formulation and administration of the compounds of the instant
application
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co.,
Easton, PA,
latest edition. A therapeutically effective dose further refers to that amount
of the compound
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
73
sufficient to result in amelioration of symptoms, e.g., treatment, healing,
prevention or
amelioration of the relevant medical condition, or an increase in rate of
treatment, healing,
prevention or amelioration of such conditions. When applied to an individual
active
ingredient, administered alone, a therapeutically effective dose refers to
that ingredient
alone. When applied to a combination, a therapeutically effective dose refers
to combined
amounts of the active ingredients that result in the therapeutic effect,
whether administered
in combination, serially oresimultaneously.
In practicing the method of treatment or use of the present invention, a
therapeutically effective amount of protein or other active ingredient of the
present invention
is administered to a mammal having a condition to be treated. Protein or other
active
ingredient of the present invention may be administered in accordance with the
method of
the invention either alone or in combination with other therapies such as
treatments
employing cytokines, lyrnphokines or other hematopoietic factors. When co-
administered
with one or more cytokines, lymphokines or other hematopoietic factors,
protein or other
active ingredient of the present invention may be administered either
simultaneously with
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or
anti-thrombotic factors, or sequentially. If administered sequentially, the
attending physician
will decide on the appropriate sequence of administering protein or other
active ingredient of
the present invention in combination with cytokine(s), lyrnphokine(s), other
hematopoietic
factor(s), thrombolytic or anti-thrombotic factors.
4.12.1 ROUTES OF ADMINISTRATION
Suitable routes of administration may, for example, include oral, rectal,
transmucosal, or intestinal administration; parenteral delivery, including
intramuscular,
subcutaneous, intramedullary injections, as well as intrathecal, direct
intraventricular,
intravenous, intraperitoneal, intranasal, or intraocular injections.
Administration of protein
or other active ingredient of the present invention used in the pharmaceutical
composition or
to practice the method of the present invention can be carned out in a variety
of conventional
ways, such as oral ingestion, inhalation, topical application or cutaneous,
subcutaneous,
intraperitoneal, parenteral or intravenous injection. Intravenous
administration to the patient
is preferred.
Alternately, one may administer the compound in a local rather than systemic
manner, for example, via injection of the compound directly into a arthritic
joints or in
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
74
fibrotic tissue, often in a depot or sustained release formulation. In order
to prevent the
scarnng process frequently occurring as complication of glaucoma surgery, the
compounds
may be administered topically, for example, as eye drops. Furthermore, one may
administer
the drug in a targeted drug delivery system, for example, in a liposome coated
with a specific
antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes
will be targeted
to and taken up selectively by the afflicted tissue.
The polypeptides of the invention are administered by any route that delivers
an
effective dosage to the desired site of action. The determination of a
suitable route of
administration and an effective dosage for a particular indication is within
the level of skill
in the art. Preferably for wound treatment, one administers the therapeutic
compound
directly to the site. Suitable dosage ranges for the polypeptides of the
invention can be
extrapolated from these dosages or from similar studies in appropriate animal
models.
Dosages can then be adjusted as necessaxy by the clinician to provide maximal
therapeutic
benefit.
4.12.2 COMPOSITIONS/FORMULATIONS
Pharmaceutical compositions for use in accordance with the present invention
thus
may be formulated in a conventional manner using one or more physiologically
acceptable
carriers comprising excipients and auxiliaries which facilitate processing of
the active
compounds into preparations which can be used pharmaceutically. These
pharmaceutical
compositions may be manufactured in a manner that is itself known, e.g., by
means of
conventional mixing, dissolving, granulating, dragee-making, levigating,
emulsifying,
encapsulating, entrapping or lyophilizing processes. Proper formulation is
dependent upon
the route of administration chosen. When a therapeutically effective amount of
protein or
other active ingredient of the present invention is administered orally,
protein or other active
ingredient of the present invention will be in the form of a tablet, capsule,
powder, solution
or elixir. When administered in tablet form, the pharmaceutical composition of
the invention
may additionally contain a solid carrier such as a gelatin or an adjuvant. The
tablet, capsule,
and powder contain from about 5 to 95% protein or other active ingredient of
the present
invention, and preferably from about 25 to 90% protein or other active
ingredient of the
present invention. When administered in liquid form, a liquid carrier such as
water,
petroleum, oils of animal or plant origin such as peanut oil, mineral oil,
soybean oil, or
sesame oil, or synthetic oils may be added. The liquid form of the
pharmaceutical
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
composition may further contain physiological saline solution, dextrose or
other saccharide
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene
glycol. When
administered in liquid form, the pharmaceutical composition c~antains from
about 0.5 to 90%
by weight of protein or other active ingredient of the present invention, and
preferably from
5 about 1 to 50% protein or other active ingredient of the present invention.
When a therapeutically effective amount of protein ox other active ingredient
of the
present invention is administered by intravenous, cutaneous or subcutaneous
injection,
protein or other active ingredient of the present invention will be in the
form of a
pyrogen-free, parenterally acceptable aqueous solution. The preparation of
such parenterally
10 acceptable protein or other active ingredient solutions, having due regard
to pH, isotonicity,
stability, and the like, is within the skill in the art. A preferred
pharmaceutical composition
for intravenous, cutaneous, or subcutaneous injection should contain, in
addition to protein
or other active ingredient of the present invention, an isotonic vehicle such
as Sodium
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and
Sodium Chloride
15 Injection, Lactated Ringer's Injection, or~other vehicle as known in the
art. The
pharmaceutical composition of the present invention may also contain
stabilizers,
preservatives, buffers, antioxidants, or other additives known to those of
skill in the art. For
injection, the agents of the invention may be formulated in aqueous solutions,
preferably in
physiologically compatible buffers such as Hanks's solution, Ringer's
solution, or
20 physiological saline buffer. For transmucosal administration, penetrants
appropriate to the
barrier to be permeated are used in the formulation. Such penetrants are
generally known in
the art.
For oral administration, the compounds can be formulated readily by combining
the
active compounds with pharmaceutically acceptable carriers well knov~m in the
art. Such
25 carriers enable the compounds of the invention to be formulated as tablets,
pills, dragees,
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral
ingestion by a
patient to be treated. Pharmaceutical preparations for oral use can be
obtained from a solid
excipient, optionally grinding a resulting mixture, and processing the mixture
of granules,
after adding suitable auxiliaries, if desired, to obtain tablets or dragee
cores. Suitable
30 excipients are, in particular, fillers such as sugars, including lactose,
sucrose, mannitol, or
sorbitol; cellulose preparations such as, for example, maize starch, wheat
starch, rice starch,
potato starch, gelatin, gum tragacanth, methyl cellulbse, hydroxypropylmethyl-
cellulose,
sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
76
disintegrating agents may be added, such as the cross-linked polyvinyl
pyrrolidone, agar, or
alginic acid or a salt thereof such as sodium alginate. Dragee cores are
provided with
suitable coatings. For this purpose, concentrated sugar solutions may be used,
which may
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel,
polyethylene glycol,
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or
solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for
identification or to
characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules
made
of gelatin, as well as soft, sealed capsules made of gelatin and a
plasticizer, such as glycerol
or sorbitol. The push-fit capsules can contain the active ingredients in
admixture with filler
such as lactose, binders such as starches, and/or lubricants such as talc or
magnesium
stearate and, optionally, stabilizers. In soft capsules, the active compounds
may be dissolved
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or
liquid polyethylene
glycols. In addition, stabilizers may be added. All formulations for oral
administration
should be in dosages suitable for such administration. For buccal
administration, the
compositions may take the form of tablets or lozenges formulated in
conventional manner.
For administration by inhalation, the compounds for use according to the
present
invention are conveniently delivered in the form of an aerosol spray
presentation from
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g.,
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane,
carbon dioxide
or other suitable gas. In the case of a pressurized aerosol the dosage unit
may be determined
by providing a valve to deliver a metered amount. Capsules and cartridges of,
e.g., gelatin
for use in an inhaler or insufflator may be formulated containing a powder mix
of the
compound and a suitable powder base such as lactose or starch. The compounds
may be
formulated for parenteral administration by injection, e.g., by bolus
injection or continuous
infusion. Formulations for injection may be presented in unit dosage form,
e.g., in ampules
or in mufti-dose containers, with an added preservative. The compositions may
take such
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and
may contain
formulatory agents such as suspending, stabilizing and/or dispersing agents.
Pharmaceutical formulations for parenteral administration include aqueous
solutions
of the active compounds in water-soluble form. Additionally, suspensions of
the active
compounds may be prepared as appropriate oily injection suspensions. Suitable
lipophilic
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty
acid esters, such
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
77
as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions
may contain
substances which increase the viscosity of the suspension, such as sodium
carboxymethyl
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain
suitable
stabilizers or agents which increase the solubility of the compounds to allow
for the
preparation of highly concentrated solutions. Alternatively, the active
ingredient may be in
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-
free water, before
use.
The compounds may also be formulated in rectal compositions such as
suppositories
or retention enemas, e.g., containing conventional suppository bases such as
cocoa butter or
other glycerides. In addition to the formulations described previously, the
compounds may
i
also be formulated as a depot preparation. Such long acting formulations may
be
administered by implantation (for example subcutaneously or intramuscularly)
or by
intramuscular injection. Thus, for example, the compounds may be formulated
with suitable
polymeric or hydrophobic materials (for example as an emulsion in an
acceptable oil) or ion
exchange resins, or as sparingly soluble derivatives, for example, as a
sparingly soluble salt.
A pharmaceutical carrier for the hydrophobic compounds of the invention is a
co-
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-
miscible organic
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent
system.
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant
polysorbate
80, and 65% w/v polyethylene glycol 300, made up to volume in absolute
ethanol. The VPD
co-solvent system (VPD:SW) consists of VPD diluted 1:1 with a 5% dextrose in
water
solution. This co-solvent system dissolves hydrophobic compounds well, and
itself produces
low toxicity upon systemic administration. Naturally, the proportions of a co-
solvent system
may be varied considerably without destroying its solubility and toxicity
characteristics.
Furthermore, the identity of the co-solvent components may be varied: for
example, other
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the
fraction size of
polyethylene glycol may be varied; other biocompatible polyners may replace
polyethylene
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may
substitute for
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical
compounds
may be employed. Liposomes and emulsions are well known examples of delivery
vehicles
or Garners for hydrophobic drugs. Certain organic solvents such as
dimethylsulfoxide also
may be employed, although usually at the cost of greater toxicity.
Additionally, the
compounds may be delivered using a sustained-release system, such as
semipermeable
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
78
matrices of solid hydrophobic polymers containing the therapeutic agent.
Various types of
sustained-release materials have been established and are well known by those
skilled in the
art. Sustained-release capsules may, depending on their chemical nature,
release the
compounds for a few weeks up to over 100 days. Depending on the chemical
nature and the
biological stability of the therapeutic reagent, additional strategies for
protein or other active
ingredient stabilization may be employed.
The pharmaceutical compositions also may comprise suitable solid or gel phase
Garners or excipients. Examples of such carriers or excipients include but are
not limited to
calcium carbonate, calcium phosphate, various sugars, starches, cellulose
derivatives,
gelatin, and polymers such as polyethylene glycols. Many of the active
ingredients of the
invention may be provided as salts with pharmaceutically compatible counter
ions. Such ,
pharmaceutically acceptable base addition salts are those salts which retain
the biological
effectiveness and properties of the free acids and which are obtained by
reaction with
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide,
ammonia,
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium
acetate,
potassium benzoate, triethanol amine and the like.
The pharmaceutical composition of the invention may be in the form of a
complex of
the proteins) or other active ingredients) of present invention along with
protein or peptide
antigens. The protein and/or peptide antigen will deliver a stimulatory signal
to both B and T
lymphocytes. B lymphocytes will respond to antigen through their surface
imm.unoglobulin
receptor. T lymphocytes will respond to antigen through the T cell receptor
(TCR)
following presentation of the antigen by MHC proteins. MHC and structurally
related
proteins including those encoded by class I and class II MHC genes on host
cells will serve
to present the peptide antigens) to T lymphocytes. The antigen components
could also be
supplied as purified MHC-peptide complexes alone or with co-stimulatory
molecules that
can directly signal T cells. Alternatively antibodies able to bind surface
immunoglobulin
and other molecules on B cells as well as antibodies able to bind the TCR and
other
molecules on T cells can be combined with the pharmaceutical composition of
the invention.
The pharmaceutical composition of the invention may be in the form of a
liposome in
which protein of the present invention is combined, in addition to other
pharmaceutically
acceptable carriers, with amphipathic agents such as lipids which exist in
aggregated form as
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous
solution.
Suitable lipids for liposomal formulation include, without limitation,
monoglycerides,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
79
diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids,
and the like.
Preparation of such liposomal formulations is within the level of skill in the
art, as disclosed,
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and
4,737,323, all of
which are incorporated herein by reference.
The amount of protein or other active ingredient of the present invention in
the
pharmaceutical composition of the present invention will depend upon the
nature and
severity of the condition being treated, and on the nature of prior treatments
which the
patient has undergone. Ultimately, the attending physician will decide the
amount of protein
or other active ingredient of the present invention with which to treat each
individual patient.
Initially, the attending physician will administer low doses of protein or
other active
ingredient of the present invention and observe the patient's response. Larger
doses of
protein or other active ingredient of the present invention may be
administered until the
optimal therapeutic effect is obtained for the patient, and at that point the
dosage is not
increased further. It is contemplated that the various pharmaceutical
compositions used to
practice the method of the present invention should contain about 0.01 ~g to
about 100 mg
(preferably about 0.1 ~,g to about 10 mg, more preferably about 0.1 ~,g to
about 1 mg) of
protein or other active ingredient of the present invention per kg body
weight. For
compositions of the present invention which are useful for bone, cartilage,
tendon or
ligament regeneration, the therapeutic method includes administering the
composition
topically, systematically, or locally as an implant or device. When
administered, the
therapeutic composition for use in this invention is, of course, in a pyrogen-
free,
physiologically acceptable form. Further, the composition may desirably be
encapsulated or
injected in a viscous form for delivery to the site of bone, cartilage or
tissue damage.
Topical administration may be suitable for wound healing and tissue repair.
Therapeutically
useful agents other than a protein or other active ingredient of the invention
which may also
optionally be included in the composition as described above, may
alternatively or
additionally, be administered simultaneously or sequentially with the
composition in the
methods of the invention. Preferably for bone and/or cartilage formation, the
composition
would include a matrix capable of delivering the protein-containing or other
active
ingredient-containing composition to the site of bone and/or cartilage damage,
providing a
structure for the developing bone and cartilage and optimally capable of being
resorbed into
the body. Such matrices may be formed of materials presently in use for other
implanted
medical applications.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The choice of matrix material is based on biocompatibility, biodegradability,
mechanical properties, cosmetic appearance and interface properties. The
particular
application of the compositions will define the appropriate formulation.
Potential matrices
for the compositions may be biodegradable arid chemically defined calcium
sulfate,
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and
polyanhydrides.
Other potential materials are biodegradable and biologically well-defined,
such as bone or
dermal collagen. Further matrices are comprised of pure proteins or
extracellular matrix
components. Other potential matrices are nonbiodegradable and chemically
defined, such as
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may
be comprised
10 of combinations of any of the above-mentioned types of material, such as
polylactic acid and
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be
altered in
composition, such as in calcium-aluminate-phosphate and processing to alter
pore size,
particle size, particle shape, and biodegradability. Presently preferred is a
50:50 (mole
weight) copolymer of lactic acid and glycolic acid in the form of porous
particles having
15 diameters ranging from 150 to 800 microns. In some applications, it will be
useful to utilize
a sequestering agent, such as carboxymethyl cellulose or autologous blood
clot, to prevent
the protein compositions from disassociating from the matrix.
A preferred family of sequestering agents is cellulosic materials such as
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose,
20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose,
hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred
being
cationic salts of carboxyrnethylcellulose (CMC). Other preferred sequestering
agents
include hyaluronic acid, sodium alginate, polyethylene glycol),
polyoxyethylene oxide,
carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent
useful
25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation
weight, which
represents the amount necessary to prevent desorption of the protein from the
polymer
matrix and to provide appropriate handling of the composition, yet not so much
that the
progenitor cells are prevented from infiltrating the matrix, thereby providing
the protein the
opportunity to assist the osteogenic activity of the progenitor cells. In
further compositions,
30 proteins or other active ingredients of the invention may be combined with
other agents
beneficial to the treatment of the bone and/or cartilage defect, wound, or
tissue in question.
These agents include various growth factors such as epidermal growth factor
(EGF), platelet
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
81
derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(3),
and
insulin-like growth factor (IGF).
The therapeutic compositions are also presently valuable for veterinary
applications.
Particularly domestic animals and thoroughbred horses, in addition to humans,
are desired
patients for such treatment with proteins or other active ingredients of the
present invention.
The dosage regimen of a protein-containing pharmaceutical composition to be
used in tissue
regeneration will be determined by the attending physician considering various
factors which
modify the action of the proteins, e.g., amount of tissue weight desired to be
formed, the site
of damage, the condition of the damaged tissue, the size of a wound, type of
damaged tissue
(e.g., bone), the patient's age, sex, and diet, the severity of any infection,
time of
administration and other clinical factors. The dosage may vary with the type
of matrix used
in the reconstitution and with inclusion of other proteins in the
pharmaceutical composition.
For example, the addition of other known growth factors, such as IGF I
(insulin like growth
factor I), to the final composition, may also effect the dosage. Progress can
be monitored by
periodic assessment of tissue/bone growth and/or repair, for example, X-rays,
histomorphometric determinations and tetracycline labeling.
Polynucleotides of the present invention can also be used for gene therapy.
Such
polynucleotides can be introduced either in vivo or ex vivo into cells for
expression in a
mammalian subject. Polynucleotides of the invention may also be administered
by other
known methods for introduction of nucleic acid into a cell or organism
(including, without
limitation, in the form of viral vectors or naked DNA). Cells may also be
cultured ex vivo in
the presence of proteins of the present invention in order to proliferate or
to produce a
desired effect on or activity in such cells. Treated cells can then be
introduced in vivo for
therapeutic purposes.
4.12.3 EFFECTIVE DOSAGE
Pharmaceutical compositions suitable for use in the present invention include
compositions wherein the active ingredients are contained in an effective
amount to achieve
its intended purpose. More specifically, a therapeutically effective amount
means an amount
effective to prevent development of or to alleviate the existing symptoms of
the subject
being treated. Determination of the effective amount is well within the
capability of those
skilled in the art, especially in light of the detailed disclosure provided
herein. For any
compound used in the method of the invention, the therapeutically effective
dose can be
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
82
estimated initially from appropriate in vitro assays. For example, a dose can
be formulated in
animal models to achieve a circulating concentration range that can be used to
more
accurately determine useful doses in humans. For example, a dose can be
formulated in
animal models to achieve a circulating concentration range that includes the
ICso as
determined in cell culture (i. e., the concentration of the test compound
which achieves a
half maximal inhibition of the protein's biological activity). Such
information can be used
to more accurately determine useful doses in humans.
A therapeutically effective dose refers to that amount of the compound that
results in
amelioration of symptoms or a prolongation of survival in a patient. Toxicity
and therapeutic
efficacy of such compounds can be determined by standard pharmaceutical
procedures in
cell cultures or experimental animals, e.g., for determining the LDso (the
dose lethal to 50%
of the population) and the EDSO (the dose therapeutically effective in 50% of
the population).
The dose ratio between toxic and therapeutic effects is the therapeutic index
and it can be
expressed as the ratio between LDso and EDso. Compounds which exhibit high
therapeutic
indices are preferred. The data obtained from these cell culture assays and
animal studies
can be used in formulating a range of dosage for use in human. The dosage of
such
compounds lies preferably within a range of circulating concentrations that
include the EDSo
with little or no toxicity. The dosage may vary within this range depending
upon the dosage
form employed and the route of administration utilized. The exact formulation,
route of
administration and dosage can be chosen by the individual physician in view of
the patient's
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of
Therapeutics", Ch.
1 p.1. Dosage amount and interval may be adjusted individually to provide
plasma levels of
the active moiety which are sufficient to maintain the desired effects, or
minimal effective
concentration (MEC). The MEC will vary for each compound but can be estimated
from ire
vitro data. Dosages necessary to achieve the MEC will depend on individual
characteristics
and route of administration. However, HPLC assays or bioassays can be used to
determine
plasma concentrations.
Dosage intervals can also be determined using MEC value. Compounds should be
administered using a regimen which maintains plasma levels above the MEC for
10-90% of
the time, preferably between 30-90% and most preferably between 50-90%. In
cases of local
administration or selective uptake, the effective local concentration of the
drug may not be
related to plasma concentration.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
83
An exemplary dosage regimen for polypeptides or other compositions of the
invention will be in the range of about 0.01 ~g/kg to 100 mg/kg of body weight
daily, with
the preferred dose being about 0.1 q.g/kg to 25 mg/kg of patient body weight
daily, varying
in adults and children. Dosing may be once daily, or equivalent doses may be
delivered at
longer or shorter intervals.
The amount of composition administered will, of course, be dependent on the
subject
being treated, on the subject's age and weight, the severity of the
affliction, the manner of
administration and the judgment of the prescribing physician.
4.12.4 PACKAGING
The compositions may, if desired, be presented in a pack or dispenser device
which
may contain one or more unit dosage forms containing the active ingredient.
The pack may,
for example, comprise metal or plastic foil, such as a blister pack. The pack
or dispenser
device may be accompanied by instructions for administration. Compositions
comprising a
compound of the invention formulated in a compatible pharmaceutical carrier
may also be
prepared, placed in an appropriate container, and labeled for treatment of an
indicated
condition.
4.13 ANTIBODIES
Also included in the invention are antibodies to proteins, or fragments of
proteins of
the invention. The term "antibody" as used herein refers to immunoglobulin
molecules and
immunologically active portions of immunoglobulin (Ig) molecules, i.e.,
molecules that
contain an antigen-binding site that specifically binds (inununoreacts with)
an antigen. Such
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric,
single chain,
Fab, Fab' and F~ab~>2 fragments, and an Fib expression library. In general, an
antibody molecule
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD,
which differ
from one another by the nature of the heavy chain present in the molecule.
Certain classes
have subclasses as well, such as IgGI, IgG2, and others. Furthermore, in
humans, the light
chain may be a kappa chain or a lambda chain. Reference herein to antibodies
includes a
reference to all such classes, subclasses and types of human antibody species.
An isolated related protein of the invention may be intended to serve as an
antigen, or
a portion or fragment thereof, and additionally can be used as an immunogen to
generate
antibodies that immunospecifically bind the antigen, using standard techniques
for
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
84
polyclonal and monoclonal antibody preparation. The full-length protein can be
used or,
alternatively, the invention provides antigenic peptide fragments of the
antigen for use as
immunogens. An antigenic peptide fragment comprises at least 6 amino acid
residues of the
amino acid sequence of the full length protein, such as an amino acid sequence
shown in
SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses
an epitope
thereof such that an antibody raised against the peptide forms a specific
immune complex
with the full length protein or with any fragment that contains the epitope.
Preferably, the
antigenic peptide comprises at least 10 amino acid residues, or at least 15
amino acid
residues, or at least 20 amino acid residues, or at least 30 amino acid
residues. Preferred
epitopes encompassed by the antigenic peptide are regions of the protein that
are located on
its surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by
the
antigenic peptide is a surface region of the protein, e.g., a hydrophilic
region. A
hydrophobicity analysis of the human related protein sequence will indicate
which regions of
a related protein are particularly hydrophilic and, therefore, are likely to
encode surface
residues useful for targeting antibody production. As a means for targeting
antibody
production, hydropathy plots showing regions of hydrophilicity and
hydrophobicity may be
generated by any method well known in the art, including, for example, the
I~yte Doolittle or
the Hopp Woods methods, either with or without Fourier transformation. See,
e.g., Hopp and
Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982,
J. Mol.
Biol. 157: 105-142, each of which is incorporated herein by reference in its
entirety.
Antibodies that axe specific for one or more domains within an antigenic
protein, or
derivatives, fragments, analogs or homologs thereof, are also provided herein.
A protein of the invention, or a derivative, fragment, analog, homolog or
ortholog
thereof, may be utilized as an immunogen in the generation of antibodies that
immunospecifically bind these protein components.
The term "specific for" indicates that the variable regions of the antibodies
of the
invention recognize and bind polypeptides of the invention exclusively (i.e.,
able to
distinguish the polypeptide of the invention from other similar polypeptides
despite sequence
identity, homology, or similarity found in the family of polypeptides), but
may also interact
with other proteins (for example, S. aureus protein A or other antibodies in
ELISA
techniques) through interactions with sequences outside the variable region of
the antibodies,
and in particular, in the constant region of the molecule. Screening assays to
determine
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
binding specificity of an antibody of the invention are well known and
routinely practiced in
the art. For a comprehensive discussion of such assays, see Harlow et al.
(Eds), Antibodies
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY
(1988),
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of
the
5 invention axe also contemplated, provided that the antibodies are first and
foremost specific
for, as defined above, full-length polypeptides of the invention. As with
antibodies that are
specific for full length polypeptides of the invention, antibodies of the
invention that
recognize fragments are those which can distinguish polypeptides from the same
family of
polypeptides despite inherent sequence identity, homology, or similarity found
in the family
10 of proteins.
Antibodies of the invention are useful for, for example, therapeutic purposes
(by
modulating activity of a polypeptide of the invention), diagnostic purposes to
detect or
quantitate a polypeptide of the invention, as well as purification of a
polypeptide of the
invention. Kits comprising an antibody of the invention for any of the
purposes described
15 herein are also comprehended. In general, a kit of the invention also
includes a control
antigen for which the antibody is immunospecific. The invention further
provides a
hybridoma that produces an antibody according to the invention. Antibodies of
the
invention are useful for detection and/or purification of the polypeptides of
the invention.
Monoclonal antibodies binding to the protein of the invention may be useful
20 diagnostic agents for the immunodetection of the protein. Neutralizing
monoclonal
antibodies binding to the protein may also be useful therapeutics for both
conditions
associated with the protein and also in the treatment of some forms of cancer
where
abnormal expression of the protein is involved. In the case of cancerous cells
or leukemic
cells, neutralizing monoclonal antibodies against the protein may be useful in
detecting and
25 preventing the metastatic spread of the cancerous cells, which may be
mediated by the
protein.
The labeled antibodies of the present invention can be used for i~r
vita°o, iya vivo, and
in situ assays to identify cells or tissues in which a fragment of the
polypeptide of interest is
expressed. The antibodies may also be used directly in therapies or other
diagnostics. The
30 present invention further provides the above-described antibodies
immobilized on a solid
support. Examples of such solid supports include plastics such as
polycarbonate, complex
carbohydrates such as agarose and Sepharose~, acrylic resins and such as
polyacrylamide
and latex beads. Techniques for coupling antibodies to such solid supports are
well known
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
86
in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed.,
Blackwell
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et
al., Meth.
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the
present
invention can be used for in vitro, ifa vivo, and i~a situ assays as well as
for immuno-affinity
purification of the proteins of the present invention.
Various procedures known within the art may be used for the production of
polyclonal or monoclonal antibodies directed against a protein of the
invention, or against
derivatives, fragments, analogs homologs or orthologs thereof (see, for
example, Antibodies:
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory
Press,
Cold Spring Harbor, NY, incorporated herein by reference). Some of these
antibodies are
discussed below.
4.13.1 POLYCLONAL ANTIBODIES
For the production of polyclonal antibodies, various suitable host animals
(e.g.,
rabbit, goat, mouse or other mammal) may be immunized by one or more
injections with the
native protein, a synthetic variant thereof, or a derivative of the foregoing.
An appropriate
immunogenic preparation can contain, for example, the naturally occurring
immunogenic
protein, a chemically synthesized polypeptide representing the immunogenic
protein, or a
recombinantly expressed inununogenic protein. Furthermore, the protein may be
conjugated
to a second protein known to be immunogenic in the mammal being immunized.
Examples
of such immunogenic proteins include but are not limitedrto keyhole limpet
hemocyanin,
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The
preparation can
further include an adjuvant. Various adjuvants used to increase the
immunological response
include, but are not limited to, Freund's (complete and incomplete), mineral
gels (e.g.,
aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic
polyols,
polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in
humans such as
Bacille Calmette-Guerin and Corynebacterium parvum, or similar
immunostimulatory
agents. Additional examples of adjuvants that can be employed include MPL-TDM
adjuvant
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate).
The polyclonal antibody molecules directed against the immunogenic protein can
be
isolated from the mammal (e.g., from the blood) and further purified by well
known
techniques, such as affinity chromatography using protein A or protein G,
which provide
primarily the IgG fraction of immune serum. Subsequently, or alternatively,
the specific
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
87
~i
antigen which is the target of z~ i~~~~~~wsr~~iin sought, or an epitope
thereof, may be
imrri~bilized on a column to purify the immune specific antibody by
immunoaffinity
chromatography. Purification of immunoglobulins is discussed, for example, by
D.
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA,
Vol. 14, No. 8
(April 17, 2000), pp. 25-28).
4.13.2 MONOCLONAL ANTIBODIES
The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as
used herein, refers to a population of antibody molecules that contain only
one molecular
species of antibody molecule consisting of a unique light chain gene product
and a unique
heavy chain gene product. In pauticular, the complementarity determining
regions (CDRs)
of the monoclonal antibody are identical in all the molecules of the
population. MAbs thus
contain an antigen-binding site capable of immunoreacting with a particular
epitope of the
antigen characterized by a unique binding affinity for it.
Monoclonal antibodies c'an be prepared using hybridoma methods, such as those
described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma
method, a
mouse, hamster, or other appropriate host animal, is typically immunized with
an
immunizing agent to elicit lymphocytes that produce or are capable of
producing antibodies
that will specifically bind to the immunizing agent. Alternatively, the
lymphocytes can be
immunized in vitro.
The innnunizing agent will typically include the protein antigen, a fragment
thereof
or a fusion protein thereof. Generally, either peripheral blood lymphocytes
are used if cells
of human origin are desired, or spleen cells or lymph node cells are used if
non-human
mammalian sources are desired. The lymphocytes are then fused with an
immortalized cell
line using a suitable fusing agent, such as polyethylene glycol, to form a
hybridoma cell
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press,
(1986) pp. 59-
103). Immortalized cell lines are usually transformed mammalian cells,
particularly
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse
myeloma cell
lines are employed. The hybridoma cells can be cultured in a suitable culture
medium that
preferably contains one or more substances that inhibit the growth or survival
of the unfused,
immortalized cells. For example, if the parental cells lack the enzyme
hypoxanthine guanine
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the
hybridomas
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
88
typically will include hypoxanthine, aminopterin, and thymidine ("HAT
medium"), which
substances prevent the growth of HGPRT-deficient cells.
Preferred immortalized cell lines are those that fuse efficiently, support
stable high
level expression of antibody by the selected antibody-producing cells, and are
sensitive to a
medium such as HAT medium. More preferred immortalized cell lines are marine
myeloma
lines, which can be obtained, for instance, from the Salk Institute Cell
Distribution Center,
San Diego, California and the American Type Culture Collection, Manassas,
Virginia.
Human myeloma and mouse-human heteromyeloma cell lines also have been
described for
the production of human monoclonal antibodies (Kozbor, J. linmunol., 133:3001
(1984);
Brodeur et al., Monoclonal Antibody Production Techniques and Applications,
Marcel
Dekker, Inc., New York, (1987) pp. 51-63).
The culture medium in which the hybridoma cells are cultured can then be
assayed
for the presence of monoclonal antibodies directed against the antigen.
Preferably, the
binding specificity of monoclonal antibodies produced by the hybridoma cells
is determined
by immunoprecipitation or by an in vitro binding assay, such as
radioimmunoassay (RIA) or
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are
known in
the art. The binding affinity of the monoclonal antibody can, for example, be
determined by
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980).
Preferably,
antibodies having a high degree of specificity and a high binding affinity for
the target
antigen are isolated.
After the desired hybridoma cells are identified, the clones can be subcloned
by
limiting dilution procedures and grown by standard methods. Suitable culture
media for this
purpose include, far example, Dulbecco's Modifed Eagle's Medimn and RPMI-1640
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in
a mammal.
The monoclonal antibodies secreted by the subclones can be isolated or
purified from
the culture medium or ascites fluid by conventional immunoglobulin
purification procedures
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel
electrophoresis, dialysis, or affinity chromatography.
The monoclonal antibodies can also be made by recombinant DNA methods, such as
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal
antibodies of
the invention can be readily isolated and sequenced using conventional
procedures (e.g., by
using oligonucleotide probes that are capable of binding specifically to genes
encoding the
heavy and light chains of marine antibodies). The hybridoma cells of the
invention serve as
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
89
a preferred source of such DNA. Once isolated, the DNA can be placed into
expression
vectors, which are then transfected into host cells such as simian COS cells,
Chinese hamster
ovary (CHO) cells, or myeloma cells that do not otherwise produce
immunoglobulin protein,
to obtain the synthesis of monoclonal antibodies in the recombinant host
cells. The DNA
also can be modified, for example, by substituting the coding sequence for
human heavy and
light chain constant domains in place of the homologous rnurine sequences
(LJ.S. Patent No.
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to
the
immunoglobulin coding sequence all or part of the coding sequence for a non-
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be
substituted
for the constant domains of an antibody of the invention, or can be
substituted for the
variable domains of one antigen-combining site of an antibody of the invention
to create a
chimeric bivalent antibody.
4.13.3 HUMANIZED ANTIBODIES
The antibodies directed against the protein antigens of the invention can
further
comprise humanized antibodies or human antibodies. These antibodies are
suitable for
administration to humans without engendering an irninune response by the human
against
the administered immunoglobulin. Humanized forms of antibodies are chimeric
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab,
Fab',
F(ab')Z or other antigen-binding subsequences of antibodies) that are
principally comprised
of the sequence of a human immunoglobulin, and contain minimal sequence
derived from a
non-human immunoglobulin. Humanization can be performed following the method
of
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et
al., Nature,
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by
substituting
rodent CDRs or CDR sequences for the corresponding sequences of a human
antibody. (See
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of
the human
immunoglobulin are replaced by corresponding non-human residues. Humanized
antibodies
can also comprise residues that are found neither in the recipient antibody
nor in the
imported CDR or framework sequences. W general, the humanized antibody will
comprise
substantially all of at least one, and typically two, variable domains, in
which all or
substantially all of the CDR regions correspond to those of a non-human
immunoglobulin
and all or substantially all of the framework regions are those of a human
immunoglobulin
consensus sequence. The humanized antibody optimally also will comprise at
least a portion
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
of an immtmoglobulin constant region (Fc), typically that of a human
immunoglobulin
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct.
Biol., 2, 593-596
(1992)).
5 4.13.4 HUMAN ANTIBODIES
' Fully human antibodies relate to antibody molecules in which essentially the
entire
sequences of both the light chain and the heavy chain, including the CDRs,
arise from
human genes. Such antibodies are termed "human antibodies", or "fully human
antibodies"
herein. Human monoclonal antibodies can be prepared by the trioma technique;
the human
10 B-cell hybridoma technique (see Kozbor, et al., 1983 linmunol Today 4: 72)
and the EBV
hybridoma technique to produce human monoclonal antibodies (see Cole, et al.,
1985 In:
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).
Human
monoclonal antibodies may be utilized in the practice of the present invention
and may be
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci
USA 80,
15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in
vitro (see Cole, et
aL, 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp.
77-96).
In addition, human antibodies can also be produced using additional
techniques,
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227,
381 (1991);
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can
be made by
20 introducing human immunoglobulin loci into transgenic animals, e.g., mice
in which the
endogenous immunoglobulin genes have been partially or completely inactivated.
Upon
challenge, human antibody production is observed, which closely resembles that
seen in
humans in all respects, including gene rearrangement, assembly, and antibody
repertoire.
This approach is described, for example, in U.S. Patent Nos. 5,545,807;
5,545,806;
25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al.
(Bio/Technology 10, 779-
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature
368, 812-13
(1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger
(Nature
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol.
13, 65-93
(1995)).
30 Human antibodies may additionally be produced using transgenic nonhuman
animals
that are modified so as to produce fully human antibodies rather than the
animal's
endogenous antibodies in response to challenge by an antigen. (See PCT
publication
W094/02602). The endogenous genes encoding the heavy and light immunoglobulin
chains
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
91
in the nonhuman host have been incapacitated, and active loci encoding human
heavy and
light chain immunoglobulins are inserted into the host's genome. The human
genes are
incorporated, for example, using yeast artificial chromosomes containing the
requisite
human DNA segments. An animal which provides all the desired modifications is
then
obtained as progeny by crossbreeding intermediate transgenic animals
containing fewer than
the full complement of the modifications. The preferred embodiment of such a
nonhuman
animal is a mouse, and is termed the XenomouseTM as disclosed in PCT
publications WO
96/33735 and WO 96/34096. This animal produces B cells that secrete fully
human
immunoglobulins. The antibodies can be obtained directly from the animal after
immunization with an immunogen of interest, as, for example, a preparation of
a polyclonal
antibody, or alternatively from immortalized B cells derived from the animal,
such as
hybridomas producing monoclonal antibodies. Additionally, the genes encoding
the
immunoglobulins with human variable regions can be recovered and expressed to
obtain the
antibodies directly, or can be further modified to obtain analogs of
antibodies such as, for
example, single chain Fv molecules.
An example of a method of producing a nonhuman host, exemplified as a mouse,
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in
U.S.
Patent No. 5,939,598. It can be obtained by a method including deleting the J
segment genes
from at least one endogenous heavy chain locus in an embryonic stem cell to
prevent
rearrangement of the locus and to prevent formation of a transcript of a
rearranged
immunoglobulin heavy chain locus, the deletion being effected by a targeting
vector
containing a gene encoding a selectable marker; and producing from the
embryonic stem cell
a transgenic mouse whose somatic and germ cells contain the gene encoding the
selectable
marker.
A method for producing an antibody of interest, such as a human antibody, is
disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression
vector that
contains a nucleotide sequence encoding a heavy chain into one mammalian host
cell in
culture, introducing an expression vector containing a nucleotide sequence
encoding a light
chain into another mammalian host cell, and fusing the two cells to form a
hybrid cell. The
hybrid cell expresses an antibody containing the heavy chain and the light
chain.
In a further improvement on this procedure, a method for identifying a
clinically .
relevant epitope on an immunogen, and a correlative method for selecting an
antibody that
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
92
binds immunospecifically to the relevant epitope with high affinity, are
disclosed in PCT
publication WO 99/53049.
4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES
According to the invention, techniques can be adapted for the production of
single-chain antibodies specific to an antigenic protein of the invention (see
e.g., LJ.S. Patent
No. 4,946,778). In addition, methods can be adapted for the construction of
Fab expression
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid
and effective
identification of monoclonal Fab fragments with the desired specificity for a
protein or
derivatives, fragments, analogs or homologs thereof. Antibody fragments that
contain the
idiotypes to a protein antigen may be produced by techniques known in the art
including, but
not limited to: (i) an F(ab')z fragment produced by pepsin digestion of an
antibody molecule;
(ii) an Fab fragment generated by reducing the disulfide bridges of an F~~b~~2
fragment; (iii) an
Fab fragment generated by the treatment of the antibody molecule with papain
and a reducing
agent and (iv) F~ fragments.
4.13.6 BISPECIFIC ANTIBODIES
Bispecific antibodies are monoclonal, preferably human or humanized,
antibodies
that have binding specificities for at least two different antigens. In the
present case, one of
the binding specificities is for an antigenic protein of the invention. The
second binding
target is any other antigen, and advantageously is a cell-surface protein or
receptor or
receptor subunit.
Methods for making bispecific antibodies are known in the art. Traditionally,
the
recombinant production of bispecific antibodies is based on the co-expression
of two
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have
different
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of
the random
assortment of immunoglobulin heavy and light chains, these hybridomas
(quadromas)
produce a potential mixture of ten different antibody molecules, of which only
one has the
correct bispecific structure. The purification of the correct molecule is
usually accomplished
by affinity chromatography steps. Similar procedures are disclosed in WO
93/08829,
published 13 May 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659.
Antibody variable domains with the desired binding specificities (antibody-
antigen
combining sites) can be fused to immunoglobulin constant domain sequences. The
fusion
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
93
preferably is with an immunoglobulin heavy-chain constant domain, comprising
at least part
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-
chain constant
region (CH1) containing the site necessary for light-chain binding present in
at least one of
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if
desired, the
immunoglobulin light chain, are inserted into separate expression vectors, and
are co-
transfected into a suitable host organism. For further details of generating
bispecific
antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210
(1986).
According to another approach described in WO 96/27011, the interface between
a
pair of antibody molecules can be engineered to maximize the percentage of
heterodimers
that are recovered from recombinant cell culture. The preferred interface
comprises at least
a part of the CH3 region of an antibody constant domain. In this method, one
or more small
amino acid side chains from the interface of the first antibody molecule are
replaced with
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of
identical or
similar size to the large side chains) are created on the interface of the
second antibody
molecule by replacing large amino acid side chains with smaller ones (e.g.
alanine or
threonine). This provides a mechanism for increasing the yield of the
heterodimer over other
unwanted end-products such as homodimers.
Bispecific antibodies can be prepared as full-length antibodies or antibody
fragments
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific
antibodies from
antibody fragments have been described in the Literature. For example,
bispecific antibodies
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985)
describe a
procedure wherein intact antibodies are proteolytically cleaved to generate
F(ab')2
fragments. These fragments are reduced in the presence of the dithiol
complexing agent
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular
disulfide formation.
The Fab' fragments generated are then converted to thionitrobenzoate (TNB)
derivatives.
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by
reduction with
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB
derivative to form the bispecific antibody. The bispecific antibodies produced
can be used
as agents for the selective immobilization of enzymes.
Additionally, Fab' fragments can be directly recovered from E. coli and
chemically
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med_ 175, 217-
225 (1992)
describe the production of a fully humanized bispecific antibody F(ab')2
molecule. Each
Fab' fragment was separately secreted from E. coli and subjected to directed
chemical
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
94
coupling in vitro to form the bispecific antibody. The bispecific antibody
thus formed was
able to bind to cells overexpressing the ErbB2 receptor and normal human T
cells, as well as
trigger the lytic activity of human cytotoxic lymphocytes against human breast
tumor targets.
Various techniques for making and isolating bispecific antibody fragments
directly
from recombinant cell culture have also been described. For example,
bispecific antibodies
have been produced using leucine zippers. I~ostelny et al., J. Immunol.
148(5), 1547-1553
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked
to the Fab'
portions of two different antibodies by gene fusion. The antibody homodimers
were reduced
at the hinge region to form monomers and then re-oxidized to form the antibody
heterodimers. This method can also be utilized for the production of antibody
homodimers.
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci.
USA 90,
6444-6448 (1993) has provided an alternative mechanism for making bispecific
antibody
fragments. The fragments comprise a heavy-chain variable domain (VH) coimected
to a
light-chain variable domain (VL) by a linker which is too short to allow
pairing between the
two domains on the same chain. Accordingly, the VH and VL domains of one
fragment are
forced to pair with the complementary VL and VH domains of another fragment,
thereby
forming two antigen-binding sites. Another strategy for making bispecific
antibody
fragments by the use of single-chain Fv (sFv) dimers has also been reported.
See, Gruber et
al., J. Immunol. 152, 5368 (1994).
Antibodies with more than two valencies are contemplated. For example,
trispecific
antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991).
Exemplary bispecific antibodies can bind to two different epitopes, at least
one of
which originates in the protein antigen of the invention. Alternatively, an
anti-antigenic arm
of an irnmunoglobulin molecule can be combined with an arm which binds to a
triggering
molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3,
CD28, or B7),
or Fc receptors for IgG (Fc~yR), such as Fc~yRI (CD64), Fc~yRII (CD32) and
Fc°yRIII (CD16)
so as to focus cellular defense mechanisms to the cell expressing the
particular antigen.
Bispecific antibodies can also be used to direct cytotoxic agents to cells
which express a
particular antigen. These antibodies possess an antigen-binding arm and an arm
which binds
a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or
TETA.
Another bispecific antibody of interest binds the protein antigen described
herein and further
binds tissue factor (TF).
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4.13.7 HETEROCONJUGATE ANTIBODIES
Heteroconjugate antibodies are also within the scope of the present invention.
Heteroconjugate antibodies are composed of two covalently joined antibodies.
Such
antibodies have, for example, been proposed to target immune system cells to
unwanted cells
5 (IJ.S. Patent No. 4,676,980), and for treatment of HIV infection (WO
91/00360; WO
921200373; EP 03089). It is contemplated that the antibodies can be prepared
in vitro using
known methods in synthetic protein chemistry, including those involving
crosslinking
agents. For example, immunotoxins can be constructed using a disulfide
exchange reaction
or by forming a thioether bond. Examples of suitable reagents for this purpose
include
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for
example, in U.S.
Patent No. 4,676,980.
4.13.8 EFFECTOR FUNCTION ENGINEERING
It can be desirable to modify the antibody of the invention with respect to
effector
15 function, so as to enhance, e.g., the effectiveness of the antibody in
treating cancer. For
example, cysteine residues) can be introduced into the Fc region, thereby
allowing
interchain disulfide bond formation in this region. The homodimeric antibody
thus
generated can have improved internalization capability andlor increased
complement-
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See
Caron et
20 al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Tmmunol., 148, 2918-
2922 (1992).
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared
using
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research,
53, 2560-
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc
regions and can
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et
al.,
25 Anti-Cancer Drug Design, 3, 219-230 (1989).
4.13.9 IMMUNOCONJUGATES
The invention also pertains to immunoconjugates comprising an antibody
conjugated
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an
enzymatically active
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof),
or a radioactive
isotope (i.e., a radioconjugate).
Chemotherapeutic agents useful in the generation of such immunoconjugates have
been described above. Enzymatically active toxins and fragments thereof that
can be used
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
96
include diphtheria A chain, nonbinding active fragments of diphtheria toxin,
exotoxin A
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A
chain,
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca
americana proteins
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin,
sapaonaria
officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin,
enomycin, and the
tricothecenes. A variety of radionuclides are available for the production of
radioconjugated
antibodies. Examples include ZiaBiy3ih i3lln, 9oY, and ls6Re.
Conjugates of the antibody and cytotoxic agent axe made using a variety of
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-
pyridyldithiol) propionate
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as
dimethyl
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes
(such as
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl)
hexanediamine), bis-
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediasnine),
diisocyanates
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as
1,5-difluoro-
2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as
described in
Vitetta et al., Science, 238: 1098 (1987). Carbon-14-labeled 1-
isothiocyanatobenzyl-3-
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating
agent for
conaugation of radionucleotide to the antibody. See W094/11026.
In another embodiment, the antibody can be conjugated to a "receptor" (such
streptavidin) for utilization in tumor pretargeting wherein the antibody-
receptor conjugate is
administered to the patient, followed by removal of unbound conjugate from the
circulation
using a clearing agent and then administration of a "ligand" (e.g., avidin)
that is in turn
conjugated to a cytotoxic agent.
4.14 COMPUTER READAELE SEQUENCES
In one application of this embodiment, a nucleotide sequence of the present
invention
can be recorded on computer readable media. As used herein, "computer readable
media"
refers to any medium which can be read and accessed directly by a computer.
Such media
include, but are not limited to: magnetic storage media, such as floppy discs,
hard disc
storage medium, and magnetic tape; optical storage media such as CD-ROM;
electrical
storage media such as RAM and ROM; and hybrids of these categories such as
magnetic/optical storage media. A skilled artisan can readily appreciate how
any of the
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
97
presently known computer readable mediums can be used to create a manufacture
comprising computer readable medium having recorded thereon a nucleotide
sequence of the
present invention. As used herein, "recorded" refers to a process for storing
information on
computer readable medium. A skilled artisan can readily adopt any of the
presently known
methods for recording information on computer readable medium to generate
manufactures
comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for
creating a
computer readable medium having recorded thereon a nucleotide sequence of the
present
invention. The choice of the data storage structure will generally be based on
the means
chosen to access the stored information. In addition, a variety of data
processor programs
and formats can be used to store the nucleotide sequence information of the
present
invention on computer readable medium. The sequence information can be
represented in a
word processing text file, formatted in commercially-available software such
as WordPerfect
and Microsoft Word, or represented in the form of an ASCII file, stored in a
database
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can
readily adapt any
number of data processor structuring formats (e.g. text file or database) in
order to obtain
computer readable medium having recorded thereon the nucleotide sequence
information of
the present invention.
By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534
or
a representative fragment thereof; or a nucleotide sequence at least 95%
identical to any of
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer
readable form, a
skilled artisan can routinely access the sequence information for a variety of
purposes.
Computer software is publicly available which allows a skilled artisan to
access sequence
information provided in a computer readable medium. The examples which follow
demonstrate how software which implements the BLAST (Altschul et al., J. Mol.
Biol.
215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993))
search
algorithms on a Sybase system is used to identify open reading frames (ORFs)
within a
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be
useful in
producing commercially important proteins such as enzymes used in fermentation
reactions
and in the production of commercially useful metabolites.
As used herein, "a computer-based system" refers to the hardware means,
software
means, and data storage means used to analyze the nucleotide sequence
information of the
present invention. The minimum hardware means of the computer-based systems of
the
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
98
present invention comprises a central processing unit (CPL, input means,
output means, and
data storage means. A skilled artisan can readily appreciate that any one of
the currently
available computer-based systems are suitable for use in the present
invention. As stated
above, the computer-based systems of the present invention comprise a data
storage means
having stored therein a nucleotide sequence of the present invention and the
necessary
hardware means and software means for supporting and implementing a search
means. As
used herein, "data storage means" refers to memory which can store nucleotide
sequence
information of the present invention, or a memory access means which can
access
manufactures having recorded thereon the nucleotide sequence infornlation of
the present
invention.
As used herein, "search means" refers to one or more programs which are
implemented on the computer-based system to compare a target sequence or
target structural
motif with the sequence information stored within the data storage means.
Search means are
used to identify fragments or regions of a known sequence which match a
particular target
sequence or target motif. A variety of known algorithms are disclosed publicly
and a variety
of commercially available software for conducting search means are and can be
used in the
computer-based systems of the present invention. Examples of such software
includes, but
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the
available
algorithms or implementing software packages for conducting homology searches
can be
adapted for use in the present computer-based systems. As used herein, a
"target sequence"
can be any nucleic acid or amino acid sequence of six or more nucleotides or
two or more
amino acids. A skilled artisan can readily recognize that the longer a target
sequence is, the
less likely a target sequence will be present as a random occurrence in the
database. The
most preferred sequence length of a target sequence is from about 10 to 300
amino acids,
more preferably from about 30 to 100 nucleotide residues. However, it is well
recognized
that searches for commercially important fragments, such as sequence fragments
involved in
gene expression and protein processing, may be of shorter length.
As used herein, "a target structural motif," or "target motif," refers to any
rationally
selected sequence or combination of sequences in which the sequences) are
chosen based on
a three-dimensional configuration which is formed upon the folding of the
target motif.
There are a variety of target motifs known in the art. Protein target motifs
include, but are
not limited to, enzyme active sites and signal sequences. Nucleic acid target
motifs include,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
99
but are not limited to, promoter sequences, hairpin structures and inducible
expression
elements (protein binding sequences).
4.15 TRIPLE HELIX FORMATION
In addition, the fragments of the present invention, as broadly described, can
be used
to control gene expression through triple helix formation or antisense DNA or
RNA, both of
which methods are based on the binding of a polynucleotide sequence to DNA or
RNA.
Polynucleotides suitable for use in these methods are preferably 20 to 40
bases in length and
are designed to be complementary to a region of the gene involved in
transcription (triple
helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science
15241, 456
(1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself
(antisense-
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense
Inhibitors of
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally
results in a shut-off of RNA transcription from DNA, while antisense RNA
hybridization
blocks translation of an mRNA molecule into polypeptide. Both techniques have
been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide.
4.16 DIAGNOSTIC ASSAYS AND HITS
The present invention further provides methods to identify the presence or
expression
of one of the ORFs of the present invention, or homolog thereof, in a test
sample, using a
nucleic acid probe or antibodies of the present invention, optionally
conjugated or otherwise
associated With a suitable label.
In general, methods for detecting a polynucleotide of the invention can
comprise
contacting a sample with a compound that binds to and forms a complex with the
polynucleotide for a period sufficient to form the complex, and detecting the
complex, so
that if a complex is detected, a polynucleotide of the invention is detected
in the sample.
Such methods can also comprise contacting a sample under stringent
hybridization
conditions with nucleic acid primers that anneal to a polynucleotide of the
invention under
such conditions, and amplifying annealed polynucleotides, so that if a
polynucleotide is
amplified, a polynucleotide of the invention is detected in the sample.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
100
In general, methods for detecting a polypeptide of the invention can comprise
contacting a sample with a compound that binds to and forms a complex with the
polypeptide for a period sufficient to form the complex, and detecting the
complex, so that if
a complex is detected, a polypeptide of the invention is detected in the
sample.
In detail, such methods comprise incubating a test sample with one or more of
the
antibodies or one or more of the nucleic acid probes of the present invention
and assaying
for binding of the nucleic acid probes or antibodies to components within the
test sample.
Conditions for incubating a nucleic acid probe or antibody with a test sample
vary.
Incubation conditions depend on the format employed in the assay, the
detection methods
employed, and the type and nature of the nucleic acid probe or antibody used
in the assay.
One skilled in the art will recognize that any one of the commonly available
hybridization,
amplification or immunological assay formats can readily be adapted to employ
the nucleic
acid probes or antibodies of the present invention. Examples of such assays
can be found in
Chard, T., An Introduction to Radioimmunoassay and Related Techniques,
Elsevier Science
Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al.,
Techniques in
hnmunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983),
Vol. 3
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory
Techniques in
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam,
The
Netherlands (1985). The test samples of the present invention include cells,
protein or
membrane extracts of cells, or biological fluids such as sputum, blood, serum,
plasma, or
urine. The test sample used in the above-described method will vary based on
the assay
format, nature of the detection method and the tissues, cells or extracts used
as the sample to
be assayed. Methods for preparing protein extracts or membrane extracts of
cells are well
known in the art and can be readily be adapted in order to obtain a sample
which is
compatible with the system utilized.
In another embodiment of the present invention, kits are provided which
contain the
necessary reagents to carry out the assays of the present invention.
Specifically, the
invention provides a compartment kit to receive, in close confinement, one or
more
containers which comprises: (a) a first container comprising one of the probes
or antibodies
of the present invention; and (b) one or more other containers comprising one
or more of the
following: wash reagents, reagents capable of detecting presence of a bound
probe or
antibody.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
101
In detail, a compartment kit includes any kit in which reagents are contained
in
separate containers. Such containers include small glass containers, plastic
containers or
strips of plastic or paper. Such containers allows one to efficiently transfer
reagents from
one compartment to another compartment such that the samples and reagents are
not
cross-contaminated, and the agents or solutions of each container can be added
in a
quantitative fashion from one compartment to another. Such containers will
include a
container which will accept the test sample, a container which contains the
antibodies used
in the assay, containers which contain wash reagents (such as phosphate
buffered saline,
Tris-buffers, etc.), and containers which contain the reagents used to detect
the bound
antibody or probe. Types of detection reagents include labeled nucleic acid
probes, labeled
secondary antibodies; or in the alternative, if the primary antibody is
labeled, the enzymatic,
or antibody binding reagents which are capable of reacting with the labeled
antibody. One
skilled in the art will readily recognize that the disclosed probes and
antibodies of the present
invention can be readily incorporated into one of the established kit formats
which are well
known in the art.
4.17 MEDICAL IMAGING
The novel polypeptides and binding partners of the invention are useful in
medical
imaging of sites expressing the molecules of the invention (e.g., where the
polypeptide of the
invention is involved in the immune response, for imaging sites of
inflammation or
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods
involve
chemical attachment of a labeling or imaging agent, administration of the
labeled
polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging
the labeled
polypeptide ih vivo at the target site.
4.18 SCREENING ASSAYS
Using the isolated proteins and polynucleotides of the invention, the present
invention further provides methods of obtaining and identifying agents which
bind to a
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences
set forth
in SEQ m NO: 1-1041, or 2083-2534, or bind to a specific domain of the
polypeptide
encoded by the nucleic acid. In.detail, said method comprises the steps of
(a) contacting an agent with an isolated protein encoded by an ORF of the
present invention, or nucleic acid of the invention; and
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
102
(b) determining whether the agent binds to said protein or said nucleic acid.
In general, therefore, such methods for identifying compounds that bind to a
polynucleotide of the invention can comprise contacting a compound with a
polynucleotide
of the invention for a time sufficient to form a polynucleotide/compound
complex, and
detecting the complex, so that if a polynucleotide/compound complex is
detected, a
compound that binds to a polynucleotide of the invention is identified.
Likewise, in general, therefore, such methods for identifying compounds that
bind to
a polypeptide of the invention can comprise contacting a compound with a
polypeptide of
the invention for a time sufficient to form a polypeptide/compound complex,
and detecting
the complex, so that if a polypeptide/compound complex is detected, a compound
that binds
to a polynucleotide of the invention is identified.
Methods for identifying compounds that bind to a polypeptide of the invention
can
also comprise contacting a compound with a polypeptide of the invention in a
cell for a time
sufficient to form a polypeptide/compound complex, wherein the complex drives
expression
of a receptor gene sequence in the cell, and detecting the complex by
detecting reporter gene
sequence expression, so that if a polypeptide/compound complex is detected, a
compound
that binds a polypeptide of the invention is identified.
Compounds identified via such methods can include compounds which modulate the
activity of a polypeptide of the invention (that is, increase or decrease its
activity, relative to
activity observed in the absence of the compound). Alternatively, compounds
identified via
such methods can include compounds which modulate the expression of a
polynucleotide of
the invention (that is, increase or decrease expression relative to expression
levels observed
in the absence of the compound). Compounds, such as compounds identified via
the
methods of the invention, can be tested using standard assays well known to
those of skill in
the art for their ability to modulate activity/expression.
The agents screened in the above assay can be, but are not limited to,
peptides,
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents
can be
selected and screened at random or rationally selected or designed using
protein modeling
techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical
agents
and the like are selected at random and are assayed for their ability to bind
to the protein
encoded by the ORF of the present invention. Alternatively, agents may be
rationally
selected or designed. As used herein, an agent is said to be "rationally
selected or designed"
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
103
when the agent is chosen based on the configuration of the particular protein.
For example,
one skilled in the art can readily adapt currently available procedures to
generate peptides,
pharmaceutical agents and the like, capable of binding to a specific peptide
sequence, in
order to generate rationally designed antipeptide peptides, for example see
Hurby et al.,
Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides,
A User's
Guide, W.H. Freeman, NY (1992), pp. 289-307, and I~aspczak et al.,
Biochemistry
28:9230-8 (1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as
broadly
described, can be used to control gene expression through binding to one of
the ORFs or
EMFs of the present invention. As described above, such agents can be randomly
screened
or rationally designed/selected. Targeting the ORF or EMF allows a skilled
artisan to design
sequence specific or element specific agents, modulating the expression of
either a single
ORF or multiple ORFs which rely on the same EMF for expression control. One
class of
DNA binding agents are agents which contain base residues which hybridize or
form a triple
helix formation by binding to DNA or RNA. Such agents can be based on the
classic
phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl
or polymeric
derivatives which have base attachment capacity.
Agents suitable for use in these methods preferably contain 20 to 40 bases and
are
designed to be complementary to a region of the gene involved in transcription
(triple helix -
see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241,
456 (1988); and
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-
Okano, J.
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense W hibitors of
Gene
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally results in
a shut-off of RNA transcription from DNA, while antisense RNA hybridization
blocks
translation of an mRNA molecule into polypeptide. Both techniques have been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide and other DNA binding agents.
Agents which bind to a protein encoded by one of the ORFs of the present
invention
can be used as a diagnostic agent. Agents which bind to a protein encoded by
one of the
ORFs of the present invention can be formulated using known techniques to
generate a
pharmaceutical composition.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
104
4.19 USE OF NUCLEIC ACIDS AS PROBES
Another aspect of the subject invention is to provide for polypeptide-specific
nucleic
acid hybridization probes capable of hybridizing with naturally occurnng
nucleotide
sequences. The hybridization probes of the subject invention may be derived
from any of
the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the
corresponding
gene is only expressed in a limited number of tissues, a hybridization probe
derived from
any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as
an
indicator of the presence of RNA of cell type of such a tissue in a sample.
Any suitable hybridization technique can be employed, such as, for example, in
situ
hybridization. PCR as described.in US Patents Nos. 4,683,195 and 4,965,188
provides
additional uses for oligonucleotides based upon the nucleotide sequences. Such
probes used
in PCR may be of recombinant origin, may be chemically synthesized, or a
mixture of both.
The probe will comprise a discrete nucleotide sequence for the detection of
identical
sequences or a degenerate pool of possible sequences for identification of
closely related
genomic sequences.
Other means for producing specific hybridization probes for nucleic acids
include the
cloning of nucleic acid sequences into vectors for the production of mRNA
probes. Such
vectors are known in the art and are commercially available and may be used to
synthesize
RNA probes ifa vitro by means of the addition of the appropriate RNA
polyrnerase as T7 or
SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The
nucleotide
sequences may be used to construct hybridization probes for mapping their
respective
genomic sequences. The nucleotide sequence provided herein may be mapped to a
.
chromosome or specific regions of a chromosome using well-known genetic and/or
chromosomal mapping techniques. These techniques include in situ
hybridization, linkage
analysis against known chromosomal markers, hybridization screening with
libraries or
flow-sorted chromosomal preparations specific to known chromosomes, and the
like. The
technique of fluorescent in situ hybridization of chromosome spreads has been
described,
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic
Techniques, Pergamon Press, New York NY.
Fluorescent ifz situ hybridization of chromosomal preparations and other
physical
chromosome mapping techniques may be correlated with additional genetic map
data.
Examples of genetic map data can be found in the 1994 Genome Issue of Science
(265:1981f). Correlation between the location of a nucleic acid on a physical
chromosomal
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
105
map and a specific disease (or predisposition to a specific disease) may help
delimit the
region of DNA associated with that genetic disease. The nucleotide sequences
of the subject
invention may be used to detect differences in gene sequences between normal,
carrier or
affected individuals.
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared
by, for
example, directly synthesizing the oligonucleotide by chemical means, as is
commonly
practiced using an automated oligonucleotide synthesizer.
Support bound oligonucleotides may be prepared by any of the methods known to
those
of skill in the art using any suitable support such as glass, polystyrene or
Teflon. One strategy
is to precisely spot oligonucleotides synthesized by standard synthesizers.
Immobilization can
be achieved using passive adsorption (hlouye & Hondo, (1990) J. Clin.
Microbiol. 28(6), 1469-
72); using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey &
Collins, (1989) Mol.
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (I~eller
et al., 1988;
1989); all references being specifically incorporated herein.
Another strategy that may be employed is the use of the strong biotin-
streptavidin
interaction as a linker. For example, Broude et al. (1994) Froc. Natl. Acad.
Sci. USA 91(8),
3072-6, describe the use of biotinylated probes, although these are duplex
probes, that are
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads
may be
purchased from Dynal, Oslo. Of course, this same linking chemistry is
applicable to coating
any surface with streptavidin. Biotinylated probes may be purchased from
various sources,
such as, e.g., Operon Technologies (Alameda, CA).
Nunc Laboratories (Naperville, IL) is also selling suitable material that
could be used.
Nunc Laboratories have developed a method by which DNA can be covalently bound
to the
microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface
grafted with
secondary amino groups (>NH) that serve as bridgeheads for further covalent
coupling.
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be
bound
to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing
immobilization of
more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-
42).
The use of CovaLink NH~ strips for covalent binding of DNA molecules at the 5'-
end
has been described (Rasmussen et al., (1991). In this technology, a
phosphoramidate bond is
employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is
beneficial as
immobilization using only a single covalent bond is preferred. The
phosphoramidate bond joins
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
106
the DNA to the CovaLink NH secondary amino groups that are positioned at the
end of spacer
arms covalently grafted onto the polystyrene surface through a 2 nm long
spacer arm. To link
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the
oligonucleotide terminus
must have a 5'-end phosphate group. It is, perhaps, even possible for biotin
to be covalently
bound to CovaLink and then streptavidin used to bind the probes.
More specifically, the linkage method iilcludes dissolving DNA in water (7.5
ng/~,1) and
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold
0.1 M 1-
methylimidazole, pH 7.0 (1-MeIm~), is then added to a final concentration of
10 mM 1-Melm~.
A ss DNA solution is then dispensed into CovaLink NH strips (75 p,l/well)
standing on ice.
Carbodiimide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC),
dissolved in 10 mM 1-Melm~, is made fresh and 25 ~,1 added per well. The
strips are incubated
for 5 hours at 50°C. After incubation the strips are washed using,
e.g., Nunc-Immuno Wash;
first the wells are washed 3 times, then they are soaked with washing solution
for 5 min., and
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH,
0.25% SDS
heated to 50°C).
It is contemplated that a further suitable method for use with the present
invention is
that described in PCT Patent Application WO 90/03382 (Southern & Maskos),
incorporated
herein by reference. This method of preparing an oligonucleotide bound to a
support involves
attaching a nucleoside 3'-reagent through the phosphate group by a covalent
phosphodiester link
to aliphatic hydroxyl groups carried by the support. The oligonucleotide is
then synthesized on
the supported nucleoside and protecting groups removed from the synthetic
.oligonucleotide
chain under standard conditions that do not cleave the oligonucleotide from
the support.
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen
phosphorate.
An on-chip strategy for the preparation of DNA probe for the preparation of
DNA probe
arrays may be employed. For example, addressable laser-activated
photodeprotection may be
employed in the chemical synthesis of oligonucleotides directly on a glass
surface, as described
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by
reference. Probes
may also be immobilized on nylon supports as described by Van Ness et al.
(1991) Nucleic
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan &
Cavalier (1988)
Anal. Biochem. 169(1), 104-8; all references being specifically incorporated
herein.
To link an oligonucleotide to a nylon support, as described by Van Ness et al.
(1991),
requires activation of the nylon surface via alkylation and selective
activation of the 5'-amine of
oligonucleotides with cyanuric chloride.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
107
One particular way to prepare support bound oligonucleotides is to utilize the
light-generated synthesis described by Pease et al., (1994) Proc. Nafl. Acad.
Sci., USA 91(11),
5022-6, incorporated herein by reference). These authors used current
photolithographic
techniques to generate arrays of immobilized oligonucleotide probes (DNA
chips). These
methods, in which light is used to direct the synthesis of oligonucleotide
probes in high-density,
miniaturized arrays, utilize photolabile 5'-protected N acyl-deoxynucleoside
phosphoramidites,
surface linker chemistry and versatile combinatorial synthesis strategies. A
matrix of 256
spatially defined oligonucleotide probes may be generated in this manner.
4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS
The nucleic acids may be obtained from any appropriate source, such as cDNAs,
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC
inserts, and RNA, including mRNA without any amplification steps. For example,
Sambrook
et al. (1989) describes three protocols for the isolation of high molecular
weight DNA from
mammalian cells (p. 9.14-9.23).
DNA fragments may be prepared as clones in M13, plasmid or lambda vectors
and/or
prepared directly from genomic DNA or cDNA by PCR or other amplification
methods.
Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of
DNA
samples may be prepared in 2-500 ml of final volume.
The nucleic acids would then be fragmented by any of the methods known to
those of
skill in the art including, for example, using restriction enzymes as
described at 9.24-9.28 of
Sambrook et al. (1989), shearing by ultrasound and NaOH treatment.
Low pressure sheariilg is also appropriate, as described by Schriefer et al.
(1990)
Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this
method, DNA
samples are passed through a small French pressure cell at a variety of low to
intermediate
pressures. A lever device allows controlled application of low to intermediate
pressures to the
cell. The results of these studies indicate that low-pressure shearing is a
useful alternative to
sonic.and enzymatic DNA fragmentation methods.
One particularly suitable way for fragmenting DNA is contemplated to be that
using the
two base recognition endonuclease, C'viJI, described by Fitzgerald et al.
(1992) Nucleic Acids
Res. 20(14) 3753-62. These authors described an approach for the rapid
fragmentation and
fractionation of DNA into particular sizes that they contemplated to be
suitable for shotgun
cloning and sequencing.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
108
The restriction endonuclease CviJI normally cleaves the recognition sequence
PuGCPy
between the G and C to leave blunt ends. Atypical reaction conditions, which
alter the
specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA
fragments form
the small molecule pUCl9 (2688 base pairs). Fitzgerald et al. (1992)
quantitatively evaluated
the randomness of this fragmentation strategy, using a CviJI** digest of pUCl9
that was size
fractionated by a rapid gel filtration method and directly ligated, without
end repair, to a lac Z
minus M13 cloning vector. Sequence analysis of 76 clones showed that CviJI**
restricts
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is
accumulated
at a rate consistent with random fragmentation.
As reported in the literature, advantages of this approach compared to
sonicaion and
,:
agarose gel fractionation include: smaller amounts of DNA are required (0.2-
0.5 ~Cg instead of
2-5 ~,g); and fewer steps are involved (no preligation, end repair, chemical
extraction, or
agarose gel electrophoresis and elution are needed).
Irrespective of the manner in which the nucleic acid fragments are obtained or
prepared,
it is important to denature the DNA to give single stranded pieces available
for hybridization.
This is aclueved by incubating the DNA solution for 2-5 minutes at 80-
90°C. The solution is
then cooled quickly to 2°C to prevent renaturation of the DNA fragments
before they are
contacted with the chip. Phosphate groups must also be removed from genomic
DNA by
methods known in the art.
4.22 PREPARATION OF DNA ARRAYS
Arrays may be prepared by spotting DNA samples on a support such as a nylon
membrane. Spotting may be performed by using arrays of metal pins the
positions of which
correspond to an array of wells in a microtiter plate) to repeated by transfer
of about X20 n1 of a
DNA solution to a nylon membrane. By offset printing, a density of dots higher
than the density
of the wells is achieved. One to 25 dots may be accommodated in 1 mm2,
depending on the
type of label used. By avoiding spotting in some preselected number of rows
and columns,
separate subsets (subarrays) may be formed. Samples in one subarray may be the
same genomic
segment of DNA (or the same gene) from different individuals, or may be
different, overlapped
genomic clones. Each of the subarrays may represent replica spotting of the
same samples. In
one example, a selected gene segment may be amplified from 64 patients. For
each patient, the
amplified gene segment may be in one 96-well plate (all 96 wells containing
the same sample).
A plate for each of the 64 patients is prepared. By using a 96-pin device, all
samples may be
spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from
each patient.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
109
Where the 96 subarrays are identical, the dot span may be 1 mmz and there may
be a 1 mm
space between subarrays.
Another approach is to use membranes or plates (available from NL1NC,
Naperville,
Illinois) which may be partitioned by physical spacers e.g. a plastic grid
molded over the
membrane, the grid being similar to the sort of membrane applied to the bottom
of multiwell
plates, or hydrophobic strips. A fixed physical spacer is not preferred for
imaging by exposure
to flat phosphor-storage screens or x-ray films.
The present invention is illustrated in the following examples. Upon
consideration of
the present disclosure, one of skill in the art will appreciate that many
other embodiments and
variations may be made in the scope of the present invention. Accordingly, it
is intended that
the broader aspects of the present invention not be limited to the disclosure
of the following
examples. The present invention is not to be limited in scope by the
exemplified embodiments
which are intended as illustrations of single aspects of the invention, and
compositions and
methods which are functionally equivalent are within the scope of the
invention. Indeed,
numerous modifications and variations in the practice of the invention are
expected to occur to
those skilled in the art upon consideration of the present preferred
embodiments. Consequently,
the only limitations which should be placed upon the scope of the invention
are those which
appear in the appended claims.
All references cited within the body of the instant specification are hereby
incorporated
by reference in their entirety.
5.0 EXAMPLES
5.1 EXAMPLE 1
Novel Nucleic Acid Seguences Obtained From Various Libraries
A plurality of novel nucleic acids were obtained from cDNA libraries prepared
from
various human tissues and in some cases isolated from a genomic library
derived from human
chromosome using standard PCR, SBH sequence signature analysis and Sanger
sequencing
techniques. The inserts of the library were amplified with PCR using primers
specific for the
vector sequences which flank the inserts. Clones from cDNA libraries were
spotted on nylon
membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to
obtain signature
sequences. The clones were clustered into groups of similar or identical
sequences.
Representative clones were selected for sequencing.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
110
In some cases, the 5' sequence of the amplified inserts was then deduced using
a typical
Sanger sequencing protocol. PCR products were purified and subjected to
fluorescent dye
terminator cycle sequencing. Single pass gel sequencing was done using a 377
Applied
Biosystems (ABA sequencer to obtain the novel nucleic acid sequences.
5.2 EXAMPLE 2
Assemblage of Novel Conti~s
The contigs of the present invention, designated as SEQ m NO: 2083-2534 were
assembled using an EST sequence as a seed. Then a recursive algorithm was used
to extend the
seed EST into an extended assemblage, by pulling additional sequences from
different
databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and
UniGene, and
exons from public domain genomic sequences predicated by GenScan) that belong
to this
assemblage. The algorithm terminated when there were no additional sequences
from the
above databases that would extend the assemblage. Further, inclusion of
component sequences
into the assemblage was based on a BLASTN hit to the extending assemblage with
BLAST
score greater than 300 and percent identity greater than 95%.
Table 8 sets forth the novel predicted polypeptides (including proteins)
encoded by the
novel pohynucleotides (SEQ )D NO: 2083-2534) of the present invention, and
their
corresponding translation start and stop nucleotide locations to each of SEQ
ID NO: 2083-2534.
Table 8 also indicates the method by which the polypeptide was predicted.
Method A refers to
a polypeptide obtained by using a software program called FASTY (available
from
http://fasta.bioch.virginia.edu) which selects a polypeptide based on a
comparison of the
translated novel polynucleotide to known polynucleotides (W.R. Pearson,
Methods in
Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B
refers to a
polypeptide obtained by using a software program called GenScan for
human/vertebrate
sequences (available from Stanford University, Office of Technology Licensing)
that predicts
the polypeptide based on a probabilistic model of gene structure/compositional
properties (C.
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by
reference).
Method C refers to a polypeptide obtained by using a Hyseq proprietary
software program that
translates the novel polynucheotide and its complementary strand into six
possible amino acid
sequences (forward and reverse frames) and chooses the polypeptide with the
longest open
reading frame.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
111
5.3 EXAMPLE 3
Novel Nucleic Acids
The novel nucleic acids of the present invention SEQ ID NO: 1-1041 were
assembled
from Hyseq's proprietary EST sequences as described in Example 1 and human
genome
sequences that are available from the public databases
(htt~://www.ncbi.nlm.nih.~ovn.
Exons were predicted from human genome sequences using GenScan
(http:l/genes.mit.edu/GENSCANinfo.html); HMMgene
(http~l/www cbs dtu.dl~/services/HMM~enemmmgenel l.html); and GenMark.hmm
(httpyenemark.biology.~atech.edu/GeneMark/whmm info.html). The Hyseq
proprietary
EST sequences and the predicted exons were assembled based on a BLASTN hit to
the
extending assemblage with BLAST score greater than 300 and percent identity
greater than
95%. Then, the predicted genes were analyzed using Neural Network SignalP V1.1
program
(from Center for Biological Sequence Analysis, The Technical University of
Denmark) for
presence of a signal peptide. These sequences ware further analyzed for
absence of a
transmembrane region using the TMpred program
(http://www.ch.embnet.or~/software/TMPRED form.html).
Table 1 shows the various tissue sources of SEQ ID NO: 1-1041.
The homologs for polypeptides SEQ m NO: 1042-2082, that correspond to
nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0a1
19MP-
WashU searches against Genpept release 124 using BLAST algorithm. The results
showing
homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J.
Comp. Biol., Vol. 6, 219-235 (1999), http:l/motif.stanford.edu/ematrix-search/
herein
incorporated by reference), all the polypeptide sequences were examined to
determine
whether they had identifiable signature regions. Scoring matrices of the
eMatrix software
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO
databases. Table 3 shows the accession number of the homologous eMatrix
signature found
in the indicated polypeptide sequence, its description, and the results
obtained which include
accession number subtype; raw score; p-value; and the position of signature in
amino acid
sequence.
Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences
were examined for domains with homology to certain peptide domains. Table 4
shows the
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
112
name of the Pfam model found, the description, the e-value and the Pfam score
for the
identified model within the sequence. Further description of the Pfam models
can be found
at http://pfam.wustl.edu/.
The GeneAtlasT"' software package (Molecular Simulations Inc. (MSI), San
Diego,
CA) was used to predict the three-dimensional structure models for the
polypeptides
encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated
by
(1) PSI-BLAST which is a multiple alignment sequence profile-based searching
developed
by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High
Throughput Modeling
(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated
sequence
and structure searching procedure (http://www.msi.com/), and (3) SeqFoldTM
which is a fold
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-
791 (1998)).
This analysis was carried out, in part, by comparing the polypeptides of the
invention with
the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional
structures
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier
given to
template structure; "Chain ID", identifier of the subcomponent of the PDB
template
structure; "Compound Information", information of the PDB template structure
and/or its
subcomponents; "PDB Function Amlotation" gives function of the PDB template as
annotated by the PDB files (http:/www.rcsb.or DB/); start and end amino acid
position of
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold
score, and the
Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlasT"'
software
(MST), is based on Dr. Eisenberg's Profile-3D threading program developed in
Dr. David
Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and
Eisenberg, Nature,
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc.. Natl.
Acad. Sci. USA,
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify
score for
proteins with different lengths so that a unified cutoff can be used to select
good models as
follows:
Verify score (normalized) _ (raw score -1/2 high score)/(1/2 high score)
The PFM score, produced by GeneAtlasT"' software (MSI), is a composite scoring
function that depends in part on the compactness of the model, sequence
identity in the
alignment used to build the model, pairwise and surface mean force potentials
(MFP). As
given in table 5, a verify score between 0 to 1.0, with 1 being the best,
represents a good
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
113
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best,
represents a good
model. A SeqFoldTM score of more than 50 is considered significant. A good
model may
also be determined by one of skill in the art based all the information in
Table 5 taken in
totality.
Table 6 shows the position of the signal peptide in each of the polypeptides
and the
maximum score and mean score associated with that signal peptide using Neural
Network
SignalP V1.1 program (from Center for Biological Sequence Analysis, The
Technical
University of Denmark). The process for identifying prokaryotic and eukaryotic
signal
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob
Engelbrecht,
Soren Brunak, and Gunnar von Heijne in the publication " Identification of
prokaryotic and
eukaryotic signal peptides and prediction of their cleavage sites" Protein
Engineering, Vol.
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score
and a mean
S score, as described in the Nielson et al reference, was obtained for the
polypeptide
sequences.
Table 7 correlates each of SEQ ID NO: 1-1041 to a specific chromosomal
location.
Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their
corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their
corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and
the US
serial number of the priority application in which the contig sequence was
filed.
Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the
corresponding SEQ
ID NO in which the sequence was filed in priority US application 60/311,261.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
114
Table 1
'Tissue Ori in 1Z1VA/Tissue Librar Name SEQ ID NO:
Source
adrenal gland Clontech ADR002 13 23 34 45 77 111
115 122 187
194 210-211 249-250
255 290
320 357-358 362 420
443 451
492 499 551 577 630
698 702
713 718 805 808 819
841-843
845 861 896 899 909
924 937
949 985 1037
adult bladder Invitrogen BLD001 9 87 189 320-321
358 563 768
840 970
adult brain Clontech ABR001 ~ 184-186 277 282 352
558 849
871 898 958
adult brain Clontech ABR006 30 45 170 199 210
226 260 292-
294 340 357 413 443-444
478
499 551-552 579 582
584-588
632-637 646 654-655
676 683
731-732 755-756 777
813-827
861 872 874 880 883
1002 1012
adult brain Clontech ABR008 15 45 54 61 67 81
87 101 106
108 122-123 143-144
170 181-
183 195-209 215 222
245-248
261-270 283-289 292-293
296
306 308-310 327 340
358 370
394-407 409 421 428
440 442
459 477-478 496 531-547
551-
552 556 565-566 578-579
606
618 620-621 629-630
651 653-
655 664 667-668 707
713-714
729 745 750 753 756
772 779
788 790 793-794 799-800
802
808 812 823 826-827
849-850
859 862 872 883 885
898 917
919 921 930 935-936
947 974
985-986 992 1002
1006 1012
1028 1030 1036 1039
adult brain Clontech ABRO 11 1012
adult brain GIBCO AB3001 23 57-58 67 85 296
492 499 579
853 898-899 950 1012
adult brain GIBCO ABD003 45 59-62 67 72 82
85-88 156
179-180 182 296 299
355-356
440 458 474 483 499
563 823
840 852 860 885 898
992 999
1012
adult brain Invitrogen ABR014 45 115 238 470 599
653 974-976
adult brain Invitrogen ABR015 45 600 885 1012
adult brain Invitro en ABR016 599 1012
adult brain Invitrogen ABT004 ' 34 45 54 74 84 118
138-143 170-
171 180-181 208 255
277 359
379 428 438 499 501
536 715
731 783 793 799 805
809 824
862 898 912 977 998
1012
adult cervix BioChain CVX001 23 26 48 54 57 67
77 118 121
177 183 238 255 271-272
296
303 311-319 325 352
361-362
411-412 419-420 424
428 440
447 478 541 567 569
599-600
622 699 793 805 813
831 836-
837 839 844-845 848
863 872
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
115
Table 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
913 928-929 944 958
965 970
973 1001 1004
adult colon Invitrogen CLN001 250 322-325 429 630
788 970
985
adult heart GIBCO AHR001 28-30 45 61 67 90-94
118 122
150-151 183 193 250-251
279
349-351 369-370 410
419 474
483 485 490 493 552
563 719
773 835-836 853 861
961 976
1030
adult kidney GIBCO AKD001 24 31-34 44-46 48
55 62 67 81
121 144 151 162 176-178
183
251 255 258 277 352
358 369-
370 386 408 420 429
483 490
536 546 579 599-600
602 645
698 793 805 874 898
913
adult kidney Invitrogen AKT002 32 53-54 67 85 177
251 260 341
386 408 419-420 431-436
478
490 493 507 561 582
596-599
698 728 788 805 819
837 844-
848 885 898 969 989
1013
adult liver Clontech ALV003 101 121 193 579 638-639
729
890-893 919 1007 1017
adult liver Invitrogen ALV002 75 157 173 183 212-214
236 240
263 292 323 335 386
408 415
495-499 552 577 589
599 727
782 858 869 898-900
924 968
adult lung GIBCO ALG001 67 77 152 369 386
419 443 483
583 732 849 907
adult ovary Invitrogen AOV001 5 26 34 43 45 48 55
61-62 64-67
77 87 101-102 105
115 118 122-
129 143 151 155-163
170 174-
175 177 181-183 193
251-252
286 292 338 347 353-354
369
381 410 415 420 424
451 458
483 489 497 499 515
536 541
546 552 577 579 595
599-600
604 647 658 661 665
699 744
782-783 800 805-806
814 831
835 839-840 844 853
874 895
898-899 913 924 929
941-942
949 973 977 994 1004
1007 1012
1016 1031 1037
adult lacenta Clontech APL001 67 419 688 728 848
930
adult spleen Clontech SPLc01 82 101 187 255 260
358 370 447
483 489 579 586 648
768 835
845 848 853-857 863
885 913
917 962 986
adult spleen GIBCO ASP001 87 105 108 122 158
172 215 299
380 492 499 552 599
622 785
830 840 850 889
adult testis GIBCO ATS001 68-69 106 183 251
301 360 386
520 541 570 753 788
832 840
890 916
bone marrow Clontech BMD001 10-12 16-19 24-26
35 46 48 58
77 85 95-96 98-99
122 156 164
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
116
TahlP 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
172 187 222 251 385
424 429
458 478 483 489 519
568-569
599 622-623 630-631
696 700
758 765 794 844 914
919 924
944 971 985 992 1001
1017
bone marrow GF BMD002 23 45 81-82 104-105
115 136
144 156 170 172-173
181 183
247 287 292 306 319-320
327
362 370 418 478-483
489 492
536 548-552 565 569-570
572
579 596 599 614-622
630 640-
641 643 653 668 691
699 708
715-718 726 743 756
758 772
789 841 889 917 920
947 958
994 1006 1010 1037
1039
cultured preadipocytesStratagene ADP001 121 255 400 490-494
511 629
689 758 793 835 861
913 944
949 984
endothelial cellsStratagene EDT001 34 45 54 58 67 120-122
144 151-
154 183 193 299 385
440 451
458 483 490 499 515
552 563
569 577 579 599 622-623
752
793 800 844-845 898-899
942
944 949
fetal brain Clontech FBR001 139 168 356 599 702
712 831
845 850 872-873 898
921 1037
fetal brain Clontech FBR004 138 168 250 363 873-875
882
fetal brain Clontech FBR006 14 29 45 51 81 87
101 104 118
131 143-144 157 171
177 206
208-209 215 229 238
251 261
273 279 283 291-293
326-332
358 362 370-371 397
400 402
413 419 428 461 472
485 551-
560 568-569 579 618
620 629-
630 653-657 659-661
663-673
675 700 714 739-742
744-746
766 779 793 809 815
819 822
840 850 859 862 872
875-885
930 958 972 995 1002
1006 1028
1030-1031 1038
fetal brain GIBCO HFB001 13-15 54-57 62 67
70-72 84 121
174 177 180 183 410
417 424
485 518 520 542 552
578-579
599 785 793 805 831-832
840
858 871 883 898-899
977 1012
fetal brain Invitrogen FBT002 7 45 49 144-149 157
180 255 263
356 493 501 600 630
707 748
832 845 858 913 1012
fetal heart Invitrogen FHR001 24 45 81-82 104 114-115
118
121 144 152 181 239
247 288
292 327 362 370 381
419 428
444 453 458 478 486
493 503
569 571 576 582.596
618 640
' 668 674-688 719-722
731 744
753 762 772 784 794
819 823
836 850 885 914 944
949 957-
958 1017
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
117
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
fetal kidney Clontech FI~D001 82 107 208 458 483
485 536 758
760 819 836 894 1017
fetal kidney Clontech FKD002 61 101 105 183 189
238 247 263
292 327 340 370 405
416 419
517 569 586 620 648
668 689-
691 731 746-752 763
771-772
787-788 819 840 842
854 861
872 944 958 961 969
fetal kidney Invitro en FKD007 116
fetal liver Clontech FLV002 410 429 454 692-695
704 781
805 894-895 1017
fetal liver Clontech FLV004 67 107 115 118 151
187 241 255
287 370 466 478 492
518 548
552 569 582 589 630
653 668
696-699 752-757 784
789 805
885 908 985
fetal liver ~ Invitrogen FLV001 45 101 130-137 157
222 240 337
386 428-429 492 552
589 693
727 840
fetal liver-spleenColumbia FLS001 1-9 18 20-23 27 34
36-38 45 55
University 67 70 83 89 94 118
122 158 164
172-173 177 183 219
238 240
246 251 292 299 323
335 338
358 369 376 385-386
397 408
416 419 421-422 429
451 456-
460 466 472 478 483
489-490
493 516 536 543 546
551 569-
573 579 586 588-589
593-595
599-603 619 622 668
676 691
699 702 724 731 734
743 787
789 794 800 805 834-835
840
848 853 874 880 885
890-891
899 908 910 923 926-927
930
939-940 944 949 958
973 980
992 999 1004 1007
1009 1013
fetal liver-spleenColumbia FLS002 3 8 17 22 36-37 46
55 61 63 70
University 72 85 89-90 94 106
122 148 156
158 165 172 177 181
194 213
215 219 246 251 292
299 304-
307 323-324 338 346
355 366
371 374 380-381 386
392 397
410 417 421 440 455
462-464
466-468 489-490 492-493
507-
521 536 552 565-566
569 571-
576 592 596 599 619
630 650
655 661 688 698-699
712 718
723-729 731 735-737
753 767
783 824 831 834 840
845 871
885 891 894 899 902
906-909
913 923-930 940 943
949 958
. 973 980 992 999 1003
1007 1017
1032 1040-1041
fetal liver-spleenColumbia FLS003 23 67 106 150 158
193 338 374
University 376 411 443 478 493
546 565
569-570 582 589 609-613
630
661 699 724 727-734
767 809
812 834-835 845 880
890 910
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
118
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
929-930 958 973 980
985 1013
fetal lung Clontech FLG001 728 824 1008
fetal lun Clontech FLG004 115 668
fetal lung Invitrogen FLG003 120 183 322 333-336
476 516
691 831 835 850 1012
fetal muscle Invitrogen FMS001 45 338-339 365 369
386 429 431
496-497 789 793 856
970 1008
1019 1033 1035
fetal muscle Invitrogen FMS002 45 115 171 247 327
365 370 405
536 642-652 668 710-711
719
726 758-761 765 836
899 901
907 913 948 965 1037
fetal skin Invitrogen FSK001 29 57 67 74 81 118
152 177 180
193 294 340-342 345
375 397
419 437-443 445-451
454 475
532 541 546 565 598
604 630
650 668 728 742 772
789 793
804-805 823 828-830
837 840
849 899 901 922 958
970 1007
1022 1033
fetal skin Invitrogen FSK002 34 45 77 81 85 115
173 200 279
292-293 360 370 381
419 428-
429 451 466 490 551
569-570
579 600 604 630 647
668 698
700-706 729 731 746
750 758
762-766 768-773 780
794 840
850 859 861 885 901
911 913
957 961 965 973 1038
fibroblast Stratagene LFB001 55 72 143 255 490
502-505 587
599 627 861 863 885
984 1037
induced neuron-cellsStratagene NTD001 30 82 111 124 181
206 356 392
410 417 484-488 578
831-834
898 977 1036 1039
infant brain Columbia IB2002 18 21 45 66 73-75
100-103 118
University 152 168-171,177 180
241-242
252 292-295 340 345
366-367
413 438 454 499 501
542 561-
562 578-580 599 668
702 728-
729 745 765 768 772
793 796-
799 823-824 863 874
887 899
948-949 967 975 977
981 983
992 995 1012
infant brain Columbia IB2003 81 101 113 118 177
180 241 252
Uiliversity 293 340 345 367 371
379 381
400 417 499-501 536
562 578
580-581 629-630 702
713 745
796-805 824 831 837
840 845
874 885 967 977 981
985 1012
1030
infant brain Columbia IBM002 168 358 413-414 913
University
infant brain Columbia IBS001 415 417 533 581 886-888
977
University
leukocyte Clontech LUG003 77 619889 949
leukocyte GIBCO LUC001 34 36 38-42 50-52
55 67 77 81-
83 85 121 137 144
158 172 183
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
119
Tahl a 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
223 226 251 254 258
291 324
368-374 378 424 429
443 483
492 536 552 564 600
602 732
760 768 782 785 805
838 844-
845 848 850 889 898
905 908
946 973 992
leg 55 72 143
255 490
502-505 587
599
627 861 863
885
984 1037
lung tumor Invitrogen LGT002 55 61 65 77-79 82
102 105 115
156-157 165-167 170
182-183
197 243-244 251 253
296-297
325 370 386 418-419
421-425
478 483 492 499 520
531 533
541 569 577 582 600
788 844-
845 848 874 899 911
913 916-
918 939 944 949 956
970 976
lymph node Clontech ALN001 47 63 104-105 183
483 492 691
894 1017
lymphocytes ATCC LPC001 45 53 77 158 193 251
392 421
455 469-474 483 507
536 546
579 581 618 621 640
765 780-
787 793 838 845 875
924 968
978 999
macrophage Invitrogen HMP001 122 147 157 183 251
255 493
738 898-899 903-905
mammary gland Invitrogen MMG001 45 64 67 83-84 101
113 143 148
152 158 164 177 181-183
189
216-218 253 255 258
263 274
299 336 419 421 423
426-430
440 466 478 490 520
533 536
564 569 579 582 630
646 753
768 782 789 800 835
840 848
850 883 912-913 944
950 958
melanoma from-cell-line-Clontech MEL004 62 158 181 298 362
364 402 419
ATCC-#CRL-1424 515 536 896-897 958
973 1004
1008
*Mixture of 16 Various VendorsCGd010 353 358 823 942 982
tissues - 1020
mRNA
*Mixture of 16 Various VendorsCGd011 569 630 944 955 999
tissues -
mRNA
*Mixture of 16 Various VendorsCGd012 9 38 59 63 80 85 122-123
tissues - 152
~A 154 177 195 217 232
246 250
296 300 306 323-324
381 427
434 438-439 478 489
499 507
517 538 558 565 571
575 630
657 681 701 736 762
792 800
802 823-824 861 871-872
899
929 941 955 968 974
985-1003
1006 1011-1012 1033
*Mixture of 16 Various VendorsCGd013 232 434 748 956-958
tissues - 992
mRNA
*Mixture of 16 Various VendorsCGd015 18 69 115 324 335
tissues - 548 551 569
~A 582 600 622 731 819
899 911
944 957-958 1012 1017-1018
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
120
Tahle 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source
*Mixture of 16 Various VendorsCGd016 46 172 183 323 371
tissues - 481 493 565
~A 569 571 596 599 630
654 698
745 762 786 849 907
944 1004-
1013 1037 1039
neuronal cells Stratagene NTU001 7 33 45 107 113 121
150 183 286
385 440 478 483 485
487 489
536 569 582 756 768
772 819
836 944 958 966 1001
pituitary gland Clontech PIT004 158 222 255 345 356
370 379
569 579 819 831 861-862
885
898 922 1017
placenta Clontech PLA003 7 36 61 279 419 478
489 582 586
599 641 647 668 681
707-711
774-779 1001
placenta Invitrogen APL002 57 173 536 728 793
800
prostate Clontech PRT001 26 219-222 229 412
599 665 762
835 837 860 878 951
1031
rectum Invitrogen REC001 9 292 343-346 431
546 714 800
863 918
retinoic acid-induced-Shatagene NTR001 112 400 478 569 582
629 756
neuronal-cells 758 800 819 831 835-836
850
906 944 958
salivary gland Clontech SAL001 58 61 77 118 150 158
294 347-
348 483 492-493 546
752 830
915
skeletal muscle Clontech SI~M001 80 118 247 365 483
719 805 812
823
small intestine Clontech SIN001 34 37 45 52 60 93
106 119 121
138 144 177 180 208
223-225
238 247 294 323 335-336
343
362 370 380 386 397
409-411
416 420 440 451 455
478 489
493 536 571 577 579
590 602
604-608 614 622 624-628
655
668 688 700 714 805-812
831
841 872 894 899 914
924 926
929 958 961 965 973
991 998
1017
spinal cord Clontech SPC001 51 164 182-183 190
226-228
255-257 275-277 286
296 299
451 454 542 552 579
591 728
753 770 786 790 831
835 849-
852 898 907 958 1000
1012
stomach Clontech STO001 72 222 232 247 258
366 645
thalamus Clontech THA002 45 49 113 155 164
180 183 191-
192 208 229-232 238
345 417
443 512 551 558 592
630 728
800 823 840 858-860
885 898
976 1012
thymus Clontech THM001 45 141 160 183 258
360 378-379
418 451 460 569 602
619 731
788-790 819 835 845
958 965
1004
thymus Clontech THMc02 47 108 115 121 144
157 173 247
259-260 300 327 340
358 362
375-393 409 453 455
461 478-
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
121
Table 1
Tissue Ori in RNAITissue Librar Name SEQ ID NO:
Source
479 489 551 565 569-570
579
582 615 630 640 653
668 708
744 752 758 766 790-795
810
819 823 835-836 845
850 853
861 885 911 919 938
958 962
994 1001 1027
thyroid gland Clontech THR001 46 58 67 80 82 144
160 177 183
193-194 233-235 251
255 263
268 278-280 286 299
301-303
324 358 370 386 397
408 410
420 440 474 483 493
506 519-
520 533 594 599-600
602 658
661 719 758 772 785
788 793
830 851 853 864-867
898 904
909 924 929 961 973
991 998
1001 1009
trachea Clontech TRC001 45 154 236 238 281
323 416 571
602 868-869 913
umbilical cord BioChain FUC001 34 45 54 58 67 70
85 152 154
177 180 188 208 251
299 370
409 415 419 434 451-455
483
596 599 647 661 733
742 793
808 839-840 845 849-850
861
888 911 913 992
uterus Clontech UTR001 177 237-239 255 258
417 493
520 567 599 604 646
844 870
874 898 973
young liver GIBCO ALV001 45 419 440 443 490
653 732 753
805 845 898 904
*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult
brain mRNA (Invitrogen), 2)
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA
(Invitrogen), 4) Normal adult liver
mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Norn~al fetal
liver mRNA (Invitrogen), 7)
normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech),
9) Human bone marrow
mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 11) Human
thymus mRNA
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord
mRNA (Clontech), 14)
human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human
conceptional
umbilical cord mRNA (BioChain).
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
122
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1044 AAB32400 Homo SapiensHUMA- Human secreted 339 100
protein
sequence encoded by
gene 30 SEQ ID
N0:86.
1044 AAM74711 Homo SapiensMOLE- Human bone marrow335 100
expressed probe encoded
protein SEQ
ID NO: 35017.
1044 AAM61909 Homo SapiensMOLE- Human brain expressed335 100
single
exon probe encoded protein
SEQ ID
NO: 34014.
1045 gi3859599Arabidopsis similar to class I chitinases74 27
(Pfam:
thaliana PF00182, E=1.2e-142,
N=1)
1045 gi15292107Drosophila LD38671p 74 33
melanogaster
1045 gi2258324Fusarium yellowing-associated 73 32
protein
oxysporum
f. Sp.
ciceris
1046 gi17428204Ralstonia CONSERVED HYPOTHETICAL 74 32
solanacearumPROTEIN
1046 gi4314432Homo Sapienssimilar to phosphatidylinositol71 30
(4,5)bisphosphate 5-phosphatase;
match to PID:g1399105
1046 gi~17545909~Ralstonia CONSERVED HYPOTHETICAL 74 32
ref~NP_5193solanacearumPROTEIN
11.1
1047 gi9756017Actinoplanesalpha-amylase 69 38
Sp.
50/110
1047 gi~6572499~gHomo SapiensLHX3 protein 67 26
b~AAF17291
.1~
1047 gi~18572988~Homo SapiensLIM homeobox protein 67 26
3
re~XP_0291
70.2
1048 AAY28474-Homo SapiensUYJO Human Capon protein.721 99
1048 gi2895555Homo sapienscarboxyl-terminal PDZ 721 99
ligand of
neuronal nitric oxide
synthase
1048 gi2895557Rattus carboxyl-terminal PDZ 654 92
ligand of
norve icus neuronal nitric oxide
synthase
1049 gi19713721FusobacteriumGTP-binding protein 66 28
era
nucleatum
subsp.
nucleatum
ATCC 25586
1050 131291 Homo sa iensfumarylacetoacetase 175 70
(AA 1-349)
1050 g1182393 Homo sa iensfumarylacetoacetate 175 70
hydrolase
1050 g112803409Homo Sapiensfiunar lacetoacetate 175 70
1052 g14680089Human envelope glycoprotein 79 26
immunodeficienc
y virus a
1
1052 g13868997Ephydatia EFPDE2 74 20
fluviatilis
1052 g14679590Human envelope glycoprotein 74 25
immunodeficienc
y virus type
1
1054 g13844648Mycoplasma glycerol kinase (glpK) 71 28
genitalium
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
123
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1054 gi18448155Ipomoea AC3 70 27
leaf
curl virus
1054 gi~12044888~Mycoplasma glycerol kinase (glpK) 71 28
ref~IVP_0726genitalium
98.1
1056 AAM56747 Homo SapiensMOLE- Human brain expressed229 72
single
exon probe encoded protein
SEQ ID
NO: 28852.
1056 AAM67067 Homo SapiensMOLE- Human bone marrow 224 69
expressed probe encoded
protein SEQ
ID NO: 27373.
1056 AAM54664 Homo SapiensMOLE- Human brain expressed224 69
, single
exon probe encoded protein
SEQ ID
NO: 26769.
1058 gi~13310191~multiple recombinant envelope 228 79
protein
gb~AAK181sclerosis
89.1~AF331associated
500_1 retrovirus
element
1058 gi~21103962~Homo sapiensenverin-2 209 77
gb~AAM331
41.1
1058 gi~8272468~gHomo Sapiensenvelope protein 198 75
b~AAF74215
.1 ~AF15696
3 1
1059 120380199Homo sa Similar to LOC168246 251 100
iens
1059 gi~8388692~eLeishmania probable DNA-binding 67 46
protein
mb~CAB940major
42.1 ~
1060 gi~21292780~Anopheles agCP4203 70 39
'
gb~EAA049gambiae
str.
25.1 J PEST
1061 g1330862 Equine membrane glycoprotein 179 30
herpesvirus
1
1061 g117221106Equine glycoprotein gp2 178 34
herpesvirus
1
1061 AAE03643 Homo SapiensINCY- Human extracellular175 29
matrix and
cell adhesion molecule-7
(XMAD-7).
1062 gi~11037117~Homo SapiensNAG13 334 66
gb~AAG274
85.1 CAF
194
537 1
1062 gi~1335205~eHomo SapiensORFII 332 66
mb~CAA364
80.1 ,
1063 g121323402CorynebacteriumABC-type transporter, 70 36
periplasmic
glutamicum component
ATCC 13032
1063 gi~19551869~CorynebacteriumCOG1464:ABC-type uncharacterized70 36
reflNP-5998glutamicum transport systems, periplasmic
71.1 ~ component
1063 gi~17551878~CaenorhabditisTPRDomain 67 37
re NP elegans
4990
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
124
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
90.1
1064 gi2308977Aspergilluschitin synthase 66 29
nidulans
1065 gi18076958Yarrowia Optl protein 74 30
lipolytica
1065 gi786145 Walleye envelope polyprotein 73 28
dermal
sarcoma
virus
1065 gi2801522Walleye gPr env 73 28
dermal
sarcoma
virus
1066 gi9294279ArabidopsisTal l-like non-LTR retroelement67 32
thaliana protein-like; CHP-rich
zinc finger
rotein-like
1066 gi~20848817~Mus musculussimilar to HEAT SHOCK 83 69
COGNATE
ref~XP_1380 PROTEIN 80
10.1
1069 AAM77637 Homo SapiensMOLE- Human bone marrow 96 65
expressed probe encoded
protein SEQ
ID NO: 37943.
1069 AAM64901 Homo SapiensMOLE- Human brain expressed96 65
single
exon probe encoded protein
SEQ ID
NO: 37006.
1069 gig 17473741Homo Sapienssimilar to Meningioma-expressed112 56
~
ref~~ antigen 6/11 (MEA6) (MEAL
0623 l)
80.1
1070 gi296288 Homo Sapienshistone H1 77 44
1070 15923857 Artemisia s ualene synthase 75 35
annua
1070 AAO08837 Homo SapiensHYSE- Human polypeptide 73 39
SEQ ID
NO 22729.
1071 g121483554Drosophila SD02058p 72 29
melano aster
1071 g18515845Homo Sapienshepatocellular carcinoma71 38
associated
rotein TD26
1071 gi~21483554~Drosophila SD02058p 72 29
gb~AAM527melanogaster
52.1 ~
1072 g15902896Streptomycestype I polyketide synthase74 50
AVES 4
avermitilis
1072 gi~21301752~Anopheles agCP8235 70 34
gb~EAA138gambiae
str.
97.1 PEST
1073 AAV30916 Homo SapiensGEMY Human secreted protein9.9 66
_ AR415 4 cDNA.
aal
1073 ABB89113 Homo SapiensHUMA- Human polypeptide 99 66
SEQ ID
NO 1489.
1073 AAB90679 Homo SapiensGEMY Human AR415 4 protein99 66
sequence SEQ ID 35.
1074 AAG99338 Homo SapiensTAKE Human atypical tachykinin380 92
~
rotein fragment SEQ ID
NO: 20.
1074 AAG99336 Homo SapiensTAKE Human atypical tachykinin329 91
rotein fragment SEQ ID
NO: 13.
1074 AAG99333 Homo SapiensTAKE Human atypical tachykinin324 91
protein fra ment SEQ
ID NO: 3.
1075 g117945760Drosophila RE33302p 305 29
melanogaster
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
125
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1075 gi1039447SaccharomycesLpblp 91 25
cerevisiae
1075 AAB64777 Homo SapiensHUMA- Human secreted 78 77
protein
sequence encoded by gene
5 SEQ ID
N0:63.
1076 AAB50261 Homo SapiensCORI- Human breast cancer308 39
associated
B726P-20 rotein.
1076 AAB50244 Homo SapiensLORI- Human breast cancer308 39
associated
B726P-79 rotein.
1076 AAB84702 Homo SapiensCORR Amino acid sequence308 39
of a
human cancer associated
antigen.
1077 12529735 Gorilla 1 co horin BlE recursor 71 31
orilla
1077 AAB74724 Homo SapiensINCY- Human membrane 70 31
associated
protein MEMAP-30.
1077 g14164424Scluzosaccharomsimilar to yeast cytoskeleton70 24
control
yces ombe protein Bnilp
1078 g118145107Clostridiumprobable transcriptional71 28
regulator
perfringens
1078 gi~9581801~ePlasmodium guanylyl cyclase 69 24
mb~CAC005falciparum
46.1
1078 gi~16805032~Plasmodium Ser/Thr protein kinase 69 26
ref~NP_4730falciparum
61.1
1079 gi~20886321~Mus musculussimilar to olfactory 72 34
receptor, family 5,
ref~XP subfamily V, member 1;
1406 olfactory
_ receptor, family 5, subfamily
14.1 V
member 1
1081 g19650824Petroselinumcommon plant regulatory 76 28
factor 5
Iris um
1081 g1559695 Hydrolagus This CDS feature is included74 31
to show
colliei the translation of the
corresponding
C_region. Presently translation
qualifiers on C region
features are
illega1
1081 g1476622 Hydrolagus immunoglobulin light 74 31
chain
colliei
1082 AAM39205 Homo SapiensHYSE- Human polypeptide 363 71
SEQ ID
NO 2350.
1082 AA007159 Homo SapiensHYSE- Human polypeptide 357 76
SEQ ID
NO 21051.
1082 AAM40991 Homo SapiensHYSE- Human polypeptide 343 79
SEQ ID
NO 5922.
1083 gi~17229222~Nostoc Sp. similar to HetF protein 72 30
PCC
reflNP-48577120
70.1
1084 g117221628Felis catusT-lym hocyte surface 76 38
CD2 antigen
1084 g118565073Crimean-Congoenvelope glycoprotein 74 29
precursor
hemorrhagic
fevervirus
1084 gi~17221628~Felis catusT-lymphocyte surface 76 38
CD2 antigen
dbj~BAB784
75.1
1085 117430213Ralstonia PUTATIVE HEMAGGLUTININ- 74 26
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
126
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
solanacearumRELATED PROTEIN
1087 gi2323287multiple polyprotein 618 79
sclerosis
associated
retrovirus
1087 gi~4996596~dHuman polyprotein 317 74
bj~BAA7854endogenous
9.1 ~ retrovirus
W
1087 gi~9630708~rFeline leukemiagag-pol precursor polyprotein293 38
gPr80
e~NP_0472virus
55.1
1088 gi15075953SinorhizobiumPUTATIVE MOLYBDENUM 70 56
meliloti TRANSPORT SYSTEM PERMEASE
ABC TRANSPORTER PROTEIN
1088 gi2288880Arthrobactertransmembrane protein 67 56
nicotinovorans
1088 gi17298547BradyrllizobiumModB 67 56
japonicum
1089 AAY95660Homo sa iensZYMO Human Zntr2 protein.231 61
1089 AAU83682Homo SapiensGETH Human PRO protein, 210 59
Seq ID No
182.
1089 AAY99386Homo SapiensGETH Human PR01305 (UNQ671)210 59
amino acid sequence SEQ
ID N0:153.
1090 gi7688355Solanum Dof zinc finger protein 70 31
tuberosum
1090 gi4389445Drosophila transcription factor 67 32
melanogaster
1090 gi~7688355~eSolanum Dof zinc finger protein 70 31
mb~CAB898tuberosum
31.1
1092 AAG78884Homo SapiensBIOW- Human ribosomal 90 44
protein s5-
17.
1092 AAM91239Homo SapiensHUMA- Human 72 53
immune/haematopoietic
antigen SEQ
ID NO:18832.
1092 AAM95026Homo sapiensHUMA- Human reproductive72 48
system
related antigen SEQ ID
NO: 3684.
1094 gi18676450Homo sa iensFLJ00122 protein 69 38
1094 gi18073428Homo sa iensstabilin-2 69 38
1094 gi~20806091~Homo Sapiensstabilin-2; CD44-like 69 38
precursor FELL
ref~NP_0600
34.8
1095 gi20906397Methanosarcinaconserved protein 76 44
mazei Goel
1095 gi~21299784~Anopheles agCP6531 75 30
gb~EAA119gambiae str.
29.1 PEST
~
1095 gi~17549046~Ralstonia CONSERVED HYPOTHETICAL 73 32
reflNP-5223solanacearumPROTEIN
86.1
1096 AAB58317Homo SapiensROSE/ Lung cancer associated678 100
of eptide sequence SEQ
ID 655.
1096 gi862600Drosophila male-specific lethal-1 176 25
protein
melanogaster
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
127
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1096 gi601930 Oryctolagus neurofilament-H 115 24
cuniculus
1097 AAU83109 Homo SapiensZYMO Novel secreted 76 85
protein
Z701935G4P.
1097 gi~20348496~Mus musculussimilar to RII~EN cDNA 72 57
9030605E16
ref~XP_1117
12.1
1098 gi18031887Mus musculusFanconi anemia complementation77 29
gr ou G
1098 112002137Mus musculusFanconi anemia grou 77 29
G rotein
1098 AAB72381 Homo sapiensLEEM/ Human hairy and 75 28
enhancer of
S lit homolo a amino
acid se uence.
1099 g18217648Homo SapiensdJ579F20.1 (high-mobility159 70
group
(nonhistone chromosomal)
protein 1-
like 1)
1099 g15815432Gallus gallushi h mobility group 154 70
protein HMGl
1099 14140289 Gallus allushigh mobility group 154 70
1 rotein
1100 ABB 11527Homo SapiensHYSE- Human apolipoprotein84 26
B
rece for homolo ue,
SEQ ID N0:1897.
1100 1487347 Homo sa iensbrea oint cluster region81 32
rotein
1100 g1144050 Bordetella filamentous hemagglutinin78 30
periussis
1102 AAM68946 Homo SapiensMOLE- Human bone marrow327 81
expressed probe encoded
protein SEQ
ID NO: 29252.
1102 AAM79768 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO
3414.
1102 AAM78784 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO
1446.
1103 AAZ11186 Homo SapiensSAGA Gene encoding transmembrane143 68
_ domain containing protein
aal clone
HP02239.
1103 AAD31079_Homo SapiensINCY- Human cornichon 143 68
protein
aal (CORN) cDNA.
1103 AAA88439_Homo SapiensGETH Antitumour PR0181 143 68
cDNA
aal clone DNA23330-1390.
1104 ABB07527 Homo sapiensINCY- Human drug metabolizing562 100
enzyme (DME) (ID: 5643401CD1).
1104 ABB07515 Homo SapiensINCY- Human drug metabolizing562 100
enzyme (DME) ID: 8097779CD1).
1104 113161409Mus musculusfamily 4 cytochrome 431 76
P450
1107 g113542874Mus musculusSimilar to CGI-67 protein677 64
1107 AAU81978 Homo sa iens1NCY- Human secreted 665 65
protein SECP4.
1107 AAU77137 Homo SapiensMILL- Human alpha/beta 665 65
hydrolase
38618 polypeptide.
1108 113620885Homo Sapiensmitochondrial ribosomal323 100
protein S6
1108 113620887Mus musculusmitochondrial ribosomal284 82
protein S6
1108 g119713140FusobacteriumFusobacterium outer 79 28
membrane protein
nucleatum family
subsp.
nucleatum
ATCC 25586
1109 g118378673Homo SapiensPATE 607 89
1109 g1530'5193Rattus sperm protein 10 108 30
norvegicus
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12,8
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1109 gi969103 Mus musculusmSP-10 107 27
1110 12462979 Bos taurus Tenascin-X 119 34
1110 g13413958Homo SapiensLDL rece for related 110 27
rotein 105
1110 g113938519Homo Sapienslow density lipoprotein110 27
receptor-related
protein 3
1111 g117981053Mus musculustranscri tion factor 82 32
NFATS
1111 g115425825Mus musculustonicity-responsive 82, 32
enhancer binding
rotein
1111 g16911148Mus musculustranscription factor 82 32
NFATS isoform b
1112 g16634473Metarhizium adenylate cyclase, ACY 73 . 30
anisopliae
var.
anisopliae
1113 AAU19759 Homo SapiensHUMA- Human novel extracellular900 70
matrix rotein, Seq ID
No 409.
1113 g13171934Mus musculusneuronal-STOP rotein 886 52
1113 g12769587Mus musculusSTOP protein 885 52
1114 g118652188Oenococcus OppF 72 41
oeni
1115 g19119 Drosophila fos-related anti en 69 37
s .
1115 g17769652Drosophila Fos-related antigen 69 37
melanogaster
1115 g117862946Drosophila SD04477p 69 37
melanogaster
1116 121212948Mus musculusperoxisomal rotein (PeP)243 83
1116 12347114 Mus musculusCC chemokine receptor-572 28
1116 12431976 Mus musculusCCRS 72 28
1117 gi~20825251~Mus musculussimilar to RE1-silencing77 40
transcription
ref~XP factor; neuron restrictive
1319 silencer
_ factor; re ressor bindin
98.1 ~ to the X2 box
1117 gi~15597871~Pseudomonas probable type II secretion69 41
system
ref~NP_2513aeruginosa protein
65.1
1118 gi~3860513~eMus famulus reverse transcriptase 303 82
mb~CAA135
74.1 ~
1118 gi~3860536~eMus saxicolareverse transcriptase 303 81
mb~CAA135
77.1 ~
1118 gi~3860510~eMus dunni reverse transcriptase 298 63
mb~CAA135
73.1
1119 AA004758 Homo SapiensHYSE- Human polypeptide234 59
SEQ ID
NO 18650.
1119 AAM69569 Homo sapiensMOLE- Human bone marrow220 63
expressed probe encoded
protein SEQ
ID NO: 29875.
1119 AAM67717 Homo SapiensMOLE- Human bone marrow219 49
expressed probe encoded
protein SEQ
ID NO: 28023.
1120 g121107877Xanthomonas cytochrome C 78 27
axonopodis
pv.
citri str.
306
1120 g115292331Drosophila LD47230p . 77 42
melanogaster
1120 115072444Avian phospho rotein 72 38
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
129
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
paramyxovirus
6
1121 AAB44126 Homo SapiensHUMA- Human cancer associated150 83
protein sequence SEQ
ID N0:1571.
1121 gi550015 Homo sapiensribosomal protein L21 150 83
1121 gi619788 Homo sa L21 ribosomal protein 150 83
iens
1122 AAU74448 Homo SapiensOULU- Human protein sequence125 100
of
lysyl hydroxylase 1 (LH
1 ).
1122 1190074 Homo sa lysyl hydroxylase 125 100
iens
1122 g15817297Homo Sapienslysyl hydroxylase 1 125 100
1123 g121281601CaenorhabditisC. elegans PQN-44 protein78 34
ele ans (corresponding sequence
F55A12.9c)
1123 g114578225CaenorhabditisC. elegans PQN-44 protein76 38
elegans (comes ondin se uence
F55A12.9b)
1123 g12088669CaenorhabditisC. elegans PQN-44 protein76 38
elegans comes ondin se uence
F55A12.9a)
1125 AAU17301 Homo SapiensHUMA- Novel signal transduction344 88
athway rotein, Se ID
866.
1125 AAE11776 Homo SapiensINCY- Human kinase (PKIN)-10344 88
protein.
1125 AAU17304 Homo SapiensHUMA- Novel signal transduction340 86
athway rotein, Se ID
869.
1126 AAM41712 Homo sapiensHYSE- Human polypeptide 152 96
SEQ ID
NO 6643.
1126 AAM39926 Homo SapiensHYSE- Human polypeptide 152 96
SEQ ID
NO 3071.
1126 AAM79067 Homo SapiensHYSE- Human protein SEQ 152 96
ID NO
1729.
1127 AAE02938 Homo SapiensMILL- Human adenylate 252 98
cyclase
25678.
1127 AAB02006 Homo sapiensTEXA Adenylyl cyclase 252 98
type II-C2 C2
al ha domain.
1127 g1202752 Rattus adenylyl cyclase type 252 98
II
norvegicus
1128 AAA94860_Homo SapiensTEXA Human caspase activator96 100
Smac
aal codin se uence.
1128 AAU78447 Homo SapiensUYJE- Inhibitor of apoptosis96 100
(IAP)
roteiii Smac.
1128 AAB26210 Homo sa TEXA Human cas ase activator96 100
iens Smac.
1129 g13874765CaenorhabditisSimilarity to Drosophila97 30
acetylcholine
elegans receptor protein
(SW:ACH1 DROME), contains
similarity to Pfam domain:
PF00065
(Neurotransmitter-gated
ion-channel),
Score=296.9, E-value=5e-86,
N=3
1129 g16681597Yaba monkeysimilar to vaccinia G8R 72 28
tumor virus
1129 gi~17548199~Caenorhabditisacetylcholine receptor 97 30
reflNP elegans
5099
32.1 ~
1130 gi~17564116~Caenorhabditistyrosine-proteinkinase 73 29
ref~IVP-5064elegans
84.1
1131 113925613Homo sa insulinoma-associated 88 27
iens protein IA-6
r 1131g1158485 Drosophila son of sevenless protein85 24
~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
130
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
melanogaster
1131 gi728778205-Feb-1998symbol=Sos; 85 24
synonym=BG:DS00941.4;
match=method:"sim4",
score:"1000.0",
desc:"GenBank::M83931:Drosophila
melanogaster son of sevenless
(Sos)
mRNA, complete cds. CDS:346..5133;
PID:g158485.", species:"Drosophila
melanogaster' ;
match=method: "BLASTX",
version:"2.Oa19MP-WashU
[Build
so12.5-ultra 01:47:30
1132 gi9696 Mytilus of henolic adhesive protein75 25
edulis
1134 gi13562016Plectreurysfibroin 2 72 29
tristis
1134 gi1129074Bacillus beta-N-acetylglucosaminidase69 28
subtilis
1134 gi2636104Bacillus N-acetylglucosaminidase 69 28
subtilis (major
autolysin (CWBP90)
1135 AAB58870 Homo SapiensHUMA- Breast and ovarian72 80
cancer
associated antigen protein
sequence
SEQ ID 578.
1135 111595476Homo sa RPBllblbeta protein 72 80
iens
1135 AAB44840 Homo SapiensHUMA- Human secreted 69 45
protein
encoded by gene 11.
1137 g1206985 Rattus troponin I 70 46
norve icus
1137 g116945895Takifugu SUN-like 1 70 31
rubri es
1137 gi~8394466~rRattus troponin I, skeletal, 70 46
fast 2
ef~NP norvegicus
0588
_
81.1
1140 AA004998 Homo SapiensHYSE- Human polypeptide 277 96
SEQ ID
NO 18890.
1140 g119917538MethanosarcinamttA/Hcf106 protein 80 28
acetivorans
str.
C2A]
[Methanosarcina
acetivorans
C2A
1140 14959705 Mus musculusfibulin-2 76 28
1141 g110141010Vesicular non-structural polyprotein91 31
exanthema
of
swiiia virus
1141 g16566147Drosophila large Forked protein 85 30
melanogaster
1141 g12317953murid glycoprotein 150 79 28
he esvirus
4
1142 AAB54067 Homo SapiensHUMA- Human pancreatic 218 56
cancer
antigen protein sequence
SEQ ID
N0:519.
1142 g11710365Mus musculusnoggin 89 29
1142 g121105761Equus caballusno gin 89 29
1143 gi~21295753~Anopheles agCP1560 69 26
gb~EAA078gambiae
str.
98.1 ~ PEST
1144 g1505094 Homo Sapienssimilar to an actin bundling127 35
~ protein,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
131
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
dematn.
1144 gi2337952Homo Sapiensactin-binding double-zinc-finger122 36
rotein
1144 gi21304227Oryza sativaovule development aintegumenta-like76 29
rotein BNM3
1145 gi~21298336~Anopheles agCP2121 68 37
gb~EAA104gambiae str.
81.1 ~ PEST
1146 AAW22049 Homo SapiensINCY- Interferon gamma 221 100
inducing
factor-2 (IGIF-2) alternate
transcript
variant.
1146 AAV05368_Homo SapiensSCHE cDNA encoding human167 84
aal interleukin-1-gamma.
1146 AAH78060-Homo SapiensSTRD Nucleotide sequence167 84
of human
aal interleukin 18 (IL-18).
1147 AAY57937 Homo SapiensINCY- Human transmembrane123 100
protein
HTMPN-61.
1147 gi~20345904~Mus musculussimilar to delta-like 105 86
homolog
ref~XP_1098 (Drosophila)
23.1
1148 gi19069293Encephalitozoonsimilarity to ADP/ATP 75 32
CARRIER
cuniculi PROTEIN
1148 gi8978336Arabidopsis contains similarity 74 26
to CHP-rich zinc
thaliana finger rotein~ ene id:K23F3.4
1148 gi19716318Aspergillus antigenic cell wall 74 32
protein MP1
flavus
1149 gi5456699Emericella ATP-binding cassette 70 35
multidrug
nidulans traps ort protein ATRC
1149 gi~20898840~Mus musculussimilar to HSPC038 protein69 0 31
re~XP_1393
87.1 ~
1150 gi3883128Arabidopsis arabinogalactan-protein96 32
thaliana
1150 gi17429208Ralstoua CONSERVED HYPOTHETICAL 92 26
solanacearumPROTEIN
1150 gi4063766Emericella chitinase 91 27
nidulans
1151 gi13561058Homo SapiensdJ1108D11.1 (novel protein107 31
similar to
C. elegans T22C1.7 )
1151 gi21105299Mytilus precollagen-NG 105 26
alloprovincialis
1151 gi14164347Oncorhynchuscollagen al(I) 96 28
mykiss
1152 gil8479434Mus musculusolfactory rece for MOR188-176 33
1152 gi2653915Oran virus glycoprotein G1 and 72 46
G2 precursor;
envelo a Tyco rotein
precursor
1152 gi18479436Mus musculusolfactory rece for MOR188-272 33
1153 gi3403167Homo sa tensGBAS 161 86
1153 112804791Homo sa tensglioblastoma am lifted 161 86
sequence
1153 AAB57149 Homo SapiensROSEI Human prostate 134 81
cancer antigen
protein se uence SEQ
ID N0:1727.
1154 g117742234Agrobacteriumhistidase 87 35
tumefaciens
str.
C58 (U.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
132
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
Washington)
1154 gi15159496AgrobacteriumAGR_L_1400GMp 87 35
tumefaciens
str.
C58 (Cereon)
1154 gi158521Drosophila seven-up protein type 80 32
2
melano aster
1155 gi~10441551~Cryptotermescytochrome b 65 28
gb~AAG170domesticus
99.1~AF189
115 1
1156 AA012089Homo SapiensHYSE- Human polypeptide 475 98
SEQ ID
NO 25981.
1156 gi20147787Xeno us laevisnuclear rece for core 74 25
ressor
1156 gi19881705Oryza sativaPutative transposable 72 32
element
1157 19963851Homo SapiensHT019 80 34
1157 AAB93530Homo SapiensHELI- Human protein sequence77 34
SEQ
ID N0:12884.
1157 11040970Homo sa iensfus-like protein 77 42
1158 19795254Sepia officinalisGABA-A rece for beta 71 27
subunit
1158 g115026157Clostridium amidase, germination 68 34
specific
acetobutylicumcwlC/cwlD B.subtilis
ortholo )
1158 gi~9795254~gSepia officinalisGABA-A receptor beta 71 27
subunit
b~AAF97816
.1
1159 AAB93423Homo sapiensHELI- Human protein sequence336 100
SEQ
ID NO:12641.
1159 g113097768Homo SapiensSimilar to RIKEN cDNA 336 100
2900073H19
ene
1159 g120071708Mus musculusRIKEN cDNA 2900073H19 334 96
gene
1160 AAM72558Homo SapiensMOLE- Human bone marrow 274 100
expressed probe encoded
protein SEQ
ID NO: 32864.
1160 AAM59959Homo sapiensMOLE- Human brain expressed274 100
single
exon probe encoded protein
SEQ ID
NO: 32064.
1161 AAB07704Homo SapiensINMR Protein encoded 139 36
by the
endogenetic fragment
of HERV-W.
1161 g18272464Homo sa iensag 139 36
1161 gi~5726238~gmultiple gag polyprotein 131 35
b~AAD4837sclerosis
5.1~AF1238associated
81_1 retroviriis
element
1162 AAU25448Homo sapiensINCY- Human mddt protein346 79
from clone
LG:1083264.1:2000MAY
19.
1162 AAU11265Homo sa iensBODE- Human zinc finger 319 65
rotein 51.
1162 AAB95637Homo SapiensHELI- Human protein sequence314 67
SEQ
ID N0:18371.
1163 g114189950Homo Sapiensconnexin 58 536 84
1163 g19957542Homo Sapiensconnexin 59 536 84
1163 110946367Danio rerio connexin 55.5 485 81
1164 1755700 Bombyx mori sericinlB 76 27
1164 g119569861DictyosteliumRTOA protein (Ratio-A). 76 28
discoideum
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
133
Table 2
SEQ AccessionSpecies Description Score
ID No, Identity
NO:
1164 gi10580635HalobacteriumVng1087c 76 25
s . NRC-1
1165 gi19915386MethanosarcinaWD-domain containing 89 28
protein
acetivorans
str.
C2A]
[Methanosarcina
acetivorans
C2A
1165 15639663 Homo sa iensWD re eat protein WDR3 83 28
1165 g111544739Homo sa iensdJ776P7.2 (WD re eat 83 28
domain 3
1166 AAM69338 Homo SapiensMOLE- Human bone marrow72 31
expressed probe encoded
protein SEQ
ID NO: 29644.
1166 AAM56953 Homo sapiensMOLE- Human brain expressed72 31
single
exon probe encoded protein
SEQ ID
NO: 29058.
1166 g120197507Arabidopsis expressed protein 67 39
thaliana
1167 g15802812Homo SapiensGa rotein 83 30
1167 g17160650Bordetella pertactin (P.68) 79 31
bronchiseptica
1167 g113173444Bordetella pertactin 79 31
bronchise
tics
1168 g11495029Danio rerio protein kinase CK2 alpha'84 24
1168 g1643443 Penicillium PHOG 82 32
chrysogenum
1168 gi~18858419~Danio rerio casein kinase 2 alpha 84 24
2
re~NP_5713
15.1
1169 g1206716 Rattus salivary proline-rich 90 31
protein
norvegicus
1169 g115029903Mus musculusSimilar to proline-rich89 36
protein BstNI
subfamil 2
1169 g153182 Mus musculusproline rich rotein 81 34
1170 gi~17553370~CaenorhabditisF40H6.S.p 78 33
ref~NP_4983elegans
18.1
1170 gi~15215731~Arabidopsis AT4g36780/C7A10 580 73 30
gb~AAK914thaliana
11.1
1171 1340446 Homo sa ienszinc fm er protein 7 218 61
(ZFP7)
1171 AAB43928 Homo SapiensHLTMA- Human cancer 216 58
associated
protein sequence SEQ
ID NO:1373.
1171 AAB21040 Homo SapiensINCY- Human nucleic 213 48
acid-binding
protein, NuABP-44.
1172 AAE04368 Homo sapiensINCY- Human kinase (PKIN)-9.120 85
1172 AAM79153 Homo SapiensHYSE- Human protein 120 85
SEQ ID NO
1815.
1172 AAE10614 Homo SapiensCUR A- Human novel STE20-like120 85
rotein, NOV-3d.
1173 1218572 Pan troglodytesrot GOR 74 29
1173 1243898 Pan GOR 74 29
1173 11666473 Mus musculusNOV rotein 71 50
1174 g15901830Drosophila BcDNA.GH07910 74 31
melano aster
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
134
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1174 AAM80237 Homo SapiensHYSE- Human protein SEQ 71 38
ID NO
3883.
1174 ABB 11528Homo SapiensHYSE- Human secreted 71 38
protein
homologue, SEQ ID N0:1898.
1175 gi~12054759~Podospora catalase A 65 33
emb~CAC20anserina
748.1
1176 AAM93289 Homo SapiensHELI- Human polypeptide,145 100
SEQ ID
NO: 2777.
1176 gi17431512Ralstonia PUTATIVE OUTER MEMBRANE 71 26
solanacearumCHANNEL LIPOPROTEIN
TRANSMEMBRANE
1176 gi15823991Streptomycesmodular polyketide synthase70 51
avermitilis
1177 AAM41939 Homo SapiensHYSE- Human polypeptide 84 61
SEQ ID
NO 6870.
1177 gi870751 Homo SapiensN-acetylgalactosamine 84 61
6-sulfate
sulfatase (GALNS)
1177 1618426 Homo sa N-acetyl alactosamine 84 61
iens 6-sul hatase
1178 1435855 Mus Sp. CREB-binding protein; 89 22
CBP
1178 AAW40058 Homo sapiensUSSH Cellular transcriptional87 22
factor
CBP.
1178 g117944308Drosophila RE12101p 86 26
-
melanogaster
1179 AAM25814 Homo SapiensHYSE- Human protein sequence73 93
SEQ
ID N0:1329.
1179 AAM25290 Homo SapiensHYSE- Human protein sequence73 93
SEQ
ID N0:805.
1179 AAM79441 Homo SapiensHYSE- Human protein SEQ 73 93
~ NO
3087.
1180 AAB88388 Homo SapiensHELI- Human membrane 719 97
or secretory
protein clone PSEC0131.
1180 g120810493Homo SapiensSimilar to RII~EN cDNA 716 96
2810417M05
gene
1180 AAD30543_Homo SapiensMILL- Human B7RP-2 DNA. 83 38
aal
1181 ABB 14686Homo SapiensHUMA- Human nervous system190 97
related
olypeptide SEQ ID NO
3343.
1181 g114329731Secale cerealehigh molecular weight 88 27
glutenin subunit
x
1181 g114329761Triticum high molecular weight 84 26
glutenin subunit
aestivum x
1182 111692645Mus musculusaspartly beta-hydroxylase74 28
_ g111878112Mus musculusaspartyl beta-hydroxylase74 28
1182 6.6 kb
transcript
1182 g111878110Mus musculusaspartyl beta-hydroxylase74 28
4.5 kb
transcript
1183 g115485622Homo SapiensQ9H4T4 like 80 25
1183 g119714949FusobacteriumTong protein 78 32
nucleatum
subsp.
nucleatum
ATCC 25586
1183 g17717375Homo Sapienshuman CHD2-52 down syndrome71 23
cell
adhesion molecule
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
135
Table 2
SEQ AccessionSpecies Description Score /a
ID No. Identity
NO:
1184 AAU83667 Homo SapiensGETH Human PRO protein,388 100
Seq ID No
152.
1184 AAG89161 Homo SapiensGEST Human secreted 388 100
protein, SEQ ID
NO: 281.
1184 AAY99348 Homo SapiensGETH Human PR01194 (UNQ607)388 100
amino acid sequence
SEQ ID NO:29.
1185 AAB93506 Homo SapiensHELI- Human protein 543 100
sequence SEQ
ID N0:12830.
1185 AAB87570 Homo SapiensGETH Human PR01268. 426 95
1185 AAY78808 Homo sapiensPROT- Hydrophobic domain426 95
containing protein clone
HP10537
rotein se uence.
1187 gi15823978Streptomycesmodular polyketide synthase75 41
avermitilis
1187 AAB66657 Homo SapiensHSCR- Human elastin 71 39
protein without
si nal pe tide.
1187 AAY69137 Homo SapiensUNSY Amino acid sequence71 39
of a
human tropoelastin derivative.
1188 gi6907090Oryza sativaSimilar to Oryza sativa76 30
root-specific
(japonica RCc3 mRNA. (L27208)
cultivar-
ou
1188 AAY36063 Homo SapiensGEST Extended human 74 26
secreted
rotein se uence, SEQ
ID NO. 448.
1188 AAY35971 Homo SapiensGEST Extended human 73 26
secreted
protein sequence, SEQ
ID NO. 220.
1189 gi9827989Leishmania possible CG12797 protein72 36
ma' or
1189 gi~13625467)Leishmania LACK protective antigen68 27
gb~AAK350donovani
68.1
1190 gi17027071Xiphocentronelongation factor-1 107 27
Sp. alpha
UMSP00002937
2-Costa Rica
1190 gi310665 StrongylocentrotNf Y-A subunit 88 24
us p uratus
1190 gi21743 Triticum lugh molecular weight 86 23
glutenin subunit
aestivum lAxl
1191 gi16878287Homo SapiensSimilar to C-terminal 167 96
modulator protein
1191 115866714Homo SapiensC-terminal modulator 167 96
protein
1191 AA006984 Homo SapiensHYSE- Human polypeptide132 83
SEQ ID
NO 20876.
1192 AAD05496_Homo SapiensHUMA- Human secreted 859 100
protein-
aal encoding gene 5 cDNA
clone
HHBCS39, SEQ ID N0:15.
1192 AAE01707 Homo SapiensHUMA- Hurnan gene 5 859 100
encoded
secreted protein HHBCS39,
SEQ ID
N0:119.
1192 AAE01676 Homo SapiensHUMA- Human gene 5 encoded859 100
secreted protein HHBCS39,
SEQ ID
N0:88.
1193 g118650588Homo Sapiensretinoic acid early 1312 99
transcript 1
1193 AAB15540 Homo SapiensINCY- Human immune system1283 97
molecule from Inc a
clone 3402252.
1193 ABB84887 Homo SapiensGETH Human PR0791 protein1234 94
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
136
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
se uence SEQ ID N0:142.
1195 11196427 Homo sa a 2 protein 248 50
iens
1195 g11780975Human gag protein 248 50
endogenous
retrovirus
K
1195 g11556397Human gag 248 50
endogenous
retrovirus
K
1196 g1556256 Leishmania G protein alpha subunit 72 22
donovani
1197 AAY07237 Homo SapiensISTF Wild type monocyte 121 100
chemotactic
rotein 2.
1197 AAY05300 Homo sa ISTF C-C chemokine, MCP2.121 100
iens
1197 AAW42072 Homo sa INCY- Human MC roprotein.121 100
iens
1198 ABB57423 Homo sapiensHUMA- Human secreted 187 79
protein
encodin olypeptide SEQ
ID NO 69.
1198 ABB57394 Homo SapiensHUMA- Human secreted 187 79
protein
encoding polypeptide
SEQ ID NO 40.
1198 AAY59757 Homo SapiensMETA- Human normal ovarian187 79
tissue
derived protein 34.
1199 AAY72603 Homo SapiensINCY- Human Electron 155 100
Transfer
Protein, ETRN-1.
1199 AAB88465 Homo SapiensHELI- Human membrane 155 100
or secretory
protein clone PSEC0259.
1199 AAE03926 Homo sapiensHUMA- Human gene 29 encoded155 100
secreted protein HTADC63,
SEQ ID
N0:89.
1200 g16458884Deinococcuschorismate mutase/prephenate73 42
radioduransdehydratase
1201 g120803920MesorhizobiumHYPOTHETICAL PROTEIN 68 32
loti
1201 gi~17545158~Ralstonia PUTATIVE LIPASE/ESTERASE66 31
ref~NP_5185solanacearumPROTEIN
60.1
1202 AAM67586 Homo SapiensMOLE- Human bone marrow 69 30
expressed probe encoded
protein SEQ
ID NO: 27892.
1202 AAM55191 Homo SapiensMOLE- Human brain expressed69 30
single
exon probe encoded protein
SEQ ID
NO: 27296.
1202 g1849219 SaccharomycesProlp: Glutamate 5-kinase69 33
(Swiss Prot.
cerevisiae accession number P32264)
1203 g118676554Homo SapiensFLJ00174 rotein 269 84
1203 gi~20913341~Mus musculussimilar to FLJ00174 protein125 81
ref~XP-1267
63.1
1203 gi~20850247~Mus musculussimilar to proline-rich 121 33
protein
ref~XP-1366
64.1
1204 AAM68056 Homo SapiensMOLE- Human bone marrow 140 84
expressed probe encoded
protein SEQ
ID NO: 28362.
1204 AAM55676 Homo SapiensMOLE- Hurnan brain expressed140 84
single
exon probe encoded rotein
SEQ ID
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
137
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
NO: 27781.
1205 gi541624 Drosophila pdm2 71 39
virilis
1205 gi9955855AspergillusRNA polymerase II largest69 38
subunit
oryzae
1205 gi662296 Rattus MIBP1 68 32
norvegicus
1206 ABB50703 Homo SapiensHLTMA- Human secreted 260 94
protein
encoded by gene 52 SEQ
ID N0:651.
1206 AAW88802 Homo SapiensHLJMA- Polypeptide fragment260 94
encoded
by ene 52.
1206 ABB50706 Homo sapiensHL1MA- Human secreted 143 96
protein
encoded by gene 52 SEQ
ID N0:654.
1207 AAM79588 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO
3234.
1207 AAM78604 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO
1266.
1207 AAB58944 Homo SapiensHUMA- Breast and ovarian72 41
cancer
associated antigen protein
sequence
SEQ ID 652.
1208 AAE03429 Homo SapiensHLTMA- Human gene 3 encoded575 64
secreted protein HETDB76,
SEQ ID
NO: 112.
1208 gi19110438Homo Sapienspolycystin-1L1 575 64
1208 AAE03463 Homo SapiensHLTMA- Human gene 3 encoded185 97
secreted protein HETDB76,
SEQ ID
NO: 146.
1209 16760015 Homo sa brain rotein 1114 85
iens
1209 g11747306Mus musculusSDR2 151 31
1209 g120381292Mus musculusstromal cell derived 151 31
factor receptor 2
1211 g114043211Homo SapiensSimilar to RIKEN cDNA 460 89
4931428F04
gene
1211 g1190508 Homo Sapienssalivary proline-rich 113 28
rotein recursor
1211 112862320Homo SapiensWDC146 102 28
1212 AAO14407 Homo SapiensFARB Human 11 beta-hydroxysteroid291 63
dehydrogenase 1-like
enzyme.
1212 AAM79592 Homo sapiensHYSE- Human protein SEQ 217 45
ID NO
3238.
1212 g14581319Homo SapiensdJ28O10.3(HSD11B1 (hydroxysteroid217 45
(11-beta) dehydrogenase
1)
1213 AAR06514 Homo SapiensSTRI Natural human Platelet238 64
Factor-
4var1 encoded by EcolZi
fra ment.
1213 g1292390 Homo Sapiensplatelet factor 4 238 64
1213 AAZ28361_Homo SapiensSMIK Platelet factor-4 200 56
(PF-4)
aal nucleotide sequence.
1214 AAD12580 Homo SapiensSAGA Human protein having162 82
_ hydrophobic domain encoding
aal cDNA
clone HP 10753.
1214 AAD08193 Homo SapiensHUMA- Human secreted 162 82
protein-
_ encoding gene 3 cDNA
aal clone
HNTAC64, SEQ ID N0:13.
1214 AAD05544_Homo sapiensHUMA- Human secreted 162 82
protein-
aal encoding gene l2 cDNA
clone
HNTAC64, SEQ ID N0:63.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13~
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1215 gi21429094Drosophila LD38004p 354 49
melanogaster
1215 gi15292155Drosophila LD40717p 354 49
melanogaster
1215 AAG75596 Homo SapiensHL1MA- Human colon cancer294 50
antigen
protein SEQ ID N0:6360.
1216 gi7248894Xeno us laevisAr rotein-tyrosine kinase84 35
1216 1402191 Mus musculusHNF-3beta 80 26
1216 g1404764 Mus musculusfork head related rotein80 26
1218 AAM39205 Homo SapiensHYSE- Human polypeptide559 74
SEQ ID
NO 2350.
1218 AAO03505 Homo SapiensHYSE- Human polypeptide502 81
SEQ ID
NO 17397.
1218 AAM40991 Homo SapiensHYSE- Human polypeptide467 66
SEQ ID
NO 5922.
1220 AA001188 Homo SapiensHYSE- Human polypeptide248 86
SEQ ID
NO 15080.
1220 AAY73334 Homo sapiens1NCY- HT1ZM clone 180506179 35
protein
se uence.
1220 120249 Oryza sativagt-2 77 32
1221 g14519619Haliotis colla en pro al ha-chain90 28
discus
1221 g17380690Neisseria UDP-N-acetylglucosamine--N-90 37
meningitidesacetylmuramyl-(pentape
22491 pyrophosphoryl-undecaprenol
N-
acetylglucosamine transferase
1221 g17225645Neisseria UDP-N-acetylglucosamine--N-90 37
meningitidesacetylmuramyl-(pentapeptide)
MC58 pyrophosphoryl-undecaprenol
N-
acetyl lucosamine transferase
1222 ABA05334_Homo SapiensMILL- Human fucosyltransferase2154 99
aal family member 32132
coding
sequence.
1222 AAM47905 Homo SapiensMILL- Human fucosyltransferase2154 99
family member 32132.
1222 ABA05333_Homo SapiensMILL- Human fucosyltransferase2154 99
aal family member 32132
encoding cDNA.
1223 AAY21852 Homo SapiensINCY- Human signal peptide-150 100
contianing protein (SIGP)
(clone ID
2652271).
1223 AAY48563 Homo SapiensMETA- Human breast tumour-150 100
associated rotein 24.
1223 AAW75103 Homo SapiensHLTMA- Human secreted 150 100
protein
encoded by ene 47 clone
HMCBP63.
1224 AAM67078 Homo SapiensMOLE- Human bone marrow517 99
expressed probe encoded
protein SEQ
ID NO: 27384.
1224 AAM54676 Homo SapiensMOLE- Human brain expressed517 99
single
exon probe encoded protein
SEQ ID
NO: 26781.
1224 117467358Sus scrofa MIF2 suppressor 184 80
1225 g19454237CochliobolusDNA binding protein 73 30
MAT-1
sativus
1225 g121428792Drosophila GH03582p 72 38
melanogaster
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
139
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1225 gi6633838ArabidopsisF2K11.15 70 31
thaliana
1226 gi21430124Drosophila HL01222p 76 28
melanogaster
1226 AAM77437 Homo SapiensMOLE- Human bone marrow 72 33
expressed probe encoded
protein SEQ
ID NO: 37743.
1226 AAM64659 Homo SapiensMOLE- Human brain expressed72 33
single
exon probe encoded protein
SEQ ID
NO: 36764.
1227 AAM50715 Homo SapiensMILL- Human TRP-like 243 83
calcium
channel-5 (TLCC-5).
1227 gi~20874183~Mus musculussimilar to hornerin 80 29
ref~XP_1310
03.1
1227 gi~17864717~Mus musculushornerin 80 29
gb~AAKl
57
91.1
1229 gi4019247Ateline thymidine kinase 71 46
he esvirus
3
1229 gi2760368Drosophila Shar pei/DRhoGEF2 70 26
melanogaster
1229 gi17862944Drosophila SD04476p 70 26
melanogaster
1230 gi4559296Mus musculussilencing mediator of 80 30
retinoic acid and
thyroid hormone receptor
extended
isoform
1230 118181872Mus musculusGATA-2 protein 78 41
1230 g118033511Rattus transcription factor 78 41
GATA-2
norvegicus
1231 g113365501C rinus integrin beta2-chain 75 27
carpio
1231 g13322933Treponema DNA ligase (11g) 73 32
allidum
1231 gi~13365501~Cyprinus integrinbeta2-chain 75 27
carpio
dbj~BAB391
30.1
1232 AAM79791 Homo SapiensHYSE- Human protein SEQ 78 35
ID NO
3437.
1232 AAM78807 Homo sapiensNYSE- Human protein SEQ 78 35
ID NO
1469.
1232 AAB19338 Homo Sapiens1NCY- Amino acid sequence78 35
of a
human fibrous roteiii
(FIBR).
1233 AAU21459 Homo SapiensHUMA- Human novel foetal87 26
antigen,
SEQ ID NO 1703.
1233 g115081227Arabidopsisglycine-rich protein 75 37
GRP20
thaliana
1233 12645433 Homo SapiensCHD3 74 30
1234 AAU83676 Homo SapiensGETH Human PRO protein, 178 97
Seq ID No
170.
1234 ABB84911 Homo SapiensGETH Human PR01244 protein178 97
sequence SEQ ID N0:190.
1234 AAB62403 Homo sapiensCURA- Human MBSP7 polypeptide178 97
(clone 3499605Ø64 .
1235 ABB 10348Homo SapiensHUMA- Human cDNA SEQ 409 61
ID NO:
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
140
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
656.
1235 AAU18012Homo SapiensHUMA- Human immunoglobulin178 83
olypeptide SEQ ID No
157.
1235 ABB89226Homo SapiensHUMA- Human polypeptide 78 82
SEQ ID
NO 1602.
1236 gi10566951Rattus s-gicerin/MIJC18 85 45
norvegicus
1236 gi10566949Rattus 1-gicerin/MUC18 85 45
norvegicus
1236 AAB90798Homo sapiensNOJI/ Human shear stress-response84 42
rotein SEQ ID NO: 96.
1238 gi21464300Drosophila GH20068p 95 36
melano aster
1238 gi3868879Xeno us laevisZic-related-2 88 35
1238 gi1841756Mus musculusGATA-5 cardiac transcription87 52
factor
1239 gi17946266Drosophila RE61793p 96 40
melanogaster
1239 gi15636898Gallus gallusformin binding protein 91 27
11-related
rotein
1239 gi780454African swinepB407L 88 30
fever virus
1240 AAE05302Homo SapiensMILL- Human TANGO 457 1331 100
protein.
1240 AAE05303Homo SapiensMILL- Human mature TANGO1207 100
457
rotein.
1240 AAE05305Homo SapiensMILL- Human TANGO 457 1201 100
protein
cyto lasmic domain.
1241 gi5640111LycopersiconRAD23 protein 84 25
esculentum
1241 gi17131739Nostoc Sp. polyketide synthase type76 33
PCC I
7120
1241 gi~5640111~eLycopersiconRAD23 protein 84 25
mb~CAB515esculentum
44.1
1242 AAG03496Homo SapiensGEST Human secreted protein,67 39
SEQ ID
NO: 7577.
1242 gi~13876270~Mus musculusprotocadherin alpha 8 66 35
gb~AAK260
55.1
1243 AAE16665Homo SapiensMILL- Human calcium chaimel196 87
family
member, 21784 rotein.
1243 AAB62248Homo SapiensWARN Human calcium channel196 87
alpha2delta subunit.
1243 AAY92320Homo SapiensWARN Human alpha-2-delta-C196 87
calcium channel subunit
polype tide.
1244 gi~4102990~gAspergillus DNA polymerase epsilon 70 30
homolog
b~AAD0163nidulans
7.1
1245 15917666Zea mays extensin-like rotein 94 26
1245 g119481644shrimp whiteWSSV052 89 36
spot syndrome
virus
1245 g117016928shrimp whitewsv001 89 36
spot syndrome
virus
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
141
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1246 AA012623 Homo SapiensHYSE- Human polypeptide 169 69
SEQ ID
NO 26515.
1246 AA012822 Homo SapiensHYSE- Human polypeptide 153 75
SEQ ID
NO 26714.
1246 AAO02255 Homo SapiensHYSE- Human polypeptide 123 65
SEQ ID
NO 16147.
1247 gi1653353Synechocystisnodulation protein 75 28
s . PCC
6803
1247 14468626 Mus musculusTEF-5 74 26
1247 g117430764Ralstonia SKWP PROTEIN 5 74 23
solanacearum
1248 g115139973SinorhizobiumCONSERVED HYPOTHETICAL 77 47
meliloti PROTEIN
1249 g17191078Leishmania L712.2 99 29
maj or
1249 g117384256Homo sapiensmucin 5 85 31
1249 g15821153Homo SapiensRNA binding rotein 83 33
1250 AAY36495 Homo SapiensHUMA- Fragment of human 124 86
secreted
protein encoded by ene
27.
1250 AA012122 Homo sapiensHYSE- Human polypeptide 123 91
SEQ ID
NO 26014.
1250 AAB95063 Homo SapiensHELI- Human protein sequence121 90
SEQ
ID N0:16901.
1252 gi~15839838~Mycobacteriummembrane protein, MmpL 68 27
family
re~NP_3348tuberculosis
75.1 CDC1551
1254 AAG00399 Homo SapiensGEST Human secreted protein,328 100
SEQ ID
NO: 4480.
1254 g121428466Drosophila LD22609p 85 24
melanogaster
1254 g119914274Methanosarcinasensory transduction 85 26
histidine kinase
acetivorans[Methanosarcina
str.
C2A
1256 g114161094Choloepus von Willebrand Factor 80 24
didactylus
1256 g114161092Cyclopes von Willebrand Factor 78 23
didactylus
1256 g113872552Acomys von Willebrand Factor 77 23
cahirinus
1258 g17008025Callithrix prochymosin 715 64
'acchus
1258 g111990126Camelus chymosin 634 57
dromedarius
1258 g1491952 synthetic preprochymosin 618 56
construct
1259 gi~21402709~Bacillus AMP-binding, AMP-binding72 34
enzyme
ref~NP_6586anthracis [Bacillus anthracis
A2012
94.1
1260 gi~4505431~rHomo Sapiensnuclear protein, ataxia-telangiectasia64 33
ef~NP_0025 locus; NPAT gene; E14
gene
10.1
1260 gi~15309894~Homo Sapienssimilar to nuclear protein,64 33
ataxia-
ref~XP_0408 telangiectasia locus;
NPAT gene; E14
46.2 gene
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
142
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1260 gi~1304114~dHomo sapiensNPAT 64 33
bj ~BAA
1186
1.1
1261 gi4519535Homo SapiensLeukotriene B4 ome a-hydroxylase133 49
1261 gi1857022Homo Sapiensleukotriene B4 omega-hydroxylase133 49
1261 gi18266446Homo Sapienscytochrome P450, subfamily133 49
IVF,
of epode 2
1262 gi13363530Escherichia cell division protein 79 26
coli HfIB/FtsH
0157:H7 protease
1262 gi746401 Escherichia ATP-binding rotein 79 26
coli
1262 1146028 Escherichia ftsH 79 26
coli
1263 AAW67859 Homo SapiensHUMA- Human secreted 283 100
protein
encoded by gene 53 clone
HBMCL41.
1264 g111066248Helix lucorumpresenilin 85 21
1264 gi~19115422~Schizosaccharomribonuclease II RNB 69 30
family protein;
ref~NP'5945yces pombe dis3-like
10.1
1264 gi~14720912~Homo Sapienssimilar to Matrin 3 69 32
ref~XP_03
82
04.1
1265 g15757703Mus musculussyntrophin-associated 82 38
serine-threonine
protein kinase
1265 g14996035Human 69.8% identical to U47 76 42
gene of strain
heipesvirus U1102 of HHV-6
6
1265 g1330951 Gallid ICP4 76 36
lie esvirus
1
1266 gi~17511177~CaenorhabditisZK1053.3.p 75 40
ref~NP,4933elegans
24.1 ~
1266 gi~17538077~CaenorhabditisZK1248.2.p 69 34
ref~NP elegans
4951
59.1
1267 g1915540 Ovis aries pregnancy-s ecific antigen85 25
1267 16179989 Capra hircuspregnancy-associated 84 25
glycoprotein-2
1267 g19798658Rhinolophus pepsinogen A 80 23
ferrume uinum
1268 gi~15789526~Halobacteriumserine proteinase; HtrA69 30
ret~NP_2793Sp. NRC-1
50.1
1269 g19988674Influenza hemagglutinin protein 70 24
A virus .
(A/Swine/Wisco
nsin/14094/99(H
3N2))
1269 g16552676Influenza hemagglutinin 70 25
A virus
(ABangkok/1/97
(H3N2))
1269 g16552638Influenza hemagglutinin 70 24
A virus
(A/Trinidad/51/9
6(H3N2))
1270 13378527 Zea mays anther specific protein87 41
1270 AAW 15787Homo sapiensPENN- Human metastasis 85 28
suppressor
KISS-1.
1270 g121410770Homo SapiensSimilar to RTKFN cDNA 84 46
1500005K14
gene
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
143
Table
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1271 gi1335527Human reading frame VP3 75 38
oliovirus
1
1271 gi61253 Human polyprotein 75 38
oliovirus
1
1271 gi~17453412~Homo Sapienssimilar to 60S ribosomal76 40
protein L7A
reflXP-0631 (Surfeit locus protein
3)
32.1
1272 AAU87081 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related
lectin, Siglec-11.
1272 AAU87077 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related
lectin, Siglec-BMS-L3d.
1272 AAU87076 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related
lectin, Siglec-BMS-L3c.
1273 AAA09121 Homo SapiensCURA- Clone 2355875 720 100
cDNA
_ (update), encodes syncollin
aal homologue.
1273 AAY92233 Homo SapiensCURA- Glone 2355875f 720 100
- syncollin
homologue.
1273 AAB54267 Homo SapiensHUMA- Human pancreatic 715 100
cancer
antigen protein sequence
SEQ ID
N0:719.
1274 gi15559064Mus musculusSNAGl 198 59
1274 AAU17435 Homo sapiensHUMA- Novel signal transduction131 62
athway protein, Se ID
1000.
1274 AAW99023 Homo sa iensMOUN 1762 eptide sequence.131 62
1275 gi~6753732~rMus musculusepidermal growth factor65 30
ef~NP_0342
43.1 ~
1275 gi~50801 Mus musculuspolyprotein 65 30
hem
b~CAA2411
5.1
1275 gi~20341089~Mus musculusepidermal growth factor65 30
ref~XP_1093
85.1
1276 AAM39205 Homo sapiensHYSE- Human polypeptide447 78
SEQ ID
NO 2350.
1276 AAM40991 Homo SapiensHYSE- Human polypeptide424 74
SEQ ID
NO 5922.
1276 AA007159 Homo SapiensHYSE- Human polypeptide401 75
SEQ ID
NO 21051.
1277 gi13905120Mus musculusRIKEN cDNA 0610013I17 134 35
gene
1277 113936283Mus musculusTRH3 134 35
1277 AAB92625 Homo SapiensHELI- Human protein 127 35
sequence SEQ
ID N0:10921.
1279 AAM66940 Homo SapiensMOLE- Human bone marrow362 85
expressed probe encoded
protein SEQ
ID NO: 27246.
1279 AAM54534 Homo SapiensMOLE- Human brain expressed362 85
single
exon probe encoded protein
SEQ ID
NO: 26639.
1279 gi~208153~gbsynthetic crystal toxin 79 40
~AAA73184.construct
1~
1280 AAE05187 Homo Sapiens1NCY- Human drug metabolising484 100
enzyme (DME-18) rotein.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
144
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1280 AAU12266 Homo SapiensGETH Human PR05780 polypeptide484 100
sequence.
1280 AAY91631 Homo SapiensHUMA- Human secreted 484 100
protein
sequence encoded by gene
24 SEQ ID
N0:304.
1281 AAH46856 Homo SapiensHUMA- Human serine/threonine238 100
_ phosphatase encoding
aal cDNA (clone ID
HLD0020.
1281 AAG77801 Homo SapiensHUMA- Human HLD0020 238 100
serine/threonine phosphatase
protein
se uence. .
1281 AAB85476 Homo SapiensHUMA- Human serine/threonine238 100
phosphatase (clone ID
HLD0020).
1282 gi~14762786~Homo SapiensGS2 gene 70 30
ref~XP
0478
71.1
1283 gi3860165Arabidopsisdisease resistance protein69 38
RPP1-WsB
thaliana
1283 AA009033 Homo SapiensHYSE- Human polypeptide 68 38
SEQ ID
NO 22925.
1283 gi6967115Arabidopsisdisease resistance protein68 38
homlog
thaliana
1285 gi1055252Rattus pheromone receptor VN5 78 32
norve icus
1285 gi2746733Drosophila circadian clock protein 73 26
virilis
1285 gi2641617Drosophila TIM 73 26
virilis
1286 gi6013135Rattus coxsackie-adenovirus-receptor86 67
norvegicus homolog
1286 AAV50429 Homo SapiensUYNY Human coxsackievirus83 75
and Ad2
_ and Ad5 receptor (HCAR)
aal cDNA.
1286 AAV28845 Homo SapiensDAND Human coxsackievirus83 75
and
_ adenovirus receptor encoding
aal DNA.
1287 AAU83224 Homo SapiensZYMO Novel secreted protein642 100
Z930757G12P.
1287 AAY70692 Homo sa DAND Human soluble aitractin-2.84 54
iens
1287 AAY70691 Homo sa DAND Human membrane attractin-2.84 54
iens
1288 AAW70326 Homo SapiensGEMY Secreted protein 1655 99
DU123 1.
1288 ABB 12473Homo SapiensHYSE- Human bone marrow 547 72
expressed
protein SEQ ID NO: 312.
1288 15689736 Homo SapiensMyopodin rotein 475 100
1289 g14103543Tomato chlorosisheat shock protein 70 73 29
virus
1289 g112247413Cristatellacytochrome b 72 30
mucedo
1289 gi~4103543Tomato chlorosisheat shock protein 70 73 29
~g
b~AAD0179virus
0.1~
1291 AAB94128 Homo SapiensHELI- Human protein sequence520 98
SEQ
ID N0:14383.
1291 AAY85576 Homo sapiensJANC Hs-UNC-53/1 fragment/GFP520 98
fusion insert of plasmid
pGI3150.
1291 AAY85564 Homo Sapiens~ JANC Human homologue ~ 520 ~ 98
of UNC-53
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
145
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
(Hs-UNC-53/1) se uence.
1292 AAY01413 Homo SapiensHLTMA- Secreted protein 207 97
encoded by
gene 31 clone HHBAG64.
1292 AAY05324 Homo SapiensGEMY Human secreted protein207 97
1j167 5.
1292 g115157864AgrobacteriumAGR_C_4816p 71 34
tumefaciens
str.
058 (Cereon)
1294 AAB 12146Homo SapiensPROT- Hydrophobic domain219 100
protein
from clone HP 10672 isolated
from
Thymus cells.
1295 gi~17228767~Nostoc Sp. probable glycogen phosphorylase78 34
PCC
ref~NP,48537120
15.1
1295 gi~10835203~Homo Sapiensadvanced glycosylation 65 58
end product-
ref~NP_0011 specific receptor
27.1 ~
1295 gi~190846~gbHomo Sapiensreceptor for advanced 65 58
glycosylation
~AAA03574. end products
1~
1296 g117511816Homo SapiensSimilar to RIKEN cDNA 1268 99
1110032022
ene
1296 AAB88440 Homo sapiensHELI- Human membrane 688 100
or secretory
rotein clone PSEC0222.
1296 g17211438Homo sa golgin-67 94 30
iens
1298 g118314436Homo SapiensSimilar to RIKEN cDNA 481 79
4921511004
gene
1298 11872546 Mus musculusNIK 86 25
1298 g15533305Homo Sapienssomatostatin receptor 85 29
interacting
rotein s lice variant
a
1299 11334643 Xeno us APEG recursor roteiii 105 27
laevis
1299 g117428053Ralstonia PROBABLE RIBONUCLEASE 100 32
E
solanacearum(RNASE E) PROTEIN
1299 g16690017HerpesvirusNTR 96 25
apio
1300 AAB87346 Homo SapiensHUMA- Human gene 5 encoded586 74
secreted protein HDPIE85,
SEQ ID
N0:87.
1300 AAB44298 Homo SapiensGETH Human PR0706 (UNQ370)586 74
rotein sequence SEQ ID
N0:385.
1300 AAY41742 Homo SapiensGETH Human PR0706 protein586 74
sequence.
1301 g1218572 Pan troglodytesprot GOR 1344 62
1301 1243898 Pan GOR 1040 68
1301 g117862570Drosophila LD38414p 486 45
melano aster
1302 g113276598Homo sapiensdJ614O4.7 (Novel rotein)260 28
1302 g113397804Homo SapiensdJ616B8.3 (novel gene) 230 30
1302 AAB56641 Homo SapiensROSE/ Human prostate 226 30
cancer antigen
protein sequence SEQ
ID N0:1219.
1303 g1603989 Drosophila salivary gland glue protein149 23
melano aster
1303 g113324584Borrelia LMP1 129 17
burgdorferi
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
146
Table 2
SEQ AccessionSpecies Description Score
1D No. Identity
NO:
1303 g1161956 Trypanosomasurface antigen 128 13
cruzi
1304 g113569248Human gag protein 81 34
immunodeficienc
y virus
a 1
1304 g14324832Human gag-pol polyprotein 80 29
immunodeficienc
y virus
a 1
1304 g111691875Mus musculusADP-ribosylation factor 79 22
1 GTPase
activatin rotein
1305 AA006469 Homo SapiensHYSE- Human polypeptide 191 100
SEQ ID
NO 20361.
1305 g13608368Xenopus origin recognition complex69 30
laevis associated
protein p81
1305 ABB 15196Homo SapiensHUMA- Human nervous system68 36
related
polype tide SEQ ID NO
3853.
1306 AAE03657 Homo SapiensINCY- Human extracellular109 27
matrix and
cell adhesion molecule-21
(XMAD-
21).
1306 ABB 11890Homo SapiensHYSE- Human protocadherin109 27
Flamingo 1 homologue,
SEQ ID
NO:2260.
1306 13449298 Homo SapiensMEGF2 109 27
1308 g19294050Arabidopsisprotein kinase-like protein84 32
thaliana
1308 g115983765ArabidopsisAT3g24550/MOB24 8 84 32
thaliana
1308 g113877617Arabidopsisprotein kinase-like protein84 32
thaliana
1309 AAU00375 Homo SapiensBERN/ Htunan stem cell 127 54
growth factor
rece tor.
1309 AAE07145 Homo SapiensSALK Human Kit/stem cell127 54
factor
receptor kinase insert
region.
1309 13236223 E uus caballustyrosine kinase receptor127 50
homolog
1310 g121449343Actinosynnemapolyketide synthase 77 46
pretiosum
subsp.
auranticum
1310 g121114513Xanthomonastranscriptional regulator75 36
campestris
pv.
campestris
str.
ATCC 33913
1310 gi13364364Escherichiaacetylglutamate kinase 73 36
- coli
0157:H7
1311 g120146220Oryza sativasimilar to splicing factor/activator110 33
(japonica protein
cultivar-
oup)
1311 g1206712 Rattus salivary proline-rich 104 27
protein
norvegicus
1311 AAY84592 Homo SapiensUNIW Amino acid sequennce103 34
of a
human artemin olypeptide.
1312 12065210 Mus musculusPro-Pol-dUTPase of rotein530 69
__ gi~10834720~Homo sapiensPP565 249 66
1312
gb~AAG237
90.1 ~AF258
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
147
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
587_1
1312 gi~13194728~Gallus galluspol-like protein ENS-3 115 21
gb~AAK155
26.1
~AF329
451 1
1313 AAW03515Homo sa iensSHKJ Human DOCK180 rotein.147 58
1313 gi1339910Homo sa iensDOCK180 protein 147 58
1313 gi1504002Homo sapienssimilar to a human major111 43
CRK-binding
protein DOCK180.
1314 gi12007418Mus musculusB3 olfactory rece for 76 38
1314 118480290Mus musculusolfactory rece for MOR260-376 38
1314 112007432Mus musculusB3 olfacto rece for 76 38
1315 g1483581Mus musculusNotch 3 82 26
1315 g118159668Pyrobaculum paREP2b 81 29
aerophilum
1315 g14584086Spermatozopsisp210 protein 79 25
similis
1316 AAM71305Homo SapiensMOLE- Human bone marrow 422 98
expressed probe encoded
protein SEQ
ID NO: 31611.
1316 AAM58790Homo SapiensMOLE- Human brain expressed422 98
single
exon probe encoded protein
SEQ ID
NO: 30895.
1316 g1149490Lactococcus sucrose-6-phosphate hydrolase72 31
lactis
1317 g11620040Paramecium Asp-rich 72 28
bursaria
Chlorella
virus 1
1317 13721615C rinus carpioMEF2C 71 25
1317 gi~9631936~rParamecium Asp-rich 72 28
ef~NP_0487bursaria
25.1 Chlorella
virus 1
1318 gi~21291797~Anopheles agCP3974 74 35
gb~EAA039gambiae str.
42.1 PEST
~
1319 g121306283Chlamydomonasiron transporter Ftrl 74 30
reinhardtii
1319 AAB60461Homo sapiens1NCY- Human cell cycle 73 33
and
proliferation protein
CCYPR-9, SEQ
ID N0:9.
1319 g16013155Homo Sapiensp35s ' 73 33
1320 g19717245Mus musculuscytoplasmic dynein heavy430 94
chain
1320 g1402528Rattus cytoplasmic dynein heavy430 94
chain
norvegicus
1320 g1294543Rattus dynein heavy chain 430 94
norvegicus
1323 gig 17221411Burkholderiakdo transferase 70 34
~
emb~CADl2cepacia
639.1
~
1324 g11698601Cricetulus beta-1,6-N- 440 38
griseus acetylglucosaminyltransferase
1324 g1349091Rattus N-acetylglucosaminyltransferase438 43
V
norvegicus
1324 118997007Mus musculusN-acetylglucosaminyltransferase438 43
V
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
148
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1325 AAM70545 Homo SapiensMOLE- Human bone marrow 115 47
expressed probe encoded
protein SEQ
ID NO: 30851.
1325 AAM58098 Homo SapiensMOLE- Human brain expressed115 47
single
exon probe encoded protein
SEQ ID
NO: 30203.
1325 AAM72994 Homo SapiensMOLE- Human bone marrow 111 28
expressed probe encoded
protein SEQ
ID NO: 33300.
1326 gi12724969Lactococcusphenolic acid decarboxylase77 46
lactis subsp.
lactis
1327 AAB53097 Homo SapiensGETH Human angiogenesis-associated372 63
rotein PRO 1246, SEQ
ID N0:167.
1327 AAU12416 Homo SapiensGETH Human PR01246 polypeptide372 63
sequence.
1327 AAY99377 Homo SapiensGETH Human PR01246 (UNQ630)372 63
amino acid sequence SEQ
ID NO:132.
1328 gi6014505Hepatitis polyprotein 76 43
GB
virus B
1328 gi765145 Hepatitis polypeptide 68 41
GB
virus B
1328 gi~20544059~Homo Sapienssimilar to U4/U6-associated294 100
RNA
ref~XP_0862 splicing factor
20.4
1329 AAV42689_Homo sapiensSIBI- DNA encoding human158 91
calcium
aal channel alpha-2 subunit.
1329 AAQ84667_Homo SapiensSALK Human neuronal calcium158 91
aal channel subunit alpha
2c.
1329 AAQ84664-Homo SapiensSALK Human neuronal calcium158 91
aal channel subunit alpha
2b.
1330 gi19923 Nicotiana pistil extensin like 71 38
protein, partial CDS
tabacum
1330 gi~144429~gbCellulomonasbeta-1,4-xylanase 67 30
~AAA56792.fimi
1~
1331 12388676 Mytilus precolla en P 85 35
edulis
1331 g117862044Drosophila LD06016p 75 30
melano aster
1331 g113879780MycobacteriumPE_PGRS family protein 74 30
tuberculosis
CDC1551
1333 AA000015 Homo SapiensHYSE- Human polypeptide 442 61
SEQ ID
NO 13907.
1333 AAB82479 Homo SapiensZYMO Human RING finger 81 31
protein
Za op2.
1333 120975274Homo sapiensskeletrophin 81 31
1334 ABB 11819Homo SapiensHYSE- Human secreted 367 82
protein
homolo ue, SEQ ID N0:2189.
1334 AAW80398 Homo SapiensGEMY A secreted protein 130 67
encoded by
clone cw1543 3.
1334 g15081693Samanea pulvinus inward-rectifying70 34
samara channel
SPICK2
1335 ABB89969 Homo sapiensHUMA- Human polype tide 142 96
SEQ ID
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
149
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
NO 2345.
1335 AAB38385 Homo SapiensHUMA- Human secreted 142 96
protein
encoded by gene 18 clone
HTLEJ24.
1335 AAB38338 Homo SapiensHUMA- Human secreted 142 96
protein
encoded by gene 18 clone
HTLFE57.
1336 gi~14590195~Pyrococcus asparaginyl-tRNA synthetase70 37
re~NP_1422horikoshii
60.1
1337 gi3879419Caenorhabditiscontains similarity to 69 29
Pfam domain:
elegans PF00102 (Protein-tyrosine
phosphatase), Score=51.6,
E-
value=1.8e-14, N=1
1337 gi~17563828~Caenorhabditisprotein tyrosine phosphatase69 29
ref~NP_5059elegans
65.1
1338 gi~2072960~gHomo Sapiensp40 138 33
b~AACS
126
8.1~
1338 gi~4185940~eHuman env protein 124 75
mb~CAA768endogenous
80.1 ~ retrovirus
K
1338 gi~757872~eHuman env 124 75
mb~CAA577endogenous
23.1 ~ retrovirus
1340 gi1491979Molluscum MC036R 78 33
contagiosum
virus subtype
1
1340 gi~9628968~rMolluscum MC036R 78 33
ef~NP_0439contagiosum
87.1 virus
1341 gi18676514Homo SapiensFLJ00154 protein 1560 100
1341 AAB84252 Homo SapiensHUMA- Amino acid sequence572 63
of a
human cytokine receptor-like
rotein.
1341 AAB84251 Homo SapiensHUMA- Human cytokine 572 63
receptor-like
protein fragment.
1342 AAY27757 Homo SapiensHUMA- Human secreted 152 71
protein
encoded by gene No. 47:
1342 AAB27551 Homo SapiensMYRI- Human tumour suppressor77 32
BRG1 encoded by cDNA
mutated at
base 1705.
1342 AAB27550 Homo sapiensMYRI- Human tumour suppressor77 32
BRG1 protein from cell
lines DU145
and NCI-H 1300.
1344 gi21464394Drosophila RE18651p 78 26
melanogaster
1344 AAM39065 Homo SapiensHYSE- Human polypeptide 77 21
SEQ ID
NO 2210.
1344 1338290 Homo Sapiensson3 protein 77 21
1345 12202 Canis s Clox 135 37
.
1345 g13879551Caenorhabditiscontains similarity to 125 33
Pfam domain:
elegans PF01391 (Collagen triple
helix repeat
(20 copies)), Score=56.4,
E-value=2e-
13, N=2; PF01484 (Nematode
cuticle
collagen N-terminal domain),
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
150
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
Score=87.2, E-value=l.le-22,
N=1
1345 gi158695 Drosophila tropomyosin isoform 118 30
33 (9C)
melanogaster
1346 gi7862077Giardia 3-hydroxy-3-methylglutaryl-coenzyme90 26
intestinalisA reductase
1346 gi1098615Mycoplasma adhesin-related 30 kDa 87 23
protein
pneumoniae
1346 gi20380058Homo sa iensSimilar to PRAM-1 rotein84 28
1347 113905302Mus musculusSimilar to ATPase, class736 85
II, type 9A
1347 g117862322Drosophila LD22119p 633 72
melanogaster
1347 AAM25271 Homo SapiensHYSE- Human protein 572 100
sequence SEQ
ID N0:786.
1348 g1456319 Bacteriophage74kDa protein 75 33
FC1
1348 g11524115Lycopersiconsubtilisin-like endoprotease73 28
esculentum
1348 g14200334LycopersiconP69A protein 73 28
esculentum
1349 g121391988Drosophila HL08052p 78 31
melano aster
1349 g120148339Arabidopsis cyclin delta-3 77 25
thaliana
1349 gi~17647607~Drosophila maroon-like; bronzy; 78 31
section 5
ref~NP_5234melanogaster
23.1
1351 g118676524Homo sa iensFLJ00159 rotein 164 52
1351 g121392066Drosophila RE04357p 139 34
melanogaster
1351 AAB92637 Homo SapiensHELI- Human protein 81 43
sequence SEQ
ID N0:10953.
1352 g119071965Aspergillus chitin synthase 79 28
oryzae
1352 g117945592Drosophila RE26660p 78 41
melano aster
1352 g116184663Drosoplula LD28370p 74 22
melanogaster
1353 gi~11037117~Homo SapiensNAG13 307 65
gb~AAG274
85.1 CAF
194
537_1
1353 gi~1335205~eHomo SapiensORFII 305 65
mb~CAA364
80.1
1354 g11388166Drosophila Bowel 80 32
melano aster
1354 g115553187Scyliorhinushomeodomain protein 79 22
Otxl
canicula
1354 AAY85573 Homo sapiensJANC Hs-UNC-53/3 fragment/GFP78 26
fusion insert of plasmid
pGI3303.
1358 gi~21288288~Anopheles agCP9766 71 30
gb~EAA006gambiae str.
09.1 ~ PEST
1358 ~ gi~17465558~Homo Sapiens~ similar to mucin ~ 68 ~ 36
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
151
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
re~XP_0698
88.1
1359 gi~21302892~Anopheles agCP5020 70 31
gb~EAA150gambiae str.
37.1 PEST
1361 gi15080686Lentinula CDCS 79 26
edodes
1361 gi495516 Plasmodium circumsporozoite protein77 31
vivax
1361 gi21070569DictyosteliumVSAE2 (FR.AGMENT). 3/10176 31
discoideum
1362 gi8953400Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23
~
thaliana s these-like rotein
1362 gi~15239030~Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23
ref~NP-1966thaliana synthase - like protein
99.1 ~
1363 gi2444430Xenopus laevisdeacetylase 327 81
1363 gi602098 Xeno us laeviseast ltPD3 homologue 324 80
1363 AAB49954 Homo SapiensMETH- Human histone 323 80
deacetylase
HDAC-1.
1364 AAM69686 Homo SapiensMOLE- Human bone marrow418 55
expressed probe encoded
protein SEQ
ID NO: 29992.
1364 AAM57281 Homo SapiensMOLE- Human brain expressed418 55
single
exon probe encoded protein
SEQ ID
NO: 29386.
1364 gi~1780971~eHuman gag protein 172 37
mb~CAA714endogenous
16.1 ~ retrovirus
K
1365 gi437084 Gallus gallusvitamin D3 hydroxylase 510 41
associated
protein
1365 12149156 Homo Sapiensfatty acid amide hydrolase477 38
1365 AAW57783 Homo SapiensSCRI Human fatty acid 468 38
amide
hydrolase.
1366 g13510695Homo SapiensDNA polymerase theta 77 21
1366 g1309132 Mus musculuscalnexin 72 22
1366 g115214567Mus musculusSimilar to calnexin 72 22
1367 gi~17508849~Caenorhabditishelicase 73 40
re~NP elegans
4914
26.1 ~
1368 g15457567Pyrococcus Na+/H+ antiporter (napA-1)76 33
abyssi
1368 g18247211Candida albicansShe9 rotein 69 31
1368 gi~14590079)Pyrococcus Na(+)/H(+) antiporter 76 30
ref~NP_1421horikoshii
43.1
1369 g117644260Homo SapiensbB206I21.1 (ATPase, 305 98
Class VI, type
11C ) .
1369 AA014200 Homo SapiensINCY- Human transporter166 50
and ion
channel TRICH-17.
1369 g15080816Arabidopsis Putative ATPase 166 49
thaliana
1370 gi~18573281~Homo Sapienssimilar to 40S ribosomal70 38
protein S3A
re~XP_0959
33.1
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
152
Tahle: 7
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1372 gi6683562Mus musculushe aran sulfate 6-sulfotransferase886 91
3
1372 gi6683558Mus musculusheparan sulfate 6-sulfohansferase265 72
2
1372 ABL39900_Homo SapiensSEGK Human HS6ST2v encoding262 71
aal cDNA SEQ ID NO:1.
1373 gi~20882231Mus musculussimilar to LIM domain 76 24
~ only 7
ref~XP_1392
03.1
1373 gi~20302988~Medicago nodule-specific glycine-rich72 26
sativa protein 3
gb~AAM189
48.1 ~AF498
989 1
1373 gi~9965267~ginfectious non-structural protein 72 24
2
b~AAG1000hypodermal
and
8.1 ~ hematopoietic
necrosis
virus
1374 13355835 Rhizobium RBSK 78 32
etli
1374 g17453560Polyangium epoD 73 28
cellulosum
1374 g11749684Schizosaccharomsimilar to Saccharomyces72 28
cerevisiae
yces pombe porphobilinogen deaminase,
SWISS-
PROT Accession Number
P28789
1375 116973455Danio reriobeta-3-galactosyltransferase1050 63
1375 AAB24035 Homo SapiensGETH Human PR04397 protein725 46
sequence SEQ ID NO:42.
1375 AAB88404 Homo SapiensHELI- Human membrane 709 43
or secretory
protein clone PSEC0159.
1376 g17668 Drosophila bsg25D protein 73 33
melanogaster
1376 g120177037Drosophila LD21844p 73 33
melanogaster
1376 g11353669CaenorhabditisUNC-24 69 43
ele ans
1379 AAS16182_Homo SapiensGENA- Human apolipoprotein245 67
C1
aal (APOC1 DNA.
1379 AAU10534 Homo SapiensGENA- Human apolipoprotein245 67
C1
(APOC1) of eptide.
1379 AAS 16825-Homo SapiensGENA- Human apolipoprotein245 67
C1
aal (APOC1) DNA coding se
uence.
1380 AAY36290 Homo sapiensHUMA- Human secreted 177 74
protein
encoded by gene 67.
1380 g116551305Tatianyx DNA-directed RNA polymerase71 38
beta'
arnacites subunit 2
1380 13411013 Candida protein mannosyltransferase68 35
albicans 1
1381 AAM80132 Homo SapiensHYSE- Human protein SEQ 173 66
ID NO
3778.
1381 g14731867Dictyosteliumsterol glucosyltransferase107 30
discoideum
1381 AAB74726 Homo SapiensINCY- Human membrane 89 41
associated
protein MEMAP-32.
1382 AAB62100 Homo SapiensWIST- Human bridging 78 27
integrator-2
(Bin2) rotein.
1382 g16527168Homo Sapiensbreast cancer associated78 27
protein
BRAP 1
1382 g15852834Homo Sapiensbridging integrator-2 78 27
~ ~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
153
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1383 gi7670050Xeno us type I collagen al ha 92 27
laevis 1
1383 AA001606 Homo SapiensHYSE- Human polypeptide 85 29
SEQ ID
NO 15498.
1383 gi17738485Agrobacteriumbiopolymer transport 85 28
protein
tumefaciens
str.
C58 (U.
Washin ton)
1384 gi20451261CaenorhabditisC. elegans GCY-17 protein71 26
elegans (comes onding se uence
W03F11.2)
1384 gi2665714AgrobacteriummoaC 71 29
tumefaciens
1384 gi~20864452~Mus musculusRIKEN cDNA 2410018E23 130 59
ref]XP-1500
76.1 ~
1385 AAY94938 Homo SapiensGEMY Human secreted protein103 25
clone
ye78 1 protein sequence
SEQ ID
N0:82.
1385 gi12831176Agelaius gamma filamin protein 96 29
phoeniceus
1385 AAU81998 Homo sapiensINCY- Human secreted 87 27
protein
SECP24.
1386 gi10440468Homo SapiensFLJ00070 protein 102 41
1386 gi11136912Danio rerioRPTP-al ha protein 94 32
1386 120377083Homo Sapiensp78 92 36
1387 AAM40810 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID
NO 5741.
138.7 AAM39024 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID
NO 2169.
1387 g115080474Homo SapiensSimilar to RIKEN cDNA 190 59
1700023011
ene
1388 g112802591Bovine tegument protein 82 30
herpesvirus
4
1388 g1950226 SaccharomycesTrf4p ' 73 26
cerevisiae
1388 gi~13095641~Bovine tegumentprotein 82 30
ref~NP_0765herpesvirus
4
56.1
1389 AAI67224_Homo SapiensCORI- BS11S cDNA sequence.363 100
aal
1389 AAF85500_Homo SapiensEOSB- Nucleotide sequence363 100
of a
aal human breast cancer protein
designated
BCH1.
1389 AAA54120-Homo sapiensEOSB- Breast cancer protein363 100
BCH1
aal codin se uence.
1390 g1184653 Homo SapiensIFN-alpha responsive 74 30
transcription
factor
1390 gi~2580453~gXenopus Xbap 68 47
laevis
b~AAB8233
6.1~
1391 AAB88456 Homo SapiensHELI- Human membrane 85 52
or secretory
protein clone PSEC0246.
1391 AAB62392 Homo SapiensLEXI- Human LDL receptor85 52
family
rotein (LDLP).
1392 ABB 12009Homo Sapiens~ HYSE- Human RAMP 1 ~ 90 ~ 100
homologue,
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
154
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
SEQ ID N0:2379.
1392 gi3171910Homo sa RAMP1 90 100
iens
1392 gi12653551Homo Sapiensreceptor (calcitonin) 90 100
activity modifying
rotein 1
1394 gi4467343Drosophila EG:140G11.1 70 27
melano aster
1394 gi6018879Drosophila BACN4L24.d 70 27
melanogaster
1394 gi157993 Drosophila developmental protein 70 27
melanogaster
1395 gi4928919Arabidopsiszinc forger protein 2 86 26
thaliana
1395 gi2702272Arabidopsisexpressed protein 86 26
thaliana
1396 AAM25276 Homo sapiensHYSE- Human protein sequence729 93
SEQ
ID N0:791.
1396 AAE14340 Homo sapiensINCY- Human protease 528 33
PRTS-5
protein.
1396 AAB47561 Homo sa INCY- Protease PRTS-3. 528 33
iens
1397 gi18369843Infectious P6 89 40
salmon anemia
virus
1397 gi4092530Infectious NS1 protein 87 39
salmon anemia
virus
1397 gi14009648Infectious NS1 87 39
salmon anemia
virus
1398 AAW63707 Homo sa UYOR- Human hSK2 protein.331 91
iens
1398 gi1575663Rattus ~ calcium-activated potassium331 91
channel
norvegicus rSK2
1398 gi15082148Homo Sapienssmall-conductance calcium-activated331 91
otassium channel
1399 AAB01.381Homo sapiensINCY- Neuron-associated 1653 68
protein.
1399 gi18157547Mus musculuspecanex-like 3 1620 66
1399 16650377 Mus musculusecanex 1 1277 51
1400 gi~20887681Mus musculussimilar to melastatin 468 91
~ 1
ref~XP,1405
75.1
1400 gi~3243075~gHomo Sapiensmelastatin 1 355 75
b~AAC8000
0.1~
1400 gi~20552333~Homo Sapienssimilar to melastatin 355 75
1
ref~XP-0076
62.9
1401 AAU15955 Homo SapiensHUMA- Human novel secreted931 92
protein,
Seq ID 908.
1401 g13978441Homo SapiensPITSLRE protein kinase 95 24
alpha SV9
isoform
1401 g11517914Homo Sapiensmonocytic leukaemia zinc91 28
finger
rotein
1402 g11289326Mus musculusROR-al ha 1 84 25
1402 g1530878 Chlamydomonasamino acid feature: N-glycosylation79 32
,
eugametos sites, as 41 .. 43, 46
.. 48, 51 .. 53, 72
..
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
155
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
74, 107 .. 109, 128 ..
130, 132 .. 134,
158 .. 160, 163 .. 165;
amino acid
feature: Rod protein
domain, as 169 ..
340; amino acid feature:
globular
protein domain, as 32
.. 168
1402 gi220763 Rattus HES-3 factor 79 52
norve icus
1403 gi~20479430~Homo Sapienssimilar to olfactory 71 32
receptor MOR231-
ref~XP-1149 1
55.1
1403 gi~20480897~Homo sapienssimilar to olfactory 71 32
receptor MOR234-
ref~XP-1150 3
14.1 ~
1404 AAA88548_Homo sapiensSMIK Human CASB616 cDNA.89 100
aal
1404 AAB 19591Homo SapiensSMIK Human CASB616. 89 100
1404 11100110 Homo sa protein-tyrosine kinase 89 100
iens
1405 g14206753Oryctolagushomeodomain-containing 74 24
protein
cuniculus
1405 g113445253Mus musculusorphan Gpr37-like rotein72 33
1
1405 g13080552Mus musculusHoxa-9 71 50
1406 AAM50585 Homo SapiensNISB Benign prostatic 325 100
hyperplasia
associated protein JT460914.
1406 g118031947Homo SapiensSOCS box protein ASB-5 325 100
1406 AAU20593 Homo sapiensHUMA- Human secreted 316 100
protein, Seq
ID No 585.
1407 AAU83222 Homo SapiensZYMO Novel secreted protein895 97
Z930005G2P.
1407 AAY02712 Homo SapiensHUMA- Human secreted 91 56
protein
encoded by gene 63 clone
HBJFV28.
1407 AA000641 Homo SapiensHYSE- Human polypeptide 86 64
SEQ ID
NO 14533.
1408 ABB17944 Homo SapiensHUMA- Human nervous system81 53
related
pol eptide SEQ ID NO
6601.
1408 AAM77906 Homo SapiensMOLE- Human bone marrow 72 40
expressed probe encoded
protein SEQ
ID NO: 38212.
1408 AAM65199 Homo SapiensMOLE- Human brain expressed72 40
single
exon probe encoded protein
SEQ ID
NO: 37304.
1409 g15230847Vitreoscillaglutamine synthetase 68 33
Sp. homolog
C1
1409 g18515736Drosophila highwire 67 35
melano aster
1409 g13138797Sulfolobus Ssh7b 65 48
shibatae
1410 AAW23309 Homo sapiensEIJI- Human Werner's 151 96
syndrome WS-2
protein.
1410 g11913785Homo SapiensRep-8 151 96
1410 g118089098Homo sapiensre roduction 8 151 96
1411 gi~21297468~Anopheles agCP15537 166 56
gb~EAA096gambiae
str.
13.1 PEST
1411 gi~20983200~Mus musculusRIKEN cDNA 1810030007 73 24
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
156
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
ref~XP-1358
12.1
1412 gi532572 Hordeum lipoxygenase 1 82 28
vulgare
1412 gi945419 Mus musculushepatoma derived growth 77 35
factor
(HDGF)
1412 gi17932895stork hepatitispreC/core antigen 77 26
B
virus
1413 gi2370143Homo Sapiensimmunoglobulin-like domain-169 42
containing 1
1413 gi2645890Homo sa IGSF1 169 42
iens
1413 AAB40232 Homo SapiensHUMA- Human secreted 162 40
protein
sequence encoded by gene
46 SEQ ID
N0:142.
1414 gi21204314Staphylococcusproline-tRNA ligase 78 32
aureussubsp.
aureus MW2
1414 gi14247033Staphylococcusproline-tRNA ligase 78 32
aureus subsp.
aureus Mu50
1414 gi13701063Staphylococcusproline-tRNA ligase 78 32
aureus subsp.
aureus N315
1415 gi9948469Pseudomonasprobable non-ribosomal 78 31
peptide
aeruginosa synthetase
1415 AAE19251 Homo SapiensBIOI- SOSl protein sequence75 23
from
PS462.
1415 AAU84311 Homo SapiensBAAI~/ Protein ABCB2 74 30
differentially
ex ressed in breast cancer
tissue.
1416 gi18676710Homo sa FLJ00254 rotein 623 75
iens
1416 gi2065210Mus musculusPro-Pol-dUTPase pol rotein583 69
1416 gi~18676710~Homo SapiensFLJ00254 protein 623 75
dbj~BAB850
07.1 ~
1417 AAR85785 Homo SapiensUYNY Human GRB-10. 77 32
1417 gi841210 Mus musculusgrowth factor receptor 77 32
binding protein
Grb 10
1417 AAM90963 Homo SapiensHUMA- Human 74 32
immune/haematopoietic
antigen SEQ
ID N0:18556.
1419 AAM79990 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO
3636.
1419 AAM79006 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO
1668.
1419 AAR28494 Homo SapiensXIAM/ Sequence encoded 82 100
by the
CAMPATH-1 antigen cDNA.
1420 AAU01383 Homo SapiensMILL- Human TANGO 499 828 73
form 2,
variant 1 amino acid
sequence.
1420 AAU01382 Homo SapiensMILL- Human TANGO 499 828 73
form 2,
variant 4 amino acid
se uence.
1420 AAU01380 Homo SapiensMILL- Human TANGO 499 828 73
form 2,
amino acid se uence.
1421 gi19069609EncephalitozoonPROTEASOME REGULATORY 76 26
cuniculi SUBUNIT YTA6 OF THE AAA
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
157
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
FAMILY OF ATPASES
1422 AAM66177 Homo SapiensMOLE- Human bone marrow199 72
expressed probe encoded
protein SEQ
ID NO: 26483.
1422 AAM53791 Homo SapiensMOLE- Human brain expressed199 72
single
exon probe encoded protein
SEQ ID
NO: 25896.
1422 AAM68472 Homo SapiensMOLE- Human bone marrow176 81
expressed probe encoded
protein SEQ
ID NO: 28778.
1423 11800227 Oryza sativaBowman-Birk roteinase 74 34
inhibitor
1423 g110141005San Miguel non-structural polyprotein74 26
sea
lion virus
1423 gi~17490177~Homo sapienssimilar to RING finger 76 28
protein 18
re~XP-0623 (Testis-specific ring-forger
protein)
00.1 ~
1424 g1461336 Pyrenomonas hsp70 75 29
salina
1424 g113880037Mycobacteriummembrane protein, MmpL 75 24
family
tuberculosis
CDC1551
1424 g11449306MycobacteriummmpL2 75 24
tuberculosis
H37Rv
1425 g115600 Enterobacteriagene 7.3, host range 79 30
ha a T7
1425 g116198065Drosophila LD28477p 77 30
melanogaster
1425 g111870012Drosophila xnp/atr-x DNA helicase 77 30
melanogaster
1426 g116185397Drosophila LD39815p 204 44
melano aster
1426 g12244793Arabidopsis disease resistance N 86 30
like protein
thaliana
1426 AAU84280 Homo SapiensBGHM Human endometrial 77 26
cancer
related rotein, HERC1.
1427 AAY36302 Homo SapiensHUMA- Human secreted 183 79
protein
encoded by gene 79.
1427 AAB88359 Homo SapiensHELI- Human membrane 178 80
or secretory
protein clone PSEC0087.
1427 AAM41635 Homo SapiensHYSE- Human polypeptide178 80
SEQ ID
NO 6566.
1428 AAU82008 Homo Sapiens1NCY- Human secreted 114 64
protein
SECP34.
Y
1428 AAB32391 Homo SapiensHUMA- Human secreted 114 64
protein
sequence encoded by
gene 21 SEQ ID
N0:77.
1428 AAY08306 Homo SapiensFIBR- Human collagen 74 45
IX alpha-3
chain rotein.
1429 g12792523Ralstonia alternative RNA sigma 69 30
factor RpoS
solanacearum
1429 g117428221Ralstonia RNA POLYMERASE SIGMA 69 33
S
solanacearum(SIGMA-38) FACTOR
TRANSCRIPTION REGULATOR
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
158
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
PROTEIN
1429 gi~5032313~rHomo Sapiensdystrophin Dp140bc isoform;73 26
e~NP_0040 Dystrophin (muscular
dystrophy,
14.1 Duchenne and Becker
types)
1433 gi9954445Rattus TEMO 171 62
norve icus
1433 gi14030260maize rayadopolyprotein ~ 79 32
fino virus
1433 AAB95656 Homo sapiensHELI- Human protein 77 36
sequence SEQ
ID N0:18419.
1434 AAR04212 Homo SapiensCALB- Human 32K alveolar391 43
surfactant
rotein.
1434 AAP60661 Homo SapiensKUSH/ Genomic sequence 386 43
of human
alveolar surfactant
protein
(hASP)encoded by genomic
DNA.
1434 AAB58135 Homo SapiensROSE/ Lung cancer associated366 42
pol a tide sequence
SEQ ID 473.
1435 gi17224904Mus musculusimmuno lobulin superfamily180 48
member 9
1435 gi20988778Homo SapiensSimilar to immunoglobulin173 53
su erfamily, member
9
1435 gi14149050Drosophila turtle protein, isoform114 36
4
melanogaster
1436 gi1465855CaenorhabditisC. elegans PQN-57 protein85 23
elegans (correspondin sequence
R09F10.7)
1436 gi1465856CaenorhabditisC. elegans PQN-56 protein85 23
elegans (correspondin sequence
R09F10.2)
1436 117864717Mus musculushornerin 83 26
1437 gi~21292574~Anopheles agCP3449 66 33
gb~EAA047gambiae str.
19.1 PEST
1438 ABB 10160Homo SapiensHUMA- Human cDNA SEQ 166 62
ID NO:
468.
1438 g19657279Vibrio choleraeaspartokinase II/homoserine71 28
dehydrogenase, methionine-sensitive
1439 g14582571Gallus gallusH erion protein, 419 75 24
kD isoform
1439 g113165 Oenothera ATPase alpha-subunit 72 26
(aa 1-511)
biennis
1439 g1903838 Oenothera F-1-ATPase alpha subunit72 26
berteriana
1440 g14558758Homo Sapienstestis-specific chromodomain233 62
Y-like
protein
1440 g14558762Mus musculustestis-specific chromodomain231 36
Y-like
rotein
1440 g13342716Homo Sapienstestis-specific ChromoDomain195 36
Y
isoform 1
1441 g1155627 Acanthamoebamyosin I heavy chain 118 42
castellanii
1441 g113093370Mycobacteriuminitiation factor IF-2 116 33
1e rae
1441 AAY20289 Homo SapiensUYRO- Human apolipoprotein114 39
E
mutant rotein fragment
5.
1442 g12253707Mus musculusDaxx 84 36
1442 g11934970Plasmodium AARP1 protein 79 65
falciparum
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
159
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1442 14050098 Mus musculusFas-bindin protein 78 34
1443 g12425111DictyosteliumZipA 90 26
discoideum
1443 AAY06119 Homo SapiensHARD Human CIITA interacting88 26
protein 104 CIP104).
1443 g15420387Leishmania proteophosphoglycan 86 21
maj or
1444 g1893355 AcinetobacterL-2,4-diaminobutyrate 77 26
decarboxylase
baumannii
1445 ABB55744 Homo sapiensFECH/ Human polypeptide 135 47
SEQ ID
NO 94.
1445 AAU39035 Homo SapiensGEMY Human secreted protein135 47
nh328 5.
1445 AAY28679 Homo SapiensGEMY Human nh328 5 secreted135 47
rotein.
1446 g119744390Homo sapiensretinoic acid inducible 247 54
in
neuroblastoma cells RAINB
1 d
1446 g119744388Homo Sapiensretinoic acid inducible 247 54
in
neuroblastoma cells RAINB
1
1446 AAY85565 Homo SapiensJANC Human homologue 240 52
of UNC-53
(Hs-UNC-53/2) se uence.
1447 AAU19716 Homo SapiensHUMA- Human novel extracellular71 31
matrix protein, Seq ID
No 366.
1447 g118025476cercopithicineBPLF1 71 38
he esvirus
15
1447 AAS 14575_Homo SapiensMILL- Human cDNA encoding69 62
G
aal protein-coupled receptor,
GPCR,
52872.
1448 g114027507Mesorhizobiumsalicylate hydroxylase 69 31
loti
1449 AAG64798 Homo sapiensSREH- Human peptide methionine192 . 71
sulphoxide reductase
(hPMSR).
1449 AAB81893 Homo SapiensSEQU- Human genomic database192 71
related protein SEQ ID
NO: 38.
1449 AAM42046 Homo SapiensHYSE- Human polypeptide 192 71
SEQ ID
NO 6977.
1450 g118249657Mus musculusNC8 1063 80
1450 1406748 Mus musculuszinc finger protein 250 37
1450 AAB43498 Homo SapiensHUMA- Human cancer associated249 37
rotein sequence SEQ ID
N0:943.
1451 ABB89331 Homo SapiensHUMA- Human polypeptide 732 88
SEQ ID
NO 1707.
1451 g113421927CaulobacterMaoC family protein 273 42
crescentus
CB15
1451 g119338616MethylobacteriuR-specific enoyl-CoA 261 44
hydratase
m extorquens
1452 gi~20908171~Mus musculussimilar to NADPH oxidase68 30
3; NADPH
ref~XP_1397 oxidase catalytic subunit-like
3
15.1
1452 gi~17533619~CaenorhabditisF32A5.8.p 67 42
ref~NP_4955elegans
16.1
1453 gi~15614051~Bacillus sodium-dependent phosphate65 34
reflNP halodurans traps orter
2423
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
160
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
54.1 ~
1454 gi~17551878~CaenorliabditisTPRDomain 76- 29
ref~NP_4990elegans
90.1
1455 AAM40727 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID
NO 5658.
1455 AAM38941 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID
NO 2086.
1455 gi19702127Homo sa P-Rexl rotein 191 56
iens
1456 ABB05666 Homo SapiensGEHU- Human nucleic acid496 91
management rotein clone
amy2 l 1n4.
1456 AAE03372 Homo SapiensHUMA- Human gene 18 encoded496 91
secreted protein fragment,
SEQ ID
N0:152.
1456 AAE03371 Homo SapiensHUMA- Human gene 18 encoded496 91
secreted protein fragment,
SEQ ID
N0:150.
1457 AAM66940 Homo SapiensMOLE- Human bone marrow 290 77
expressed probe encoded
protein SEQ
ID NO: 27246.
1457 AAM54534 Homo SapiensMOLE- Human brain expressed290 77
single
exon probe encoded protein
SEQ ID
NO: 26639.
1457 AAM64410 Homo SapiensMOLE- Human brain expressed287 77
single
exon probe encoded protein
SEQ ID
NO: 36515.
1458 AAB53445 Homo SapiensHUMA- Human colon cancer335 100
antigen
rotein se uence SEQ ID
N0:985.
1458 AAY30055 Homo SapiensARIA- Amino acid sequence165 91
of a
FK506-binding protein
(FKBP).
1458 AAQ52277_Homo sapiensVERT- FK506 binding protein159 100
aal (FKBP12A) cDNA.
1460 AAU20255 Homo SapiensHUMA- Human novel endocrine104 76
antigen, SEQ ID No 312.
1460 ABB 17663Homo SapiensHUMA- Human nervous system94 77
related
pol a tide SEQ ID NO
6320.
1460 AA002331 Homo SapiensHYSE- Human polypeptide 88 61
SEQ ID
NO 16223.
1461 AAM65951 Homo SapiensMOLE- Human bone marrow 97 57
expressed probe encoded
protein SEQ
ID NO: 26257.
1461 AAM53568 Homo SapiensMOLE- Human brain expressed97 57
single
exon probe encoded protein
SEQ ID
NO: 25673.
1461 AAU83199 Homo sapiensZYMO Novel secreted protein96 38
Z891639G1P.
1463 15565687 Homo sa topoisomerase-related 514 75
iens function protein
1463 15139669 Homo SapiensLAK-1 468 75
1463 g121430468Drosoplula LP06848p 332 51
melano aster
1464 AAY91421 Homo sapiensHUMA- Human secreted 109 35
protein
sequence encoded by gene
7 SEQ ID
N0:142.
1464 AAY91396 Homo SapiensHUMA- Human secreted 109 35
rotein
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
161
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
sequence encoded by gene
7 SEQ ID
N0:117.
1464 AAY91352 Homo SapiensHUMA- Human secreted 109 35
protein
sequence encoded by gene
7 SEQ ID
N0:73.
1465 AAU15978 Homo SapiensHUMA- Human novel secreted575 100
protein,
Se ID 931.
1465 AAU15958 Homo SapiensHUMA- Human novel secreted575 100
protein,
Se ID 911.
1465 116041675Homo sa 'oined to JAZF1 575 100
iens
1466 AA001502 Homo SapiensHYSE- Human polypeptide 173 66
SEQ ID
NO 15394.
1466 gi~10947038~Homo Sapiensankyrin 1, isoform l; 74 28
anlcyrin-1,
ref~NP erythrocytic; ankyrin-R
0652
09.1 ~
1466 gi~10947036~Homo Sapiensankyrin 1, isoform4; 74 28
ankyrin-1,
reflNP erythrocytic; ankyrin-R
0652
08.1
1467 g119354550Mus musculussimilar to src homology 842 91
three (SH3)
and cysteine rich domain
1467 AAU17352 Homo SapiensHUMA- Novel signal transduction361 98
athway rotein, Se ID
917.
1467 g11799566Mus musculusstet 302 44
1468 g113506771Mus musculusstructural protein FBF1 767 74
1468 g17549210Babesia 200 lcDa antigen p200 213 29
bigemina
1468 g11747 Oryctolagustrichohyalin 191 30
cuniculus
1469 111345048Homo SapiensSCAN domain-containing 86 32
rotein 2
1469 111320940Homo SapiensSCAND2 86 32
1469 g114210722Tupaia t41 86 30
herpesvirus
1470 AAY88278 Homo SapiensMILL- Human TANGO 188 1442 100
rotein.
1470 114336711Homo Sapienssimilar to C. Elegans 1442 100
protein F17C8.5
1470 AAA39947'Homo SapiensMILL- Human TANGO 188 1438 99
cDNA.
aal
1471 AAE10204 Homo SapiensHYSE-Humen bone marrow 71 44
derived
contig protein, SEQ ID
NO: 69.
1471 AAA23458 Homo SapiensALPH- cDNA encoding human67 46
_ secreted protein vpl5_l,
aal SEQ ID
N0:71.
1471 AAB80228 Homo sa GETH Human PR0269 protein.67 46
iens
1472 AAB88433 Homo SapiensHELI- Human membrane 136 86
or secretory
rotein clone PSEC0210.
1472 AAB95155 Homo SapiensHELI- Human protein sequence136 86
SEQ
ID N0:17188.
1472 AAE01745 Homo SapiensHUMA- Human gene 2 encoded136 86
secreted protein HOGCS52
variant,
SEQ ID N0:160.
1473 g19294201Arabidopsisdisease resistance protein70 24
thaliana
1474 AAE1915 Homo SapiensTHOR/ Human lcinase polypeptide631 98
7
(PKIN-15).
1474 AAM79131 Homo SapiensHYSE- Human protein SEQ ~ 494 ~ 72
ID NO
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
162
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1793.
1474 AAW 19920Homo sapiensREGC Human I~sr' (kinase494 72
suppressor
of Ras).
1475 AAD 12609_Homo SapiensSAGA Human protein having657 73
aal hydrophobic domain encoding
cDNA
clone HP03974.
1475 AA014199Homo Sapiens1NCY- Human transporter 657 73
and ion
channel TRICH-16.
1475 AAE06614Homo SapiensSAGA Human protein having657 73
hydrophobic domain, HP03974.
1476 113905246Mus musculusRIKEN cDNA 2410024K20 71 34
gene
1476 gi~17505208~Mus musculusCD2 antigen (cytoplasmic71 34
tail) binding
ref~NP'0816 protein 2; 1500011B02Rik
29.1
~
1477 g1806491Rarius guanylylcyclase 140 65
norvegicus
1477 g12648066Canis familiarisguanylate cyclase E 118 55
1477 g12623074Bos taurus rod outer segment guanylate116 55
cyclase
precursor
1478 12065210Mus musculusPro-Pol-dUTPase polyprotein585 73
1478 118676710Homo SapiensFLJ00254 protein 408 69
1478 AA004042Homo SapiensHYSE- Human polypeptide 392 75
SEQ ID
NO 17934.
1479 AAU05396Homo SapiensGEHO Human titin (connectin)208 29
protein
sequence.
1479 g11212992Homo SapiensProtein sequence and 208 29
annotation
available soon via Swiss-Prot;
available
at present via e-mail
from
LABEIT EMBL-Heidelber
.DE
1479 g117066105Homo sa iensTitin 208 29
1480 AAV44685,Homo SapiensTEXA Osteoclast inhibitor94 41
protein,
aal OIP-1, coding sequence.
1480 AAB35287Homo sa iensUROG- Human stem call 94 41
antigen-2.
1480 AAY99709Homo SapiensREGC Human stem cell 94 41
antigen-2,
hSCA-2.
1481 AAB57094Homo SapiensROSE/ Human prostate 122 100
cancer antigen
protein sequence SEQ
ID N0:1672.
1481 g132672 Homo Sapiensinterferon alphalbeta 122 100
receptor
1481 AAQ49625-Homo SapiensEUBI- Human interferon 118 96
receptor
aal extracellular domain
codin se uence.
1482 AAD17516_Homo SapiensSENO- Human taste receptor,890 94
hTlR1
aal cDNA coding sequence.
1482 ABB77319Homo Sapiens1NCY- Human G-protein 890 94
coupled
rece for SEQ ID NO 3.
1482 AAE10372Homo SapiensSEND- Human taste receptor,890 94
hTlR1
rotein.
1483 g118376312Neurospora related to SSD1 protein 109 39
crassa
1483 g12645173Schizosaccharomsts5+ 99 42
yces ombe
1483 g12459997Candida albicansrotein phosphatase Ssdl 99 40
homolog
1484 gi~18569064~Homo Sapienssimilar to 40S RIBOSOMAL319 96
ref~XP-0953 PROTEIN S3A (V-FOS
78.1 TRANSFORMATION EFFECTOR
~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
163
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
PROTEIN
1484 gi~20539276~Homo Sapienssimilar to olfactory 259 94
receptor MOR145-
ref~XP_0952 2
20.2
1484 gi~21295882~Anopheles agCP1347 68 32
gb~EAA080gambiae
str.
27.1 PEST
1485 ABB 11761Homo SapiensHYSE- Human secreted 197 36
protein
homologue, SEQ ID NO:2131.
1485 gi930259 Woolly monkeyreverse transcriptase 148 33
(476 AA)
sarcoma
virus
1485 gi18076262porcine Pol protein 147 38
endogenous
retrovirus
1486 AAM74887 Homo SapiensMOLE- Human bone marrow 172 100
expressed probe encoded
protein SEQ
ID NO: 35193.
1486 AAM62085 Homo sapiensMOLE- Human brain expressed172 100
single
exon probe encoded protein
SEQ ID
NO: 34190.
1486 1152661 Plasmid neomycin resistance rotein75 26
SB24.2
1487 112653493Homo sa Similar to brain acid-soluble75 34
iens protein 1
1487 g117428832Ralstoilia PROBABLE AVRBS3-LIKE 75 33
solanacearuxnPROTEIN
1487 g17329672Arabidopsisphosphatidate cytidylyltransferase-like72 46
thaliana protein
1488 AAU74754 Homo SapiensINCY- Human protease 2042 83
PRTS-14
rotein se uence.
1488 AAU74752 Homo SapiensINCY-Human protease PRTS-12476 39
protein sequence.
1488 111935122Mus musculusa ilin 431 40
1489 gi~17543712~CaenorhabditisYSSF3C.8.p 72 32
ref~NP-4999elegans
76.1
1489 gi~20344600~Mus musculusRIKEN cDNA 4933431K05 70 30
ref~XP_1095
79.1
1489 gi~11692798~Xenopus ataxia telangiectasia 69 26
laevis and Rad3-related
gb~AAG400 protein
02.1 ~AF320
125 1
1490 AAB95817 Homo SapiensHELI- Human protein sequence256 63
SEQ
ID N0:18817.
1490 ABB06369 Homo SapiensBODE- Human neurogenesis173 64
related
rotein 12 SEQ ID N0:2.
1490 AAB44394 Homo sapiensHUMA- Gene 10 encoded 83 66
human
secreted protein fragment
as BLASTX
query se uence.
1491 g1438795 Mus musculusserotonin 1A receptor 73 26
1491 g11066326Mus musculusserotoninlA receptor 72 26
1491 gi~438795~gbMus musculusserotonin 1A receptor 73 26
.
AAA 16850.
1~
1492 g116198083Drosophila LD29875p ~ 87 ~ 33
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
164
Table 2
SEQ AccessionSpecies Description Score
No. Identity
NO:
melano aster
1492 gi2327063Pneumocystisprotease 1 75 34
carinii f.
Sp.
carinii
1492 120420 Prunus dulcisextensin 75 34
1493 AAG67087 Homo SapiensSHAN- Human ATP-dependent106 67
serine
rotein hydrolase 13.
1493 AAM76636 Homo SapiensMOLE- Human bone marrow103 68
expressed probe encoded
protein SEQ
ID NO: 36942.
1493 AAM63822 Homo SapiensMOLE- Human brain expressed103 68
single
exon probe encoded protein
SEQ ID
NO: 35927.
1494 AAY31225 Homo SapiensAVET Human RNA helicase73 38
p135
protein.
1494 g13123906Homo sa ienspre-mRNA splicin factor73 38
1494 g113278975Homo Sapienspre-mRNA splicing factor73 38
similar to S.
cerevisiae P 16
1495 gi~17568307~Caenorhabditiscollagen 74 35
ref~NP-5098elegans
37.1 ~
1496 12065210 Mus musculusPro-Pol-dUTPase polyprotein410 81
1496 gi~10834720~Homo SapiensPP565 301 77
gb~AAG237
90.1~AF258
587 1
1496 gi~6753924~rMus musculusFriend virus susceptibility127 37
1
ef~NP_0343
74.1
1497 g120901968CaenorhabditisC. elegans RPL-36 protein71 34
elegans (comes ondin sequence
F37C12.4)
1497 gig 17554754CaenorhabditisRibosomal protein YL39 71 34
ref~NP elegans
4985
73.1
1498 g15305335Mycobacteriumproline-rich mucin homolog102 27
tuberculosis
1498 g1330130 human latency associated transcript97 37
(LAT)
herpesvirus ORF-2
1
1498 AAU83682 Homo SapiensGETH Human PRO protein,94 30
Seq ID No
182.
1499 AAY57937 Homo Sapiens1NCY- Human transmembrane199 81
protein
HTMPN-61.
1499 AAY36295 Homo SapiensHUMA- Human secreted 151 100
protein
encoded by gene 72.
1499 AAG75708 Homo SapiensHUMA- Human colon cancer141 92
antigen
rotein SEQ ID N0:6472.
1500 g121428712Drosophila SD05267p 165 54
melanogaster
1500 g120975274Homo Sapiensskeletrophin 114 40
1500 g119773434Mus musculusskeletrophin 99 52
1501 ABB 17830Homo SapiensHUMA- Human nervous 82 37
system related
pol epode SEQ ID NO
6487.
1501 AA012929 Homo SapiensHYSE- Human polypeptide73 43
SEQ ID
NO 26821.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
165
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1502 gi8778340ArabidopsisF15O4.13 77 39
thaliana
1503 AAW03515 Homo sa SHKJ Human DOCK180 protein.144 33
iens
1503 11339910 Homo sa DOCK180 protein 144 33
iens
1503 113195147Mus musculusHCH 129 25
1505 AAM70790 Homo SapiensMOLE- Human bone marrow 77 53
expressed probe encoded
protein SEQ
ID NO: 31096.
1505 AAM58316 Homo SapiensMOLE- Human brain expressed77 53
single
exon probe encoded protein
SEQ ID
NO: 30421.
1505 gi~21302711~Anopheles agCP4916 77 30
gb~EAA148gambiae
sir.
56.1 PEST
1506 AAU75102 Homo sa MYRI- Heat shock protein592 79
iens 8 (HspB).
1506 AAB82535 Homo SapiensUYCO- Human heat shock 592 79
protein
Hsc70.
1506 AAE12987 Homo SapiensSRIV/ Human Hsp70 family592 79
homologue, Hsc70.
1507 ABL53627 Homo SapiensGENO- Breast protein-eukaryotic213 92
_ conserved gene 1 (BSTP-ECG1)
aal
cDNA.
1507 ABB75677 Homo SapiensGENO- Breast protein-eukaryotic213 92
conserved gene 1 (BSTP-ECG1)
protein.
1507 AAY99421 Homo sapiensGETH Human PRO1433 (UNQ738)213 92
amino acid se uence SEQ
ID N0:292.
1508 AAW 15565Homo SapiensUYJO Human intracellular79 29
tyrosine
kinase Tnkl-al ha.
1508 g1233062 Gallus gallussrc dovcmstream region 78 33
1508 g118376366Neurospora related to ribosomal 72 30
protein S 15
crassa precursor (mitochondrial)
1509 gi~21297482~Anopheles agCP15541 68 36
gb~EAA096gambiae
str.
27.1 PEST
1510 AAM41631 Homo SapiensHYSE- Human polypeptide 127 37
SEQ ID
NO 6562.
1510 AAM39845 Homo sapiensHYSE- Human polypeptide 127 37
SEQ ID
NO 2990.
1510 AAM79502 Homo SapiensHYSE- Human protein SEQ 127 37
ID NO
3148.
1511 g121217669Mus musculusm osin IIIA 70 28
1511 gi~21302393~Anopheles agCP8799 71 36
gb~EAA145gambiae
str.
38.1 PEST
1511 gi~20822589~Mus musculussimilar to myosin IIIA 70 28
ref~XP,1408
54.1 ~
1512 g16911049Babesia p9.6.2-like variant erythrocyte82 28
bovis surface
antigen-la
1512 g16911045Babesia p9.6.2 variant erythrocyte82 28
bovis surface
antigen-la
1512 g16911047Babesia p8.4.1 variant erythrocyte81 28
bovis surface
antigen-la
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
166
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1513 gi10174843Bacillus maltose transport system77 25
(permease)
halodurans
1513 gi56312 Rattus Gephyrin 76 31
norvegicus
1513 gi4325371Arabidopsis contains similarity to 74 28
Medicago
thaliana truncatula N7 protein
(GB:Y17613)
1514 AAY14196Homo SapiensTAKEI T cell receptor 95 100
zeta chain
protein sequence.
1514 1623042 Homo SapiensT-cell receptor zeta 95 100
chain
1514 14960202Sus scrofa CD3 zeta chain 95 100
1515 ABB07508Homo SapiensINCY- Human aminoacyl 726 100
tRNA
synthetase (ATRS) polypeptide
(ID:
7474756CD 1 ).
1515 AAB43670Homo SapiensHUMA- Human cancer associated604 82
rotein sequence SEQ ID
NO:1115.
1515 g11464742Homo sa iensthreonyl-tRNA synthetase604 82
1516 g121109348Xanthomonas cytochrome B561 77 29
axonopodis
pv.
citri str.
306
1516 g121114046Xanthomonas cytochrome B561 76 28
campestris
pv.
campestris
str.
ATCC 33913
1516 gi~21243760~Xanthomonas cytochrome B561 77 29
reflIVP-6433axonopodis
pv.
42.1 citri str.
306
1517 ABB 11450Homo SapiensHYSE- Human neurotoxin 119 33
homologue,
SEQ ID N0:1820.
1517 18809770Mus musculusLy-6I.1 94 30
1517 18809768Mus musculuslymphocyte antigen LY6I 94 30
recursor
1519 gi~59977~emHuman tripartite fusion transcript171 67
PLA2L
b~CAA7866endogenous
2.1 ~ retrovirus
1519 gi~17826947~Pseudomonas beta-1,4-xylanase 73 34
sp.
dbj~BAB792ND137
87.1
~
1519 gi~21232680~Xanthomonas ribonuclease PH 72 30
ref~NP_6385campestris
pv.
97.1 campestris
~ str.
ATCC 33913
1520 AAM78023Homo sapiensMOLE- Human bone marrow 190 100
expressed probe encoded
protein SEQ
ID NO: 38329.
1520 AAM65326Homo sapiensMOLE- Human brain expressed190 100
single
exon probe encoded protein
SEQ ID
NO: 37431.
1520 g113447468Emericella FH1/FH2 protein homolog 121 49
nidulans
1522 AAG81417Homo SapiensZYMO Human AFP protein 287 100
sequence
SEQ ID N0:352.
1523 AAY90349Homo SapiensSMII~ Human fatty acid 158 85
synthase
(FAS) protein sequence.
1523 AAB43871Homo SapiensHLTMA- Human cancer associated158 85
rotein se uence SEQ ID
N0:1316.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
167
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1523 1915392 Homo Sapiensfatty acid synthase 158 85
1525 AAG03819 Homo SapiensGEST Human secreted protein,93 100
SEQ ID
NO: 7900.
1525 11311466 Homo sa 24-kDa subunit of Com 93 100
iens lex I
1525 g1188852 Homo SapiensNADH-ubi uinone reductase93 100
1526 AAD02855_Homo SapiensSUKA Human platelet membrane73 31
aal lycoprotein VI (GPVI)
cDNA.
1526 AAB49403 Homo SapiensMERE Human glycoprotein 73 31
VI mature
protein.
1526 AAB61257 Homo SapiensMILL- Mature human TANGO73 31
268
rotein.
1527 g117864896Mus musculusrotocadherin 18 precursor81 31
1527 g115980222Yersinia aconitate hydratase 1 79 30
pestis
1527 g112248353Fasciola NADH dehydrogenase subunit75 56
hepatica 5
1528 g12440214Trypanosomainvariant surface glycoprotein83 28
100
bruceibrucei
1528 g110567463Rhizobium probable viral gene 78 22
rhizogenes
.
1529 g12231279Porcine envelope protein 66 31
reproductive
and
respiratory
syndrome
virus
1530 gi~199851~gbMus musculuspot protein 257 42
~AAA39757.
1~
1530 gi~1498648~gMus musculusGag-Pol polyprotein 257 42
b~AAB0645
0.1~
1530 gi~331995~gbAKV marine gag-pot polyprotein (tag257 42
amber codon
~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)
1~
1533 g1435698 Homo sa CD44SP 136 100
iens
1533 AAV63461_Homo SapiensGEHO Human CD44 antigen 130 100
cDNA.
aal
1533 AAT14724_Homo SapiensGEHO Human haematopoietic130 100
CD44
aal cDNA clone CD44.5.
1534 g12622165Methanothermobacetyltransferase 71 29
acter
thermautotrophic
us str.
Delta H
1534 gi~15679078~Methanothermobacetyltransferase 71 29
ref~NP_2761acter
95.1 ~ thermautotrophic
us
1535 g17777 Drosophila protein H 73 28
melanogaster
1535 g1457146 Plasmodium rhoptryprotein 73 38
yoelii
1535 g113195258Plasmodium 235 kDa rhoptry protein 73 38
yoelii yoelii
1536 ABB09740 Homo sapiensBODE- Amino acid sequence132 43
of human
protein hos hatase 11.66.
1536 gi~20830386~Mus musculussimilar to importin alpha72 35
1b
reflXP
1456
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
168
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
42.1
1537 gi14039907Rattus cytochrome P450 monooxygenase353 39
norvegicus CYP2T1
1537 gi2920650Mus musculuscytochrome P450 CYP2B19 275 44
1537 12353336 Capra hircuscytochrome P450 271 31
1538 AAU83175 Homo SapiensZYMO Novel secreted protein282 100
Z874015G4P.
1538 g16714803Streptomycesintegral membrane protein.77 26
coelicolor
A3(2)
1539 g112963397Prunus x ribulose-1,5-bisphosphate74 32
yedoensis carboxylase/oxygenase
lar a subunit
1539 g1466436 SaccharomycesBOI1 69 31
cerevisiae
1539 g15833897Besleria ribulose 1,5-bisphosphate69 31
affinis carboxylase
large subunit
1542 AAY32193 Homo SapiensINCY- Human receptor 73 26
molecule
(REC) encoded by Incyte
clone
044150.
1542 g17576677HelicobacterIceAl 72 44
ylori
1542 gi~20841498~Mus musculussimilar to MUF1 protein 73 26
re~XP_l
315
41.1
1546 114581448Homo SapiensFSHD Region Gene 2 protein73 42
1546 g115982852ArabidopsisAT5g66850/MUD21_ll 71 34
thaliana
1546 gi~14581448~Homo SapiensFSHD Region Gene 2 protein73 42
gb~AAK219
77.1 ~
1547 g118676660Homo sa FLJ00229 protein 192 92
iens
1547 AAU21409 Homo SapiensHUMA- Human novel foetal179 100
antigen,
SEQ ID NO 1653.
1547 AAM42128 Homo SapiensHYSE- Human polypeptide 114 53
SEQ ID
NO 7059.
1548 AAG64494 Homo SapiensSHAN- Human natriuretic 539 100
peptide
receptor 18.
1548 118676710Homo sa FLJ00254 rotein 268 77
iens
1548 AAB28764 Homo SapiensHUMA- Sequence homologous249 72
to
rotein fragment encoded
by gene 21.
1549 AAB67055 Homo Sapiens1NCY- Human immune response606 82
molecule (IMUN) protein
SEQ ID NO:
9.
1549 AA001862 Homo SapiensHYSE- Human polypeptide 404 72
SEQ ID
NO 15754.
1549 gi~6753924~rMus musculusFriend virus susceptibility213 36
1
ef~NP
0343
_
74.1 ~
1550 1190129 Homo Sapiens70kDa peroxisomal membrane92 100
protein
1550 g1825711 Homo Sapiens7bkD peroxisomal integral92 100
membrane
protein
1550 g1220862 Rattus PMP70 89 94
norve icus
1551 AAM69543 Homo SapiensMOLE- Human bone marrow 228 100
expressed robe encoded
rotein SEQ
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
169
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
ID NO: 29849.
1551 AAM57148 Homo SapiensMOLE- Human brain expressed228 100
single
exon probe encoded protein
SEQ ID
NO: 29253.
1551 AAB93944 Homo SapiensHELI- Human protein 94 57
sequence SEQ
ID N0:13960.
1552 gi4884924Rangiferine glycoprotein C 75 34
he esvirus
1
1552 gi~18556240~Homo sapienssimilar to Salivary 78 30
glue protein SGS-3
ref~~ precursor
0676
28.2
1552 gi~4884924~gRangiferine glycoprotein C 75 34
b~AAD3187herpesvirus
1
6.1~
1553 gi~2193870~dMus musculusreverse iranscriptase 176 35
bj ~BAA2041
9.1
1553 gi~2731767~gMus musculusendonuclease/reverse 176 35
transcriptase
b~AAC5354
2.1
1554 ABB08776 Homo SapiensBODE- Human neuregulin 75 29
55 SEQ ID
NO 2.
1554 AAM92816 Homo SapiensHUMA- Human digestive 71 29
system
antigen SEQ ID NO: 2165.
1554 gi~6322838~rSaccharomycesProtein required for 70 27
cell viability;
ef~NP cerevisiae Yk1014cp
0129
_
11.1
1555 gi7528184Drosophila bicoid-interacting protein78 28
B1N3
melanogaster
1555 gi15292595Drosophila SD09926p 78 28
melanogaster
1555 gi4514620Mus musculusRor2 71 24
1557 ABA91504_Homo SapiensEYEE- Human epidermal 144 93
growth factor
aal rece for recursor cDNA.
1557 AAF85332_Homo SapiensNOVS Nucleotide sequence144 93
of wild
aal a EGFRl.
1557 AAM50768 Homo SapiensEPEE- Human epidermal 144 93
growth factor
receptor precursor.
1558 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein221 100
cysteine methyltransferase
14.
1558 AAU16267 Homo SapiensHUMA- Human novel secreted221 100
protein,
Seq ID 1220.
1558 ABB 11507Homo SapiensHYSE- Human secreted 183 97
protein
homologue, SEQ ID N0:1877.
1559 gi14599730Sachea correaematurase 71 28
1559 gi14599648Blepharandramaturase 71 30
hetero etala
1559 gi14599673Galphimia maturase 70 28
acilis
1560 gi2323287multiple polyprotein 340 83
sclerosis
associated
retrovirus
1560 gi 13310191multiple recombinant envelope 260 70
protein
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
170
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
gb~AAK181sclerosis
89.1~AF331associated
500_1 retrovirus
element
1560 gi~21103962~Homo Sapiensenverin-2 248 84
gb~AAM331
41.1
1561 AAB94698 Homo SapiensHELI- Human protein sequence107 95
SEQ
ID NO:15680.
1561 AAU18480 Homo SapiensHUMA- Human endocrine 107 95
polypeptide
SEQ ID No 435.
1561 ABB 10288Homo sapiensHUMA- Human cDNA SEQ 107 95
ID NO:
596.
1562 gi969078 Drosophila S-adenosylhomocysteine 73 26
hydrolase
melanogaster
1562 gi21064553Drosophila RE58316p 73 26
melano aster
1562 AAM41205 Homo SapiensHYSE- Human polypeptide 72 30
SEQ ID
NO 6136.
1563 gi1778844DictyosteliumLimA 71 34
discoideum
1563 gi~20985456~Mus musculussimilar to actin beta 75 36
chain - human
ref~XP-1421
11.1
1563 gi~1778844~gDictyosteliumLimA 71 34
b~AAB4092discoideum
9.1~
1564 gi~9507757~rPlasmid resolvase 507 91
F
etlNP_0614
23.1
1564 gi~148589~gbPlasmid Protein D 507 91
F
~AAA24900.
1~
1564 gi~10955295~Escherichiaresolvase 501 90
coli
retlNP_0526
36.1
1565 gi7649370Arabidopsisguanine nucleotide-exchange-like77 38
thaliana rotein
1565 gi1674160Mycoplasma involved in cytadherence,71 35
see:
neumoniae MPN142
1565 gi~15229258~Arabidopsisguanine nucleotide-exchange77 38
- like
ref~NP_1899thaliana protein
16.1
1566 gi1799600SwissProt similar to 1051 99
Accession
Number P31458
1566 gi13814506Sulfolobus Mandelate racemase /muconate286 35
solfataricuslactonizing enzyme related
protein
(MR/MLE)
1566 gi10640034Thermoplasmastarvation-sensing protein270 35
rspA related
acido hilumprotein
1567 gi13359972Escherichiaacridine efflux pump 573 98
coli
0157:H7
1567 gi1773144Escherichiaprobable transmembrane 573 98
coli protein AcrE
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
171
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1567 gi532311 Escherichia114 kDa rotein 573 98
coli
1569 gi8918871YccA of 96 pct identical to gp:AB021078288 98
plasmid 30
ColIb-P9]
[Plasmid
F
1569 gi~17136976~Drosophila repo-P1; Antibody RK2 71 33
ref~NP_4770melanogaster
26.1)
1569 gi~6502544~gGlomus homeobox protein HB 1 70 31
b~AAF14351intraradices
.1~AF11019
81
1570 gi13363792Escherichiazinc-transporting ATPase410 87
coli
0157:H7
1570 gi466605 EscherichiaNo definition line found410 87
coli
1570 gi12518128Escherichiazinc-transporting ATPase410 87
coli
0157:H7
EDL933
1571 AAU83186 Homo SapiensZYMO Novel secreted protein1006 100
Z887014G7P.
1571 gi7248459Zea mays arabinogalactan protein 85 29
1571 gi3513742Arabidopsiscontains similarity to 82 35
Zea mays
thaliana embryogenesis transmembrane
protein
(GB:X97570)
1572 gi12597465CaenorhabditisCED-1 72 44
elegans
1572 gi19571666Caenorhabditissimilar to EGF-like domain72 44
elegans
1572 gi4883938Drosophila laminin alphal,2 67 31
melanogaster
1573 ABB12490 Homo sapiensHYSE- Human bone marrow 106 38
expressed
rotein SEQ ID NO: 329.
1574 11478205 Mus musculusPNG rotein 75 41
1574 AAM40148 Homo SapiensHYSE- Human polypeptide 69 56
SEQ ID
NO 3293.
1574 AAM79341 Homo SapiensHYSE- Human protein SEQ 69 35
ID NO
2987.
1576 gi~20882651~Mus musculusATPase, class 2, member 234 91
b
ref~XP_1233
03.1
1576 gi~7656918~rMus musculusATPase, class 2, member 234 91
b; ATPase
ef]NP_0566 9B, class II; ATPase
9B, p type
20.1 ~
1577 g118143418Alteromonaschitinase A 77 39
Sp.
O-7
1577 g115426105Leishmania probable surface antigen75 24
protein
ma'or
1578 119702241Homo Sapiensrabconnectin 439 93
1578 g17452946Homo SapiensX-like 1 protein 132 41
1578 g11279384Drosophila X 109 29
melanogaster
1580 AAE20337 Homo SapiensHUMA- Human B7-H11 protein122 23
mature extracellular
domain.
1580 AAE20336 Homo SapiensHUMA- Human B7-H11 protein122 23
extracellular domain.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
172
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1580 gi2062702Homo sa butyrophilin 122 23
iens
1581 AAE18640 Homo SapiensINCY- Human G-protein 70 35
coupled
rece for (GCREC-1).
1581 118369751Oryza sativaethylene res onsive rotein70 50
1581 g115217292Oryza sativa]Putative AP2 domain containing70 50
[Oryza sativaprotein
(japonica
cultivar-
oup)
1583 g16468047Homo SapiensKrup el-like factor 85 73
1583 g15916096Homo SapiensKru pel-like factor LKLF85 73
1583 g14583418Homo SapiensKruppel-like zinc forger85 73
transcription
factor
1585 g12570021Homo Sapienspaired box containing 77 .37
transcription
factor
1585 13115988 Homo SapiensdJ394P2-1.1 (PAX-7) 77 37
1585 12570015 Homo sa alternative 77 37
iens
1586 g17861533Rattus retina specific protein 72 43
PAL
norvegicus
1586 g120977028Xenopus mitotic hosphoprotein 72 34
laevis 39
1586 AAB58458 Homo SapiensROSE/ Lung cancer associated68 39
polype tide se uence
SEQ ID 796.
1587 g15901864Drosophila BcDNA.LD27873 81 24
melanogaster
1587 g115458514StreptococcusPneumococcal histidine 78 27
triad protein D
neumoniae precursor
R6
1587 15042400 Homo sa NFI-X3=transcription 75 30
iens factor AA
1592 g14210501Homo sa BC85722_1 253 61
iens
1592 g114794910Homo sa ca icua protein 253 61
iens
1592 114794914Mus musculusca icua protein 253 61
1593 gi~8131854~gTrypanosomaantigen JL8 69 34
b~AAF73108cruzi
.1 CAF
14795
61
1595 g118892729Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase
furiosus
DSM
3638
1595 gi~20847046~Mus musculussimilar to Transcription70 28
factor BTF3
ref~XP_1366 (RNA polymerise B transcription
21.1 factor 3)
1595 gi~18977088~Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase
ref~NP_5784furiosus
DSM
45.1 3638
1597 AAU83621 Homo SapiensGETH Human PRO protein, 151 42
Seq ID No
60.
1597 AA005826 Homo SapiensHYSE- Human polypeptide 146 83
SEQ ID
NO 19718.
1597 AAM41346 Homo SapiensHYSE- Human polypeptide 102 46
SEQ ID
NO 6277.
1598 AAM79503 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO
3149.
1598 AAM78519 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO
1181.
1598 g118676526Homo sa FLJ00160 rotein 80 35
iens
1599 g12149640ArabidopsisAr~onaute protein 72 33
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
173
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
thaliana
1599 gi15027491respiratoryglycoprotein 71 32
syncytial
virus
1599 gig 15221177Arabidopsisleaf development protein72 33
Argonaute
reflNP-1752thaliana
74.1
1601 gi17130010Nostoc Sp. WD-40 repeat protein 136 28
PCC
7120
1601 gi1653631Synechocystisbeta transducin-like 131 26
protein
s . PCC '
6803
1601 gi17135261Nostoc Sp. WD-40 repeat protein 115 27
PCC
7120
1602 gi1103853Rattus rHAPl-A 89 33
norve icus
1602 gi1103851Rattus huntingtin associated 89 33
protein
norve icus
1602 gi14579673Takifugu pericentriolar material 87 30
1 protein
rubripes
1603 gi537446 ArabidopsisAtHSP101 75 31
thaliana
1603 gi12324908Arabidopsisheat shock protein 101; 75 31
13093-16240
thaliana
1603 gi6715468Arabidopsisheat shock protein 101 75 31
thaliana
1604 12190531 Vibrio choleraemethyl acceptin chemotaxis71 26
rotein
1604 g19657614Vibrio choleraehemolysin secretion protein71 26
HyIB
1604 g19655306Vibrio choleraeheat shock rotein E 70 35
1605 g13912936Geobacillusornithine carbamoyltransferase68 31
stearothermophil
us
1606 g18797 Drosophila CYS3HIS finger protein 678 51
melano aster
1606 g115291975Drosophila LD33756p 617 65
melanogaster
1606 g16967181Homo Sapiensc399E4.1 (similar to 549 75
D.melanogaster
unkem t protein.)
1607 gi~21301783~Anopheles agCP8730 72 35
gb~EAA139gambiae
str.
28.1 PEST
1607 gi~21361276~Homo Sapiensinterferon-stimulated 68 29
transcription
ref~NP_0060 factor 3, gamma (48kD);
interferon-
75.2~ stimulated gene factor
3, gamma
subunit (48 kD)
1609 g12661094Spinacia cold acclimation protein76 32
oleracea
1612 gi~1780975~eHuman gag protein 312 34
mb~CAA714endogenous
18.1 ~ retrovirus
K
1612 gi~5802810~gHomo SapiensGag-Pro-Pol protein 309 34
b~AAD5179
1.1~
1612 gi~887448~eHuman gag 309 34
mb~CAA513endogenous
06.1 ~ retrovirus
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
174
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1613 AA013889Homo SapiensHYSE- Human polypeptide 73 42
SEQ ID
NO 27781.
1614 111065727Homo sa iensdJ493F7.1 (similar to 347 100
marine BET3)
1614 g12791806Mus musculusbeta 253 69
1614 113277654Mus musculusBet3 homolo (S. cerevisiae)253 69
1615 g11122901SaccharomycesMSP8 77 20
cerevisiae
1615 g1825546SaccharomycesCatBp 77 20
cerevisiae
1615 g117978563Xeno us laevisSpl-like zinc-finger 75 40
protein XSPR-1
1616 AAY02536Homo SapiensICOS- Human ICAM-6 protein458 98
sequence.
1616 g112248907Homo sa iensTCAM-1 458 98
1616 g14579740Ratios testicular cell adhesion366 76
molecule 1
norve icus (TCAM1)
1617 AAM67067Homo SapiensMOLE- Human bone marrow 271 64
expressed probe encoded
protein SEQ
ID NO: 27373.
1617 AAM54664Homo SapiensMOLE- Human brain expressed271 64
single
exon probe encoded protein
SEQ ID
NO: 26769.
1617 AAM56747Homo SapiensMOLE- Human brain expressed229 69
single
exon probe encoded protein
SEQ ID
NO: 28852.
1618 g15802814Homo sapiensGag-Pro-Pol-Env rotein 532 52
1618 g11780973Human poi protein 531 52
endogenous
retrovirus
K
1618 15802821Homo sa iensGa -Pro-Pol protein 531 52
1619 g12769587Mus musculusSTOP rotein 662 86
1619 g11370291Rattus STOP protein 662 92
norve icus
1619 g13287265Rattus E-STOP protein 662 92
norve icus
1620 AAM65980Homo sapiensMOLE- Human bone marrow 266 100
expressed probe encoded
protein SEQ
ID N0: 26286.
1620 AAM53601Homo SapiensMOLE- Human brain expressed266 100
single
exon probe encoded protein
SEQ ID
NO: 25706.
1620 gi~20270271~Mus musculusRIKEN cDNA 1190017012 198 80
ref~NP_6200
82.1
1621 g111862941Mus musculusDDM36E 74 33
1621 111862939Mus musculusDDM36 74 33
1621 g17650186Mus musculusneighbor of Punc e1 l 73 33
rotein
1622 g13157464Thermos Sp. integral membrane rotein74 38
A4
1623 gi~59977~emHuman tripartite fusion transcript129 82
PLA2L
b~CAA7866endogenous
2.1 ~ retrovirus
1623 gi~20161147~Oryza sativaVsaA -like protein 88 32
dbj~BAB900(japonica
75.1 cultivar-group)
~
1623 gi~17864474~Drosophila domino ~ 87 41
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
175
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
ref~NP_5248melanogaster
33.1
1626 AA000498 Homo SapiensHYSE- Human polypeptide99 43
SEQ ID
NO 14390.
1627 g114041733Xenorhabdus XptA2 protein 70 23
nematophila
1627 gi~15641593~Vibrio choleraecatalase 69 23
re~NP_2312
25.1
1628 g119888204MethanopyrusSite-specific DNA methylase80 27
kandleri
AV 19
1628 g16358691Simian Pol protein 78 32
immunodeficienc
y virus
1628 gi~20094956~MethanopyrusSite-specific DNA methylase80 27
ref~NP-6148kandleri
AV19
03.1 ~
1629 AAB07704 Homo Sapiens1NMR Protein encoded 594 67
by the
endogenetic fragment
of HERV-W.
1629 g18272464Homo sa iensgag 594 67
1629 AAB07703 Homo SapiensINMR Protein encoded 590 66
by the
endogenetic fragment
of HERV-W.
1630 g132498 Homo sa iensprecursor (AA -23 to 145 100
476)
1630 1339595 Homo sa ienstriglyceride lipase 145 100
precursor
1630 1386859 Homo sa ienshepatic 1i ase 145 100
1631 g18777465Rattus cytoplasmic dynein heavy703 77
chain
norvegicus
1631 g117019507Tripneustes dynein heavy chain isotype505 53
1B
gratilla
1631 AAB93815 Homo SapiensHELI- Human protein 457 71
sequence SEQ
ID N0:13606.
1632 AAM68837 Homo SapiensMOLE- Human bone marrow122 48
expressed probe encoded
protein SEQ
ID NO: 29143.
1632 AAM56460 Homo SapiensMOLE- Human brain expressed122 48
single
exon probe encoded protein
SEQ ID
NO: 28565.
1632 g117861826Drosophila GM01964p 90 51
melano aster
1633 gi~21300783~Anopheles ebiP1105 77 33
gb~EAA129gambiae str.
28.1 ~ PEST
1633 gi~19880523~Bactrocera vitellogenin 1 precursor68 27
gb~AAM003dorsalis
72.1 ~AF3
68
053 1
1633 gi~21070999~Homo Sapiensstromal interaction 68 39
molecule 2
ref~NP-0659 precursor
11.1
1637 g12323287multiple polyprotein 289 91
sclerosis
associated
retrovirus
1637 gi~21103962~Homo Sapiensenverin-2 261 82
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
176
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
gb~AAM331
41.1
1637 gi~13310191~multiple recombinant envelope 259 82
protein
gb~AAK181sclerosis
89.1~AF331associated
500_1 retrovirus
element
1638 AAR58809 Homo sa iensUYNY Human RPTP- aroma.86 26
1638 gi292411 Homo Sapiensreceptor-type protein 86 26
tyrosine
hosphatase aroma
1638 11263069 Homo sa iensreceptor tyrosine phos 86 26
hatase gamma
1639 g19857054Leishmania possible CG7055 protein74 27
maj or
1639 gi~20853034~Mus musculusexpressed sequence AI44751973 35
ref~XP_1259
62.1
1639 gi~7008003~dMus musculustranscription factor 73 35
MAZR
bj ~BAA9087
4.1~
1640 AAG03810 Homo SapiensGEST Human secreted 220 95
protein, SEQ ID
NO: 7891.
1640 1186800 Homo Sapiensribosomal protein L12 220 95
1640 g157680 Rattus rattusribosomal protein L12 220 95
1641 AAB44286 Homo SapiensGETH Human PR01072 (UNQ529)1709 100
protein sequence SEQ
ID N0:303.
1641 AAY41730 Homo sapiensGETH Human PR01072 protein1709 100
sequence.
1641 114602625Homo sapiensPAN2 rotein 1709 100
1642 g120147241Arabidopsis ATSg09850/MYH9 6 74 32
thaliana
1642 g114329782Homo sa iensdJ1121G12.3 (Novel gene)72 28
1642 gi~16648730~Arabidopsis ATSg09850/MYH9_6 74 32
gb~AAL255thaliana
57.1
1643 g12952340Ratios insulin receptor substrate89 31
2
norvegicus
1643 g12653351Bovine product of latency-related83 30
gene
herpesvirus
type
1.1
1643 14511969 Homo Sapiensinsulin rece for substrate-282 26
1644 g19964099Chlamydia inclusion membrane protein73 35
trachomatis
1644 g119171028EncephalitozoonATP DEPENDENT DNA BINDING67 29
cuniculi HELICASE (RAD3/XPD
SUBFAMILY OF HELICASES)
1644 gi~9964095~gChlamydia inclusion membrane protein73 35
b~AAG0982trachomatis
1.1 ~AF2793
62 1
1646 gi~10863995~Homo Sapiensclones 23667 and 23775 67 42
zinc finger
ref~NP_0670 protein
11.1
1647 11196425 Homo sa iensenvelo a rotein 93 39
1647 g1200296 Mus musculusperlecan 85 26
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
177
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1647 18131894 Homo Sapiensmitofilin 84 27
1648 g11573040Haemophilusaspartokinase I / homoserine73 36
influenzae dehydrogenase I (thrA
Rd
1648 g18778726ArabidopsisT25N20.14 73 31
thaliana
1648 gi~16272063~Haemophilusaspartokinase I / homoserine73 36
refjNP-4382influenzae dehydrogenase I (thrA)
Rd
62.1
1649 g1295642 Saccharomycesphospholipase C 79 36
cerevisiae
1649 g17548846Saccharomycesdelta class phosphoinositide-specific77 36
cerevisiae hos holi ase C homolo
1649 g1161104 Schistosomaengrailed-like homeodomain74 35
protein
mansoni
1651 gi~13129464~Oryza sativa]Polyprotein 66 40
gb~AAK131[Oryza sativa
22.1~AC080(japonica
019 14 cultivar-
ou )
1652 AAG81446 Homo SapiensZYMO Human AFP protein 249 100
sequence
SEQ ID N0:410.
1652 118032212Homo sa histone acetyltransferase89 34
iens MOZ2
1652 AAR34936 Homo sapiensUYJO CENP-B. 77 35
1653 g120145484Bos taurus SCO-spondin 71 29
1655 AAM86382 Homo SapiensHUMA- Human 129 55
immune/haematopoietic
antigen SEQ
ID N0:13975.
1655 ABB03887 Homo SapiensHLTMA- Human musculoskeletal118 62
system related polypeptide
SEQ ID NO
1834.
1655 AAM75964 Homo SapiensMOLE- Human bone marrow 85 56
expressed probe encoded
protein SEQ
ID NO: 36270.
1659 g138035 Homo Sapiensp25 protein 110 45
1659 g1330915 Equine IR4 protein 99 28
herpesvirus
1
1659 g1156606 Chironomus SpId 84 30
tentans
1660 g19654641Vibrio cholerae3-deoxy-D-manno-octulosonic-acid84 23
transferase
1660 gi~20835446~Mus musculussimilar to STARP antigen73 25
reflXP-1444
09.1 ~
1660 gi~15596880~Pseudomonasprobable sugar aldolase 72 26
re~NP_2503aeruginosa
74.1
1661 g14062318EscherichiaHeat-responsive re ulatory79 36
coli protein
1661 g1976025 EscherichiaHrsA 79 36
coli
1661 g11786951Escherichiaprotein modification 79 36
coli enzyme, induction
K12 of om C
1662 AAM68588 Homo sapiensMOLE- Human bone marrow 155 100
expressed probe encoded
protein SEQ
ID NO: 28894.
1662 AAM56212 Homo SapiensMOLE- Human brain expressed155 100
single
exon probe encoded rotein
SEQ ID
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
178
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
NO: 28317.
1662 gi3845169Plasmodium phosphatase (acid phosphatase66 52
family)
falci arum
3D7
1663 AAG89215 Homo SapiensGEST Human secreted protein,218 100
SEQ ID
NO: 335.
1663 gi20070921Mus musculusRIKEN cDNA 2410008M22 130 55
ene
1663 AAR77602 Homo SapiensFORSI Human circulating 92 44
cytokine
CC-1 C-terminal fragment.
1664 AAE18212 Homo SapiensCURA- Human MOL4 protein.75 47
1664 AAM00966 Homo SapiensHYSE- Human bone marrow 72 35
protein,
SEQ ID NO: 442.
1665 AAB92828 Homo SapiensHELI- Human protein sequence74 93
SEQ
ID N0:11365.
1665 AAG63852 Homo SapiensINCY- Amino acid sequence74 93
of human
GTPase activating protein
GTPAP2.
1665 AAG63851 Homo SapiensINCY- Amino acid sequence74 93
of human
GTPase activatin protein
GTPAP 1.
1666 AAM72897 Homo sapiensMOLE- Human bone marrow 135 65
expressed probe encoded
protein SEQ
ID NO: 33203.
1666 AAM60268 Homo SapiensMOLE- Human brain expressed135 65
single
exon probe encoded protein
SEQ ID
NO: 32373.
1666 gi4007097Homo SapiensdJ1118D24.2 (60S Ribosomal135 65
Protein
L 10 LIKE)
1667 gi212267 Gallus anuscartilage link protein 917 49
1667 12010 Sus scrofa link rotein recursor 913 51
(AA -15 to 339)
1667 g1459439 E uus caballuslink protein 910 51
1668 110443237Mus musculuss licing factor 3a, subunit276 36
2
1668 g1396743 Podocoryne Pod-EPPT 276 30
carnea
1668 g1294131 Plasmodium circumsporozoite protein266 22
falcipanxm
1669 AAM49641 Homo sapiensBOEH Human tumour-associated132 65
antigen B345 rotein SEQ
ID NO 4.
1669 AAU12252 Homo SapiensGETH Human PRO5773 polypeptide132 65
se uence.
1669 AAY91592 Homo SapiensHUMA- Human secreted 132 65
protein
sequence encoded by gene
6 SEQ ID
N0:265.
1670 g14835383Homo sa alias DLC1 226 47
iens
1670 g14704343Homo Sapiensalias DLC1; candidate 226 47
tumor
suppressor ene
1670 g1155627 Acanthamoebamyosin I heavy chain 118 42
castellanii
1671 ABB 12490Homo SapiensHYSE- Human bone marrow 237 88
expressed
protein SEQ ID NO: 329.
1671 g16002932Streptomycesglycosyltransferase 67 35
fradiae
1671 gi~9634613~rHuman Ll 65 39
ef~NP_0381papillomavirus
50.1 ~ type 69
1672 g113938013Homo SapiensSimilar to RIKEN cDNA 333 66
2610509612
ene
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
179
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1672 gi2388970Schizosaccharomtat-binding homolog 235 41
7, AAA ATPase
yces pombe family roteiii
1672 gi6850321Arabidopsis Contains similarity 214 40
to YTA7 ATPase
thaliana gene from Saccharomyces
cerevisiae
gb~X81072, and contains
Bromodomain
PF~00439, AAA PF~00004,,
and Sigma-
54 PF~00158 transcription
factor
domains.
1673 gil 1066113Drosophila Misexpression suppressor71 29
of ras 4
melano aster
1673 gi~20829387~Mus musculusRIKEN cDNA 4930455F23 77 27
rel]XP-1295
40.1
1673 gi~17647635~Drosophila Misexpression suppressor71 29
of ras 4
ref~NP,5237melanogaster
75.1
1674 gi~20535935~Homo sapienssimilar to splicing 75 37
coactivator subunit
ref~XP-1157 SRm300; RNA binding
protein; AT-
87.1 rich element bindin
factor
1674 gi~17544226~CaenorhabditisY76B12C.4.p 72 34
re~NP_5001elegans
51.1
1674 gi~17559826)CaenorhabditissepB domain 70 26
ref~NP_5057elegans
99.1
1675 gi5708067Oryctolagus hyperpolarization activated99 27
cation
cuniculus channel
1675 gi402558 Canis familiarismucin 98 27
1675 110636484Homo Sapienspolyglutamine-containin96 26
protein
1676 AAM95365 Homo SapiensHUMA- Human reproductive73 26
system
related antigen SEQ
ID NO: 4023.
1676 AAB56709 Homo SapiensROSEI Human prostate 72 34
cancer antigen
protein sequence SEQ
ID NO:1287.
1676 g11881288Bacillus FUNCTION UNKNOWN, SIMILAR71 30
subtilis
PRODUCT IN E.COLI, H.
INFLUENZAE AND NEISSERIA
MENINGITIDIS.
1677 gi~15892512~EC:2.7.7.41]phosphatidate cytidylyltransferase65 34
ref~NP_3602[Rickettsia
26.1 conorii
1679 g114231 SaccharomycesNADH dehydrogenase (ubiquinone)75 31
cerevisiae
1679 g1805022 SaccharomycesNdilp 73 31
cerevisiae
1679 g11353352Chlamydomonasalanine aminotransferase70 27
reinhardtii
1680 g11805421Bacillus surfactin production 77 36
subtilis
1680 g1396482 Bacillus srfA2 77 36
subtilis
1680 g1516360 Bacillus surfactin synthetase 77 36
subtilis
1681 AAG64494 Homo SapiensSHAN- Human natriuretic156 80
peptide
rece for 18.
1681 AAE16275 Homo SapiensINCY- Human kinase PKIN-21154 73
protein.
1681 AAM40599 ~ Homo Sapiens~ HYSE- Human polypeptide~ 154 ~ 73
SEQ ID I
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
180
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
NO 5530.
1682 g12323287multiple polyprotein 1646 75
sclerosis
associated
retrovirus
1682 gi~2351212~dFriend marinegag-pol polyproteiii 807 40
(precursor protein)
bj ~BAA2206leukemia
virus
4.1~
1682 gi~9626961~rMarine leukemiaPr180 802 40
ef~NP_0579virus
33.1
1683 AAM39205 Homo SapiensHYSE- Human polypeptide 457 53
SEQ ID
NO 2350.
1683 g13033415Gibbon ape gag polyprotein 353 38
leukemia
virus
1683 gi~6524623~gPhascolarctosgag protein 343 38
b~AAF15097cinereus
.1~
1684 g119110438Homo Sapienspolycystin-1L1 712 98
1684 g16361629Periplanetavitellogenin 81 25
americana
1684 13115393 Rana 1 iensguanylate cyclase inhibitory80 35
protein
1686 AAY91542 Homo SapiensHUMA- Human secreted 212 84
protein
sequence encoded by gene
92 SEQ ID
N0:215.
1686 11279841 Bos taurus glycine trans otter 72 36
1686 119879917Oryza sativaacid hosphatase 70 35
1687 g112056568Homo sa MSTP063 212 88
iens
1687 113539684Homo sa zinc forger rotein 291 212 88
iens
1687 gi~12056568~Homo SapiensMSTP063 212 88
gb~AAG479
45.1~AF119
814 1
1689 g15689766Homosa ienszinc finger 2.2 222 91
1689 AAU16267 Homo SapiensHUMA- Human novel secreted178 58
protein,
Seq ID 1220.
1689 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein177 60
cysteine methyltransferase
14.
1690 g13328880Chlamydia Protein Export 73 29
trachomatis
1690 g12832232Brucella flagellin; FIiC 67 29
melitensis
biovar
Aborius
1690 g117984285Brucella FLAGELL1N 67 29
melitensis
1692 g14927443Haemophilushemoglobin/hemoglobin-haptoglobin93 80
influenzae binding protein
1692 g14204775Haemophilushemoglobin and hemoglobin-93 80
influenzae ha toglobin bindin protein
1692 g13647226Haemophilusliemoglobin binding protein93 80
influenzae
1694 AAW95631 Homo SapiensGEMY Homo Sapiens secreted102 100
protein
gene clone hj968 2.
1694 g113162186Homo Sapiens~ calsyntenin-3 protein ~ 102 ~ 100
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
181
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1695 AA004205 Homo SapiensHYSE- Human polypeptide 81 37
SEQ ID
NO 18097.
1695 gi160180 Plasmodium circumsporozoite antigen81 29
cynomolgi
1695 gi495522 Plasmodium circumsporozoite protein80 30
simiovale
1696 AAM80223 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO
3869.
1696 AAM79239 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO
1901.
1696 gi3688394Homo sa triple LIM domain rotein252 66
iens
1697 gi19887715MethanopyrusPredicted membrane protein74 28
kandleri
AV 19
1698 AAM93184 Homo SapiensHELI- Human polypeptide,269 87
SEQ ID
NO: 2552.
1698 118044066Mus musculusRIKEN cDNA 5033406L14 226 76
gene
1698 AAB95302 Homo SapiensHELI- Human protein sequence194 78
SEQ
ID N0:17538.
1699 ABB17279 Homo SapiensHUMA- Human nervous system110 56
related
olypeptide SEQ ID NO
5936.
1699 AA013013 Homo SapiensHYSE- Human polypeptide 101 71
SEQ ID
NO 26905.
1699 gi~7650258~gHepatitis polyprotein 74 28
C virus
b~AAF65960
.1 ~AF20777
0 1
1700 g112697585Arabidopsis4-(cytidine 5'-phospho)-2-C-methyl-D-69 40
thaliana erithritol kinase
1701 g116740569Homo sa Similar to thymus expressed84 27
iens gene 3
1701 g117940760Mus musculuscask-interacting protein79 26
2
1701 g117940758Homo sapienscask-interacting protein77 26
1
1702 g117385401Homo SapiensTPIP alpha 1i id phosphatase234 62
1702 AAU75783 Homo sapiensINCY- Human protein phosphatase208 57
1
(PP1) protein sequence.
1702 AAG67638 Homo SapiensHELI- Amino acid sequence202 56
of a
human rotein.
1703 AAO07887 Homo SapiensHYSE- Human polypeptide 246 85
SEQ ID
NO 21779.
1703 AA008651 Homo SapiensHYSE- Human polypeptide 239 83
SEQ ID
NO 22543.
1703 AA008732 Homo SapiensHYSE- Human polypeptide 221 80
SEQ ID
NO 22624.
1704 AAB94588 Homo SapiensHELI- Human protein sequence82 52
SEQ
ID N0:15392.
1704 g13288914Mus musculusaortic carboxypeptidase-like82 24
protein
ACLP
1704 AAM93437 Homo SapiensHELI- Human polypeptide,81 32
SEQ ID
NO: 3074.
1706 AAM86104 Homo SapiensHUMA- Human 179 100
immune/haematopoietic
antigen SEQ
ID N0:13697.
1706 g110039425E uus caballusALR rotein 120 40
1706 120502826Eimeria cGMP-dependent rotein 115 35
maxima kinase
1707 AAM70251 Homo sapiensMOLE- Human bone marrow ~ 115 ~ 78
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
182
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
expressed probe encoded
protein SEQ
ID NO: 30557.
1707 AAM57834 Homo SapiensMOLE- Human brain expressed115 78
single
exon probe encoded protein
SEQ ID
NO: 29939.
1707 gi15450860Arabidopsisserine/threonine-protein71 56
kinase Mak
thaliana (male germ cell-associated
kiiiase)-like
protein
1708 11620403 Homo sa SF1-Bo isoform 82 41
iens
1708 119072991H ocrea class III chitinase precursor82 40
virens
1708 118765873Hypocrea class III chitinase 82 40
virens
1709 AAM52240 Homo sa 1NCY- Human MFAP4 SEQ 1384 100
iens ID NO 3.
1709 g1790817 Homo sa microfibril-associated 1384 100
iens glycoprotein 4
1709 AAM52239 Homo sapiensINCY- Human MAG4V SEQ 1374 100
ID NO 1.
1710 g116769882Drosophila SD07884p 67 27
melanogaster
1710 gi~17545505~Ralstonia CONSERVED HYPOTHETICAL 66 41
ret)NP_5189solanacearumPROTEIN
07.1
1711 AAU82954 Homo SapiensANAD- Human homologue 111 27
of MPT1
rotein target for antifungal
com ound.
1711 g12058326Homo Sapienssubunit of RNA polymerase111 27
II
transcri tion factor
TFIID
1711 g113559031Homo sapiensbA11M20.1 (TATA box binding108 26
protein (TBP)-associated
factor, RNA
polymerise II, C1, 130kD)
1712 AAB65626 Homo SapiensSUGE- Novel protein kinase,209 82
SEQ ID
NO: 152.
1712 AAM25283 Homo sapiensHYSE- Human protein sequence209 82
SEQ
ID N0:798.
1712 AAU17269 Homo SapiensHUMA- Novel signal transduction176 67
pathway protein, Se ID
834.
1713 g118256065Mus musculusSimilar to ATPase, class127 67
II, type 9A
1713 AAM76495 Homo SapiensMOLE- Human bone marrow 123 70
expressed probe encoded
protein SEQ
ID NO: 36801.
1713 AAM63681 Homo SapiensMOLE- Human brain expressed123 70
single
exon probe encoded protein
SEQ ID
NO: 35786.
1714 g18096269Nicotiana KED 149 28
tabacum
1714 g11752736Saccharomycesgene required for phosphoylation148 30
of
cerevisiae oligosaccharides/ has
high homology
with YJR061w
1714 g12292986Rattus cyclic nucleotide-gated 141 28
channel beta
norvegicus subunit
1715 AAM72995 Homo SapiensMOLE- Human bone marrow 158 47
expressed probe encoded
protein SEQ
ID NO: 33301.
1715 AAM60359 Homo SapiensMOLE- Human brain expressed158 47
single
exon probe encoded protein
SEQ ID
NO: 32464.
1715 gi~13539605~Paramecium cycloplulin-RNA interacting144 45
protein
emb~CAC35tetraurelia
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
183
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
733.1
~
1716 AAM71015 Homo SapiensMOLE- Human bone marrow251 64
expressed probe encoded
protein SEQ
ID NO: 31321.
1716 AAM58517 Homo sapiensMOLE- Human brain expressed251 64
single
exon probe encoded protein
SEQ ID
NO: 30622.
1716 AAU19766 Homo SapiensHUMA- Human novel extracellular161 44
matrix rotein, Seq ~D
No 416.
1718 g11420924Zea mays IN1 75 27
1718 gi~14521970~Pyrococcus O-sialoglycoprotein 73 35
endopeptidase
ref~NP_1274abyssi
47.1
1719 g120513851Hordeum BPM 74 35
vul are
1719 g121039126Cryptosporidium60 kDa glycoprotein 74 26
parvum
1719 g1207158 Ratios big tau 73 36
norvegicus
1720 g118181943Caenorhabditisheparan sulfate GIcNAc 67 34
transferase-I/II
elegans
1720 g12058699Caenorhabditismultiple exostoses homolog67 34
2
ele ans
1720 gi~17554740~CaenorhabditisMULTIPLE EXOSTOSES 67 34
reilNP-4993elegans HOMOLOG 2
68.1 ~
1721 AAM69150 Homo SapiensMOLE- Human bone marrow200 38
expressed probe encoded
protein SEQ
ID NO: 29456.
1721 AAM56769 Homo SapiensMOLE- Human brain expressed200 38
single
exon probe encoded protein
SEQ ID
NO: 28874.
1721 g14185947Human pol protein 196 38
endogenous
retrovirus
I~
1722 g12065210Mus musculusPro-Pol-dUTPase olyprotein615 60
1722 g118676710Homo SapiensFLJ00254 rotein 592 60
1722 gi~20469453~Homo Sapienssimilar to FLJ00254 283 50
protein
ref~XP_1140
40.1
1723 g113881755Mycobacteriumcation efflux system 74 30
protein
tuberculosis
CDC1551
1724 AAG78866 Homo sa iensSHAN- Human zinc fin 141 68
er protein 15.
1724 ABB 17928Homo sapiensHUMA- Human nervous 99 53
system related
polypeptide SEQ ID NO
6585.
1724 gi~21295712~Anopheles agCP1631 75 26
gb~EAA078gambiae str.
57.1 ~ PEST
1725 121104340Homo Sapiensobscurin 1586 83
1725 g17024535Gallus allusstructural muscle rotein207 24
titin
1725 g11513030Gallus gallusconnectin/titin 207 24
1727 AAE19162 Homo SapiensTHOR/ Human lcinase 1096 99
polypeptide
(PK1N-20).
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
184
Table 2
SEQ AccessionSpecies Description Score
~
ID No. Identity
NO:
1727 gi2736151Rattus mytonic dystrophy kinase-related902 78
norvegicus Cdc42-binding kinase
1727 gi1695873Homo Sapiensser-thr rotein kinase 896 77
PK428
1728 AAY99411 Homo SapiensGETH Human PR01487 (UNQ756)862 67
amino acid sequence SEQ
ID N0:260.
1728 115617453Homo sapienschondroitin synthase 862 67
1728 AAE15959 Homo SapiensEUMO- Human 4589624/92-303761 79
protein, member of Fringe
and Brainiac
family.
1729 gi~15804980~EscherichiaUncharacterized conserved71 33
coli protein
ref~NP_29090157:H7
60.1 EDL933 .
1731 114268490Musca domesticahunchback 82 33
1731 AAM93401 Homo SapiensHELI- Human polypeptide,76 27
SEQ ID
NO: 3002.
1731 12076606 Musca domesticahunchback zinc finger 73 30
rotein
1732 AAY91949 Homo SapiensINCY- Human cytoskeleton1047 57
associated
protein 4 (CYSKP-4).
1732 ABB90754 Homo SapiensUYJO Human Tumour Endothelial1043 57
Marker polypeptide SEQ
ID NO 240.
1732 g1619577 Gallus alluscardiac muscle tensin 1043 56
1733 g13090889Homo Sapienssynapsin IIIa 70 38
1733 g16572355Homo sa cE86D10.1 (syna sin III)70 38
iens
1733 gi~19924105~Homo Sapienssynapsin III, isoform 70 38
IIIa
ref~NP
0034
81.2
1734 AAB85144 Homo SapiensHUMA- Human NKCR polypeptide1506 93
(clone ID HMSOM53).
1734 g14973126Mus musculushigh affinity inununoglobulin490 39
gamma
castaneus Fc receptor I
1734 g14973124Mus musculushigh affinity immunoglobulin489 39
gamma
Fc receptor I
1735 gi~15597595~Pseudomonaspyoverdine synthetase 69 30
D
reflIVP-2510aeruginosa
89.1 ~
1736 114488302Oryza sativaPutative trans oson rotein81 24
1736 g13851516Phytophthoracyst germination specific72 33
acidic repeat
infestans rotein precursor
1736 gi~14488302~Oryza sativaPutative transposon protein81 24
gb~AAK638
83.1 ~AC074
105 12
1737 AAB85357 Homo Sapiens1NCY- Human phosphatase 1591 100
(PP) (clone
ID 3402521CD1).
1737 g121205864Homo SapiensT-cell activation protein1591 100
phosphatase
2C; TA-PP2C
1737 g121464366Drosophila RE06653p 758 52
melano aster
1738 g17271811Drosophila GTPase activating protein292 38
melanogaster
1738 AAM76430 Homo SapiensMOLE- Human bone marrow 246 100
expressed probe encoded
protein SEQ
ID NO: 36736.
1738 AAM63615 Homo SapiensMOLE- Human brain ex 246 100
ressed single
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
185
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
exon probe encoded protein
SEQ ID
NO: 35720.
1739 ABB50365 Homo SapiensHUMA- Human secreted 272 87
protein
encoded by gene 65 SEQ
ID N0:313.
1739 AAW88598 Homo SapiensHUMA- Secreted protein 272 87
encoded by
gene 65 clone HFVHY45.
1739 ABB50764 Homo SapiensHUMA- Human secreted 143 92
protein
encoded by ene 65 SEQ
ID N0:716.
1740 12065210 Mus musculusPro-Pol-dUTPase pol rotein1210 58
1740 gi~10834720~Homo SapiensPP565 274 80
gb~AAG237
90.1 ~AF258
587 1
1740 gi~385615~gbMus sp. fibulin gene homolog 248 75
~AAB26708.
1~
1741 ABB90748 Homo SapiensUYJO Human Tumour Endothelial2116 97
Marker polype tide SEQ
ID NO 228.
1741 115987493Homo Sapienstumor endothelial marker2116 97
6
1741 ABB90754 Homo SapiensUYJO Human Tumour Endothelial530 37
Marker of eptide SEQ
1D NO 240.
1742 ABB 11753Homo SapiensHYSE- Human NOV/plexin-A1291 90
homolo ue, SEQ ID N0:2123.
1742 g11665757Mus musculusplexin 1 291 90
1742 16010217 Homo sa NOV/ lexin-A1 rotein 291 90
iens
1743 AAM79514 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO
3160.
1743 AAM78530 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO
1192.
1743 g11244510Homo Sapiensp311 rotein 149 90
1744 AAG93324 Homo SapiensNISC- Human protein HP 83 41
10370.
1744 g121064771Drosophila RH61467p 83 46
melano aster
1744 g118676554Homo sa FLJ00174 protein 77 41
iens
1745 14128039 Homo SapiensTL132 rotein 81 29
1745 g117983118Brucella METAL DEPENDENT HYDROLASE74 23
melitensis
1745 AAU75578 Homo SapiensUYNA- Human ubiquitin 71 31
specific
rotease 10 (USP 10).
1746 g115074154SinorhizobiumPUTATIVE FATTY 76 25
meliloti ACID/PHOSPHOLIPID SYNTHESIS
PROTEIN
1746 g11869833human myristylated tegument 75 27
protein
he esvirus
2
1746 g120516045ThermoanaerobaChemotaxis response regulator69 20
CheB,
cter consists of CheY-like
receiver domain
tengcongensisand a methylesterase
(demethylase)
domain
1747 g118025496cercopithicineEBNA-1 124 37
he esvirus
15
1747 g15821153Homo SapiensRNA binding protein 123 29
1747 g16649242Homo Sapienssplicing coactivator 123 29
subunit SRm300
1748 gi~4321764~gMus musculusMAP kinase kinase 7 alpha65 30
2
b~AAD
1581
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
186
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
9.1~
1748 gi~20859704~Mus musculusmitogen activated protein65 30
kinase kinase
ref~XP'1339 7
86.1
1748 gi~4321768~gMus musculusMAP kinase kinase 7 beta65 30
2
b~AAD
1582
1.1~
1749 AAB50964 Homo sapiensGETH Human PR01313 protein.439 89
1749 AAB47290 Homo sa GETH PR01313 0l a tide. 439 89
iens
1749 AAB24431 Homo SapiensGETH Human PR01313 protein439 89
se uence SEQ ID N0:216.
1750 AAU00502 Homo sa MILL- Human TANGO 437 115 91
iens protein.
1750 g120384654Homo Sapienstwo- ore calcium channel115 91
rotein 2
1750 AAM91059 Homo SapiensHUMA- Human 93 64
immune/haematopoietic
antigen SEQ
ID N0:18652.
1751 g110440494Homo SapiensFLJ00092 rotein 252 97
1751 AAM40956 Homo SapiensHYSE- Human polypeptide 80 30
SEQ ID
NO 5887.
1751 gi~10440494~Homo SapiensFLJ00092 protein 252 97
dbj ~BAB
157
80.1
1752 g115980036Yersinia 2-dehydro-3-deoxyphosphooctonate77 46
pesos
aldolase
1752 g111322261Diceros al ha adrenergic rece 74 26
bicornis for 2B
1752 g120516240Thermoanaerobamethylaspartate mutase 73 25
cter
ten congensis
1753 g119684014Homo Sapienssimilar to brain-specific1387 99
angiogenesis
inhibitor 3 (H. sa iens)
1753 AAB88367 Homo SapiensHELI- Human membrane 1380 99
or secretory
protein clone PSECO101.
1753 11469936 Mus musculusFGF-binding protein 158 29
1754 AAB01397 Homo SapiensINCY- Neuron-associated 435 92
rotein.
1754 g121218140Homo Sapiensrab effector MYRIP 435 92
1754 g121320161Mus musculusexophilin 8 378 77
1755 AAM74815 Homo SapiensMOLE- Human bone marrow 253 75
expressed probe encoded
protein SEQ
ID NO: 35121.
1755 AAM62013 Homo SapiensMOLE- Human brain expressed253 75
single
exon probe encoded protein
SEQ ID
NO: 34118.
1755 AAM70390 Homo sapiensMOLE- Human bone marrow 228 62
expressed probe encoded
protein SEQ
ID NO: 30696.
1756 g16460201Deinococcusphenylacetic acid degradation85 27
protein
radioduransPaaA
1756 g13309543Talcifugu MLL 79 34
rubri es
1756 AAT10059_Homo SapiensUSSH erbB-3 cDNA clone 74 31
E3-16.
aal
1757 118676406Homo sa FLJ00021 protein 70 36
iens
1758 g113423395CaulobacterNADH dehydrogenase I, 78 37
M subunit
crescentus
CB 15
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
187
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1758 gi~17506337~CaenorhabditisD1007.15.p 82 24
ref~NP-4913elegans
90.1 ~
1758 gi~16126181~CaulobacterNADH dehydrogenase I, 78 37
M subunit
ref~NP_4207crescentus
CB 15
45.1
1759 gi19881193chimpanzee transcriptional transactivator83 29
TRS1
cytome alovirus
1759 gi19881161chimpanzee transcriptional transactivator83 29
IRS1
cytomegalovirus
1759 1556297 Mus musculusal ha-1 type IV collagen81 33
1760 118033185Danio rerioUNC45-related rotein 702 79
1760 AAG77802 Homo SapiensHUMA- Human HOGEN50 603 65
serine/threonine phosphatase
protein
se uence.
1760 AAM40290 Homo SapiensHYSE- Human polypeptide 603 65
SEQ ID
NO 3435.
1761 g16634123Drosophila SoxNeuro 70 24
melano aster
1762 gi~14245700~Giardia kinesin-like protein 69 26
4
dbj~BAB561intestinalis
42.1
1762 gi~165011~gbOryctolaguseucaryotic release factor69 24
(eRF)
~AAA31246.cuniculus
1~ ,
1762 gi~15559188~Homo SapiensdJ45P21.3 (butyrophilin,69 26
subfamily 3,
emb~CAC03 member A1)
424.2
1763 AAM93661 Homo SapiensHELI- Human polypeptide,186 80
SEQ ID
NO: 3536.
1763 AAM64398 Homo SapiensMOLE- Human brain expressed154 76
single
exon probe encoded protein
SEQ ID
NO: 36503.
1763 gi~20556958~Homo Sapienssimilar to PAM COOH-terminal73 43
ref~XP_0615 interactor protein 1
62.5
1764 AAU17223 Homo SapiensHUMA- Novel signal transduction211 87
pathwa rotein, Se ID
788.
1765 g11334546Podospora Dod COI 113 grp IB protein71 37
anserina
1765 15679307 Mus musculusROR aroma t 70 27
1765 g14186077Mus musculusROR aroma T rotein 70 27
1766 g117864081Mus musculusPPAR aroma coactivator-lbeta74 26
protein
1766 g144795 Methanococcuspolyferredoxin 71 28
voltae
1766 g114279670Lycopersiconverticillium wilt disease71 31
resistance
esculentum protein
1768 AAE06588 Homo SapiensSAGA Human protein having165 100
hydrophobic domain, HP
10778.
1768 AAM40979 Homo SapiensHYSE- Human polypeptide 165 100
SEQ ID
NO 5910.
1768 AAB24542 Homo SapiensHUMA- Human secreted 73 30
protein
sequence encoded by gene
27 SEQ ID
N0:168.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
188
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1769 gi6174840Achromobacterlow-specificity D-tlueonine78 33
aldolase
xylosoxidans
subsp.
xylosoxidans
1769 gi16769806Drosophila SD02660p 75 23
melano aster
1769 gi1098473Rattus insulin-like growth 73 31
factor binding
norvegicus rotein
1770 AAP94684 Homo SapiensCHIL Amino acid sequence79 56
encoded
by part of human xnamiose
binding
protein(hMBP) genomic
DNA.
1770 gij15790548jHalobacteriumcobyric acid synthase; 69 36
CbiP
ref~NP Sp. NRC-1
2803
72.1 ~
1770 gij11467609jGuillardia Clp protease ATP binding69 27
theta subunit
ref~NP_0506
61.1j
1772 gi5532460Shi eila ShiF 66 32
flexneri
1773 gi 11544663Arabidopsis PTPKIS 1 75 42
thaliana
1773 gi11595504Arabidopsis PTPKIS1 protein 75 42
thaliana
1773 gi18389331Mus musculus2',5'-oli oadenylate 73 42
synthetase-like 10
1774 AAM06519 Homo SapiensHYSE- Human foetal protein,414 90
SEQ ID
NO: 250.
1774 gij18552248jHomo Sapienssimilar to latent transforming69 37
growth
refjXP_0925 factor beta binding
protein 1; latent
10.1 TGF beta binding protein
1775 gi4884924Rangiferine glycoprotein C 67 60
he esvirus
1
1775 AAB94152 Homo sapiensHELI- Human protein 65 34
sequence SEQ
ID N0:14435.
1775 AAB93253 Homo SapiensHELI- Human protein 65 34
sequence SEQ
ID N0:12271.
1776 gi13424176Caulobacter N-carbamyl-L-amino acid89 24
crescentus amidohydrolase
CB 15
1776 gi514267 Homo Sapiensproto-oncogene tyrosine-protein86 29
kinase
1776 128237 Homo Sapiens150 protein (AA 1-1130)84 28
1777 g163370 Gallus anus d strophin (AA 1 - 3660)68 31
1777 gij3046783jeScyliorhinusdystrophin 67 29
mb~CAA680canicula
33.1j
1777 gi~2342682jgArabidopsis Contains similarity 67 31
to Rattus AMP-
bjAAB7040thaliana activated protein kinase
(gbjX95577).
6.1j
1778 AAE16176 Homo SapiensINCY- Human G-protein 1419 100
coupled
receptor 7 (GCREC-7)
rotein.
1778 AAE18021 Homo SapiensCUBA- Human G-protein 1419 100
coupled
receptor-8a (GPCR-8a)
rotein.
1778 AAG72411 Homo SapiensVEDA Human OR-like polypeptide1419 100
query se uence, SEQ
ID NO: 2092.
1779 AAM76040 Homo SapiensMOLE- Human bone marrow93 48
expressed probe encoded
protein SEQ
117 NO: 36346.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
189
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1779 AAM63227 Homo SapiensMOLE- Human brain expressed93 48
single
exon probe encoded protein
SEQ ID
NO: 35332.
1779 gi12620576BradyrllizobiumID342 87 24
' a onicum
1780 gi2459833Rattus Maxpl 81 31
norvegicus
1780 AAB65650 Homo SapiensSUGE- Novel protein kinase,- 80 35
SEQ ID
NO: 177.
1780 AAM39805 Homo sapiensHYSE- Human polypeptide 80 36
SEQ ID
NO 2950.
1781 14877963 Mus musculusNF-ka aB inducin kinase 69 39
1781 115077865Mus musculusbullous emphi oid antigen67 35
1-b
1781 g115077863Mus musculusbullous emphi oid anti 67 35
en 1-a
1782 g14138265Nicotiana Avr9 elicitor response 76 27
protein
tabacum
1782 g112725153LactococcusSOS ribosomal protein 75 32
L3
lactis subsp.
lactis
1782 AAB21008 Homo SapiensINCY- Human nucleic acid-binding73 32
protein, NuABP-12.
1783 g13947714Streptococcusinitiation factor IF2 86 20
agalactiae
1783 g19558387Streptococcusinitiation factor 2 86 20
a alactiae
1783 g19558369Streptococcusinitiation Factor 2 86 20
a alactiae
1786 g1435855 Mus s . CREB-binding protein; 75 22
CBP
1786 g12911464Leishmania sodium stibogluconate 75 34
resistance
tarentolae rotein
1786 g119547887Mus musculusCREB-binding rotein 75 22
1787 13747099 Mus musculusC1 -related factor 616 61
1787 114278927Mus musculusgliacolin ' 615 64
1787 g110566471Mus musculusGliacolin 615 64
1788 gi~21291197~Anopheles agCP7579 71 20
gb~EAA033gambiae
str.
42.1 ~ PEST
1788 gi~20803964~MesorhizobiumHYPOTHETICAL PROTEIN 69 43
emb~CAD31loti
541.1
1789 AAM41125 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID
NO 6056.
1789 AAM39339 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID
NO 2484.
1789 AAM79857 Homo SapiensHYSE- Human protein SEQ 320 80
ID NO
3503.
1790 g11143585Paracentrotus2 alpha fibrillar collagen69 23
lividus
1791 g19837427Lytechinus embryonic blastocoelar 116 34
extracellular
varie atus matrix rotein recursor
1791 g114089698Mycoplasma OLIGOPEPTIDE ABC 71 23
pulinonis TRANSPORTER PERMEASE
PROTEIN
1791 g16572111Bartonella riboflavin synthase alpha69 29
chain
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
190
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
uintana
1792 gi~4506023~rHomo Sapiensprotein phosphatase 68 39
2, regulatory
ef~NP_0027 subunit B (B56), gamma
isoform
10.1
1793 AAM71170 Homo SapiensMOLE- Human bone marrow180 82
expressed probe encoded
protein SEQ
ID NO: 31476.
1793 AAM58664 Homo SapiensMOLE- Human brain expressed180 82
single
exon probe encoded protein
SEQ ID
NO: 30769.
1793 AAM65679 Homo SapiensMOLE- Human brain expressed168 71
single
exon probe encoded protein
SEQ ID
NO: 37784.
1794 AAG00072 Homo SapiensGEST Human secreted 125 80
protein, SEQ ID
NO: 4153.
1794 AAW34618 Homo SapiensIMUT- Human C3 protein 125 80
mutant DV-
7N.
1794 AAW34617 Homo sapiensIMUT- Human C3 protein 125 80
mutant DV-
6.
1795 AAY05069 Homo SapiensSMIK Human PIGR-2 protein1055 85
sequence.
1795 gi396170 Homo sa iensCMRF-35 anti en 406 45
1795 gi18490143Homo SapiensCMRF35 leukocyte immunoglobulin-406 45
like receptor
1796 gi~6723273~dBaboon gag-pol precursor polyprotein421 41
bj~BAA8965endogenous
9.1~ virus strain
M7
1796 gi~13940448~Murine leukemiapol precursor protein 421 41
gb~AAK503virus
81.1 ~U43202
2
1796 gi~331995~gbAKV murine gag-pol polyprotein 421 41
(tag amber codon
~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)
1
1797 121411325Homo SapiensSimilar to LOC205103 260 73
1797 gi~4835878~gHomo Sapiensendocytic receptor Endo18077 31
b~AAD3028
O.1~AF1348
38 1
1797 gi~16076075~Leishmania trypanothione reductase70 30
emb~CAC94donovani
295.1 donovani
1798 g1927721 SaccharomycesSiplp: SNF1 proteiiikinase72 34
substrate;
cerevisiae YDR422C; CAI: 0.13
1798 g1172604 Saccharomycesprotein kinase 72 34
cerevisiae
1798 gi~6320630~rSaccharomycesSNF1 proteinkinase substrate;72 34
Siplp
eflNP_0107cerevisiae
10.1
1799 gi~20839768~Mus musculussimilar to GDP-fucose 71 29
transporter 1
ref~XP_1303
11.1
1801 gi~17461642~Homo Sapienssimilar to Ig kappa 78 23
chain
reflXP
0662
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
191
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
49.1 ~
1801 gi~6325342~rSaccharomycesProtein required for 76 22
cell viability;
0154 cerevisiae Ypr085cp
ef~NP
_
10.1
1801 gi~9635081~rGallid UL47 74 26
ef~NP_0578herpesvirus
2
09.1 ~
1802 AAB94148 Homo SapiensHELI- Human protein sequence250 56
SEQ
ID N0:14427.
1802 AAG64564 Homo SapiensSHAN- Human zinc-finger 250 56
protein 60.
1802 AAM79356 Homo SapiensHYSE- Human protein SEQ 250 56
ID NO
3002.
1803 AAW81754 Homo SapiensBOEF Human Fanconi anaemia-631 85
associated ene II protein.
1803 g12407911Homo Sapiensdifferentially expressed555 74
in Fanconi
anemia
1803 16013073 Mus musculusHemT-3 protein 89 24
1805 g114189735Homo sapiensATP-binding cassette 1508 90
transporter
family A member 12
1805 11943947 Bos taurus ABC transporter 404 31
1805 AAZ94734_Homo SapiensFARB Human ATP binding 395 33
cassette
aal ABCAl (ABC1) cDNA.
1806 AAU12234 Homo SapiensGETH Human PR04350 polypeptide859 100
sequence.
1806 AAA96344_Homo SapiensGETH cDNA encoding a 498 48
novel
aal of epode designated PR04357.
1806 AAU12445 Homo SapiensGETH Human PRO4357 polypeptide498 48
sequence.
1807 1190396 Homo sa rofilaggrin 76 29
iens
1808 AAB88367 Homo SapiensHELI- Human membrane 74 30
or secretory
rotein clone PSECO101.
1808 g119684014Homo Sapienssimilar to brain-specific74 30
angiogenesis
inhibitor 3 (H. Sapiens)
1808 gi~18576362~Homo Sapienssimilar to fibroblast 74 30
growth factor
re~XP_0844 binding protein 1
81.1
1809 g1530876 Chlamydomonasamino acid feature: Rod 126 35
protein
reinhardtiidomain, as 266 .. 468;
amino acid
feature: globular protein
domain, as 32
.. 265
1809 g16578849Myxococcus FrgA 126 29
xanthus
1809 12429362 Santalum proline rich protein 122 27
album
1810 g117428288Ralstonia PROBABLE CATION- 75 28
solanacearumTRANSPORTING ATPASE
LIPOPROTEIN TRANSMEMBRANE
1810 g121483422Drosophila LD34142p 71 29
melano aster
1810 ABB90042 Homo SapiensHUMA- Human polypeptide 70 32
SEQ ID
NO 2418.
1811 gi~20915248~Mus musculussimilar to Collagen alpha148 74
1(VI) chain
ref~XP_1451 precursor
60.1
1812 g12104558Rattus ~ CCA3 ~ 1150 ~ 90
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
192
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
norvegicus
1812 AAB64963 Homo SapiensROSE/ Human secreted 172 37
protein
sequence encoded by gene
24 SEQ ID
NO:141.
1812 gi12963869Mus musculusgene trap ankyrin repeat172 37
containing
rotein
1813 AAB65201 Homo SapiensGETH Human PR01009 (UNQ493)208 100
rotein se uence SEQ ID
N0:194.
1813 AAY66678 Homo SapiensGETH Membrane-bound protein208 100
PR01009.
1813 AAB24068 Homo SapiensGETH Human PR01009 protein208 100
se uence SEQ ID N0:36.
1815 AAG89314 Homo SapiensGEST Hurnan secreted 191 100
protein, SEQ ID
NO: 434.
1815 gi6460052Deinococcusdipeptidyl peptidase 66 60
IV-related protein
radiodurans
1816 gi1052594Drosophila trithorax protein trxI 75 26
melanogaster
1816 gi1052593Drosophila trithorax protein trxII 75 26
melanogaster
1816 gi158818 Drosophila zinc-binding protein 75 26
melanogaster
1817 AAB49765 Homo SapiensHELI- Human proliferation229 94
differentiation factor
amino acid
se uence.
1817 AAB88393 Homo SapiensHELI- Human membrane 229 94
or secretory
rotein clone PSEC0137.
1817 gi18446895Drosophila AT05866p 73 25
melanogaster
1818 gi6573212Giardia variant-specific surface73 32
protein H7-1
intestinalis
1818 gi159143 Giardia variant-specific surface73 32
protein H7
intestinalis
1818 gi15144254Micrurus neurotoxin homologue 72 32
8
corallinus
1819 gi161857 Tetrahymenasurface antigen 69 35
thermo hila
1821 gi913964 Carcinoscorpiusfactor C 80 26
rotundicauda
1821 gi217397 Tachypleus limulus factor C precursor80 26
tridentatus
1821 gi18542425Tachypleus factor C precursor 80 26
tridentatus
1822 19309473 Mus musculusDNMT1 associated protein-174 37
1822 g11666895Homo sa CHL1 protein 74 23
iens
1822 g116923930Mus musculusMAT1-mediated transcriptional74 37
repressor
1823 g19058659Canis familiarisskeletal muscle chloride73 34
channel C1C-1
1823 g1433182 Drosophila receptor protein tyrosine72 26
phosphatase
melanogaster
1823 g120429105Paracoccus decaprenyl diphosphate 72 27
synthase
zeaxanthinifacie
ns
1824 g113374178Mus musculusTAFII140 rotein 612 88
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
193
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1824 gi17861888Drosophila GM10839p 246 49
melano aster
1824 gi6634096Drosophila BIP2 protein 242 48
melano aster
1825 gi16605480Homo sa G6b-C protein 1159 100
iens
1825 116605484Homo sa G6b-E rotein 1009 90
iens
1825 gi5304877Homo sa immuno lobulin rece for 1003 83
iens
1826 AAB94636 Homo SapiensHELI- Human protein sequence105 37
SEQ
ID N0:15515.
1826 AAU15903 Homo SapiensHUMA- Human novel secreted105 37
protein,
Se ID 856.
1826 gi21430928Drosophila SD27341p 93 39
melanogaster
1827 AAR33270 Homo SapiensWIST- T cell receptor 329 92
alpha chain
clone alphal.3.
1827 gi1806100Homo SapiensT cell rece for alpha 329 92
chain
1827 gi2358032Homo SapiensTCRAV8S3 329 92
1828 gi20513851Hordeum BPM 73 45
vul are
1828 AA001897 Homo SapiensHYSE- Human polypeptide 70 35
SEQ ID
NO 15789.
1828 AAE16477 Homo SapiensOSTE- Human collagen 69 31
alphal (II)
rotein.
1829 AAG66837 Homo SapiensSHAN- Human ATP-dependent356 100
serine
proteinase 31.
1829 AAG66838 Homo SapiensSHAN- Human ATP-dependent89 100
serine
proteinase 31 N-terminal
peptide.
1829 gi5881591Gallus gallushomeodomain protein 77 38
1830 AAB94294 Homo SapiensHELI- Human protein sequence951 99
SEQ
ID N0:14745.
1830 gi10504968Drosophila rho guanine nucleotide 180 22
exchange factor
melano aster4
1830 gi16197921Drosophila LD03170p 180 22
melano aster
1831 ABB 12353Homo SapiensHYSE- Human bone marrow 199 30
expressed
protein SEQ ID NO: 107.
1831 120452161Canis familiarisretinitis i mentosa GTPase143 24
re lator
1831 gi2062609Xenopus middle molecular weight 140 24
laevis neurofilament
rotein NF-M(1)
1832 AAB29778 Homo SapiensRHOD- Human MSF-derived 148 18
tribonectin.
1832 gi142161 Anaplasma surface antigen Amf105 141 25
mar finale
1832 gi4808177Drosophila largest subunit of the 141 20
RNA polymerase
subobscura II com lex
1833 AAM66321 Homo SapiensMOLE- Human bone marrow 424 51
expressed probe encoded
protein SEQ
ID NO: 26627.
1833 AAM53933 Homo SapiensMOLE- Human brain expressed424 51
single
exon probe encoded protein
SEQ ID
NO: 26038.
1833 gi~6723273~dBaboon gag-pol precursor polyprotein357 47
bj~BAA8965endogenous
9.1 virus strain
M7
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
194
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1834 AAM88756 Homo SapiensHUMA- Human 208 100
immune/haematopoietic
antigen SEQ
ID N0:16349.
1834 gi20417 Persea americanacellulase 77 34
1834 gi153337 Streptomyceskanamycin-apramycin resistance69 26
tenebrariusmethylase
1837 AAY02893 Homo SapiensIiLTMA- Fragment of human76 41
secreted
protein encoded by ene
92.
1837 AAY99429 Homo SapiensGETH Human PR01563 (UNQ769)73 35
amino acid se uence SEQ
ID N0:317.
1837 gi6634084Drosophila malate dehydrogenase 73 39
(NADP-
melanogasterdependent oxaloacetate
decarboxylating), malic
enzyme
1838 gi2865602SaccharopolyspoSapI M2 methyltransferase77 37
ra Sp.
1838 gi3089358Rattus MARRLC2A 75 33
norvegicus
1838 gi~2865602~gSaccharopolyspoSapI M2 methyltransferase77 37
b~AAC9718ra Sp.
2.1~
1839 AAM69149 Homo SapiensMOLE- Human bone marrow 154 96
expressed probe encoded
protein SEQ
ID NO: 29455.
1839 AAM56768 Homo SapiensMOLE- Human brain expressed154 96
single
exon probe encoded protein
SEQ ID
NO: 28873.
1839 AAW96209 Homo SapiensSMIK Amyloid precursor 102 78
protein
(APP) C-terminal fragment.
1840 gi9946563Pseudomonasprobable type II secretion81 36
system
aeru inosa protein
1840 gi21108565Xanthomonaspseudouridylate synthase75 35
axonopodis
pv.
citri str.
306
1840 ABB04714 Homo sapiensSHAN- Human PP1744 protein74 31
SEQ
ID N0:23.
1841 gi1491949Molluscum MC006L 85 30
contagiosum
virus sub
a 1
1841 AAM42085 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID
NO 7016.
1841 AAM40299 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID
NO 3444.
1842 120381413Homo sapiensSimilar to LOC160680 216 44
1842 g113592175Leishmania ppg3 144 24
maj or
1842 g15420387Leishmania proteophosphoglycan 140 23
ma' or
1843 AAB87181 Homo SapiensMILL- Human secreted 278 42
protein
MANGO 349 E41D variant,
SEQ ID
N0:231.
1843 AAB87128 Homo sapiensMILL- Human secreted 278 42
protein
MANGO 349, SEQ ID N0:130.
1843 AAB87179 Homo SapiensMILL- Human secreted 276 41
protein
MANGO 349 I21K variant,
SEQ ID
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
195
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
N0:227.
1844 AAE14341 Homo sapiensINCY- Human protease 886 93
PRTS-6
protein.
1844 gi16768276Drosophila GH27809p 290 41
melano aster
1844 gi2655204Mus musculusubiquitin-specific protease258 35
1846 AAY88300 Homo SapiensMILL- Human TANGO 187-3 1334 90
protein.
1846 gi13097780Homo SapiensSimilar to RIKEN cDNA 1326 90
2810037014
gene
1846 AAY88296 Homo SapiensMILL- Human TANGO 187-2/31312 87
protein.
1847 AAG74984 Homo SapiensHUMA- Human colon cancer75 32
antigen
protein SEQ ID N0:5748.
1847 gi17352449Rattus ErbB3/Her3 precursor 74 38
norve icus
1847 gi~20860870~Mus musculussimilar to H4(D10S170) 75 32
protein
re~XP,1256
64.1 ~
1848 gi3123530Fowlpox I3L, ortholo ue of vaccinia75 27
virus I3L
1848 gi5902659Drosophila ring canal protein 70 27
melanogaster
1848 gi~18110218~Drosophila kel-P2 70 27
ref~NP-4765melanogaster
89.2
1849 gi2065210Mus musculusPro-Pol-dUTPase olyprotein614 78
1849 AAM65715 Homo SapiensMOLE- Human bone marrow 548 73
expressed probe encoded
protein SEQ
ID NO: 26021.
1849 AAM53338 Homo SapiensMOLE- Human brain expressed548 73
single
exon probe encoded protein
SEQ ID
NO: 25443.
1850 gi10999071LophognathusNADH dehydrogenase subunit74 23
2
longirostris
1850 gi18537243Human envelope glycoprotein 74 29
immunodeficienc
y virus
a 1
1850 gi~1099907,1~LophognathusNADH dehydrogenase subunit74 23
2
gb~AAG006longirostris
22.2~AF
128
462 2
1851 gi~17448210~Homo Sapienssimilar to 60 kDa heat 72 28
shock protein,
ref~XP_0685 mitochondrial precursor
(Hsp60) (60
03.1 kDa chaperonin) (CPN60)
(Heat shock
protein 60) (HSP-60)
(Mitochondrial
matrix protein Pl) (P60
lymphocyte
protein) (HuCHA60)
1852 gi1164937SaccharomycesYOR3160w 74 31
cerevisiae
1852 gi3176662ArabidopsisSimilar to mannosyl-oligosaccharide73 31
thaliana glucosidase gb~X87237
from Homo
sa iens.
1852 gi13398928Arabidopsisalpha-glucosidase 1 73 31
thaliana
1853 gi~20889364~Mus musculussimilar to hepatitis ~ 76 ~ 36
A virus cellular
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
196
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1384 receptor 1; T cell immunoglobin
ref~XP
_ domain and mucin doamin
29.1 ~ rotein 1
1853 gi~21288202~Anopheles agCP9342 71 32
gb~EAA005gambiae
str.
23.1 ~ PEST
1854 AAB88481 Homo SapiensHELI- Human membrane 776 99
or secretory
rotein clone PSEC0251.
1854 AAE03835 Homo SapiensHLTMA- Human gene 18 776 99
encoded
secreted protein HFKHW50,
SEQ ID
NO: 81.
1854 AAE03863 Homo SapiensHIJMA- Human gene 18 716 97
encoded
secreted protein HFKHW50,
SEQ ID
N0:109.
1855 gi1663748Chlamydomonasdynein heavy chain 7 82 29
reinhardtii
1855 gi1663744Chlamydomonasdynein heavy chain 5 80 28
reinhardtii
1855 gi1663738Chlamydomonasdynein heavy chain 2 80 27
reinhardtii
1856 gi18032120Gallus gallusshal-like voltage-gated 75 23
potassium
channel
1856 gi1408569Haemophilusadhesion and penetration71 28
protein
influenzae
1856 gig 18032120Gallus gallusshal-like voltage-gated 75 23
potassium
gb~AAL566 chaimel
33.1 ~AF075
160 1
1857 AAM67180 Homo SapiensMOLE- Human bone marrow 129 44
expressed probe encoded
protein SEQ
ID NO: 27486.
1857 AAM54795 Homo sapiensMOLE- Human brain expressed129 44
single
exon probe encoded protein
SEQ ID
NO: 26900.
1857 gi~21040255~Homo Sapienssplicing factor, arginine/serine-rich109 29
12
re~NP_6319
07.1 ~
1858 gi21392190Drosophila RE74758p 71 39
melanogaster
1858 gi9954108TrypanosomaRNA binding protein RGGm68 40
cruzi
1858 gi20302994Medicago nodule-specific glycine-rich66 32
protein 1C
tnmcatula
1859 gi~20536244~Homo Sapienssimilar to autoantigen 72 30
La
ref~XP_0605
05.4
1860 gi~17541362~CaenorhabditisK08E7.S.p 103 29
ref)NP-5024elegans
09.1
1860 gi~17446900~Homo Sapienssimilar to DNA-directed 100 34
RNA
re~XP_0658 polymerase (EC 2.7.7.6)
II largest
33.1 ~ chain - Mastigamoeba
invertens
(fra ment)
1860 gi~9628166~rAfrican CD2 homolog 98 30
swine
eflNP fever virus
0427
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
197
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
52.1
1861 AAY70691 Homo sa DAND Human membrane attractin-2.162 40
iens
1861 AAY70690 Homo SapiensDAND Human membrane attractin-1.162 40
1861 gi12275390Rattus membrane attractin 162 40
norvegicus
1862 gi10039425Equus caballusALR protein 81 28
1862 gi13529521Mus musculusSimilar to elastin microfibril80 32
interface
located protein
1862 AAM40414 Homo SapiensHYSE- Human polypeptide 79 39
SEQ ID
NO 3559.
1863 gi~16588389~Homo SapiensB lymphocyte activation-related247 52
protein
gb~AAL267 BC-1514
87.1 ~AF304
442 1
1863 gi~20479028~Homo Sapienssimilar to B lymphocyte 117 68
activation-
re~XP_1137 related protein BC-1514
29.1
1863 gi~21301715~Anopheles agCP8366 85 41
gb~EAA138gambiae
str.
60.1 ~ PEST
1864 AAU15851 Homo SapiensHUMA- Human novel secreted1275 78
protein,
Seq ID 804.
1864 AAU16312 Homo sapiensHUMA- Human novel secreted1123 76
protein,
Seq ID 1265.
1864 AAG02054 Homo SapiensGEST Human secreted protein,308 91
SEQ ID
NO: 6135.
1865 AAB94953 Homo SapiensHELI- Human protein sequence86 29
SEQ
ID N0:16485.
1865 13746787 Homo SapiensSYT interacting protein 86 29
SIP
1865 g115022507Homo sapienscoactivator activator 86 29
1866 g117133332Nostoc Sp. preprotein translocase 68 43
PCC Sect subunit
7120
1866 gi~13489110~Homo Sapiensgap junction protein, 66 40
alpha 3, 46kD
ref~NP-0687 (connexin 46)
73.1
1867 g1706930 Rattus cyclic GMP stimulated 191 95
norvegicus phosphodiesterase
1867 AAV54762-Homo SapiensUNIW Human cGS-PDE cDNA 137 100
DNA
aal seqeucne.
1867 AAV36157_,Homo SapiensUNIW Human cyclic-GMP-nucleotide137 100
aal phos hodiesterase cDNA.
1868 AAB95695 Homo SapiensHELI- Human protein sequence112 27
SEQ
ID N0:18516.
1868 AAY91447 Homo SapiensHUMA- Human secreted 112 27
protein
sequence encoded by gene
48 SEQ ID
N0:168.
1868 AAY91393 Homo SapiensHUMA- Human secreted 112 27
protein
sequence encoded by gene
48 SEQ ID
N0:114.
1870 AAU07886 Homo SapiensWHED Polypeptide sequence1454 94
for
human hspGlS.
1870 g113603891Homo sa MOV10-like 1 1454 94
iens
1870 113603857Mus musculusMOV10-like 1 954 77
1871 AAM96652 Homo SapiensHUMA- Human reproductive484 96
system
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
198
Table 2
SEQ AccessionSpecies Description Score
ID No, Identity
NO:
related antigen SEQ ID
NO: 5310.
1871 gi18676652Homo sa FLJ00225 rotein 433 95
iens
1871 gi21386760Berneuxia maturase R 70 32
thibeoca
1872 AAQ90304_Homo SapiensNISR Human thryoid peroxidase73 29
gene.
aal
1872 AAW48781 Homo sa RSRR- Thyroid eroxidase.73 29
iens
1872 AAR75689 Homo SapiensNISR Human thryoid eroxidase.73 29
1873 AAG03774 Homo SapiensGEST Human secreted protein,228 90
SEQ ID
NO: 7855.
1873 1338288 Homo Sapienspre rosomatostatin I 228 90
1873 g1342299 Macaca preprosomatostaon 228 90
fascicularis
1875 AAR30418 Homo sa DAND Nearly com lete 76 30
iens p107 rotein.
1875 g1347378 Homo Sapiens107 76 30
1875 g1157871 Drosophila P glycoprotein 76 24
melanogastex
1876 ABB 17955Homo SapiensHUMA- Human nervous system186 40
related
poi a tide SEQ ID NO
6612.
1876 AAS 17764_Homo SapiensGENA- Human Genomic DNA 167 39
for
aal CRYBB1.
1876 AA002331 Homo SapiensHYSE- Human polypepode 165 42
SEQ ID
NO 16223.
1877 gi~59977~emHuman tripartite fusion transcript224 76
PLA2L
b~CAA7866endogenous
2.1 retrovirus
1878 ABB84943 Homo SapiensGETH Human PR01556 protein1056 93
sequence SEQ ID N0:254.
1878 AAB31670 Homo SapiensPROT- Amino acid sequence1056 93
of a
human protein having
a hydrophobic
domain.
1878 AAB47295 Homo SapiensGETH PR01556 0l epode. 1056 93
1879 ABB15861 Homo SapiensHUMA- Human nervous system73 36
related
poi eptide SEQ ID NO
4518.
1880 AAU83117 Homo sapiensZYMO Novel secreted protein66 54
Z799543G2P.
1880 g112723186Lactococcusouter membrane lipoprotein66 26
precursor
lactis subsp.
lactis
1881 1609624 Vibrio choleraeE SC 73 29
1882 g112667456Ratios synaptotagnun VIId 86 32
norvegicus
1882 g112667454Rattus synaptotagmin VIII 85 33
norvegicus
1882 g1334072 PseudorabiesORF-3 protein 83 35
virus
1883 g11747 Oryctolagustrichohyalin 119 29
cuniculus
1883 g12072290Xenopuslae XL-INCENP 100 27
vis
1883 g112584554_ polyprotein 96 25
Human
coxsackievirus
B3
1884 gi~15601413~Vibrio choleraesucrose-6-phosphate dehydrogenase65 55
ref~NP
2330
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
199
Table 2
SEQ AccessionSpecies Description Score 1o
ID No. Identity
NO:
44.1 ~
1885 gi16878287Homo sa Similar to C-terminal 74 35
iens modulator protein
1885 gi15866714Homo sa C-terminal modulator 74 35
iens rotein
1885 AA006984 Homo SapiensHYSE- Human polypeptide 70 60
SEQ ID
NO 20876.
1887 AAW25939 Homo SapiensCNRS T-cell receptor 601 99
V-beta-5.1
pe tide fra ent.
1887. gi36973 Homo SapiensT-cell receptor beta-chain601 99
1887 gi1552498Homo sa V_se meat translation 600 100
iens product
1888 gi18874468Homo Sapienspartitioning-defective 198 73
3-like protein
splice variant c
1888 gi16903870Homo sapienspartitioning-defective 198 73
3-like protein
splice variant b
1888 gi16903868Homo Sapienspartitioning-defective 198 73
3-like protein
s lice variant a
1889 gi21489377Homo SapiensMAPA rotein 1620 99
1889 gi21489330Bos taurus MAPA protein 833 56
1889 gi21489379Mus musculusMAPA protein 630 48
1890 AAY10874 Homo SapiensHUMA- Amino acid sequence503 100
of a
human secreted rotein.
1890 gi17429674Ralstonia PROBABLE LIPOPROTEIN 73 44
solanacearum
1891 gi15723141Homo sa c349E10.1.1 (novel protein,180 46
iens isoform 1)
1891 AAB59006 Homo SapiensHUMA- Breast and ovarian174 47
cancer
associated antigen protein
sequence
SEQ ID 714.
1891 gi19353342Mus musculusRII~EN cDNA 9530058802 162 47
gene
1892 AAM86086 Homo SapiensHUMA- Human 95 53
immiule/haematopoietic
antigen SEQ
ID NO:13679.
1892 AA005973 Homo SapiensHYSE- Human polypeptide 94 82
SEQ ID
NO 19865.
1892 AA009418 Homo SapiensHYSE- Human polypeptide 91 70
SEQ ID
NO 23310.
1893 gi8778607ArabidopsisFSM15.23 71 25
thaliana
1894 AAM65951 Homo SapiensMOLE- Human bone marrow 69 38
expressed probe encoded
protein SEQ
ID NO: 26257.
1894 AAM53568 Homo sapiensMOLE- Human brain expressed69 38
single
exon probe encoded protein
SEQ ID
NO: 25673.
1894 gi~20832567~Mus musculussimilar to Heterogeneous163 76
nuclear
ref~XP_1335 ribonucleoprotein A3
(hnRNP A3)
24.1 ~ (D 10 S 102)
1895 AAM66299 Homo sapiensMOLE- Human bone marrow 440 83
expressed probe encoded
protein SEQ
ID NO: 26605.
1895 AAM53913 Homo SapiensMOLE- Human brain expressed440 83
single
exon probe encoded protein
SEQ ID
NO: 26018.
1895 gi~6723273~dBaboon gag-pol precursor polyprotein270 45
bj ~BAA8965endogenous
9.1~ virus strain
M7
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
200
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1896 gi4883988Bartonella cell division protein 68 28
FtsZ
clarridgeiae
1897 AA013209 Homo sapiensHYSE- Human polypeptide 142 54
SEQ ID
NO 27101.
1897 AAM66708 Homo sapiensMOLE- Human bone marrow 124 46
expressed probe encoded
protein SEQ
ID NO: 27014.
1897 AAM54310 Homo SapiensMOLE- Human brain expressed124 46
single
exon probe encoded protein
SEQ ID
NO: 26415.
1898 gi2565268Drosophila pore-forming protein 75 27
MIP family
virilis
1898 gi7453547Homo Sapiensglioma tumor suppressor 75 31
candidate
re ion rotein 1
1898 gi3218331Metarhiziumnitrogen response regulator74 26
aniso liae
1899 19656609 Vibrio choleraechemotaxis protein CheA 73 32
1899 gi~20908537~Mus musculusRIVEN cDNA 1700001L19 443 80
re~XP_1274
14.1
1899 gi~15642063~Vibrio choleraechemotaxis protein CheA 73 32
re~NP,2316
95.1
1900 gi~18586105~Homo Sapienssimilar to scal 203 84
reflXP
0914
00.1 ~
1900 gi~20888279~Mus musculussimilar to spinocerebellar199 82
ataxia type 1
refjXP_
1465
08.1
1901 g1338033 Homo sa serum rotein 90 32
iens
1901 g14808221Homo SapiensdJ1177I5.2 (serum constituent90 32
protein
MSE55)
1901 g14098993Mus musculuspolyhomeotic 2 88 30
1902 AAB 19933Homo SapiensINCY- Human oxidoreductase250 100
OXRD-
8.
1902 g119713043Fusobacteriumhon/zinc/copper-binding 73 22
protein
nucleatum
subsp.
nucleatum
ATCC 25586
1902 gi~20342079~Mus musculusltIKEN cDNA 1700003E16 77 25
ref~XP_1106
14.1
1903 g1342279 Macaca opiomelanocortin 231 49
nemestrina
1903 128342 Homo sa roo iomelanocortin 230 49
iens
1903 g1190183 Homo sapienso iomelanocortin 230 49
1904 gi~11037117~Homo SapiensNAG13 180 53
gb~AAG274
85.1 CAF
194
537_1
1905 g15360984Homo SapiensdJ228HI3.1 (similar to 152 72
Ribosomal
protein L21 e)
1905 AAB44126 Homo SapiensHUMA- Human cancer associated150 83
protein sequence SEQ
ID N0:1571.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
201
Table 2
SEQ AccessionSpecies Description Score
No, Identity
NO:
1905 gi550015 Homo sa ribosomal protein L21 150 83
iens
1906 gi2654610Pseudomonasarginine/ornitlline succinyltransferase79 25
aeru inosa AIsubunit
1906 gi17226812Botryotiniahistidine kinase 72 33
fuckeliana
1906 gi16904238Botryotiniatwo-component osmosensing72 33
histidine
fuckeliana kinase BOS1
1908 gi330359 Human nuclear antigen precursor91 37
herpesvirus
4
1908 gi1632793Human EBNA3C (EBNA 4B) latent 91 37
protein
herpesvirus
4
1908 11184677 Candida hyphal wall rotein 1 90 38
albicans
1909 g113177635Rattus phospholipase C beta-3 72 26
norve icus
1909 g11150880Mus musculusphos holi ase C beta3 71 26
1909 g117105044Simian 10.1 kDa 71 31
adenovirus
25
1910 g19857054Leishmania possible CG7055 protein 71 47
maj or
1910 g11617560Leishmania LCFACASS; L5701.2 67 33
ma'or
1910 gi~9857054~eLeishmania possible CG7055 protein 71 47
mb~CAC040major
11.1
1911 AAY87278 Homo SapiensINCY- Human signal peptide501 82
containing protein HSPP-55
SEQ ID
NO:55.
1911 AAB 18912Homo SapiensGETH A novel polypeptide501 82
designated
PR01889.
1911 AAU27659 Homo SapiensZYMO Human protein AFP513481.416 77
1912 12065210 Mus musculusPro-Pol-dUTPase olyprotein434 80
1912 gig 18676710Homo SapiensFLJ00254 protein 270 64
dbj~BAB850
07.1
1913 g15713196Caenorhabditisliprin-alpha homolog 479 38
SYD-2
elegans
1913 1930343 Homo SapiensLAR-interacting protein 467 39
1b
1913 g1930341 Homo SapiensLAR-interacting protein 467 39
la
1914 g16651021Mus musculussemaphorin cytoplasmic 274 63
domain-
associated rotein 3B
1914 g16651019Mus musculussemaphorin cytoplasmic 274 63
domain-
associated protein 3A
1914 AAM25720 Homo SapiensHYSE- Human protein sequence266 61
SEQ
ID N0:1235.
1915 g1902214 Zea mays RNA polymerase beta' 72 24
subuW t-2
1915 g112482 Zea mays RNA polymerase beta-2 72 24
subunit (AA
1-1527)
1915 gig 11467184Zea mays RNA polymerase beta' 72 24
subunit-2
reflNP-0430
17.1
1916 g11655432Mus musculuslexin 2 1135 58
1916 AAM93435 Homo SapiensHELI- Human polypeptide,1132 57
SEQ ID
NO: 3070.
1916 g1961515 Xenopus lexin 1126 54
laevis
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
202
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1917 g115559064Mus musculusSNAG1 86 38
1917 gi~20863586~Mus musculussimilar to dJ551D2.5 88 30
(novel protein)
ref~XP_1415
81.1
1917 gi~18644890~Mus musculussorting nexin associated86 38
golgi protein 1
re~NP_5706
14.1
1918 g119528383Drosophila RE04404p 67 32
melanogaster
1919 AAM77461Homo SapiensMOLE- Human bone marrow 189 79
expressed probe encoded
protein SEQ
ID NO: 37767.
1919 AAM64684Homo sapiensMOLE- Human brain expressed189 79
single
exon probe encoded protein
SEQ ID
NO: 36789.
1919 gig 17477135Homo Sapienssimilar to embryonal 263 75
stem cell specific
ref~XP'0634 gene 1
15.1
1920 g12623757Rarius neurabin 172 97
norvegicus
1920 12827450Gallus anus KS5 rotein 154 88
1920 113991829Xenopus laevisneurabin 145 83
1923 g15532302Heterocapsa PSII CP47 apoprotein 75 29
tri uetra
1923 g11881335Bacillus SIMILAR TO YQFU, YXKD, 68 38
subtilis YITB
OF B. SUBTILIS.
1923 gi~5532302~gHeterocapsa PSII CP47 apoprotein 75 29
b~AAD4470triquetra
1.1~
1924 g16855429Leishmania possible mucin 1 precursor77 33
maj or
1924 g15832816Caenorhabditiscontains similarity to 74 34
Pfam domain:
elegans PF01694 (Rhomboid family),
Score=61.7, E-value=5.1e-15,
N=1
1924 AAB51976Homo SapiensHUMA- Human secreted 72 38
protein
sequence encoded by gene
48 SEQ ID
N0:108.
1925 AAB51635Homo SapiensROSE/ Human secreted 205 31
protein
sequence encoded by gene
16 SEQ ID
N0:75.
1925 AAB47128Homo Sapiens1NCY- CDIFF-6, Incyte 199 34
ID No.
2009435CD 1.
1925 ABB55766Homo SapiensFECH/ Human polypeptide 197 38
SEQ ID
NO 138.
1926 AAG89279Homo SapiensGEST Human secreted protein,330 44
SEQ ID
NO: 399.
1926 AAB70690Homo SapiensSREN- Human hDPP protein319 44
sequence
SEQ ID N0:7.
1926 g113182757Homo sa iensHTPAP 319 44
1927 g113177290Ectocarpus EsV-1-8 69 36
siliculosus
virus
1928 g118700171Arabidopsis AT5g20480/F7C8 70 86 39
thaliana
1928 g1915207Sus scrofa gastric mucin 83 29
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
203
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1928 gi532113Caenorhabditishomeotic region most 79 27
like
elegans HMPB_DROME: homeotic
probosci edia rotein
1929 ABB 12295Homo SapiensHYSE- Human secreted 135 59
protein
homologue, SEQ ID N0:2665.
1929 AAG04080Homo SapiensGEST Human secreted 78 38
protein, SEQ ID
NO: 8161.
1929 gi9279807Drosophila cortactin 77 27
melanogaster
1930 AAV81204_Homo sapiensGEHO Human CD7 cDNA. 872 73
aal
1930 AAB36657Homo SapiensIMMV Human CD7 protein 872 73
sequence
SEQ ID N0:2.
1930 AAU02438Homo SapiensGEHO Human lymphocyte 872 73
cell surface
anti en CD7 olype tide.
1931 gi2636248Bacillus similar to transaldolase73 29
subtilis (pentose
hosphate)
1931 gi~21398633~Bacillus Transaldolase, Transaldolase74 29
[Bacillus
reflNP,6546anthracis
A2012
18.1
1931 gi~16080764~Bacillus similar to transaldolase73 29
subtilis (pentose
ref~NP_3915 phosphate)
92.1
1932 AAB43545Homo SapiensHUMA- Human cancer associated73 46
protein sequence SEQ
ID N0:990.
1932 AAM40234Homo SapiensHYSE- Human polypeptide71 26
SEQ ID
NO 3379.
1934 gi3129962Gallus gallusB locus Lectin like 82 30
Natural Killer cell
surface protein
1934 AAB93791Homo SapiensHELI- Human protein 77 38
sequence SEQ
ID N0:13545.
1934 gi2541864Drosophila DAD polypeptide 77 32
melanogaster
1935 gi~4959869~gMurine leukemiapolymerise 335 52
b~AAD3453virus
6.1~
1935 gi~6524624~gPhascolarctospol protein 331 52
b~AAF15098cinereus
.l~
1935 gi~9630313~rGibbon ape pol polyprotein 328 52
ef~NP_0567leukemia
virus
90.1
1936 gi6562332Arabidopsis diaminopimelate decarboxylase86 30
thaliana
1936 gi7573355Arabidopsis diaminopimelate decarboxylase-like86 30
thaliana rotein
1936 gi15146250Arabidopsis ATSg11880/F14F18 50 86 30
thaliana
1939 AAU07442Homo SapiensGETH Human Wntl Upregulated300 100
protein 2 (WUP2).
1939 AAU07441Homo SapiensGETH Human Wntl Upregulated300 100
protein 1 (WUP1).
1939 AAB56802Homo sapiensROSEI Human prostate 300 100
cancer antigen
protein se uence SEQ
ID N0:1380.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
204
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1940 15802814 Homo sa Gag-Pro-Pol-Env rotein 587 57
iens
1940 g14185939Human pol protein 586 57
endogenous
retrovirus
K
1940 15802821 Homo sa Gag-Pro-Pol rotein 586 57
iens
1941 AAU83088 Homo sapiensZYMO Novel secreted protein586 100
Z2812G3P.
1941 AAB20275 Homo sa SCHE Human interleukin 535 76
iens DNAX 80.
1941 AAB20277 Homo SapiensSCHE Human interleukin 529 76
DNAX 80
variant.
1942 AAM06866 Homo SapiensHYSE- Human foetal protein,994 100
SEQ ID
NO: 1074.
1942 g117426446Homo sa bA351K23.5 (novel rotein)933 54
iens
1942 115099951Mus musculusdiacylglycerol acyltransferase915 55
2
1943 AAM06596 Homo sapiensHYSE- Human foetal protein,406 98
SEQ ID
NO: 327.
1943 gi~15640499~Vibrio choleraeS-adenosylmethionine 67 51
synthase
ref~NP-2301
26.1 ~
1945 AAG75561 Homo SapiensHUMA- Human colon cancer327 100
antigen
protein SEQ ID N0:6325.
1945 g116416764Homo SapiensFI~SG16 327 100
1945 g113905212Mus musculusRIKEN cDNA 1200006F02 261 79
gene
1946 g1288174 Mus musculusOct2b 97 85
1946 g153490 Mus musculusOct2.5 transcription 97 85
factor
1946 g19937478Drosophila thyroid hormone receptor-associated72 39
melanogasterrotein TRAP 170
1947 AAM66980 Homo SapiensMOLE- Human bone marrow 170 69
expressed probe encoded
protein SEQ
ID NO: 27286.
1947 AAM54574 Homo SapiensMOLE- Hurnan brain expressed170 69
single
exon probe encoded protein
SEQ ID
NO: 26679.
1947 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86
expressed probe encoded
protein SEQ
ID NO: 35495.
1948 AAY10874 Homo SapiensHUMA- Amino acid sequence100 100
of a
human secreted rotein.
1949 AAA27155_Homo SapiensGENE- Human P2 DNA. 100 100
aal
1949 AAY94475 Homo SapiensGENE- Predicted translation100 100
product of
human P2 splice isoform,
P2-B.
1949 AAY94474 Homo SapiensGENE- Human P2 protein. 100 100
1950 19502082 Homo sapienstubby super-family protein80 40
1950 19502080 Mus musculustubby super-family protein77 41
1950 18118432 Oryza sativabeta-ex ansin 73 35
1951 g14808994walleye envelope polyprotein 69 46
epidermal
hyperplasia
virus
type 1
1951 gig 15642893Thermotoga ribonucleotide reductase,66 46
B 12-
ref~NP_2279maritime dependent
34.1
1952 AAB80264 Homo SapiensGETH Human PR0332 protein.~ 577 ~ 61
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
205
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1952 AAB33425 Homo SapiensGETH Human PR0332 protein577 61
UNQ293 SEQ ID N0:57.
1952 AAY13396 Homo SapiensGETH Amino acid sequence577 61
of protein
PR0332.
1953 gi16648392Drosoplula LD39243p 449 61
melanogaster
1953 AAG73684 Homo SapiensHUMA- Human colon cancer371 55
antigen
rotein SEQ ID N0:4448.
1953 AAY48312 Homo SapiensMETA- Human prostate 371 55
cancer-
associated rotein 9.
1954 AAU84348 Homo SapiensBARK/ Protein MMP2 differentially2068 94
ex ressed in breast cancer
tissue.
1954 ABB90738 Homo SapiensUYJO Human Tumour Endothelial2068 94
Marker poi eptide SEQ
ID NO 208.
1954 AAB84607 Homo SapiensPFIZ Amino acid sequence2068 94
of matrix
metallo roteinase elatinase
A.
1955 gi16769680Drosophila LD46678p 245 35
melano aster
1955 AAM66797 Homo SapiensMOLE- Human bone marrow 148 80
expressed probe encoded
protein SEQ
ID NO: 27103.
1955 AAM54396 Homo SapiensMOLE- Human brain expressed148 80
single
exon probe encoded protein
SEQ ID
NO: 26501.
1957 AAB80242 Homo SapiensGETH Human PR0236 rotein.648 97
_ AAM93378 Homo SapiensHELI- Human polypeptide,648 97
1957 SEQ ID
N0: 2955.
1957 AAB 12157Homo sapiensPROT- Hydrophobic domain648 97
protein
from clone HP03165 isolated
from KB
cells.
1958 AAM41696 Homo SapiensHYSE- Human polypeptide 234 47
SEQ ID
NO 6627.
1958 AAU17119 Homo SapiensHUMA- Novel signal transduction229 46
pathway protein, Seq
ID 684.
1958 gi16741621Homo SapiensSimilar to RAB37, member228 47
of RAS
oncogene family
1959 gi18025526cercopithicineLF3 140 30
he esvirus
15
1959 gi3153821Mus musculusplenty-of prolines-101; 137 25
POP101; SH3-
philo-protein
1959 gi39255 Actinomycessialidase 129 28
viscosus
1960 ABB 12366Homo SapiensHYSE- Human bone marrow 400 90
expressed
rotein SEQ ID NO: 120.
1960 AA012936 Homo SapiensHYSE- Human polypeptide 115 95
SEQ ID
NO 26828.
1960 AAM84898 Homo SapiensHUMA- Human 113 82
immune/haematopoietic
antigen SEQ
ID N0:12491.
1961 gi19110438Homo sa polycystin-1L1 190 94
iens
1961 gi3115393Rana pipiensguanylate cyclase inhibitory80 35
. protein
1961 gi3462887Ratios alpha-fodrin 68 31
norvegicus
1962 AAU83130 Homo Sapiens~ ZYMO Novel secreted ~ 1076~ 100
protein
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
206
Table 2
SEQ AccessionSpecies Description Score /a
ID No. Identity
NO:
Z835892G6P.
1962 11890354 Brassica L-ascorbate eroxidase 80 33
na us
1962 g17529611Leishmania hypoothetical protein 79 31
L787.06
ma' or
1963 AAG78679 Homo sa BODE- Human thrombotic 467 86
iens protein 46.
1963 AAY87347 Homo SapiensINCY- Human signal peptide467 86
containing protein HSPP-124
SEQ ID
N0:124.
1963 AAB01431 Homo sa MILL- Human TANGO 224 467 86
iens (form 2).
1964 g13413504Rattus Bassoon 81 26
norvegicus
1964 g1330452 human DNA polymerase 79 28
he esvirus
5
1964 AAV69717_Homo SapiensLUDW- Tumour rejection 73 33
antigen
aal precursor MAGE-C1 cDNA.
1965 gi~2323'287~gmultiple polyprotein 286 64
b~AAB6652sclerosis
8.1~ associated
retrovirus
1965 gi~2351212~dFriend marinegag-pol polyprotein (precursor179 47
protein)
bj~BAA2206leukemia
virus
4.1~
1965 gi~9629516~rRauscher Pol 179 47
marine
ef~NP_0447leukemia
virus
38.1
1966 gi~2323287~gmultiple polyprotein 476 65
b~AAB6652sclerosis
8.1~ associated
retrovirus
1966 gi~2281588~gsynthetic Pol 323 51
b~AAB6416construct
0.1~
1966 gi~9626961~rMarine leukemiaPr180 323 51
ef~NP_0579virus
33.1
1967 12065210 Mus musculusPro-Pol-dUTPase pol rotein518 73
1967 AAM65715 Homo SapiensMOLE- Human bone marrow 464 69
expressed probe encoded
protein SEQ
ID NO: 26021.
1967 AAM53338 Homo SapiensMOLE- Human brain expressed464 69
single
exon probe encoded protein
SEQ ID
NO: 25443.
1968 AAG78149 Homo SapiensBODE- Human polypeptide-388 82
cytochrome b5-13.
1968 g13150438Human pol-env 345 55
endogenous
retrovirus
K
1968 g11469243Human pol/env 345 55
endogenous
retrovirus
K
1969 g121113108XanthomonasTong-dependent receptor 78 31
campestris
pv.
campestris
str.
ATCC 33913
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
207
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1969 gi476274 Homo SapiensR kappa B 77 23
1969 gi4206769Acanthamoebamyosin I heavy chain 76 27
kinase
castellanii
1970 gi~13310191~multiple recombinant envelope 244 77
protein
gb~AAK181sclerosis
89.1~AF331associated
1 retrovirus
500
_ element
1970 gi~8272468~gHomo Sapiensenvelope protein 219 81
b~AAF74215
.1 ~AF15696
3 1
1970 gi~21103962~Homo Sapiensenverin-2 219 77
gb~AAM331
41.1
1971 AAU83621 Homo SapiensGETH Human PRO protein, 320 100
Seq ID No
60.
1971 AA005826 Homo SapiensHYSE- Human polypeptide 295 93
SEQ ID
NO 19718.
1971 AAM39560 Homo SapiensHYSE- Human polypeptide 194 56
SEQ ID
NO 2705.
1972 gi6456112Mus musculusF-box protein FBX15 128 44
1972 gi21428946Drosophila GH22104p 74 31
melanogaster
1972 gi~6456112~gMus musculusF-box protein FBX15 128 44
b~AAF09139
.1~
1973 1148270 Escherichialambda-integrase 550 94
coli
1973 g11790244Escherichiasite-specific recombinase,550 94
coli acts on cer
I~12 sequence of ColEl, effects
chromosome segregation
at cell
division
1973 g113364217Escherichiasite-specific recombinase544 92
coli XerC
0157:H7
1974 g11805552EscherichiaFORMATE HYDROGENLYASE 887 88
coli
TRANSCRIPTIONAL ACTIVATOR.
1974 11616960 EscherichiaHyfR 887 88
coli
1974 g17920396Salmonella formate hydrogenlyase 522 54
activator
typhimuriumprotein
1975 1409795 EscherichiaNo definition line found1175 99
coli
1975 g115074592SinorllizobiumHYPOTHETICAL 378 33
meliloti TR.ANSMEMBRANE PROTEIN
1975 g117740718AgrobacteriumNa+/Pi-cotransporter 372 34
tumefaciens
str.
C58 (U.
Washington)
1976 AAB82047 Homo SapiensIGAK- Human mast cell 163 23
surface
antigen.
1976 g112654783Homo SapiensSimilar to loss of heterozygosity,163 23
11,
chromosomal region 2,
gene A
1976 AAZ45690-Homo sapiensREGC cDNA sequence encoding108 25
the
aal human minor vault protein
193.
1977 ABB56523 Homo SapiensMERI Human NMDA receptor73 28
subunit
SEQ ID NO 44.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
208
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1977 AAW87504 Homo SapiensSIBI- Human N-methyl-D-aspartate73 28
receptor subunit encoded
by clone
NMDA24.
1978 AAG00471 Homo SapiensGEST Human secreted protein,285 93
SEQ ID
NO: 4552.
1978 gi298489 Papio hamadryasSP-10 133 34
1978 gi452582 Vulpes vulpesfox sperm acrosomal protein132 34
FSA-
Acr. l
1979 AAB87128 Homo SapiensMILL- Human secreted 490 86
protein
MANGO 349, SEQ ID N0:130.
1979 AAB87179 Homo SapiensMILL- Human secreted 488 85
protein
MANGO 349 I21K variant,
SEQ ID
N0:227.
1979 AAB87181 Homo SapiensMILL- Human secreted 487 85
protein
MANGO 349 E41D variant,
SEQ ID
N0:231.
1982 AAM75035 Homo SapiensMOLE- Human bone marrow 109 67
expressed probe encoded
protein SEQ
ID NO: 35341.
1982 AAM62231 Homo SapiensMOLE- Human brain expressed109 67
single
exon probe encoded protein
SEQ ID
NO: 34336.
1982 gi11967423Mus musculusvomeronasal receptor 105 76
V1RC5
1983 AAG89276 Homo sapiensGEST Human secreted protein,224 46
SEQ ID
NO: 396.
1983 AAB56565 Homo sapiensROSE/ Human prostate 99 40
cancer antigen
protein sequence SEQ
ID N0:1143.
1983 AAY44987 Homo sa 1NCY- Human epidermal 78 28
iens protein-4.
1984 AAB95089 Homo SapiensHELI- Human protein sequence498 97
SEQ
ID NO:17025.
1984 AAM06608 Homo SapiensHYSE- Human foetal protein,495 96
SEQ ID
NO: 339.
1984 gi497890 unidentifiedalpha subunit of dinitrogenase73 24
nitrogen-fixingreductase (Fe protein)
bacteria
1985 gi~17455728~Homo Sapienssimilar to Zinc-forger 71 37
protein ubi-d4
ref~XP_0635 (Requiem) (Apoptosis
response zinc
94.1 ~ finger protein)
1986 gi21428886Drosophila GH12469p 69 34
melano aster
1987 17767529 Bos taurus cyclophilin I 364 75
1987 18699209 Canis familiariscyclo hilin A 361 88
1987 111641132Sus scrofa cyclo hilin 361 88
1988 g115073168SinorhizobiumPROBABLE TRANSLATION 81 37
meliloti INITIATION FACTOR IF-2
PROTEIN
1988 g11181352Paramecium Pro-rich protein; PIPG 78 25
(8X)
bursaria
Chlorella
virus 1
1988 g1493242 Feline Feline herpesvirus type 77 20
1 immediate
herpesvirusearly protein
1
1989 AAM65707 Homo SapiensMOLE- Human bone marrow 134 66
expressed probe encoded
protein SEQ
ID NO: 26013.
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
209
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1989 AAM53330 Homo SapiensMOLE- Human brain expressed134 66
single
exon probe encoded protein
SEQ ID
NO: 25435.
1989 gi~20475216~Homo Sapienssimilar to synapsin 228 59
I
ref~XP-1148
02.1 ~
1990 AAM71181 Homo SapiensMOLE- Human bone marrow110 64
expressed probe encoded
protein SEQ
ID NO: 31487.
1990 AAM58674 Homo SapiensMOLE- Human brain expressed110 64
single
exon probe encoded protein
SEQ ID
NO: 30779.
1990 gi21323636CorynebacteriumSulfate permease and 75 26
related
glutamicum transporters (MFS superfamily)
ATCC 13032
1991 gi1932813Xeno us laevisdsRNA adenosine deaminase96 34
1991 AAE10203 Homo SapiensHYSE- Human bone marrow83 25
derived
conti rotein, SEQ ID
NO: 68.
1991 gi3242649Rana catesbeianaalpha 1 type I collagen80 30
1992 gi1181423Paramecium PBCV-1 chitinase 71 41
bursaria
Chlorella
virus 1
1992 gi~21300897~Anopheles agCP14405 72 37
gb~EAA130gambiae str.
42.1 ~ PEST
1992 gi~9631828~rParamecium PBCV-1 chitinase 71 41
ef~NP_0486bursaria
13.1 Chlorella
virus 1
1994 gi8248755Plasmodium protein phosphatase 72 25
falciparum
3D7
1994 gi4104348CampylobacterS-layer-RTX protein 70 38
rectus
1994 gi~8248755~ePlasmodium protein phosphatase 72 25
mb~CAB628falciparum
3D7
78.2
1995 gi21324402CorynebacteriumUncharacterized ATPase 73 38
related to the
glutamicum helicase subunit of
the Holliday
ATCC 13032 junction resolvase
1995 gi~19552845~CorynebacteriumCOG2256:Uncharacterized73 38
ATPase
ref~NP_6008glutamicum related to the helicase
subunit of the
47.1 Holliday 'unction resolvase
1995 gi~17533213~CaenorhabditisF14ES.S.p 73 30
reflNP elegans
4957
77.1 ~
1996 11871223 Rickettsia crystalline surface 92 30
hi layer rotein
1996 g16969926Rickettsia OmpB ~ 79 25
aeschlimannii
1996 g114670347Rickettsia OmpB 78 25
felis
1997 gi~20548733~Homo Sapienssimilar to gag protein 256 58
re~XP-0556
41.2
1997 gi~9739120~gBovine leukemiagag 186 34
b~AAF97916virus
.l
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
210
TahlP 7
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
1997 gi~9626226~rBovine leukemiaPr44 185 34
e~NP_0568virus
97.1
1998 AAM79834 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO
3480.
1998 AAM78850 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO
1512.
1998 AAM79204 Homo SapiensHYSE- Human protein SEQ 272 71
ID NO
1866.
1999 AAM73176 Homo SapiensMOLE- Human bone marrow 168 48
expressed probe encoded
protein SEQ
ID NO: 33482.
1999 AAM60521 Homo sapiensMOLE- Human brain expressed168 48
single
exon probe encoded protein
SEQ ID
NO: 32626.
1999 gi~13929148~Rattus cyclic nucleotide-gated 163 47
channel beta
ref~NP_1139norvegicus subunit 1
97.1 ~
2000 gi1869859human very large tegument protein73 30
he esvirus
2
2000 gi7380253Neisseria 2-keto-4-hydroxyglutarate70 37
aldolase
' meningitidis
22491
2000 gi7226633Neisseria 4-hydroxy-2-oxoglutarate70 37
aldolase/2-
meningitidisdeydro-3-deoxyphosphogluconate
MC58 aldolase
2001 gi17016969Mus musculusNUANCE 138 36
2001 gi6273778Homo Sapienstrabeculin-alpha 137 33
2001 gi1675222Mus musculusACF7 neural isoform 1 136 42
2002 AAM39256 Homo SapiensHYSE- Human polypeptide 81 29
SEQ ID
NO 2401.
2002 1840789 Homo sa bindin re ulato factor 81 29
iens
2002 g117028337Homo Sapiensregulatory factor X, 81 29
5 (influences HLA
class II expression)
2003 g12252814Mus musculusFOG 172 64
2003 AAR58815 Homo SapiensUSSH Human c-myc far 103 42
upstream
element (FUSE) binding
protein
(FBP)variant from HL60
clone 3-1.
2003 g13598974Rattus protein tyrosine phosphatase103 26
TD14
norve icus
2004 g111994696Arabidopsiscontains similarity to 77 28
DNA repair
thaliana protein ene id:K7M2.11
2004 17209527 Mus musculustestis-s ecific gene 73 24
2004 gi~17451912~Homo Sapienssimilar to DNA-binding 234 97
protein B
ref~XP_0710
83.1
2005 AAE12023 Homo sapiens1NCY- Human G-protein 173 100
coupled
receptor, GCREC-2.
2005 AAG65832 Homo SapiensFARB Human G protein-coupled173 100
receptor (GPCR).
2005 AAG68126 Homo SapiensFARB Human 7TM-GPCR protein105 78
sequence SEQ ID N0:6.
2006 g120068811Homo SapiensRab-couplin protein 130 43
2006 g115822596Homo sapiensnRi 11 104 45
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
211
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2006 gi13377897Homo SapiensRabl l interacting protein83 40
I2i l la
2007 gi~17539708)CaenorhabditisFO8B4.S.p 78 42
ref~NP-5014elegans
89.1
2008 AAE10350 Homo SapiensPFIZ Human ADAMTS-J1.4 504 97
variant
protein.
2008 AAE10349 Homo SapiensPFIZ Human ADAMTS-J1.3 504 97
variant
rotein.
2008 AAE10347 Homo sapiensPFIZ Human ADAMTS-J1.1 504 97
variant
protein.
2009 AAV31720_Homo SapiensMOUN Nucleotide sequence87 29
of the
aal PUR-al ha ene.
2009 AAT99264_Homo SapiensMOUN Human PUR-alpha 87 29
gene.
aal
2009 AAQ44800_Homo SapiensMOUN Encodes single-stranded87 29
DNA
aal binding (PUR) protein.
2010 gi170444 Lycopersiconextensin (class II) 123 27
esculentum
2010 gi4662641Arabidopsisexpressed protein 116 30
thaliana
2010 gi188864 Homo sa mucin 115 28
iens
2011 AAY93650 Homo SapiensHUMA- Amino acid sequence1677 100
of a
human prostacyclin-stimulating
factor-
2.
2011 AAS 15723_Homo SapiensCURA- DNA encoding insulin-like1673 99
aal growth factor family
related protein,
NOV3.
2011 AAE17599 Homo SapiensINCY- Human extracellular1673 99
messenger
(XMES)-1 rotein.
2012 gi10440434Homo sa FLJ00052 protein 336 69
iens
2012 gi20502870Mus musculusSDS3 333 68
2012 gi21430678Drosophila RE74901p 170 36
melano aster
2013 AAH77293_Homo SapiensMILL- Human ion channel 214 93
protein
aal IC32391 cDNA coding re
ion.
2013 AAE13278 Homo Sapiens1NCY- Human transporters214 93
and ion
channels (TRICH)-5.
2013 AAG77969 Homo SapiensMILL- Human ion channel 214 93
protein
IC32391.
2014 gi4894768Xeno us ephrin-B2 recursor 78 30
laevis
2015 AAU77498 Homo sapiens1NCY- Human lipid metabolism1291 100
enzyme, LMM-6.
2015 ABB08205 Homo SapiensINCY- Human lipid metabolism1122 100
enzyme-5 (LME-5).
2015 ABB07493 Homo SapiensINCY- Human lipid metabolism864 75
molecule (LMM) polypeptide
(ID:
2965233 CD 1 ).
2016 gi~14769015~Homo Sapiensfibrillin3 68 36
retlXP_0415
69.1 ~
2017 gi2313786Helicobacterchorismate synthase (aroC)78 33
ylori 26695
2017 gi4155160HelicobacterCHORISMATE SYNTHASE 72 32
pylori J99
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
212
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2017 gi~15645287~Helicobacterchorismate synthase (aroC)78 33
reilNP-2074pylori 26695
57.1
2018 gi15485622Homo sa Q9H4T4 like 1068 100
iens
2018 ABB 14744Homo SapiensHUMA- Human nervous system694 98
related
pol epode SEQ ID NO 3401.
2018 AAB95100 Homo SapiensHELI- Human protein sequence101 24
SEQ
ID N0:17064.
2019 18050556 Gorilla carboxyl-ester lipase 223 42
gorilla
2019 AAU09894 Homo SapiensMONS Bile Salt Stimulated217 39
Lipase
(BSSL).
2019 ABB04676 Homo SapiensMONS Human milk bile 217 39
salt-
stimulated lipase (BSSL)
protein SEQ
ID N0:2.
2020 12065210 Mus musculusPro-Pol-dUTPase polyprotein515 74
2020 gi~385615~gbMus Sp. fibulin gene homolog 300 75
~AAB26708.
1~
2020 gi~13194728~Gallus galluspol-like protein ENS-3 170 33
gb~AAK155
26.1 ~AF329
451 1
2021 AAM66980 Homo SapiensMOLE- Human bone marrow 170 75
expressed probe encoded
protein SEQ
ID NO: 27286.
2021 AAM54574 Homo sapiensMOLE- Human brain expressed170 75
single
exon probe encoded protein
SEQ ID
NO: 26679.
2021 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86
expressed probe encoded
protein SEQ
ID NO: 35495.
2022 AAD29146_Homo sapiensZYMO Human Zcyto2l consensus649 83
aal cDNA.
2022 AAU83208 Homo SapiensZYMO Novel secreted protein649 83
Z908463G2P.
2022 AAE18311 Homo SapiensZYMO Human Zcyto2l consensus649 83
protein.
2024 g114336750Homo SapiensCe protein similar to 84 34
Dm Cys3His
forger rotein
2024 AAB50363 Homo sa UYSL- Human SRCAP. 83 34
iens
2024 AAB95541 Homo SapiensHELI- Human protein sequence83 34
SEQ
ID N0:18149.
2025 g118676682Homo SapiensFLJ00240 protein 470 45
2025 g114701866Dictyosteliumcarmil 221 29
discoideum
2025 g11881738Acanthamoebamyosin-I binding protein219 29
Acan125
castellanii
2026 ABB12490 Homo SapiensHYSE- Human bone marrow 212 78
expressed
protein SEQ ID NO: 329.
2027 AAU83147 Homo SapiensZYMO Novel secreted protein1153 100
Z846363G2P.
2027 gi~21287755~Anopheles ebiP4780 205 51
gb~EAA000gambiae
str.
76.1 ~ PEST
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
213
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2027 gi~17552028~CaenorhabditisCOSD11.8.p 91 38
ref~NP-4984elegans
07.1 ~
2028 gi1510143Homo Sapienssimilar to C.elegans 323 57
protein encoded in
cosmid T20D3 (Z68220).
2028 gi3879942CaenorhabditisT20D3.11 124 27
elegans
2028 gi5869818Globodera NADH-ubiquinone oxidoreductase82 27
allida subunit 6
2029 AAE13288 Homo SapiensINCY- Human transporters75 31
and ion
channels (TRICH)-15.
2029 gi3252893Thermotoga ABC transporter 74 37
neapolitana
2029 gi~18403965~Arabidopsisexpressed protein 70 29
re~NP_5658thaliana
26.1
2030 AAB97908 Homo SapiensSHAN- Hurnan GTP-binding79 27
protein
17 SEQ ID N0:2.
2030 AAM42129 Homo SapiensHYSE- Human polypeptide 79 27
SEQ ID
NO 7060.
2030 gi9971156Mus musculusGTP-binding like protein79 27
2
2031 gi~20864803~Mus musculusRIKEN cDNA 4930503K02 89 25
ref)XP'1308
00.1 ~
2031 gi~21262152~Oryza sativaSMC4 protein 77 28
emb~CAD32
690.1
2031 gi~1507705~gBorrelia outer surface protein 74 33
b~AAB0656burgdorferi
8.1~
2032 AAG65898 Homo SapiensSMIK Amino acid sequence481 100
of GSK
ene Id 18525.
2032 AAU83670 Homo sapiensGETH Human PRO protein, 471 97
Seq ID No
158.
2032 ABB84896 Homo SapiensGETH Human PR01309 protein471 97
se uence SEQ ID N0:160.
2034 gi6723273Baboon gag-pol precursor polyprotein687 43
endogenous
virus sham
M7
2034 gi18448744Moloney Pr180 gag-pro-pol polyprotein685 42
marine
leukemia
virus
2034 gi2801471Moloney Pr180 682 42
m'urine
leukemia
virus
2035 gi~17554696~CaenorhabditisR148.7.p 68 32
ref~NP elegans
4976
70.1
2035 gi~16127996fEscherichiaaspartokinase I, homoserine68 43
coli
re~NP K12 dehydrogenase I
4145
~
43.1
2035 gi~19548975~Escherichiaaspartokinase I-homoserine.68 43
coli
gb~AAL908 dehydrogenase I
85.1~AF487
900 1
2036 gi13424459Caulobactermethyl-accepting chemotaxis~ 72 ~ 32
protein
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
214
TahlP 9
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
crescentus Mc I
CB15
2036 gi~16877133~Homo sapienscarboxypeptidase, vitellogenic-like69 30
gb~AAH168
38.1 ~AAH16
838
2037 AAB67055 Homo SapiensINCY- Human immune response532 75
molecule (IMUN) protein
SEQ ID NO:
9.
2037 AA001862 Homo SapiensHYSE- Human polypeptide403 67
SEQ ID
NO 15754.
2037 gi~6753924~rMus musculusFriend virus susceptibility240 39
1
eflNP
0343
_
74.1
2039 AAB38447 Homo SapiensHUMA- Fragment of human80 27
secreted
protein encoded by gene
20 clone
HLTFBY 15.
2039 111527799Mus musculusGTP-bindin rotein like 73 30
1
2039 g1695237 Equine tegument protein 73 33 a
he esvirus
2
2040 gi~20544038~Homo Sapienssimilar to PER-HEXAMER 68 41
REPEAT
ref~XP PROTEIN 5
0896
12.4
2042 AAM77922 Homo SapiensMOLE- Human bone marrow642 85
expressed probe encoded
protein SEQ
ID NO: 38228.
2042 AAM65219 Homo SapiensMOLE- Human brain expressed642 85
single
exon probe encoded protein
SEQ ID
NO: 37324.
2042 gi~6723273~dBaboon gag-pol precursor polyprotein139 26
bj~BAA8965endogenous
9.1 virus strain
M7
2043 g148507 Wolinella formate dehydrogenase 80 27
succinogenes
2043 112381857Danio rerio c-Maf 78 42
2043 gi~18594822~Homo Sapienszinc finger protein 306 100
21 (KOX 14)
reflXP_0929
95.1
2044 13132272 Sus scrofa WT1 homologue 99 47
2044 AAG78446 Homo sapiensMASI Predicted WT1 Wilin's96 45
tumour
pol eptide of humans.
2044 AAG62154 Homo SapiensCORI- Human WT1/PSA 96 45
fusion
rotein SEQ ID NO: 357.
2046 g121483222Drosophila AT16994p 86 33
melanogaster
2046 g121111736Xanthomonas cell division protein 79 30
campestris
pv.
campestris
str.
ATCC 33913
2046 112653493Homo SapiensSimilar to brain acid-soluble79 36
protein 1
2047 ABB 12490Homo SapiensHYSE- Human bone marrow200 83
expressed
rotein SEQ ID NO: 329.
2047 gi~20837783~Mus musculussimilar to 40S ribosomal73 35
protein S11
ret~XP_1459
21.1
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
215
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2047 gi~6002932~gStreptomycesglycosyl transferase 71 35
b~AAF00209fradiae
.1 CAF '
16496
0 5
2048 AAB59012 Homo SapiensHUMA- Breast and ovarian103 32
cancer
associated antigen protein
sequence
SEQ ID 720.
2048 gi2429362Santalum proline rich rotein 99 31
album
2048 gi17945382Drosophila RE17165p 98 25
melanogaster
2051 gi15625542Hepatitis S antigen 71 31
B virus
2051 gi~4884886~gHepatitis surface antigen 68 30
B virus
b~AAD3185
7.1 CAF
1341
40 1
2052 AAB28764 Homo SapiensHUMA- Sequence homologous693 78
to
protein fragment encoded
by gene 21.
2052 gi2065210Mus musculusPro-Pol-dUTPase olyprotein693 78
2052 AAB73606 Homo SapiensSHAN- Human dUTP pyrophosphatase668 77
26.
2053 gi9945983Pseudomonastranscriptional regulator83 34
PcaQ
aeru inosa
2053 gi13874427Homo sa cerebral protein-5 76 35
iens
2053 gi12803205Homo sa CAAX box 1 76 35
iens
2054 gi21307831Aplysia CREB-binding protein 76 26
californica
2054 gi16755887Drosophila guanine nucleotide exchange76 26
factor
melano aster
2054 gi~21307831~Aplysia CREB-binding protein 76 26
gb~AAL548californica
59.1)
2055 gi16588389Homo SapiensB lymphocyte activation-related437 71
protein
BC-1514
2055 AAB92981 Homo SapiensHELI- Human protein sequence407 68
SEQ
ID N0:11698.
2055 AAM48325 Homo SapiensSHAN- Human urine receptor398 74
21.23.
2056 gi~2072969~gHomo Sapiensp40 134 47
b~AACS
127
4.1~
2056 gi~7959889~gHomo SapiensPR02221 123 43
b~AAF71115
.1 CAF
11672
1 95
2056 gi~2072974~gHomo Sapiensp40 122 44
b~AACS
127
7.1
2057 gi19171178Homo Sapiensmetalloprotease disintegrin518 98
16 with
thrombospondin type I
motif
2057 gi19171150Homo sa ADAMTS18 rotein 168 35
iens
2057 AAM39212 Homo SapiensHYSE- Human polypeptide 128 76
SEQ ID
NO 2357.
2058 gi~4959869~gMurine leukemiapolymerase 336 50
b~AAD3453virus
6.1
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
216
Tahlc:
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2058 gi~9630313~rGibbon ape pol polyprotein 331 46
ef~NP_0567leukemia
virus
90.1
2058 gi~6723273~dBaboon gag-pol precursor polyprotein329 49
bj~BAA8965endogenous
9.1 ~ virus strain
M7
2059 gi~20546404~Homo Sapienssimilar to nuclear receptor179 91
coactivator
ref~XP_1164 4; RET-activating gene
ELE1
66.1
2060 gi~6731237~gHomo Sapiensmyoferlin 112 79
b~AAF27177
.1 CAF
18231
7 1
2060 gi~798799~gbMus musculusimmunoglobulin heavy 72 55
chain
~AAC37713.
1~
2060 gi~20819487~Mus musculussimilar to LYRIC 72 27
ref~XP_1453
57.1
2061 gi415738 Euglena PSII D1- olype tide 75 27
gracilis
2061 gi11491 Euglena 32 kd rotein 75 27
gracilis
2061 gi11488 Euglena 32-Kda thylakoid membrane75 27
acilis protein
2062 gi21360549ArabidopsisAT3g01480/F4P13 3 79 29
thaliana
2062 gi3337366Arabidopsisnodulin-like protein 68 36
thaliana
2063 17959778 Homo sa PR01546 121 42
iens
2063 AAG02639 Homo SapiensGEST Human secreted protein,119 53
SEQ ID
NO: 6720.
2063 AAG02753 Homo SapiensGEST Human secreted protein,110 45
SEQ ID
NO: 6834.
2064 g115077406Antheraea fibroin 109 30
yamamai
2064 AAB82806 Homo SapiensBOST- Human low density 92 24
lipoprotein
binding roteiii 2 (LBP-2).
2064 AA001059 Homo SapiensHYSE- Human polypeptide 90 30
SEQ ID
NO 14951.
2065 g1200964 Mus musculusserine 2 ultra hi h sulfur80 30
rotein
2065 1200962 Mus musculusserine 1 ultra high sulfur80 30
protein
2065 AAM99918 Homo SapiensHIJMA- Hurnan polypeptide75 28
SEQ ID
NO 34.
2066 g1544724 Cavia cholecystokinin A receptor;69 29
CCK-A
receptor
2066 g12541920Rattus cholecystokinintype-A 69 29
receptor
norvegicus
2066 12114152 Mus musculuscholecystokinin type-A 69 29
receptor
2067 g12828586Pongo pygmaeusBRCA1 73 22
2068 AAM40813 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID
NO 5744.
2068 AAM39027 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID
NO 2172.
2068 AAY25768 Homo SapiensHUMA- Human secreted 75 29
protein
encoded from gene 58.
2070 11334150 Mus musculusunidentified reading 169 28
frame (first ATG
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
217
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
at os. 210)
2070 gi557822 Saccharomycesmal5, stay len: 1367, 133 20
CAI: 0.3,
cerevisiae AMYH_YEAST P08640
GLUCOAMYLASE S1 (EC 3.2.1.3)
2070 gi1304387Saccharomycesglucoamylase 133 20
cerevisiae
var.
diastaticus
2071 gi17983056Brucella BETA-HEXOSAMINIDASE A 88 29
melitensis
2071 gi1573917Haemophilus multidrug resistance 81 33
' protein A (emrA)
influenzae
Rd
2071 gi17982813Brucella NITROGEN REGULATION 80 26
melitensis PROTEIN NTRB
2073 gi~17532255~Caenorhabditisankyrin and proline rich67 29
domains
ref~NP elegans
4964
31.1
2074 gi19919730Homo SapiensBTEBS 704 97
2074 gi13195441Homo sapiensBTE-binding protein 4 478 64
2074 114549656Mus musculusdo amine receptor regulating452 76
factor
2076 AAE17482 Homo SapiensZYMO Human leucine-rich 1326 100
repeat-7
(ZLRR7) rotein.
2076 AAU83190 Homo SapiensZYMO Novel secreted protein1326 100
Z887300G2P.
2076 ABB 11242Homo SapiensHYSE- Human SLIT-2 homologue,568 99
SEQ ID N0:1612.
2077 g118893729Pyrococcus proteaseiv 74 34
furiosus
DSM
3638
2077 AAB94745 Homo SapiensHELI- Human protein sequence71 34
SEQ
ID N0:15792.
2077 g116413096Listeria 11n0656 68 35
innocua
2078 g160675 Beet ringspotpolyprotein 75 37
virus
2078 gi~14743288~Homo Sapienssimilar to Alu subfamily92 58
J sequence
reflXP contamination warning
0471 entry
91.1
2078 gi~20260801~Beetringspotpolyprotein 75 37
ref~NP_6201virus
13.1
2079 g13834629Mus musculusdiaphanous-related formin;208 67
p134
mDia2
2079 AAG74400 Homo SapiensHUMA- Human colon cancer71 36
antigen
rotein SEQ ID N0:5164.
2079 13171906 Homo SapiensDIA-156 roteiii 71 36
2080 g117298315Homo sa ienscandidate tumor suppressor125 100
rotein
2080 g17861733Homo Sapienslow density lipoprotein 125 100
receptor related
protein-deleted in tumor
2080 g18926243Mus musculuslow density lipoprotein 90 63
receptor related
protein LRP1B/LRP-DIT
2081 g14574224Fundulus multidrug resistance 343 55
transporter
heteroclitushomolog
2081 g116304396Pseudopleuronecmultidrug resistance 340 52
transporter-like
tes americanusprotein
2081 g13355757Gallus gallus~ ABC transporter protein~ 328 ~ 53
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
218
Table 2
SEQ AccessionSpecies Description Score
ID No. Identity
NO:
2082 gi7532975bacteriophageP10 67 27
phi-8
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
219
Table 3
SEQ ID DatabaseDescription *Results
NO: entr
ID
1059 BL00349CTF/NF-I roteins. BL00349H 15.70 9.710e-09
8-45
1061 DM00215PROLINE-RICH PROTEIN DM00215 19.43 6.143e-10
3. 29-61
DM00215 19.43 8.322e-09
40-72
1062 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 6.092e-12
II 80-99
ORF2.
1063 PR00944COPPER ION BINDING PROTEINPR00944E 9.18 7.132e-09
33-46
SIGNATURE
1076 PD00078REPEAT PROTEIN ANK PD00078B 13.14 9.217e-09
23-35
NUCLEAR ANKYR.
1089 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 8.754e-10
16-25
SIGNATURE
1089 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.658e-09
16-30
SIGNATURE
1089 PR00341PRION PROTEIN SIGNATUREPR00341E 3.32 9.898e-09
24-43
1099 PR00886HIGH MOBILITY GROUP PR00886C 11.84 1.141e-12
28-46
(HMGl/HMG2) PROTEIN
SIGNATURE
1107 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.077e-09
PI 51-65
SIGNATURE
1118 BL00472Small cytokines BL00472A 7.45 5.655e-09
1-12
(intercrine/chemokine)
C-C
subfamily signatur.
1118 PR00655AUXIN BINDING PROTEIN PR00655E 8.06 9.000e-09
88-103
SIGNATURE
1119 BL00970Nuclear transition proteinBL00970C 14.80 8.183e-12
2 proteins. 99-136
1119 BL00826MARCKS family roteins. BL00826B 12.51 4.279e-09
92-143
1119 BL00348p53 tumor antigen proteins.BL00348F 23.19 5.881e-10
93-135
BL00348F 23.19 6.857e-09
91-133
1119 PD01457RIBOSOMAL PROTEIN 40S PD01457A 16.51 8.216e-09
ZINC- 73-117
FINGER METAL.
1119 BL00752XPA protein. BL00752B 19.17 7.866e-09
100-143
BL00752B 19.17 8.979e-09
63-106
1119 DM01269303 kw ACTIVATING RAN DM01269A 23.35 9.446e-09
109-136
GTPASE ISOZYME.
1124 DM01813EGG-LAYING HORMONE. DM01813A 15.31 5.215e-09
15-42
1127 BL00452Guanylate cyclases proteins.BL00452A 17.52 1.170e-09
6-27
1131 BL00113Adenylate kinase roteins.BL00113B 20.49 9.897e-09
157-200
1162 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.000e-35
ZINC- 24-62
FINGER METAL-BINDING
NU.
1163 BL00407Connexins proteins. BL00407B 14.23 9.775e-30
21-51
BL00407C 14.61 2.500e-24
52-79
1163 PR00206CONNEXIN SIGNATURE PR00206B 13.75 1.957e-24
33-55
PR00206A 11.35 6.559e-23
2-26
PR00206C 15.16 7.469e-20
58-78
1171 PD01066PROTEIN ZINC FINGER PD01066 19.43 8.500e-28
ZINC- 35-73
FINGER METAL-BINDING
NU.
1177 DM018031 HERPESVIRUS DM01803C 7.00 7.240e-09
46-55
GLYCOPROTEIN H.
1190 PR00774GUANYLIN PRECURSOR PR00774A 6.49 8.579e-10
69-81
SIGNATURE
1195 PD02059CORE POLYPROTEIN PROTEINPD02059C 21.58 8.031
e-09 100-140
GAG CONTAINS: P.
1197 BL00472Small cytokines BL00472A 7.45 8.000e-14
1-12
(intercrine/chemokine)
C-C
subfamily signatur.
1213 PR00437SMALL CXC CYTOKINE ~ PR00437C 14.85 1.310e-16
33-51
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
220
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
FAMILY SIGNATURE
1213 BL00471Small cytokines BL00471 23.92 7.960e-10
6-53
(intercrine/chemokine)
C-x-C
subfamily signat.
1216 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 5.208e-09
183-192
SIGNATURE
1222 PF00852Fucosyl transferase. PF00852F 15.97 1.409e-15
195-231
1224 BL00299Ubi uitin domain roteins.BL00299 28.84 6.301e-11
47-98
1230 PR00540MUSCARINIC M3 RECEPTOR PR00540A 10.24 7.174e-09
134-153
SIGNATURE
1240 BL00290Immunoglobulins and BL00290A 20.89 7.480e-10
major 160-182
histocompatibility complexBL00290B 13.17 2.875e-09
roteins. 226-243
1258 PR00792PEPSIN (Al) ASPARTIC PR00792A 11.54 5.500e-18
80-100
PROTEASE FAMILY SIGNATURE
1258 BL00141Eukaryotic and viral BL00141A 12.10 4.789e-15
aspartyl 87-102
proteases roteins. BL00141B 12.14 2.929e-10
228-239
1300 BL00616Histidine acid phosphatasesBL00616A 11.86 1.000e-09
136-143
phos hohistidine proteins.
1301 DM014176 kw INDUCING XPMC2 DM01417C 12.93 9.325e-12
361-372
MUSHROOM SPAC22G7.04. DM01417D 11.08 9.820e-12
400-415
1302 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 6.067e-11
324-338
SIGNATURE
1311 BL00926Lysyl oxidase copper-bindingBL00926B 13.84 7.453e-09
region 84-121
roteins.
1320 PR00830ENDOPEPTIDASE LA (LON) PR00830A 8.41 3.712e-09
29-48
SER1NE PROTEASE (S16)
SIGNATURE
1325 BL00048Protamine P1 proteins. BL00048 6.39 4.671e-10
58-84
BL00048 6.39 4.908e-10
60-86
BL00048 6.39 2.913e-09
59-85
BL00048 6.39 5.950e-09
57-83
1345 PF00424REV protein (anti-repressionPF00424A 14.34 2.436e-09
184-215
transactivator protein).
1345 BL00048Protamine P1 proteins. BL00048 6.39 4.553e-10
178-204
BL00048 6.39 6.513e-09
179-205
1353 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 2.857e-15
II 82-101
ORF2.
1363 PF00850Histone deacetylase PF00850B 10.13 5.154e-14
family. 95-109
PF00850C 14.55 9.063e-11
132-148
1389 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.423e-09
PI 50-64
SIGNATURE
1389 PD00306PROTEIN GLYCOPROTE1N PD00306B 5.57 7.000e-09
59-69
PRECURSOR RE.
1396 BL00427Disinte ins roteins. BL00427 13.93 7.698e-17
260-314
1396 PR00289DISINTEGR1N SIGNATURE PR00289A 13.62 5.667e-14
274-293
1416 BL00419Photosystem I psaA and BL00419B 22.23 9.489e-09
psaB 18-51
roteins.
1434 PF00075RNase H. PF00075I 16.21 7.375e-11
167-173
1440 BL00598Chromo domain proteins.BL00598 14.45 1.500e-15
112-133
1440 PR00504CHROMODOMA1N SIGNATURE PR00504B 9.12 5.200e-13
106-120
PR00504C 11.19 6.510e-09
121-133
1450 PF00622Domain in SPla and the PF00622B 21.00 2.227e-09
RYanodine 93-114
Rece tor.
1451 PD02935FATTY ACID PD02935C 16.62 4.375e-16
59-86
OXIDOREDUCTASE BIOSYNT.
1467 BL00479Phorbol esters / diacylglycerolBL00479A 19.86 3.000e-11
130-152
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
221
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
binding domain proteins.BL00479B 12.57 3.340e-10
156-171
1468 PF00992Tro onin. PF00992A 16.67 5.563e-10
139-173
1468 BL00795Involucrin proteins. BL00795C 17.06 3.600e-09
193-237
1468 PR00042FOS TRANSFORMING PROTEINPR00042D 8.97 7.554e-09
141-162
SIGNATURE
1474 BL00107Protein kinases ATP-bindingBL00107A 18.39 9.308e-12
region 62-92
proteins.
1474 PR00109TYROSINE KINASE CATALYTICPR00109B 12.27 1.563e-09
62-80
DOMAIN SIGNATURE
1474 BL00239Receptor tyrosine kinaseBL00239C 18.75 4.205e-09
class II 49-71
proteins.
1475 BL00456Sodiuxnaolute symporterBL00456C 24.55 4.886e-28
family 15-69
proteins.
1480 BL00983L -6 / u-PAR domain BL00983C 12.69 1.346e-09
roteins. 36-51
1482 BL00979G-protein coupled receptorsBL00979A 19.66 9.633e-12
family 3 74-121
roteins.
1502 PD02561DETHIOBIOTIN SYNTHETASEPD02561B 12.71 9.308e-09
176-182
SYNTHASE.
1506 BL00297Heat shock hsp70 proteinsBL00297H 15.46 9.625e-23
family 302-355
proteins. BL00297D 11.95 6.063e-21
166-205
BL00297E 18.56 6.077e-21
226-269
BL00297C 9.51 9.667e-15
105-156
1506 PR0030170 KD HEAT SHOCK PROTEINPR00301I 12.76 3.208e-11
320-336
SIGNATURE
1513 PR00130DNASE I SIGNATURE PR00130E 14.66 5.046e-09
237-266
1515 DM012423 THREONINE--TRNA LIGASE.DM01242A 20.32 5.286e-20
163-206
1517 BL00983Ly-6 l u-PAR domain BL00983B 8.19 5.935e-10
roteins. 40-49
1520 BL00415S a sins proteins. BL00415P 2.37 3.914e-10
138-173
1520 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 3.746e-09
124-138
SIGNATURE PR00049D 0.00 1.000e-08
123-137
1530 PF00075RNase H. PF00075F 12.87 5.500e-10
127-137
1537 PR00463E-CLASS P450 GROUP I PR00463F 17.63 5.219e-13
288-306
SIGNATURE PR00463A 11.40 8.714e-12
52-71
PR00463B 17.50 5.041e-10
76-97
1537 PR00385P450 SUPERFAMILY PR00385C 16.94 6.318e-09
289-300
SIGNATURE
1538 PR00709AVIDIN SIGNATURE PR00709A 4.60 5.585e-09
19-37
1553 DM01354kw TRANSCRIPTASE REVERSEDM01354Y 10.69 6.423e-16
II 113-152
ORF2.
1558 PD01066PROTEIN ZINC FINGER PD01066 19.43 6.400e-25
ZINC- 70-108
FINGER METAL-BINDING
NU.
1564 PF00589Phage integrase family.PF00589B 16.17 1.621e-11
158-171
PF00589C 14.62 9.609e-10
183-194
1566 BL00908Mandelate racemase / BL00908B 37.71 6.455e-13
muconate 191-245
lactonizing enzyme family
signa.
1567 PR00702ACRIFLAVIN RESISTANCE PR00702A 14.92 2.421e-25
8-32
PROTEIN FAMILY SIGNATUREPR00702B 12.77 9.690e-18
36-54
1570 BL01047Heavy-metal-associated BL01047A 13.50 5.125e-17
domain 75-97
proteins.
1575 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 9.429e-15
II 80-99
ORF2.
1606 PF00642Zinc finger C-x8-C-x5-C-x3-HPF00642 11.59 2.575e-11
type 197-207
(and similar).
1610 DM01354kw TRANSCRIPTASE REVERSEDM01354I 15.55 7.702e-34
II 348-388
ORF2. DM01354G 11.57, 3.625e-32
277-307
DM01354H 18.00 2.528e-23
308-347
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
222
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
DM01354F 14.56 4.088e-11
241-276
1616 PD02929 ADHESION GLYCOPROTE1N PD02929A 28.27 2.263e-25
32-85
PRECURSORI.
1627 PR00121 SODIiJM/POTASSITJM- PR00121A 6.71 1.000e-08
15-29
TRANSPORTING ATPASE
SIGNATURE
1630 PR00824 HEPATIC LIPASE SIGNATUREPR00824A 7.81 7.214e-22
6-24
1640 BL00359 Ribosomal protein L11 BL00359C 22.18 1.155e-11
proteins. 93-126
1641 PR00080 ALCOHOL DEHYDROGENASE PR00080A 9.32 8.839e-10
134-145
SUPERFAMILY SIGNATURE
1641 PR00081 GLUCOSE/RIBITOL PR00081A 10.53 2.000e-12
45-62
DEHYDROGENASE FAMILY PR00081E 17.54 1.783e-10
238-255
SIGNATURE PR00081B 10.38 2.227e-09
134-145
1641 BL00061 Short-chain BL00061A 9.41 9.053e-10
134-144
dehydrogenases/reductasesBL00061B 25.79 6.860e-09
family 197-234
roteins.
1666 BL01257 Ribosomal protein LlOeBL01257D 18.80 2.973e-15
proteins. 59-98
1667 BL01241 Link domain proteins. BL01241 35.81 8.579e-37
180-232
BL01241 35.81 7.835e-14
289-341
1667 BL00086 Cytochrome P450 cysteineBL00086 20.87 3.377e-09
heme- 283-314
iron 1i and roteins.
1668 PR00671 INHIBIN BETA B CHAIN PR00671A 8.36 8.088e-09
4-22
SIGNATURE
1672 BL00674 AAA-protein family BL00674E 15.24 5.680e-15
proteins. 31-50
1682 PF00075 RNase H. PF00075A 14.44 4.400e-13
73-89
PF00075C 11.58 8.442e-09
152-163
1689 PD01066 PROTEIN ZINC FINGER PD01066 19.43 6.471 e-27
ZINC- 268-306
FINGER METAL-BINDING
NU.
1689 PR00788 NITROPHOR1N SIGNATURE PR00788A 9.79 6.108e-09
3-15
1692 BL00299 Ubiquitin domain proteins.BL00299 28.84 4.759e-10
32-83
1697 PR00423 CELL DIVISION PROTEIN PR00423E 7.36 4.038e-09
FTSZ 20-41
SIGNATURE
1706 BL00795 Involucrin proteins. BL00795C 17.06 5.395e-10
185-229
1709 BL00514 Fibrinogen beta and BL00514C 17.41 3.618e-25
gamma chains 68-104
C-terminal domain proteins.BL00514H 14.95 6.745e-16
230-254
BL00514G 15.98 6.566e-14
198-227
BL00514E 14.28 8.286e-14
128-144
BL00514D 15.35 2.915e-12
109-121
1714 PF00878 Cation-independent PF00878T 17.51 3.818e-09
mannose-6- 41-67
hos hate receptor re
eat roteins.
1715 PF01140 Matrix rotein (MA), PF01140D 15.54 4.872e-09
15. 123-157
1715 PF00992 Troponin. PF00992A 16.67 6.451e-10
109-143
PF00992A 16.67 3.724e-09
98-132
PF00992A 16.67 6.684e-09
96-130
1718 PD02474 SYNTHASE SMALL SUBUNITPD02474B 21.08 7.940e-10
92-130
ACETOLACT.
1725 BL00412 Neuromodulin (GAP-43) BL00412B 10.60 1.000e-10
proteins. 46-82
1725 PR00215 NEUROMODULIN SIGNATUREPR00215C 13.98 6.116e-10
54-74
1725 DM01688 2 POLY-IG RECEPTOR. DM01688G 16.45 3.160e-09
119-150
DM01688I 14.97 6.885e-09
107-154
1725 PD02870 RECEPTOR INTERLEUKIN-1PD02870B 18.83 8.564e-09
303-335
PRECURSOR.
1727 BL00107 Protein kinases ATP-bindingBL00107A 18.39 7.750e-21
region 185-215
proteins.
1727 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 7.176e-12
185-203
DOMAIN SIGNATURE
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
223
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
1727 BL00239 Receptor tyrosine kinaseBL00239B 25.15 4.387e-09
class II 119-166
roteins.
1728 BL00415 Synapsins proteins. BL00415Q 2.23 8.115e-09
52-87
1734 PD01270 RECEPTOR FC PD01270B 22.18 5.567e-18
75-111
IMMUNOGLOBULIN AFFIN. PD01270C 19.54 1.167e-17
118-146
PD01270A 17.22 4.960e-14
21-60
PD01270D 24.66 4.284e-09
152-187
1736 PD02346 PHOTOSYSTEM II PROTEINPD02346A 9.24 8.851e-09
6-17
PRECURSOR PHOTOSYNTHESIS.
1741 BL00415 Syna sins proteins. BL00415Q 2.23 6.777e-09
317-352
1744 BL00479 Phorbol esters / diacylglycerolBL00479B 12.57 1.000e-08
33-48
binding domain proteins.
1750 PR00763 COAGULIN SIGNATURE PR00763B 8.39 6.457e-09
41-60
1754 PR00276 INSULIN A CHAIN SIGNATUREPR00276A 11.84 7.840e-09
46-55
1755 PR00042 FOS TRANSFORMING PROTEINPR00042D 8.97 2.565e-09
164-185
SIGNATURE
1755 PF00922 Vesiculovirus hospho PF00922A 19.17 5.759e-09
rotein. 99-132
1778 PR00245 OLFACTORY RECEPTOR PR00245A 18.03 9.836e-14
59-80
SIGNATURE PR00245C 7.84 1.540e-13
237-252
PR00245B 10.38 2.125e-13
176-190
1778 BL00237 G-protein coupled receptorsBL00237A 27.68 1.474e-12
proteins. 90-129
1778 PR00534 MELANOCORTIN RECEPTOR PR00534A 11.49 4.729e-09
51-63
FAMILY SIGNATURE
1778 PR00237 RHODOPSIN-LIFE GPCR PR00237A 11.48 3.613e-09
26-50
SUPERFAMILY SIGNATURE PR00237C 15.69 7.525e-09
104-126
1787 PR00007 COMPLEMENT C1Q DOMAIN PR00007B 14.16 5.114e-15
146-165
SIGNATURE PR00007A 19.33 7.052e-10
119-145
1787 PR00524 CHOLECYSTOKININ TYPE PR00524F 5.36 4.351e-09
A 70-83
RECEPTOR SIGNATURE
1787 DM00250 kw ANNEXIN ANTIGEN DM00250B 13.84 6.595e-09
82-105
PROLINE TUMOR.
1787 BL00415 Syna sins roteins. BL00415N 4.29 7.372e-09
62-105
1787 BL01113 Clq domain proteins. BL01113B 18.26 3.786e-23
125-160
BL01113A 17.99 7.968e-15
73-99
BL01113A 17.99 5.091e-14
70-96
BL01113A 17.99 5.295e-11
64-90
BL01113A 17.99 8.568e-11
79-105
BL01113A 17.99 8.977e-11
67-93
BL01113A 17.99 4.635e-09
82-108
BL01113A 17.99 6.192e-09
76-102
BL01113A 17.99 7.750e-09
61-87
1787 BL00420 Speract receptor repeatBL00420A 20.42 8.691
proteins e-11 73-101
domain proteins. BL00420A 20.42 9.673e-11
70-98
BL00420A 20.42 2.180e-10
55-83
BL00420A 20.42 8.062e-09
52-80
1789 DM01930 2 kw FINGER SMCX SMCY DM01930E 15.41 2.964e-33
45-89
YDR096W.
1795 DM01688 2 POLY-IG RECEPTOR. DM01688I 14.97 7.480e-10
107-154
DM01688J 14.69 4.455e-09
60-96
1796 PFO0075 RNase H. PF00075J 15.78 4.115e-13
115-132
1802 PD00066 PROTEIN ZINC-FINGER PD00066 13.92 4.130e-11
METAL- 86-98
BINDI.
1802 BL00028 Zinc finger, C2H2 type,BL00028 16.07 1.600e-10
domain 110-126
proteins. BL00028 16.07 6.100e-10
70-86
1802 PR00048 C2H2-TYPE ZINC FINGER PR00048B 6.02 9.438e-10
83-92
SIGNATURE
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
224
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
1812 PD00078REPEAT PROTEIN ANK PD00078B 13.14 4.130e-09
157-169
NUCLEAR ANI~YR.
1824 PF00628PHD-finger. PF00628 15.84 5.500e-13
78-92
1833 PF00075RNase H. PF00075B 12.56 4.732e-10
156-166
1833 PR00939C2HC-TYPE ZINC-FINGER PR00939A 8.95 3.045e-09
137-146
SIGNATURE
1842 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.192e-09
PI 244-258
SIGNATURE
1844 BL00972Ubiquitin carboxyl-terminalBL00972D 22.55 3.348e-11
168-192
hydrolases family 2
proteins.
1857 PF00424REV protein (anti-repressionPF00424A 14.34 8.085e-09
71-102
transactivator rotein).
1860 PR00221CAULIMOVIRUS COAT PROTEINPR00221H 12.82 2.410e-09
184-197
SIGNATURE
1864 BL01282BIR re eat proteins. BL01282B 30.49 1.136e-10
214-252
1866 BL00155Cutinase, serine proteins.BL00155D 26.87 5.337e-09
19-67
1895 PF00075RNase H. PF00075F 12.87 7.353e-10
93-103
1911 BL00983Ly-6 J u-PAR domain BL00983C 12.69 6.365e-09
proteins. 101-116
1911 BL00272Snake toxins roteins. BL00272C 8.27 1.000e-08
105-116
1925 PR00308TYPE I ANTIFREEZE PROTEINPR00308A 5.90 6.795e-11
64-78
SIGNATURE PR00308C 3.83 2.385e-10
67-76
1925 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.438e-10
57-71
SIGNATURE
1925 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.654e-09
PI 59-73
SIGNATURE
1930 DM00179w KINASE ALPHA ADHESIONDM00179 13.97 5.263e-10
T- 107-116
CELL.
1935 PF00075RNase H. PF00075J 15.78 2.309e-12
81-98
1940 PF00075RNase H. PF00075F 12.87 3.864e-09
74-84
1952 PR00019LEUCINE-RICH REPEAT PR00019B 11.36 3.250e-10
184-197
SIGNATURE PR00019A 11.19 5.667e-09
187-200
1954 BL00546Matrixins cysteine switch.BL00546A 19.62 8.105e-30
77-106
_ BL00023Type II fibronectin BL00023 24.31 4.682e-35
1954 collagen-binding 340-376
domain proteins. BL00023 24.31 2.969e-28
282-318
BL00023 24.31 9.526e-24
224-260
1954 PR00138MATRIXIN SIGNATURE PR00138B 15.82 5.500e-18
144-159
PR00138A 15.14 8.773e-16
97-110
1954 BL00024Hemopexin domain proteins.BL00024B 21.53 9.591e-33
118-151
BL00024A 11.49 2.800e-13
97-107
BL00024C 22.98 7.796e-11
164-212
1954 PR00013FIBRONECTIN TYPE II PR00013C 12.29 1.000e-20
REPEAT 372-387
SIGNATURE PR00013C 12.29 3.571e-15
314-329
PR00013C 12.29 7.800e-14
256-271
PR00013A 12.26 5.500e-13
344-353
PR00013B 14.75 1.237e-11
355-367
PR00013B 14.75 4.000e-09
297-309
PR00013A 12.26 5.333e-09
286-295
PR00013A 12.26 7.833e-09
228-237
1957 BL01182Glycosyl hydrolases BL01182A 21.39 3.357e-34
family 35 77-119
proteins.
1957 PR00742GLYCOSYL HYDROLASE PR00742B 15.52 2.653e-14
78-96
FAMILY 35 SIGNATURE PR00742A 13.75 6.914e-10
57-74
1958 PR00449TRANSFORMING PROTEIN PR00449A 13.20 8.200e-15
P21 214-235
RAS SIGNATURE
1964 PR00727BACTERIAL LEADER PR00727A 12.93 7.000e-09
9-25
PEPTIDASE 1 (S26) FAMILY
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
225
Table 3
SEQ DatabaseDescription *Results
ID
NO: entr
ID
SIGNATURE
1965 PF00075RNase H. PF00075D 10.71 7.188e-09
71-81
1966 PF00075RNase H. PF00075C 11.58 9.786e-11
110-121
PF00075B 12.56 1.878e-10
78-88
1968 DM008923 RETROVIRAL PROTE1NASE.DM00892C 23.55 4.082e-11
314-347
1970 PF00075RNase H. PF00075J 15.78 8.571e-10
335-352
1973 PF00589Pha a integrase family.PF00589B 16.17 1.450e-14
101-114
1974 BL00675Sigma-54 interaction BL00675B 24.07 1.000e-24
domain 118-172
proteins ATP-binding BL00675C 13.51 6.400e-24
region A 183-210
roteins. BL00675D 12.03 1.750e-09
245-254
1987 PR00153CYCLOPHIL1N PEPTIDYL- PR00153B 11.57 1.500e-17
52-64
PROLYL CIS-TRANS PR00153A 12.98 4.255e-10
23-38
ISOMERASE SIGNATURE
1987 BL00170Cyclophilin-type peptidyl-prolylBL00170B 20.97 6.250e-33
cis- 47-86
trans isomerase signatur.BL00170A 17.08 2.309e-09
17-43
1998 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.750e-37
ZINC- 27-65
FINGER METAL-BINDING PD01066 19.43 8.863e-11
NU. 68-106
1999 PF00992Tro onin. PF00992A 16.67 3.487e-09
108-142
1999 BL00224Clathrin light chain BL00224B 16.94 7.055e-09
proteins. 96-148
1999 BL00422Granins proteins. BL00422C 16.18 8.059e-09
117-144
2001 BL00019Actinin-type actin-bindingBL00019B 13.34 7.158e-14
domain 261-283
roteins.
2001 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 3.500e-13
II 345-364
ORF2.
2008 PD01719PRECURSOR GLYCOPROTEIN PD01719A 12.89 3.483e-16
63-90
SIGNAL RE.
2011 BL00282Kazal serine protease BL00282 16.88 6.577e-10
inhibitors 127-149
family proteins.
2011 BL00222Insulin-like growth BL00222B 11.09 6.940e-10
factor binding 74-89
proteins.
2011 BL00621Tissue factor proteins.BL00621A 8.69 6.473e-09
5-22
2012 PD02563PROTEIN NONSTRUCTURAL PD02563C 13.51 9.634e-10
C 74-128
VP18.
2013 PR00124ATP SYNTHASE C SUBUNIT PR00124A 8.81 5.655e-09
58-77
SIGNATURE
2013 PR00783MAJOR INTRINSIC PROTEINPR00783C 13.54 8.981e-09
48-67
FAMILY SIGNATURE
2034 PF00075RNase H. PF00075F 12.87 6.523e-09
183-193
2037 BL00326Tropom osins proteins. BL00326D 8.76 9.327e-09
115-155
2048 PR00671INHIB1N BETA B CHAIN PR00671B 4.29 8.767e-10
138-157
SIGNATURE
2052 PD02455ELEMENT TRANSPOSABLE PD02455C 29.23 5.230e-09
225-27_6
INSERTION PROTEIN
TRANSPOSITION DNA.
2058 PF00075RNase H. PF00075J 15.78 9.000e-10
81-98
_ PD00066PROTEIN ZINC-FINGER PD00066 13.92 4.000e-13
2074 METAL- 62-74
BINDI.
2074 PR00048C2H2-TYPE ZINC FINGER PR00048B 6.02 4.462e-11
59-68
SIGNATURE PR00048B 6.02 1.000e-10
89-98
PR00048A 10.52 9.609e-10
101-114
2074 BL00028Zinc finger, C2H2 type,BL00028 16.07 9.100e-13
domain 104-120
proteins. BL00028 16.07 1.OOOe-O8
46-62
2076 PR00019LEUCINE-RICH REPEAT PR00019A 11.19 1.900e-11
106-119
SIGNATURE
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
226
Table 3
* Results include in order: Accession No., subtype, e-value, and amino acid
position of the signature in the
corresponding polypeptide
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
227
Table 4
SEQ Pfam Model Description E-value Score No: Position
of of
NO: Pfam the
DomainsDomain
1050 FAA_hydrolaseFumarylacetoacetate 0.64 -89.1 1 22-143
(FAA) hydrolase
fam
1066 rubredoxin Rubredoxin 7.2 -11.1 1 4-37
1076 ank Ankyrin re eat 0.01 22.5 1 25-57
1076 sodfe_C Iron/manganese superoxide3.9 -67.9 1 38-124
dismutases,
C-term
1076 DUF232 Putative transcriptional8.1 -29.1 1 134-254
regulator
1099 box HMG (high mobility grou8 -22.4 1 17-61
HMG ) box
1109 _ u-PAR/Ly-6 domain 0.21 -6.2 1 34-112
UPAR LY6
1110 ldl_recept Low-density lipoprotein8.8e-07 36.0 1 196-240
a receptor
d omain
1110 CUB CUB domain 0.38 -27.8 1 52-161
1118 rvt Reverse transcri tase 0.95 -46.1 1 38-207
1125 adenylatekinaseAdenylate kinase 0.00037 -77.6 1 13-103
1162 KRAB KR AB box 1.1 e-2392.1 1 22-62
1163 connexin Connexin 3.1e-23 90.6 1 1-130
1171 KR.AB KRAB box 6.6e-22 86.2 1 33-73
1193 MHC_I Class I Histocompatibility2e-06 1.1 1 29-205
antigen,
domains
1209 DOMON DOMON domain 1.9e-12 54.8 1 102-215
1213 IL8 Small cytokines (intecrine/chemokine),0.59 -7.8 1 18-
55
inter
1218 cys rich_FGFRCysteine rich repeat 4.4 -11.0 1 28-76
1222 Gl co transfGlycosyltransferase 6.6e-06 -54.1 1 1-322
family 10
1240 ig Immunoglobulin domain 1.6e-06 35.1 2 41-
124:156-
230
1258 as Eukaryotic aspartyl 8e-06 -110.81 19-241
protease
1280 DOMON DOMON domain 8.9 -16.6 1 35-117
1288 PDZ PDZ domain (Also known 1.1 0.4 1 7-73
as DHR or
GLGF)
1301 ExonucleaseExonuclease 3.4e-33 123.7 1 322-479
1311 Gemini_mov Geminivirus putative 5.7 -40.5 1 15-79
movement
protein
1341 fn3 Fibronectin type III 6.6e-36 132.7 2 109-
domain 200:212-
301
1345 Colla en Colla en tri 1e helix 7.3 -65.8 1 185-243
re eat (20 copies)
1365 Amidase Amidase 0.017 -178.91 68-276
1375 Galactosyl Galactosyltransferase 7.1e-44 159.2 1 113-309
T
1375 Glyco transfGlycosyltransferase 3 -77.1 1 146-293
25 family 25
1381 GRAM GRAM domain 6.6e-14 59.6 1 65-116
1396 Pep M12B-propReprolysin family propeptide1.4e-27 105.1 1 75-191
ep
1396 disintegrinDisinte in 2.6e-10 47.7 1 243-318
1398 SK_channel Calcium-activated SK 1.8e-06 34.9 1 1-57
potassium
channel
1413 i Immunoglobulin domain 5.4 9.1 1 29-88
1416 dUTPase dUTPase 0.00044 9.6 1 111-237
1420 Folate rec Folate receptor family 1.7 -111.21 14-175
1434 lectin c Lectin C-type domain 1.5e-05 28.0 1 233-319
1440 chromo 'chromo' (CHRromatin 4.6e-11 50.2 1 92-133
Organization
Modifier)
1449 PMSR Peptide methionine sulfoxide0.0089 -65.8 1 4-79
reductase
1450 SPRY SPRY domain ~ 9e-26 ~ 99.0~ 1 ~ 109-240
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
228
Table 4
SEQ Pfam Model Description E-value Score No: Position
ID of of
NO: Pfam the
DomainsDomain
1451 MaoC dehydrataMaoC like domain 2.1e-15 64.6 1 31-152
s
1463 NTP transf Nucleotidyltransferase 2.6e-12 54.3 1 121-234
2 domain
1467 DAG_PE-bindPhorbol esters/diacylglycerol8.7e-05 27.4 1 130-180
binding
dom
1467 DC1 DC1 domain 0.66 11.2 1 141-172
1470 'rri C jmjC domain 0.46 -18.2 1 166-262
1474 pkinase Protein kinase domain 0.0019 -85.7 1 2-187
1475 SSF Sodiumaolute sym orter 0.13 -177.11 1-311
family
1478 dUTPase dUTPase 7.6 -37.5 1 2-98
1479 fn3 Fibronectin type III 1.1e-19 78.9 1 14-100
domain
1485 rnaseH RNase H 0.36 -28.0 1 59-175
1488 NTR NTR/C345C module 0.044 -6.1 1 293-398
1506 HSP70 Hsp70 rotein 1.6e-13 38.3 1 61-424
1517 UPAR LY6 u-PAR/Ly-6 domain 0.33 -8.2 1 44-106
1530 rnaseH RNase H 0.011 -11.7 1 64-155
1537 p450 Cytochrome P450 2.1 -176.61 31-316
1537 DNA ligase NAD-dependent DNA ligase9.2 -42.9 1 200-256
OB OB-fold
d omain
1558 KRAB KRAB box 1.8e-18 74.8 1 68-108
1564 Phage integrasePha a irate rase family1.2e-09 45.5 1 39-204
1566 MR_MLE Mandelate racemase / 0.00079 -24.5 1 153-352
muconate
lactonizing en
1570 HMA Heavy-metal-associated 6.6e-13 56.3 1 71-131
domain
1580 i Immunoglobulin domain 0.99 15.2 1 23-131
1601 WD40 ' WD domain, G-beta repeat2e-08 41.5 3 39-
75:83-
118:126-
162
1606 zf CCCH Zinc finger C-x8-C-x5-C-x3-H0.094 19.3 3 105-
type 129:141-
173:183-
209
1612 zf CCHC Zinc knuckle 2.1e-05 31.4 2 167-
184:202-
219
1618 rnaseH RNase H 6.3e-14 59.7 1 24-144
1618 Zn Irate ase Zinc binding 3.8e-07 37.2 1 146-185
Irate ase domain
1618 _ Domain of unlaiown function9.3 -7.0 1 104-186
DUF224 (DUF224)
1641 adh short short chain dehydrogenase4.6e-32 119.9 1 42-309
1667 Xlink Extracellular link domain2.9e-83 290.0 2 162-
267:273-
364
1667 ig Immunoglobulin domain 0.0015 25.2 1 61-145
1682 rvt Reverse transcri tase 3.1e-31 117.2 1 56-238
1683 Ga 30 Gag P30 core shell protein2.9e-33 124.0 1 8-197
1689 KRAB KRAB box 4.9e-22 86.6 1 266-306
1692 ubiquitin Ubiquitin family 0.00061 26.5 1 17-91
1709 fibrinogen_CFibrinogen beta and 7.9e-85 295.2 1 37-255
gamma chains, C-
term
1713 HOK GEF Hok/gef family 2.4 -7.8 1 7-54
1716 Ga 30 Gag P30 core shell protein0.0036 -49.7 1 64-229
1721 rnaseH RNase H 0.011 -11.7 1 207-350
1722 dUTPase dUTPase 0.37 -22.9 ~ 1 ~ 93-217
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
229
Table 4
SEQ Pfam Model Description E-valueScore No: Position
ID of of
NO: Pfam the
DomainsDomain
1725 ig Irninunoglobulin domain 4.2e-1357.0 2 80-
141:259-
320
1725 IQ IQ calmodulin-bindin 4.3e-0530.4 1 49-69
motif
1727 pkinase Protein kinase domain 3e-21 84.0 1 71-267
1728 Fringe Frin e-like 5.9 -112.61 165-370
1734 ig Immuno lobulin domain 0.014 22.0 1 117-170
1737 PP2C Protein phos hatase 2C 0.0067 -50.5 1 37-273
1738 SH3 SH3 domain 1.7e-0531.7 1 102-159
1740 rnaseH RNase H 0.0042 -7.3 1 126-270
1744 DAG_PE-bindPhorbol esters/diacylglycerol2.9 -11.1 1 26-55
binding
door
1744 PHD PHD-fin er 3.3 -14.7 1 9-61
1760 GARS_N Phosphoribosylglycinamide8.2 -62.0 1 35-95
synthetase,
N
1760 Armadillo Armadillolbeta-catenin-like9.1 8.7 2 44-
seg repeat 84:131-
171
1778 7tm 1 7 transmembrane receptor1e-12 55.7 1 41-276
(rhodopsin .
family)
1778 YCF9 YCF9 3.1 -18.5 1 203-258
1787 Clq C1 domain 1e-05 13.2 1 111-230
1787 Collagen Collagen tri 1e helix 0.0043 -3.0 1 50-107
re eat (20 co ies)
1789 jm'C jmjC domain 0.0007812.0 1 52-241
1795 i Immunoglobulin domain 0.0037 23.9 1 64-141
1796 rve Inte ase core domain 2.6e-28107.5 1 20-174
1802 zf C2H2 Zinc finger, C2H2 type 6e-15 63.1 2 68-
90:108-
130
1806 Filamin Filamin/ABP280 re eat 0.0005418.6 1 26-131
1812 ank Ankyrin repeat 3.6e-2390.4 3 159-
191:205-
237:244-
276
1824 PHD PHD-forger 1.1e-1255.6 1 62-110
1826 PAP assoc PAP/25A associated domain1.5e-0635.2 1 101-155
1827 ig Immunoglobulin domain 1.6 13.4 1 29-102
1830 RhoGEF RhoGEF domain 3.3e-0624.0 1 110-280
1830 PH PH domain 2.8 6.7 1 356-451
1833 zf CCHC Zinc knuckle 2.1e-0634.7 1 137-154
1833 rvt Reverse transcriptase 7.7e-0625.9 1 84-277
1844 UCH-2 IJbiquitin carboxyl-terminal0.15 -8.5 1 165-238
hydrolase
family
1846 Armadillo Armadillo/beta-catenin-like0.28 17.7 2 50-
seg repeat 91:92-
132
1 zf CCHC Zinc knuckle 3.2e-0530.8 1 179-196
860
_ zf C3HC4 Zinc finger, C3HC4 type 0.0022 23.3 1 218-256
1864 (RING
fin er)
1887 ig Immunoglobulin domain 4e-08 40.4 1 35-112
1889 LRR Leucine Rich Repeat 0.051 20.1 1 62-85
1 rnaseH RNase H 3.4e-0625.8 1 47-177
895
_ Brevenin Brevenin/esculentin/gaegurin/rugosin7.5 -2.9 1 1-51
1899 family
1911 UPAR LY6 u-PAR/Ly-6 domain ~ 1.3e-06~ 35.4~ 1 ~ 44-117
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
230
Table 4
SEQ Pfam Model Description E-valueScore No: Position
of of
NO: Pfam the
DomainsDomain
1911 toxin Snaketoxin 3 -19.5 1 66-117
1911 Activin Activin es I and II receptor9.5 -14.0 1 30-118
rec domain
1912 Retroviral aspa 1 protease7 -26.3 1 42-142
1913 SAM SAM domain (Sterile alpha3.9e-1357.1 2 105-
motif) 170:183-
247
1916 Sema Sema domain 1.4e-1454.6 1 51-434
1926 PAP2 PAP2 su erfamily 2.9e-0737.6 1 48-142
1930 i Immunoglobulin domain 2.7e-0737.6 1 41-116
1935 rve Inte rase core domain 2.5e-1357.7 1 1-138
1940 rnaseH RNase H 1.1e-26102.0 1 24-153
1940 Integrase Integrase Zinc binding 4.7e-1253.5 1 155-194
Zn domain
1952 LRRNT Leucine rich repeat N-terminal0.0027 24.4 1 67-95
domain
1953 UQ con Ubiquitin-con'ugatin 2.8e-0840.9 1 78-219
enzyme
1954 Peptidase Matrixin 6.7e-86298.8 1 53-212
M10
1954 fn2 Fibronectin type II domain1e-79 278.2 3 231-
272:289-
330:347-
388
1958 ras Ras family 1.9 -132.01 215-284
1963 is 1 Thrombos ondin type 1 0.083 8.0 1 20-63
domain
1966 rvt Reverse transcriptase 1.5e-0521.9 1 2-196
1968 G-patch G- atch domain 0.3 6.0 1 307-352
1968 Retroviral aspartyl rotease1.4 -19.9 1 274-385
1970 rve Inte ase core domain 0.78 -16.8 1 265-395
1973 Pha a integrasePha a integrase family 5.7e-0839.9 1 1-153
1974 Si ma54 Sigma-54 interaction 3.1e-37137.2 1 63-253
activat domain
1975 Na Pi cotransNa+/Pi-cotransporter 0.0085 -99.2 1 1-146
_ signal His Kinase A (phosphoacceptor)7 -7.7 1 85-147
1975 domain
1978 UPAR LY6 u-PAR/Ly-6 domain 1.8 -16.0 1 21-96
1978 Zn_clus Fungal Zn(2)-Cys(6) binuclear5.1 -5.7 1 21-60
cluster
domain
1987 pro isomeraseCyclophilin type peptidyl-1.2e-1875.4 1 4-171
rolyl cis-tr
_ zf CCHC Zinc knuckle 1.9e-0531.5 2 181-
1997 198:204-
220
1997 TFIID-31 Transcription initiation7.9 -633 1 75-187
factor I1D,
3lkD su
1997 Ga 12 Gag polyprotein, inner 8.9 -9.5 1 155-229
coat protein 12
1998 KRAB KRAB box 2e-23 91.2 1 27-65
2001 CH Cal onin homology (CH) 0.019 10.8 1 230-330
domain
2001 SAM SAM domain (Sterile al 0.9 6.5 1 248-311
ha motif)
2008 is 1 Thrombospondin a 1 domain0.013 15.1 1 64-98
2011 i Immunoglobulin domain 1.7e-0531.7 1 186-255
2011 kazal Kazal-type serine protease0.0002827.6 1 121-168
inhibitor
domain
2011 IGFBP Insulin-like growth factor0.17 2.5 1 53-113
binding
protein
2011 zf UBR1 Putative zinc fm er in 8.3 -24.0 1 54-112
N-recognin
2015 PH PH domain 0.0002 28.1 1 174-281
2015 efhand EF hand 0.0003127.5 1 339-367
2018 RPEL RPEL re eat 1.3 11.8 1 25-50
2034 rnaseH RNase H 4e-27 103.6 1 122-267
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
231
rr"1~1 o n
SEQ Pfam Model Description E-valueScore No: Position
of of the
Pfam
ID
DomainsDomain
NO:
2038 anulin Granulin 7.7 -17.8 1 62-91
2052 rve Integrase core domain 2.6e-2494.2 1 160-314
2057 Pep Ml2B~ropReprolysin family propeptide0.44 -29.3 1 179-263
ep
2058 rve Integrase core domain 8.7e-1459.2 1 1-140
2074 zf C2H2 Zinc finger, C2H2 type S.Se-2286.5 3 42-
66:72-
96:102-
124
2074 zf BED BED zinc finger 0.94 1.8 1 91-129
2074 TP1 Nuclear transition rotein7.5 2.2 1 21-76
2076 LRR 1 3.2e-2080.6 5 57-
Leucine Rich Repeat 80:81-
104:105-
128:129-
152:153-
176
2076 LRRNT Leucine rich repeat N-terminal0.0001328.8 1 27-55
2076 LRRCT domain 0.047 18.0 1 186-234
Leucine rich repeat C-terminal
domain
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
232
z
° ~ ° ° o ° °O~r
Q, " ~. ~. ~-, o o ~ r0
p N 'zS n n ,n .Q b
a ~ a a a a
o ~ ~ o ~ o, o.
H
J N N ~t .p N
~1 O Cn O~ N
W ~O \O ~O ~ ~~ W
N CAD f~D C~D CAD N N ~ b
i ~ ~ i i n ~
O ,-. ~. ~. ,-. ~ O ~ r..
~1 N V~ W ~. ~. O~
O O O O p O
.p ~ ~. ON O Oo ~
~p c~~i
O .O O .O O O v, b
n
N ~ ~ N ,-'P. O
~O O ~O lp (~D I~
:-' m H
O
r~
d
He
~o ~~ ~~~ y~~c~
' yo r"°'oo 0 0~
~°x ~~~ c ~y ~y
o ~ ~ H ~ '~ H trJ n t=i n o
m No ~ n~ n
O
m ~~m ~n ~n a.
yH yH
a ~ m ~ ~ ~ tHI'J ~ tHrJ
ra err c~~~0 ~ ~~ x~~xx~~
ax~~o~ r c~ c~~~ ~~ ~dx~~d~d~CH~
y~~~~~ ~o~o~~'~ ~o od~ood~o
H°°o°~ ~~m c~~ °~ ~~~~a~~a
.-,~~x~mz v~r~r" t~~ Hy v~~~..~~v~~Hv~ d
maOmm~Om
~~N~ro z~~' ~b r~ ~'y
O~ O~~ r~Zn ~ ~p ~~ ~ ~ o
~,
x ~ ~ '~ ~~ ~~ ~~ o
xr
r~ m m
~ ~ 0 0
r r
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
233
0 0 ~0~~
~d
a a ~ ~ O
H
CNn ~ ~ ~ N
.p N ? s0
:p W .p °° coo b~ b
N ;D ;P i P~ m
'O p ~ O O ~ ,~~. ""
i
0 o O w
N W
~~h
O O O O n
i~
N J
d
~a ~~ ~~~ ~~ c~~ra
r ~~C x~~ ~.~ ~ b
xr
r.;d ZO~ro
x a~Z
x~ ~~ ~ r~ 0 0
~z ~z
x
a
d
o r~
a m ~ ~ ~.
r~~°b~~~°o°z~oo° ~~~~~~~o~~o
~ tzi ~ trl ''d ~ "-~ ~ H a H H k~ ~ ''d ° ~-3 m d
r~~~°~~O~~Z~o~'~~~p~~~'~o~H~'bH~' b
r° ~zo~~-~~- xr~~ x~o~ ~ Nox~ox d
~~~~xb~~~ ar~c~ x~b ~ z~ z~
°r°~~ro~r~~~d~ ~ara~~a~ o
c~~ya~~9~
x °r~r~r~~o~~,x~ ~~~d~~, ~ o
~V N'~~~ ~aN~'~~o ~
Nboo ~ o ~o ~ o W ~ z° z
N ~ ~, ~, ~ H x~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
234
° ° o~~'
~o ~o ~ ~ a\ a~ o,
-. .-. ~ ~. ~ ~ ro
w ~ ,fl ~ ~ ~ C
o ~~ w ~' ~ ~° o ~ td
a a a x
o~, o~, o~, o~, owo ~ ~ H
H
.a
0 0 0 o w o
0 0 0 0 ~ o ~ b~ ro
°m°o °w .o~ 0 0 0
a\ ~ ~ ~ ~ ~ rt
O O O O
O
.'~P J N
~M
0 0 0 0 0 ~ ro
tn ~-. i-. N
O 0o O v' N c~D
0o IJ
Owo N
O N
r ~
d
b
r ~~ ~~ ~~ r~ ~
o ro ro
~r o
H ~ ~ ~ x n
H ~ ~ o
Na ~
o ~ ~ n ~~ ~~
x
C7 y
a a~
~l C7 trJ ~ v~ H C7 ttJ ~] v~ v~ x~ C~ trJ f'l H C~ ''d H ''d a f~ H
HH~y~-Cr O~O7~~7~~W~O~
H ~ ~ ~ f~ H ~ ~ ~ 7~ c7 H m z ~-d z O O 'T' tn t'-1
z O n ~ z O O ~ z ~ ~ O O ~ ~-3 t" H ~ , tH=i 9 H ' ~'
~ ~p z ~ 7~ ~ O z ~ ~ ~ ~ O z ~-p3 tri ~ ''~ ~ '~ H h~ ~ ';' d
r3 ~ O H '-'3 O ~1 "'' ~ '-' ~ ~y' ~ a ~ ''~' b~
~ ,"dNzt~~'J~ C=1N~~'~ ~'x~'z''Hbz ~C
~t~-~~ ~~'m''~~ t-r~t-~~0~~ p~pOt-~O ~~~c~ a
~'~o ~~~0 HpH~o ~dm~~~ ~~r'o 0
a '~
o~ z°zo~~ °z~°~~ ~o~ ~~ z°~m
z ~ z~ ~z~ r 9~ ~~~r
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
235
0 0 ~ ~°o
N N
UNR ~O
a a a
~-. oo J v~
ov'o ~ ~ ~ ~ H
H
N N O~ ~ ~ ~rJ
01
p\ W W W O
,p N N N O
O N N P
p pp ~. w 00 n-
r-. i-' p O ~ C
.OP
i r ~. O O m FtJ
O ~ O ~ O
~ H
~ O ~
r
d
bH ~bH ~~~~x~d da~~~d
O~ aH~ ~ ~x~~a~o~~~a o~00
trJ ,~Z, t~r1 H ~ ~ ~~ ~ t~ ~ ~, ~ ~ ~ ~ CJ ro w c~
N ~ '~' ~ H ~ ~ G7 trJ H '~'' lzJ
U',~, ~dp'~.~'' _~d~''~'~CG~~r~O~;'34~~ ~'_'j~~ o
O~ O~ ~aHw(~Ox"H~~~O~-''G' ~C~%btrl "o
t~J ~-C O ~ o
w
,~~° ~o ~~~~o "a x
~t~d~
~~ozm . C
a d
aa~~H~aa
o ~' r
-~3 ~ 9 C7 ~ txrJ H H ~ O
~~1 . ~~ 'z7 n7 nH
~o b
bbx~~'~bb
H H3 H ~~"' ~ m ~ H
tii t~ ~ H ~ ~ tii trJ
d
n ~' n
~-3 d H
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
236
z
0 0 0 o O~p
N N N N
~b
UNG UQ UNQ USG
b~
a a a a
z
N N
W W O
H
o~', .tea o
-.
b~
b
..
~o o .-~ o
o ,o C
N ~. oo N n A
,.... W O ~ O ~-6
A
O . ~ ~ ro
o
0
o w ~ A rrJ
~' H
~ O ~°
'° r
d
(~ b H C~ 'b H f~ "b H (7 b H C~
xx~x x~x xx~x x~x x
aor~ aor~ om aor~ a
~-3 H ~ H H
,~z, m ,z~, tai tri ~ tai
yea y~ a~ ~~ a
~~x ~?~x ~~x !~~x ~' o
Or' O~ O~ O~
o.
r ~ r ~ r
,~ v~ ~. v~ ., v~
b~ '"' H H a a ~ ''" "~ ''~ a b~ '"' H ~l a b~ '"' H H td '""' H H
r~or~xzzr°~xz~~°~ zZ~°~xz~m°~x
H d tri ~-3 H '~ d tn ~-3 H H d ~ H H ''~ d ~ H H ''~ d
o ~ r N N ~ O ~ ~ N N r o ~ N N r O ~ N N O N d
xz ~r~r~xz ,~r~~~z ~mr~~z ~r~r~~~
~~bbH ~~bbH ~~broH ~~bbH ~
~o ~~ ~o r~x~ ~o~~~ ~o x~~ ~o
~r~oo ~r~oo ~rr~oo ~r~oo ~r~ o
x '-' H ~7 x "'' ~-3 ~ x '-' ~-3 ~ ~ x ~..~ ~ H ~1 ~y x H
H 0 ~ trJ tri ~ ~1 0 ~ trJ hi ~ ~l p ~ lTJ tii b ;~ p ~ tri tii b '~ p ~
~~~Z~ x~Z~ xx~~ZZ xr~~ZZ x~~~ o
d ~ d
CJ ,~ C7 ,~ d ,.~ H
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
237
O O O ~O t0 O O
O O ~ N N CD ~
a a a a
w w w N
oho ~ ~O oo N
H
N o O O N oho
W .p .p ~. W
coo cNO °co° c°~o can cNO ~ b
i i i i i i ~
O O ~ ~' O rt ""
O o0 0o N O ~D
O O O O ~ !-'
O ON O O ~ N p
M
.O ,O ,O O .O ,O r
r-. ~. O
l0 ~O J '-' N ~ fD
~ H
O ~
t~
(~ '"d H C7 'b H n 'b ~ C~ b H
xo~ yon o~ xo~
Z a~~ a~~ yZ~
0
r~; r~
r r
~~°~~~HHaa~~HHaa~~HHaa~~~~aa
~~~~m°~xzz~°mxzz~°~xzzm°mxzz
~-3 d trJ F-3 H '"3 C7 tai H H ''~ d trJ ~ H '~ d trJ H H
Oc~°~a~~ ~~a~~ ~~aZ~
OOOO~a~
~~x~~r'-~orr~r~r'-~ormr~r'-'ortnmr'-'ormr~ b
ooookzNx~~xzNxm~xzNx~mxzNx~~
~~~d~d,-~ ~
ZZZZ ~°~~x~ ~o,~~~ ~oH~x~
~rr~oo ~r~,oo ~rr~oo ~r~,oo
""' H H x ""' ~-3 ~-3 x '_'' H H x "" H H °.
H ~ ~ tii h7 ~ ~] ~ r~ trJ trJ ~ H ~ ~ t=i h7 ~ ~3 ~ ~ trJ t~J
9x~~Z~ ~x~~~Z x~x~~Z~ xx~ZZ
z
d
d H d H d H d H
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
238
O O ~O O ~O O O O
r, ~O l0 ~O ~D .P
b
o ~ w w
a a a a a a
z
J ~ .7~ N W
O\ W N O\ ~ O
H
.p W N N W .P
QO W J ~O 01 ~O v-'
00 O 00 O~ -P l0
O O O~ Oo 01 W .J~. W
eD
O O .P mP N co N G
O ~o ~ ~o ~u ~o ~u ~ "~d
.? d1 ~ N O O ~ ~ ..
O O O O O O 0 O
N G O ~ ~ O '"
Ov N O~ .p tm O~ N N "2~
O O p .O O :O o O n
W ,Wp '-' ,-. i--' i-. ~ ~-. O
V, r-~ 00 .p lp ~
~ H
~ O
r~
d
o~~~~o ~~"'..,''mb n~tr~iv ~tr~i i ~x.ItrrJ n~mp~t~ii
a ~ ~ ~ ~ ,..n.3 ~ H ~ ,.,n,3 ~i ~-3 H
d~~r9~ ~~ ~~Z~~~Z~~~~HZ°~r~
~~~do~ ~x ~~'~~r~~r~~~~~xx x
o~~~~~ ~ ~~~~~~~~~~~y~~zz ~ b
o~ ~~~C~~~~H~~H
r~ WH~Wa'.W~~WH~ r
'~W~
x ,
rbb rr
xxx
d
Hx oo d
~aa
o~~
0
aim
x
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
239
z
p O o O o 0
c'_'''" a- a. ~ ~ ~: ~ ~ d
,~N pp ,..
a a a a a
z
~' ~ 0 0 0 ~ H
H
W N N .p W N '_'
W ~O ~O O W ~ "
W N r-r. :p tp v0 O
.p oo Ov O~ C
N N N ON
O W
O
O i-J. O O O O O
O W O f7 fD
O W t~ .P ~ W
fD cC~h
O O O ~ O O
O 0o O
.p
.p
O
h
'° r
d
a~o ~~ o~~ ~~ ~~ a~~~~~
t7 ~~d "~dld tHnbt7 bd "mbt7 ~OHH
~-3 O
cZn b0 ~~ n'"O~' ~ ~~ u''fO,~rZ O~y~~ o
o~ ~~ ~~~ x~ ~~~C
r ~~ ~~ ,~~, a~ ar r ~ ~ °c
nH y~ °~ d7~ ~r,.b ~7~ ~'v'
O O O O r O r O
m m ~ y
w
~~~o~~,~~~~~,~a~~~d~~d~
o~ ~ r~ rn r~ ,goo
HH
~rHnb~~r~a~H~~.,~~H~~~H~~~ Ga
~~°~~'~~~d~~~'~~~db°db°~ zox d
cnzYbr~rrb~~r~rp~ro,~ro
ro
~m ~~b~ ~ c~ n .
arm ~a ~ ~ v ~9 ~ ~~a
o r~ ~'~rc~
~~~,~~~c~or~mc~~r~~~r~ o'~ .
r ~r~dg r~H r~H
~~dy~~~m~~~~~~~~o~~o ~xz
d ~~da~ ~~. ~~ . ~ . ~ rz~
N r
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
240
-. ~-. .., ~. "' ,r
"'' '-' ~ "' o 0
co 0 0 0 0
..-. vo ~ ~o ~ ~ ~ ~ b
U~G QG U~0 ~ 'C 'C
a~ w
a~ a a a a a a
0 0 0 ~ ~ ~ o
'-' '-' w o J N ~"
.P ~ ~ .P
~,, ~,, w .-, w P o, ;P
o, N oo :~ oo td
~o rn cn ~u cn ~ b
;, ,~ ,r ~-. ~-~ o
w o0 0 ~ o
0 0 0 ~ o C,
o .° '~-. N 'v, ~ v,
o N w N w J O
~M
~ o i i O ~ o ~ b
O O
-w.. \O ~ ~ 0 0 ~ f~D
F
lp
t
O f
rt
d
Hxc~H~Hx ~~r~~r~ r~~o ~o ~o
~~~~y~~ o ~o ~o~~ ~~ ~~ ~~ c~
r~Hr~HryHrrod rod bd b
v~ v~ " rn ~ ,~ ~ n H ~ n H ~ ~-.. H ~ H ~ H ~ H
C~~i-~r~ ~JC~aC~~y ~-3G~~~~HG~~-~~HC~,-,t-'OZ Oz OZ O
~tCrl~ ~x~~ ~-~~~~~~~~G~d~~ ~~ ~~ n
.-] "',d H ~--~ H 'T~ N ~ C,) N ~ O N ~ .n, vW'
~" a C a w ~ r., ~ r.., ~ r r, ~-C x '-C ~-C x
~~~r~ '°ax~ax~ax d
_ ~ ~ H ~ O H ~ O H
~r~ar~ar~ ~° y° aro y
C7 n Cn C~ w ~ ~ w ~ ~ w ~ ~ x 7~ 'sb 7~
H H H ao 0 0
~~'
HH d HH rt~x~~x~~x~
~ d r~
~r r_~~r~~r~
r~~ ~trl'Jc~n~ ,,.aj~~xH~g~~~g~H
~m do~~ ~~~ao~~o~ao b
rbb ~ ~ ~ d
z~~~ do~~°H~~~c~~~~~
H ~ H ~ H
~x ~mx '~~~ a~ a~, a o
~~~xo~xo~x
°~~
o~ Nom ~do~d~Zd~~
r' w
z ~ ~ r ~' r' ';'
' ~ hi ~ ~ 0 0 0
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
241
o~A
b
.,
a
N
~H
H
v,
p.
o, o, 00 0o t~TJ b
.° ~°
p OWO N ''t'
J
O ~ O
J
~M
o O ~ ~ ~"d
0o ~ ~ O
O
N r'
O "~J
'° r
d
x ~d~~c~~~~x~d~~~~~ax~~~~x~~~
zx x x ~xzxx x ~x x ~x x
~~~ay~~~a~~~~~a~~ ~~a~~~~a~a
Z _~ZZ~~~ ~~ Z~Z
.. , ~ ~~~,~a~~~~a
m~~~~~' ~ '~~~~~
xxxybxb~~ xxxabx~m~ bxb~~ ~x
~r~ r~ ~~"_'~r~ ~"_'~r ~'w o
~nro~~~ anro....~~. ~.~~. ,-.
dx"adx~~ rxa~x
b zaz~ar~~ z~H~a~~ x~'x
a
.-xo~a~ a '~xdx~ p Z
d x ~; '. H ~ ,. .
~C
~,~~w,~Hl-CH -CH~W~-3~-C~7 ''t~~'~~lH 7~
~W
ny 7~n< ~ ~~7~C~C '~ dOn''~ d0
a~~Zy~a~ a~HZy~a~, z~'a~, z~'
~a~dmmC~ ~~d~yt~C~ ,-~~-m~-~, ~ a
0
r~a~ ~ r~a~ ~ ~ c C
N ~ ~ ~ ~ C~ ~-~3 H '-'3 v~ ''~
O o "~ tii . o O o tr!
W ~ ro r ~, W ~'.''P'' b r N O (?J N
z~
x x
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
242
z
c c ~ ~ ~, b
d
a a~ ~ a ~ a
d
z
H
w ~ ~ i
w .r
i.~ o~ ov o0 0~ td
o ~ .G .p. i~ .~~n-. b
~O N N N w N
O O O O O r" C
w p i-. W N N n
O~ -P -P .P ~-' N ~ ~i
O O O O O m b
N
-~t ~ N
r~
d
C~ ~-3 ~ C~ ~-3 x ~ H x C~ H x C~ ~-3 ~ ~ ~C ~x-' H
C ~ ~C Cc~
d° Nor ~ ~~ z~ ~~~Ndx~~~
o ~x~m ~~ ~ ~ .. ~ how
~n
H ~ td ~ ~ td ~ ~ b7 ~ ~ td ~ ~ ~ r o
H
w~~ ~a~ ~a~ ~a~ ~a~ _~b a
'~ ~ t~rJ m ~ t~l,'J ~ ~-~3 ~J m ~~-l ~t~r,J ~ ~ ta'' ~ t~rJ o
~ °
w ~ ~ c'~r, ''r cHr, '-' c'T'r, ~ vx, ~ .p N y P.
a can ~ .P .p .p .~. ~ o
r
xbH ~r~~rH~~r r
y~~ .~~'~t~~,7~x~m~,~~t~~~~,tn
x ~ ~,m~o ~~o r~HO mho b
y~~' z~r z~r zxr zxr
r o ~ (~ x ~ C~ x ~ n x ~ C~ o~
~~z
aim acm a~~
rn
m
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
243
o b
~rn ~ ~ w .-.. Uv ~ td
a a a a a
H
~z
~1 N N
d
r, N P .p VO
O~ .p v0 ~O Oo W 01
O~'00~"~'
N ~-~ N v0 tn oo v0
O O O O O O
J tl~ ~~-. ~O J ~ n f~D
J ~p W Ch O ~ .~.
O .O O O O O ,O ~ b
n
'W --~ N N N N
~O O O O O ~O
~ H
~ O ~°
'° r ~
a
y~
z~ ~H x ~~~ ~ o do
a ~ o
x a n c
.. ~~ a~ ~9 a
d ~ r t~ t~
r ym
bab ~ ~~~~~~H~~x~~r~~~~~~a~~c~
x~x 00 oor ors o or~oo
p d O ~ ~ ~ H ~ H ~ H ~ ~ L~ ~ -~~- ~ ~ Hl ,~,x,, ~ ~ ~-Z3
a
rn~~~~~o~~H~~~p~~r~~~~~~~ '~
-!.a t~rJ ~ y-H, ~ m '"~ ~ ~~-7 ~ p ~ ~ ~ ~J a '_~' '"~'' ~ ~ "H_"H-'
~~~o~~~Za~a~~Zx~ '~~~r~~~ o
b ~r~o 000
~'v'~Np~'b~ h~7~~ o
n~~~~ z r ~o~~~~o~ d~~
r ° o ~ z~ ~ o off ,..., ~ c~
r ~ ~~nr~ z~ o
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
244
~_ ~_ ~ ~_
N N N N N
x' ~ C 'C
a a a
o'~', w .tea
H
~_
:~ ov td
N N N
lh W W O ~O '"
O O O O O ~ C
i--. tl~ N O i-~!
l~h ON1 O c.h W
O O O O O
W--P. O
O
~ O ~°
r
d
~~~o~~ ~H~ 9~b ~~b
r~
~c
~~~r~~ err z ~z
,..a.3 ~~ X00 H ~a-3 f~
azz
b b ~. ~c
xy aoo
bb
a
n
° zrz'~or~~ ~z°b°bro° bo°
~x~ ~°~~b~m~~~~~x~x-~~-ro~x'~~
°Htnb~wn v~~ ~ Hv~ btrJ b'J
~z~~
x~~~ ro o o~~'~~ ~ ~~r
m
O O
~y Crl f-3 H - " tiiw trJ
H N
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
245
~_
W N N
N N
r-. ~. W N 'r
-H. ~ N ('' ~. ~ tC
n
W ~O J W W W
H
N '-' ~ ~' ~ d
o ~, o ~, ~-.
o, o i.~ by o~ td
° '~~°, ° N ~ N ~ ro
o . . C
0 0 0 0 ~,
v, o J o ~-. 'v, ~ c~
~ ° ~r
°~
o . 0 0 0 0 ~ ~-d
°° ~ ~~" ° v, °
0 0 ~ "~J
o ~
'~ O
'° r
d
oza ~x ~~~~o°~o~
o ymzWV~ ~~, r~
mm~ ~ m~~~~z b~
v~ n ~ ~ ~ C7 ~~ v~ m m p y x m y
v~ ~, N . m ~, m
.. a~~ N~bm~ W m
p ~ Y ~ ~.:p ~ c
b m
~..~, C~~~rJd ~ ~~p°m''H.b~t~rJ ~ R.
x W~bN~ m~ r~
n
td~ ~Hd~~o >°c~~'>°c ~xx
m~om~'~ pW~p ~xx
O ~ ~ H ~'' ~ ~ x d x ~-~d d ~ ~ "b
o~;~~~ yd~y
xg ~oo~ ~oZ~, ~c
x~
m o~~ rim
m
a~Z d
p
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
246
"' '~ o~ o~ O ~ p
w w
ua ao
0 o d
a a
H
r
0
0 0
0 o trJ b
°o
~D ~O N O~ .-r
O O O
i n N ~'
0o W '-' O
M
fD eC
O O ~
O ~O O
O O W A
v~',
o ,p
o "~J
O e°
r~
d
~ ~ ~ H
~~~r ~a~r
n
al ~,
a ~~~ ~°~ o
~~a ~~a b
r o
~a~ ~a~ ~y~ Hy
°~ ~°
H ~ ~ ~ H
d
o~~~,~~o~~'"~ zo~~~~~~zo~~~~~~
~m~~rG~~~~~r rH~y~d~~HH~~Yd~~
cz°oxz~z°ox '~~' ~~~~x ~' ~~~~'x "°
z~~~z~N~~~z ~~~~H~~~~~~~~Z~ro
~~~HdHd
~r,r~ r~~~r~r, r~ ~ a~ ~ a~
° ~ ''~ Z ° a '"'~ z ~ ° v~ p ~ ~ ~ ~ ° ~ p n U' m
c
z~~~Z~~ z~~~Z~~
>C ~ ~C d ~~>C H~ ~~~C "-39
.~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
247
o ~ o
w w w
"' , ~ w w w w b
as ao ~ ~ ~ G
N a. a~ A- bd
a~ a~ a a a
w t~ ~ ~ ~ ~'' H
H
0 o w oN, v~, N°
d
0 0 ~o ~o w o
0 0 ow, i.~ o b~ ,b
0 o rn co co 0
0 0 ~ ~ a, ,_'
O~o N N N .P
,
p .O ~ ~ ~ C
.oP J ,-N.
~M
O O '"' ~ n
A
Ov ~ C/~
w p
~ N
r N
d
x rxbxxb~x~~b~,x~a~ a~~ ~x
y ~.,~Hy~-~d~~c~obG~Ccj° C~~ xy
t.~~ xt~Od~ O~~ O~ O
N ~ ~ O ~ ~ ~ rØ, ~ ~ ~ r C) r~-, ~ C7 ~'
° o
a N~,~CI~~ ~ br.. br.. r9 "O
°a
c~~tnp~c~~t~p~ p~ ~ a.
xH
a a a
x f~1 ~ ~ N ~ N ~ N r
Y ~ ~ td td td
ax~obOOxxOxOxxOxOxx~H~c~H~O~
~~~oooo~ro~o~ro~rr~r~ m'~o~"~o
~ H o H t~ ~ ~ ~ ~ p ~ O td ~ O ~ O ~ ~
~p~~~rrp~rnr~~rp~ndr v~r~v~rp
~zoZOZk~°xx°x~xx° ~'~~ ~x~~xz d
~a~a~,
~c bzbz ~<r~~ bc~~ o
a a~r~x H~~~ H~~~ ~'~n ~ rtd b
~' ~m_xm° ~x~~ ~x~,~ ~ ° ~ ~~ ~ o
x r Z ~ ~~" ,-C''., n "~~' ~-r, ~ n ~ rr-r O ''~b (z ~-°d t~t\'l h%
d tii trJ tn G~ O 47 0
Y ~~~ ~Y~~ ~Yro~ ~' r
t~ ~ r~nH ~~~-3 ~ ~ n
°
0 0
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
248
0
w w w r0
w
o ~' ~ a. ~
G
a a a a
H
H
N N N ~ ~j
J
pp r, 01
i
O~ .P
J
O O
O rn
O
M
A
.-. ~. O ~n
O O b
O O
.,o,~ v' H
o, ~ a
,p
~ ~
O
'
r ~
d
wax ax d~zab~~x ~o~~x ~ ~d
~ trJ CWa-7 9 ~ ~
H ~ ~ "rd H C~ H a
~' y
~
~ , ~~~~a
,, "'b~~go~~
.3
x oo~o
~
zo o ~N~'ra~ ~~~~
d~o
~~~o~~x ~'~~'~ '~ o
II zH ~ ~r ~rHa H~oc~ a ~ b
~
~td ~ d~ ~rra
~~N
w
,~N H ~ N O n ~ C~ ~ ~ . n r a.
(~ ~ ~ n ~ ,~Z,
nr ~Y 9 H rY d
a
o ~ Z e~
. .
d
~ ~ r~ t"''
~ ~o
r,d~- o p ~ ~
d o"~o ~ r~ d
r '~~~ word ~
~ y
~i ~ O O''~G~~ O
ro ~H a~~ ~td~VC ,~tiO
O n~ ~~ ~~~d
n
rr~ o oar ~r~
r~N~ N
a m
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
249
~ ~ ~
_ _ _
~O ~O ~O
W W W
A. tn~ ~ ~ y "~
b
d
a a a a a
z
H
H
N .P N W
O~
i i i i
b
O O O
O O
~d
O
f~D
~' H
0 0 0
O ~
r ~
d
z~c~~ araxax ar x x ~axax a~ax~x ~~
a v ~
x ~
x ~~r~ v~ y~~nHv~ aHv~Hv~ ~n
n ~m~ cn w ,~ w ,~
z a ~ ~m H ,..., H ,.~
n H H ,~ H H H
H ~
O ~ ~ ,~ H ~ O H O O rJ
~ ~, ~ C~ ~ O O ~ O
n ~ C~ ~ O ~ O
O O O
b r mom~m~ momnm~ mm~mn mm~mn
mr~a z~zoz~ z~zoz~ ~zozo ~zoz~ ~r,
b x~ ~c ~ --,~ ~ ~~ ~~ roo
'-~ '-' H a '-' ,~ ~ x ~ ~ x
O ,.d n a ~ ~ ~
v~ ~ ,.d
~ ~m~aya ~m~y~a ~ N~a a H~
~ ~ ~'~ N~a a ~aH H ~
H ~ ~a~ H
mr.. ~ v~ er o
x ~v~ ~
9 ~'' ~ ~
~
C~ ~ ~ ~
~ ~ N ~ ~ ~ ~ ~ ~ '
d ~ 0
W W l ~ '' 3t x
'y H ~c ~c xrHr ''H o~
~c ~c l ~r-
~9~c~.c ~a~c~~c
(-7 .p ~ .p N .P
~ a
N
~
n
H
x ro
r~vrH
d
td~o 0
o
y
ro
p
H
r
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
250
~ ~ ~ ~ ~z~v~
~ '-' "' , W W O G ~
w w '
_ ~ ~ ~
.° ° °
0
w N N
a a a a a x
w N V1
H
N
00 ~. ~ ~ ~
Q1
N ~ c~D ~ b
,..
O
J J
O O m
D\
O
~p e~~i
~d
0
0 o coo "~J
o\ ~ a1
o N °° LsJ
c~'n o 0
O
'° r
d
-,~~ax~xz~~~
xx
. r ~.. N~~r,~~~r
a~?~rroac~~rr~a ~td~zo~br~~
C) ~ O ~ ~ n ~ O ~ r' C7 C~ ~ ~ ~ H yn o
~ trJ ~ ~ '~ ~ m ~ ~"' ~ ~ V'' ,b O 'y H ~ ~ ~ x "C
H
x o r~ N ~m m~
d o~~a~~ o~~~~ ~~°~x~ ~~a~d
x m~x~~ ~~x~x ~mx~c ~~~m
H am ,~ r ~y
~ ~ N ~ ~ ~ ~ N ~ ~ 9 C~ ~ N
~~~bc,~~ ~bc,~c, ~r~~x~xz~~~x"~
o~m~o~o~m~o~o~rm~o~ roo ono
"'"~'"d H ~ ~ ~ oo ~ n-7 n ~ ~-~d oo ~ O ~ ~ ~ ~ Q ~ ~tiy ~ ~ O ~ p ~ O
r~ r°~ r r°~ r ~mor~ rrxr~r~
md~~m~mC~~m~m ~x~omoo amxmomx
xvo xxx~o xxx abrx~~'~~x~x x~ d
z~
0
o~~b~ o~~b~ ~~r~'~ rr~o ~~o
~w~x~ ~x ~ r~~
~Nmdro ~Nm~b rz~a~ xa a H°~a o
o x ~ H x ~ H ~z~N~c a~ ~ ~c
n ~~'1 n ~ N r l~'~~'~ "'~C
a~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
251
z
W ° ° ~ ,~
d
a a a
~ o
H
°o a~
a, ov b, b~ ro
v, J so
mo o ~'
0
0
~s
b
r~
0
~ H
t~J
A ~
'.''' ~'' o
~ O ~°
r
d
~~N~,dozx~z~~~~omx d~~x~x
d,~~c~ ~~ ~,~aa~~~~
0 0 00 0
,.pb ta..,~ ~ ~ ~ O ~ t~ n ~ ~ ~ x c~ p b 'b ~ ~
°a~~drroa~~ob°~~~ro ~°o°~~ o
n~9~H
~ ~~~a~~~~ ~~~~~ ~~ ~a
x ~da~~ ~~ b~a~H x° c~~ °'
~~c d~c ~xNz~c
H N
9 O ~~ 9 ~~ .~~. .
... m ~ .
aO~~'~ ~ mo ~~~~~'~~Ha
w H r-, ra.., ~ ~ ~ ~ ~ ~° O H ~-t~Jd ono y G~ O rv'n
C/~ O ~ ~-d r C~ '~ H ° trJ C~ ~..~
~ ''d
~ C'7 ~ ~ b r0 ~ ~ ~ trJ ~~ C7 r ~ ~ N
O 04~N ~'~ ~-a3t~~tm''~~Htn-~
r ~ r v,
daoo cr-.
"' G7 ~"~ '~ ~ H r '~
ooa ,~~" r~
~zm~
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
252
N c~~, v~,c~~,t_~n~~"
--. ~ ~ ~. ~. w ,"d
y td t~ ~ 9 ~ Y n
H
~-. ~ ~. r-. ~ N
~O J N ~ J ~ ~
G
00 ~D O~ ~ Ov W W
.p .p -P N N
~
J O N ~ W N P ro
~
"'
~1 Oo c~ O v0
O O O O O O rn
N i W ~ .? f~
-J. tD
W J pp tn 01
0 0 0 0 0 0 ~
b
vp .p. W J W O
Cn
~ H
~ ~_
A
O ~
'
r ~
d
ox ~~ b~ ro ~ d~c~z
~ro ~~ Ob o"G~dOro
a
r
y n
r
~,~. ~~ ~~ ~Z ~ oy
N
~ ~ i
C ~ b~ td ' ~ i~
~
O d t-'
~C ~C ~
~. C~ by C.'C~'~ ~ ~
' '~' 0 ~' ~''~ trJ
~ ~'
~
~r-~oO ~ ~~ ~~ ~~ ~
,.b~~~ x ~r
~~
~Ot7t7CJ b ~~ ~~ ~~ ~ ~ t7
O ~~
~ t~~%n~ ~
~
ror~~~r rr rr rr r~zr o
~~o~~ b"~r~a mb bb bb bb b Hb e~
9 ~ m
N ~ c O O O O O t=i r r
n ~ O w ~ b O O O O
7~ ~1 '~ O
~J
trJ CrJ ~-3 H H H Wn
H 7~ H H H H trJ
~-~ trJ '~ '~ h7 trJtrJ "
" ~ ti1 trJtrJtrJ tit
~~ ~ ~0~0~ ~~ ~~ ~~ ~ ~~ ~ a
~
no ~N
r
~r o ~ o
<
~
V
CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
253
_ _ _ ~' "' z
N N N N_ ~ ~ O ~ A
N N N N N N
ro
w p' ~ a.
a a a
z
H
H
o .-. o 0
o~ ~, ~~ 00
o, :~ N w ~ ~ ro
i-. N N ~ ~-. N v1 v,
O O -P N O
O O O O O O
w .Np 01 ~ '".
~~h
O O ~ O O O vW .b
J ~ O W '-~ W "Ot
fD
~ H
~ O ~
r
d
~~~o~,doo a~d9 ~~ ~dd~
m ~. x r
~~r~~~
wz~~~a ~~ a ~~'x ao
~'~~~~o d ox ° ~ ode' ° ~ c~
~~od~~~~ ~~d ~ ~o~ m
~a"O l7d ~~W"~; d °a
~tnxwt~~Ct~ r~~ c~ r~' ~ c~ a.
c~ ~W n~ "~ n
y d ~i ~ x m ~ x ran
r a r~ m
x Zo
ad
o °d~oo~orr~odorH aoor
' ~xx~x~x~x~x'~ x°dr xx
~'~~ooZO ~ZO~'~~o~~r~o~
o ~~dE~~or~~~o'~o~dd~ ro
~~x~o~°~'~~~~o~'~~~
y Ha ay~yza°~ y~~N°a~~
~~~m
o~ ~ r~~ b ~ z HoH~ o
d x yx ~ x
a ~ o ~o o , ~o
~ ~ ro
tai
DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE: