Patent 2456955 Summary

(12) Patent Application:	(11) CA 2456955
(54) English Title:	NOVEL NUCLEIC ACIDS AND SECRETED POLYPEPTIDES
(54) French Title:	NOUVEAUX ACIDES NUCLEIQUES ET POLYPEPTIDES SECRETES
Status:	Dead

Bibliographic Data

(51) International Patent Classification (IPC):	C12N 15/12 (2006.01) A61K 38/17 (2006.01) C07K 14/435 (2006.01) C07K 14/47 (2006.01) C07K 16/18 (2006.01) C12N 15/63 (2006.01) C12P 21/02 (2006.01) C12Q 1/68 (2006.01) G01N 33/53 (2006.01)
(72) Inventors :	TANG, Y. TOM (United States of America) YANG, YONGHONG (United States of America) WANG, ZHIWEI (United States of America) WENG, GEZHI (United States of America) MA, YUNQING (United States of America)
(73) Owners :	NUVELO, INC. (United States of America)
(71) Applicants :	NUVELO, INC. (United States of America)
(74) Agent:	SMART & BIGGAR
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2002-08-09
(87) Open to Public Inspection:	2003-10-02
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US2002/025485
(87) International Publication Number:	WO2003/080795
(85) National Entry:	2004-02-09

(30) Application Priority Data:

Application No.	Country/Territory	Date
60/311,261	United States of America	2001-08-09

Abstracts

English Abstract

The present invention provides novel nucleic acids, novel polypeptide
sequences encoded by these nucleic acids and uses thereof.

French Abstract

L'invention porte sur de nouveaux acides nucléiques, de nouvelles séquences de polypeptide codées par ces acides nucléiques et sur leurs utilisations correspondantes

Claims

Note: Claims are shown in the official language in which they were submitted.

567

WHAT IS CLAIMED IS:

1. An isolated polynucleotide comprising a nucleotide sequence selected from
the group
consisting of SEQ ID NO: 1-1041.

2. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein
said polynucleotide hybridizes to the polynucleotide of claim 1 under
stringent hybridization
conditions.

3. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein
said polynucleotide has greater than about 99% sequence identity with the
polynucleotide of
claim 1.

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA.

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises
the
complementary sequences.

6. A vector comprising the polynucleotide of claim 1.

7. An expression vector comprising the polynucleotide of claim 1.

8. A host cell genetically engineered to comprise the polynucleotide of claim
1.

9. A host cell genetically engineered to comprise the polynucleotide of claim
1
operatively associated with a regulatory sequence that modulates expression of
the
polynucleotide in the host cell.

10. An isolated polypeptide, wherein the polypeptide is selected from the
group consisting
of:
(a) a polypeptide encoded by any one of the polynucleotides of claim 1;
and
(b) a polypeptide encoded by a polynucleotide hybridizing under
stringent conditions with any one of SEQ ID NO: 1-1041.

568

11. A composition comprising the polypeptide of claim 10 and a carrier.

12. An antibody directed against the polypeptide of claim 10.

13. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:
a) contacting the sample with a compound that binds to and forms a
complex with the polynucleotide of claim 1 for a period sufficient to form the
complex; and
b) detecting the complex, so that if a complex is detected, the
polynucleotide of claim 1 is detected.

14. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:
a) contacting the sample under stringent hybridization conditions with
nucleic acid primers that anneal to the polynucleotide of claim 1 under such
conditions;
b) amplifying a product comprising at least a portion of the
polynucleotide of claim 1; and
c) detecting said product and thereby the polynucleotide of claim 1 in the
sample.

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and
the
method further comprises reverse transcribing an annealed RNA molecule into a
cDNA
polynucleotide.

16. A method for detecting the polypeptide of claim 10 in a sample,
comprising:
a) contacting the sample with a compound that binds to and forms a
complex with the polypeptide under conditions and for a period sufficient to
form the
complex; and
b) detecting formation of the complex, so that if a complex formation is
detected, the polypeptide of claim 10 is detected.

17. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:

569

a) contacting the compound with the polypeptide of claim 10 under
conditions sufficient to form a polypeptide/compound complex; and
b) detecting the complex, so that if the polypeptide/compound complex
is detected, a compound that binds to the polypeptide of claim 10 is
identified.

18. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:

a) contacting the compound with the polypeptide of claim 10, in a cell,
under conditions sufficient to form a polypeptide/compound complex, wherein
the complex
drives expression of a reporter gene sequence in the cell; and

b) detecting the complex by detecting reporter gene sequence expression,
so that if the polypeptide/compound complex is detected, a compound that binds
to the
polypeptide of claim 10 is identified.

19. A method of producing the polypeptide of claim 10, comprising,
a) culturing a host cell comprising a polynucleotide sequence selected
from the group consisting of any of the polynucleotides from SEQ ID NO: 1-
1041, under
conditions sufficient to express the polypeptide in said cell; and
b) isolating the polypeptide from the cell culture or cells of step (a).

20. An isolated polypeptide comprising an amino acid sequence selected from
the group
consisting of any one of the polypeptides SEQ ID NO: 1042-2082.

21. The polypeptide of claim 20 wherein the polypeptide is provided on a
polypeptide
array.

22. A collection of polynucleotides, wherein the collection comprising of at
least one of
SEQ ID NO: 1-1041.

23. The collection of claim 22, wherein the collection is provided on a
nucleic acid array.

24. The collection of claim 23, wherein the array detects full-matches to any
one of the
polynucleotides in the collection.

570

25. The collection of claim 23, wherein the array detects mismatches to any
one of the
polynucleotides in the collection.

26. The collection of claim 22, wherein the collection is provided in a
computer-readable
format.

Description

Note: Descriptions are shown in the official language in which they were submitted.

DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1
NOVEL NUCLEIC ACIDS AND SECRETED
POLYPEPTIDES
1. CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part application of U.S. Application
Serial No.
09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries",
Attorney Docket No. 784CIP, which in turn is a continuation-in-part
application of U.S.
Application Serial No. 09/488,725 filed, January 21, 2000 entitled "Novel
Contigs Obtained
from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No.
09/491,404
filed January 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries'.', Attorney
Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000
entitled "Novel
Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in
turn is a
continuation-in-part application of U.S. Application Serial No. 09/496,914
filed February 03,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 787;
U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 788CIP, which in turn
is,a
continuation-in-part application of U.S. Application Serial No. 09/515,126
filed February 28,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 788;
U.S. Application Serial No. 091574,454 filed May 19, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/519,705
filed March 07,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 789;
U.S. Application Serial No. 091649,167 filed August 23, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/540,217
filed March 31,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 790;
U.S. Application Serial No. 09/770,160 filed January 26, 2001 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 791CIP, which is in turn
a
continuation-in-part application of U.S. Application Serial No. 091552,929
filed April 18,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 791;
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 792; all of which are
incorporated
herein by reference in their entirety.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
2
2. BACKGROUND OF THE INVENTION
2.1 TECHNICAL FIELD
The present invention provides novel polynucleotides and proteins encoded by
such
polynucleotides, along with uses for these polynucleotides and proteins, for
example in
therapeutic, diagnostic and research methods.
2.2 BACKGROUND
Technology aimed at the discovery of protein factors (including e.g.,
cytokines, such
as lymphokines, interferons, circulating soluble factors, chemokines, and
interleukins) has
matured rapidly over the past decade. The now routine hybridization cloning
and expression
cloning techniques clone novel polynucleotides "directly" in the sense that
they rely on
information directly related to the discovered protein (i.e., partial
DNA/amino acid sequence
of the protein in the case of hybridization cloning; activity of the protein
in the case of
expression cloning). More recent "indirect" cloning techniques such as signal
sequence
cloning, which isolates DNA sequences based on the presence of a now well-
recognized
secretory leader sequence motif, as well as various PCR-based or low
stringency
hybridization-based cloning techniques, have advanced the state of the art by
making
available large numbers of DNA/amino acid sequences for proteins that are
known to have
biological activity, for example, by virtue of their secreted nature in the
case of leader
sequence cloning, by virtue of their cell or tissue source in the case of PCR-
based
techniques, or by virtue of structural similarity to other genes of known
biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications
in,
for example, diagnostics, forensics, gene mapping; identification of mutations
responsible
for genetic disorders or other traits, to assess biodiversity, and to produce
many other types
of data and products dependent on DNA and amino acid sequences.
3. SUMMARY OF THE INVENTION
The compositions of the present invention include novel isolated polypeptides,
novel
isolated polymcleotides encoding such polypeptides, including recombinant DNA
molecules,
cloned genes or degenerate variants thereof, especially naturally occurring
variants such as
allelic variants, antisense polynucleotide molecules, and antibodies that
specifically recognize

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
one or more epitopes present on such polypeptides, as well as hybridomas
producing such
antibodies.
The compositions of the present invention additionally include vectors,
including
expression vectors, containing the polynucleotides of the invention, cells
genetically engineered
to contain such polynucleotides and cells genetically engineered to express
such
polynucleotides.
The present invention relates to a collection or library of at least one novel
nucleic acid
sequence assembled from expressed sequence tags (ESTs) isolated mainly by
sequencing by
hybridization (SBH), and in some cases, sequences obtained from one or more
public
databases. The invention relates also to the proteins encoded by such
polynucleotides, along
with therapeutic, diagnostic and research utilities for these polynucleotides
and proteins. These
nucleic acid sequences axe designated as SEQ ID NO: 1-1041, or 2083-2534 and
are provided
in the Sequence Listing. In the nucleic acids provided in the Sequence
Listing, A is adenine; C
is cytosine; G is guanine; T is thymine; and N is any of the four bases or
unknown. In the
amino acids provided in the Sequence Listing, * corresponds to the stop codon.
The nucleic acid sequences of the present invention also include, nucleic acid
sequences
that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under
stringent
hybridization conditions; nucleic acid sequences which are allelic variants or
species
homologues of any of the nucleic acid sequences recited above, or nucleic acid
sequences that
encode a peptide comprising a specific domain or truncation of the peptides
encoded by SEQ
ID NO: 1-1041, or 2083-2534. A polynucleotide comprising a nucleotide sequence
having at
least 90% identity to an identifying sequence of SEQ m NO: 1-1041, or 2083-
2534 or a
degenerate variant or fragment thereof. The identifying sequence can be 100
base pairs in
length.
The nucleic acid sequences of the present invention also include the sequence
information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-
2534. The
sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-
2534 that
uniquely identifies or represents the sequence information of SEQ ID NO: 1-
1041, or 2083-
2534.
A collection as used in this application can be a collection of only one
polynucleotide.
The collection of sequence information or identifying information of each
sequence can be
provided on a nucleic acid array. In one embodiment, segments of sequence
information are
provided on a nucleic acid array to detect the polynucleotide that contains
the segment. The

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4
array can be designed to detect full-match or mismatch to the polynucleotide
that contains the
segment. The collection can also be provided in a computer-readable format.
This invention also includes the reverse or direct complement of any of the
nucleic acid
sequences recited above; cloning or expression vectors containing the nucleic
acid sequences;
and host cells or organisms transformed with these expression vectors. Nucleic
acid sequences
(or their reverse or direct complements) according to the invention have
numerous applications
in a variety of techniques known to those skilled in the art of molecular
biology, such as use as
hybridization probes, use as primers for PCR, use in an array, use in computer-
readable media,
use in sequencing full-length genes, use for chromosome and gene mapping, use
in the
recombinant production of protein, and use in the generation of anti-sense DNA
or RNA, their
chemical analogs and the like.
In a preferred embodiment, the nucleic acid sequences of SEQ m NO: 1-1041, or
2083-
2534 or novel segments or parts of the nucleic acids of the invention are used
as primers in
expression assays that are well knov~m in the art. In a particularly preferred
embodiment, the
nucleic acid sequences of SEQ m NO: 1-1041, or 2083-2534 or novel segments or
parts of the
nucleic acids provided herein are used in diagnostics for identifying
expressed genes or, as well
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992),
as expressed
sequence tags for physical mapping of the human genome.
The isolated polynucleotides of the invention include, but are not limited to,
a
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ
ID NO: 1-
1041, or 2083-2534; a polynucleotide comprising aaiy of the full length
protein coding
sequences of SEQ )D NO: 1-1041, or 2083-2534; and a polynucleotide comprising
any of the
nucleotide sequences of the mature protein coding sequences of SEQ ~ NO: 1-
1041, or 2083-
2534. The polynucleotides of the present invention also include, but are not
limited to, a
polynucleotide that hybridizes under stringent hybridization conditions to (a)
the complement of
any one of the nucleotide sequences set forth in SEQ m NO: 1-1041, or 2083-
2534; (b) a
nucleotide sequence encoding any one of the amino acid sequences set forth in
SEQ m NO: 1-
1041, or 2083-2534; (c) a pol5mucleotide which is an allelic variant of any
polynucleotides
recited above; (d) a polynucleotide which encodes a species homolog (e.g.
orthologs) of any of
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide
comprising a
specific domain or truncation of any of the polypeptides comprising an amino
acid sequence set
forth in SEQ m NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide
comprising any of the amino acid sequences set forth in the Sequence Listing;
or the
corresponding full length or mature protein. Polypeptides of the invention
also include
polypeptides with biological activity that are encoded by (a) any of the
polynucleotides having
a nucleotide sequence set forth in SEQ B7 NO: 1-1041, or 2083-2534; or (b)
polynucleotides
that hybridize to the complement of the polynucleotides of (a) under stringent
hybridization
conditions. Biologically active variants of any of the polypeptide sequences
in the Sequence
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%,
70%, 75%, 80%,
85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain
biological
activity are also contemplated. The polypeptides of the invention may be
wholly or partially
chemically synthesized but are preferably produced by recombiilant means using
the genetically
engineered cells (e.g. host cells) of the invention.
The invention also provides compositions comprising a polypeptide of the
invention.
Polypeptide compositions of the invention may further comprise an acceptable
carrier, such
as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a
polynucleotide of the invention.
The invention also relates to methods for producing a polypeptide of the
invention
comprising growing a culture of the host cells of the invention in a suitable
culture medium
under conditions permitting expression of the desired polypeptide, and
purifying the
polypeptide from the culture or from the host cells. Preferred embodiments
include those in
which the protein produced by such processes is a mature form of the protein.
Polynucleotides according to the invention have numerous applications in a
variety
of techniques known to those skilled in the art of molecular biology. These
techniques
include use as hybridization probes, use as oligomers, or primers, for PCR,
use for
chromosome and gene mapping, use in the recombinant production of protein, and
use in
generation of anti-sense DNA or RNA, their chemical analogs and the like. For
example,
when the expression of an mRNA is largely restricted to a particular cell or
tissue type,
polynucleotides of the invention can be used as hybridization probes to detect
the presence
of the particular cell or tissue mRNA in a sample using, e.g., ira situ
hybridization.
In other exemplary embodiments, the polynucleotides are used in diagnostics as
expressed sequence tags for identifying expressed genes or, as well known in
the art and

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
6
exemplified by Vollrath et al., Science 25:52-59 (1992), as expressed sequence
tags for
physical mapping of the human genome.
The polypeptides according to the invention can be used in a variety of
conventional
procedures and methods that are currently applied to other proteins. For
example, a
polypeptide of the invention can be used to generate an antibody that
specifically binds the
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful
for detecting or
quantitating the polypeptide in tissue. The polypeptides of the invention can
also be used as
molecular weight markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical
condition which comprises the step of administering to a mammalian subject a
therapeutically effective amount of a composition comprising a polypeptide of
the present
invention and a pharmaceutically acceptable carrier.
In particular, the polypeptides and polynucleotides of the invention can be
utilized,
for example, in methods for the prevention and/or treatment of disorders
involving aberrant
protein expression or biological activity.
The present invention further relates to methods for detecting the presence of
the
polynucleotides or polypeptides of the invention in a sample. Such methods
can, for
example, be utilized as part of prognostic and diagnostic evaluation of
disorders as recited
herein and for the identification of subjects exhibiting a predisposition to
such conditions.
The invention provides a method for detecting the polynucleotides of the
invention in a
sample, comprising contacting the sample with a compound that binds to and
forms a
complex with the polynucleotide of interest for a period sufficient to form
the complex and
under conditions sufficient to form a complex and detecting the complex such
that if a
complex is detected, the polynucleotide of interest is detected. The invention
also provides a
method for detecting the polypeptides of the invention in a sample comprising
contacting the
sample with a compound that binds to and forms a complex with the polypeptide
under
conditions and for a period sufficient to form the complex and detecting the
formation of the
complex such that if a complex is formed, the polypeptide is detected.
The invention also provides kits comprising polynucleotide probes and/or
monoclonal antibodies, and optionally quantitative standards, for carrying out
methods of the
invention. Furthermore, the invention provides methods for evaluating the
efficacy of drugs,
and monitoring the progress of patients, involved in clinical trials for the
treatment of
disorders as recited above.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
7
The invention also provides methods for the identification of compounds that
modulate (i.e., increase or decrease) the expression or activity of the
polynucleotides and/or
polypeptides of the invention. Such methods can be utilized, for example, for
the
identification of compounds that can ameliorate symptoms of disorders as
recited herein.
Such methods can include, but are not limited to, assays for identifying
compounds and
other substances that interact with (e.g., bind to) the polypeptides of the
invention. The
invention provides a method for identifying a compound that binds to the
polypeptides of the
invention comprising contacting the compound with a polypeptide of the
invention in a cell
for a time sufficient to form a polypeptide/compound complex, wherein the
complex drives
expression of a reporter gene sequence in the cell; and detecting the complex
by detecting
the reporter gene sequence expression such that if expression of the reporter
gene is detected
the compound that binds to a polypeptide of the invention is identified.
The methods of the invention also provide methods for treatment which involve
the
administration of the polynucleotides or polypeptides of the invention to
individuals
exhibiting synptoms or tendencies. In addition, the invention encompasses
methods for
treating diseases or disorders as recited herein comprising administering
compounds and
other substances that modulate the overall activity of the target gene
products. Compounds
and other substances can affect such modulation either on the level of target
gene/protein
expression or target protein activity.
The polypeptides of the present invention and the polynucleotides encoding
them are
also useful for the same functions known to one of skill in the art as the
polypeptides and
polynucleotides to which they have homology (set forth in Table 2); for which
they have a
signature region (as set forth in Table 3); or for which they have homology to
a gene family
(as set forth in Table 4). If no homology is set forth for a sequence, then
the polypeptides
and polynucleotides of the present invention are useful for a variety of
applications, as
described herein, including use in arrays for detection.
4. DETAILED DESCRIPTION OF THE INVENTION
4.1 DEFINITIONS
It must be noted that as used herein and in the appended claims, the singular
forms
"a", "an" and "the" include plural references unless the context clearly
dictates otherwise.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "active" refers to those forms of the polypeptide which retain the
biologic
and/or immunologic activities of any naturally occurnng polypeptide. According
to the
invention, the terms "biologically active" or "biological activity" refer to a
protein or peptide
having structural, regulatory or biochemical functions of a naturally
occurring molecule.
Likewise "immunologically active" or "immunological activity" refers to the
capability of
the natural, recombinant or synthetic polypeptide to induce a specific immune
response in
appropriate animals or cells and to bind with specific antibodies.
The term "activated cells" as used in this application are those cells which
are
engaged in extracellular or intracellular membrane trafficking, including the
export of
secretory or enzymatic molecules as part of a normal or disease process.
The terms "complementary" or "complementarity" refer to the natural binding of
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to
the
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded
molecules may be "partial" such that only certain portions) of the nucleic
acids bind or it
may be "complete" such that total complementarity exists between the single
stranded
molecules. The degree of complementarity between the nucleic acid strands has
significant
effects on the efficiency and strength of the hybridization between the
nucleic acid strands.
The term "embryonic stem cells (ES)" refers to a cell that can give rise to
many
differentiated cell types in an embryo or an adult, including the germ cells.
The term "germ
line stem cells (GSCs)" refers to stem cells derived from primordial stem
cells that provide a
steady and continuous source of germ cells for the production of gametes. The
term
"primordial germ cells (PGCs)" refers to a small population of cells set aside
from other cell
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during
embryogenesis
that have the potential to differentiate into germ cells and other cells. PGCs
are the source
from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells
are .
capable of self renewal. Thus these cells not only populate the germ line and
give rise to a
plurality of terminally differentiated cells that comprise the adult
specialized organs, but are
able to regenerate themselves.
The term "expression modulating fragment," EMF, means a series of nucleotides
which modulates the expression of an operably linked ORF or another EMF.
As used herein, a sequence is said to "modulate the expression of an operably
linked
sequence" when the expression of the sequence is altered by the presence of
the EMF.
EMFs include, but are not limited to, promoters, and promoter modulating
sequences

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
9
(inducible elements). One class of EMFs are nucleic acid fragments which
induce the
expression of an operably linked ORF in response to a specific regulatory
factor or
physiological event.
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or
"oligonucleotide" are used interchangeably and refer to a heteropolymer of
nucleotides or
the sequence of these nucleotides. These phrases also refer to DNA or RNA of
genomic or
synthetic origin which may be single-stranded or double-stranded and may
represent the
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-
like or RNA-like
material. In the sequences herein A is adenine, C is cytosine, T is thymine, G
is guanine and
N is A, C, G, or T (L~ or unknown. It is contemplated that where the
polynucleotide is
RNA, the T (thymine) in the sequences provided herein is substituted with U
(uracil).
Generally, nucleic acid segments provided by this invention may be assembled
from
fragments of the genome and short oligonucleotide linkers, or from a series of
oligonucleotides, or from individual nucleotides, to provide a synthetic
nucleic acid which is
capable of being expressed in a recombinant transcriptional unit comprising
regulatory
elements derived from a microbial or viral operon, or a eukaryotic gene.
The terms "oligonucleotide fragment" or a "polynucleotide fragment",
"portion," or
"segment" or "probe" or "primer" are used interchangeably and refer to a
sequence of
nucleotide residues which are at least about 5 nucleotides, more preferably at
least about 7
nucleotides, more preferably at least about 9 nucleotides, more preferably at
least about 11
nucleotides and most preferably at least about 17 nucleotides. The fragment is
preferably
less than about 500 nucleotides, preferably less than about 200 nucleotides,
more preferably
less than about 100 nucleotides, more preferably less than about 50
nucleotides and most
preferably less than 30 nucleotides. Preferably the probe is from about 6
nucleotides to
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more
preferably
from about 17 to 30 nucleotides and most preferably from about 20 to 25
nucleotides.
Preferably the fragments can be used in polymerase chain reaction (PCR),
various
hybridization procedures or microarray procedures to identify or amplify
identical or related
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify
each
polynucleotide sequence of the present invention. Preferably the fragment
comprises a
sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534.
Probes may, for example, be used to determine whether specific mRNA molecules
are present in a cell or tissue or to isolate similar nucleic acid sequences
from chromosomal

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl
1:241-250).
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or
other methods
well known in the art. Probes of the present invention, their preparation
andlor labeling are
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory
Manual, Cold
5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current
Protocols in
Molecular Biology, John Wiley & Sons, New York NY, both of which are
incorporated
herein by reference in their entirety.
The nucleic acid sequences of the present invention also include the sequence
infornlation from the nucleic acid sequences of SEQ ff~ NO: 1-1041, or 2083-
2534. The
10 sequence information can be a segment of any one of SEQ m NO: 1-1041, or
2083-2534
that uniquely identifies or represents the sequence information of that
sequence of SEQ m
NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and
8. One such
segment can be a twenty-mer nucleic acid sequence because the probability that
a twenty-
mer is fully matched in the human genome is 1 in 300. In the human genome,
there are three
billion base pairs in one set of chromosomes. Because 42° possible
twenty-mers exist, there
are 300 times more twenty-mers than there are base pairs in a set of human
chromosomes.
Using the same analysis, the probability for a seventeen-mer to be fully
matched in the
human genome is approximately 1 in 5. When these segments are used in arrays
for
expression studies, fifteen-mer segments can be used. The probability that the
fifteen-mer is
fully matched in the expressed sequences is also approximately one in five
because
expressed sequences comprise less than approximately 5% of the entire genome
sequence.
Similarly, when using sequence information for detecting a single mismatch, a
segment
can be a twenty-five mer. The probability that the twenty-five mer would
appear in a human
genome with a single mismatch is calculated by multiplying the probability for
a full match
(1=4z5) times the increased probability for mismatch at each nucleotide
position (3 x 25). The
probability that an eighteen mer with a single mismatch can be detected in an
array for
expression studies is approximately one in five. The probability that a twenty-
mer with a single
mismatch can be detected in a human genome is approximately one in five.
The term "open reading frame," ORF, means a series of nucleotide triplets
coding for
amino acids without any termination codons and is a sequence translatable into
protein.
The terms "operably linked" or "operably associated" refer to functionally
related
nucleic acid sequences. For example, a promoter is operably associated or
operably linked
with a coding sequence if the promoter controls the transcription of the
coding sequence.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
11
While operably linked nucleic acid sequences can be contiguous and in the same
reading
frame, certain genetic elements e.g. repressor genes are not contiguously
linked to the coding
sequence but still control transcription/translation of the coding sequence.
The term "pluripotent" refers to the capability of a cell to differentiate
into a number
of differentiated cell types that are present in an adult organism. A
pluripotent cell is
restricted in its differentiation capability in comparison to a totipotent
cell.
The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and
to naturally
occurring or synthetic molecules. A polypeptide "fragment," "portion," or
"segment" is a
stretch of amino acid residues of at least about 5 amino acids, preferably at
least about 7
amino acids, more preferably at least about 9 amino acids and most preferably
at least about
17 or more amino acids. The peptide preferably is not greater than about 200
amino acids,
more preferably less than 150 amino acids and most preferably less than 100
amino acids.
Preferably the peptide is from about 5 to about 200 amino acids. To be active,
any
polypeptide must have sufficient length to display biological and/or
irmnunological activity.
The term "naturally occurring polypeptide" refers to polypeptides produced by
cells
that have not been genetically engineered and specifically contemplates
various polypeptides
arising from post-translational modifications of the polypeptide including,
but not limited to,
acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and
acylation.
The term "translated protein coding portion" means a sequence which encodes
for the
full-length protein which may include any leader sequence or any processing
sequence.
The term "mature protein coding sequence" means a sequence which encodes a
peptide or protein without a signal or leader sequence. The "mature protein
portion" means
that portion of the protein which does not include a signal or leader
sequence. The peptide
may have been produced by processing in the cell wluch removes any
leader/signal
sequence. The mature protein portion may or may not include the initial
methionine residue.
The methionine residue may be removed from the protein during processing in
the cell. The
peptide may be produced synthetically or the protein may have been produced
using a
polynucleotide only encoding for the mature protein coding sequence.
The term "derivative" refers to polypeptides chemically modified by such
techniques
as ubiquitination, labeling (e.g., with radionuclides or various enzymes),
covalent polymer
attachment such as pegylation (derivatization with polyethylene glycol) and
insertion or

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12
substitution by chemical synthesis of amino acids such as ornithine, which do
not normally
occur in human proteins.
The term "variant"(or "analog") refers to any polypeptide differing from
naturally
occurnng polypeptides by amino acid insertions, deletions, and substitutions,
created using,
a g., recombinant DNA techniques. Guidance in determining which amino acid
residues
may be replaced, added or deleted without abolishing activities of interest,
may be found. by
comparing the sequence of the particular polypeptide with that of homologous
peptides and
minimizing the number of amino acid sequence changes made in regions of high
homology
(conserved regions) or by replacing amino acids with consensus sequence.
Alternatively, recombinant variants encoding these same or similar
polypeptides may
be synthesized or selected by making use of the "redundancy" in the genetic
code. Various
codon substitutions, such as the silent changes which produce various
restriction sites, may
be introduced to optimize cloning into a plasmid or viral vector or expression
in a particular
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may
be
reflected in the polypeptide or domains of other peptides added to the
polypeptide to modify
the properties of any part of the polypeptide, to change characteristics such
as ligand-binding
affinities, interchain affinities, or degradation/turnover rate.
Preferably, amino acid "substitutions" are the result of replacing one amino
acid with
another amino acid having similar structural and/or chemical properties, i.
e., conservative
amino acid replacements. "Conservative" amino acid substitutions may be made
on the
basis of similarity in polarity, charge, solubility, hydrophobicity,
hydrophilicity, and/or the
amphipathic nature of the residues involved. For example, nonpolar
(hydrophobic) amino
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine,
tryptophan, and
methionine; polar neutral amino acids include glycine, serine, threonine,
cysteine, tyrosine,
asparagine, and glutamine; positively charged (basic) amino acids include
arginine, lysine,
and histidine; and negatively charged (acidic) amino acids include aspartic
acid and glutamic
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20
amino acids,
more preferably 1 to 10 amino acids. The variation allowed may be
experimentally
determined by systematically making insertions, deletions, or substitutions of
amino acids in
a polypeptide molecule using recombinant DNA techniques and assaying the
resulting
recombinant variants for activity.
Alternatively, where alteration of function is desired, insertions, deletions
or
non-conservative alterations can be engineered to produce altered
polypeptides. Such

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13
alterations can, for example, alter one or more of the biological functions or
biochemical
characteristics of the polypeptides of the invention. For example, such
alterations may
change polypeptide characteristics such as ligand-binding affinities,
interchain affinities, or
degradation/turnover rate. Further, such alterations can be selected so as to
generate
polypeptides that are better suited for expression, scale up and the like in
the host cells
chosen for expression. For example, cysteine residues can be deleted or
substituted with
another amino acid residue in order to eliminate disulfide bridges.
The terms "purified" or "substantially purified" as used herein denotes that
the
indicated nucleic acid or polypeptide is present in the substantial absence of
other biological
macromolecules, e.g., polynucleotides, proteins, and the like. In one
embodiment, the
polynucleotide or polypeptide is purified such that it constitutes at least
95% by weight,
more preferably at least 99% by weight, of the indicated biological
macromolecules present
(but water, buffers, and other small molecules, especially molecules having a
molecular
weight of less than 1000 daltons, can be present).
The term "isolated" as used herein refers to a nucleic acid or polypeptide
separated
from at least one other component (e.g., nucleic acid or polypeptide) present
with the nucleic
acid or polypeptide in its natural source. In one embodiment, the nucleic acid
or polypeptide
is found in the presence of (if anything) only a solvent, buffer, ion, or
other component
normally present in a solution of the same. The terms "isolated" and
"purified" do not
encompass nucleic acids or polypeptides present in their natural source.
The term "recombinant," when used herein to refer to a polypeptide or protein,
means
that a polypeptide or protein is derived from recombinant (e.g., microbial,
insect, or
mammalian) expression systems. "Microbial" refers to recombinant polypeptides
or proteins
made in bacterial or fungal (e.g., yeast) expression systems. As a product,
"recombinant
microbial" defines a polypeptide or protein essentially free of native
endogenous substances
and unaccompanied by associated native glycosylation. Polypeptides or proteins
expressed
in most bacterial cultures, e.g., E. coli, will be free of glycosylation
modifications;
polypeptides or proteins expressed in yeast will have a glycosylation pattern
in general
different from those expressed in mammalian cells.
The term "recombinant expression vehicle or vector" refers to a plasmid or
phage or
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An
expression
vehicle can comprise a transcriptional unit comprising an assembly of (1) a
genetic element
or elements having a regulatory role in gene expression, for example,
promoters or

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
14
enhancers, (2) a structural or coding sequence which is transcribed into mRNA
and
translated into protein, and (3) appropriate transcription iutiation and
termination sequences.
Structural units intended for use in yeast or eukaryotic expression systems
preferably include
a leader sequence enabling extracellular secretion of translated protein by a
host cell.
Alternatively, where recombinant protein is expressed without a leader or
transport
sequence, it may include an amino terminal methionine residue. This residue
may or may
not be subsequently cleaved from the expressed recombinant protein to provide
a final
product.
The term "recombinant expression system" means host cells which have stably
integrated a recombinant transcriptional unit into chromosomal DNA or carry
the
recombinant transcriptional unit extrachromosomally. Recombinant expression
systems as
defined herein will express heterologous polypeptides or proteins upon
induction of the
regulatory elements linked to the DNA segment or synthetic gene to be
expressed. This term
also means host cells which have stably integrated a recombinant genetic
element or
elements having a regulatory role in gene expression, for example, promoters
or enhancers.
Recombinant expression systems as defined herein will express polypeptides or
proteins
endogenous to the cell upon induction of the regulatory elements linked to the
endogenous
DNA segment or gene to be expressed. The cells can be prokaryotic or
eukaryotic.
The term "secreted" includes a protein that is transported across or through a
membrane, including transport as a result of signal sequences in its amino
acid sequence
when it is expressed in a suitable host cell. "Secreted" proteins include
without limitation
proteins secreted wholly (e.g., soluble proteins) or partially (e.g.,
receptors) from the cell in
which they are expressed. "Secreted" proteins also include without limitation
proteins that
are transported across the membrane of the endoplasmic reticulum. "Secreted"
proteins are
also intended to include proteins containing non-typical signal sequences
(e.g. Interleukin-1
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and
factors
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see
Arend, W.P. et. al.
(1995) Annu. Rev. hnmunol. 16:27-55)
Where desired, an expression vector may be designed to contain a "signal or
leader
sequence" which will direct the polypeptide through the membrane of a cell.
Such a
sequence may be naturally present on the polypeptides of the present invention
or provided
from heterologous protein sources by recombinant DNA techniques.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "stringent" is used to refer to conditions that are commonly
understood in
the art as stringent. Stringent conditions can include highly stringent
conditions (i.e.,
hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate
(SDS), 1
mM EDTA at 65°C, and washing in O.1X SSC/0.1% SDS at 68°C), and
moderately stringent
5 conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other
exemplary hybridization
conditions are described herein in the examples.
In instances of hybridization of deoxyoligonucleotides, additional exemplary
stringent hybridization conditions include washing in 6X SSC/0.05% sodium
pyrophosphate
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base
oligonucleotides), 55°C (for 20-
10 base oligonucleotides), and 60°C (for 23-base oligonucleotides).
As used herein, "substantially equivalent" or "substantially similar" can
refer both to
nucleotide and amino acid sequences, for example a mutant sequence, that
varies from a
reference sequence by one or more substitutions, deletions, or additions, the
net effect of
which does not result in an adverse functional dissimilarity between the
reference and
15 subject sequences. Typically, such a substantially equivalent sequence
varies from one of
those listed herein by no more than about 35% (i.e., the number of individual
residue
substitutions, additions, and/or deletions in a substantially equivalent
sequence, as compared
to the corresponding reference sequence, divided by the total number of
residues in the
substantially equivalent sequence is about 0.35 or less). Such a sequence is
said to have
65% sequence identity to the listed sequence. In one embodiment, a
substantially
equivalent, e.g., mutant, sequence of the invention varies from a listed
sequence by no more
than 30% (70% sequence identity); in a variation of this embodiment, by no
more than 25%
(75% sequence identity); and in a further variation of this embodiment, by no
more than
20% (80% sequence identity) and in a further variation of this embodiment, by
no more than
10% (90% sequence identity) and in a further variation of this embodiment, by
no more that
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid
sequences
according to the invention preferably have at least 80% sequence identity with
a listed amino
acid sequence, more preferably at least 85% sequence identity, more preferably
at least 90%
sequence identity, more preferably at least 95% sequence identity, more
preferably at least
98% sequence identity, and most preferably at least 99% sequence identity.
Substantially
equivalent nucleotide sequence of the invention can have louver percent
sequence identities,
taking into account, for example, the redundancy or degeneracy of the genetic
code.
Preferably, the nucleotide sequence has at least about 65% identity, more
preferably at least

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
16
about 75% identity, more preferably at least about 80% sequence identity, more
preferably at
least 85% sequence identity, more preferably at least 90% sequence identity,
more preferably
at least about 95% sequence identity, more preferably at least 98% sequence
identity, and
most preferably at least 99% sequence identity. For the purposes of the
present invention,
sequences having substantially equivalent biological activity and
substantially equivalent
expression characteristics are considered substantially equivalent. For the
purposes of
determining equivalence, truncation of the mature sequence (e.g., via a
mutation which
creates a new stop codon) should be disregarded. Sequence identity may be
determined,
e.g., using the Jotun Hein method (Hero, J. (1990) Methods Enzymol. 183:626-
645).
Identity between sequences can also be determined by other methods known in
the art, e.g.
by varying hybridization conditions.
The term "totipotent" refers to the capability of a cell to differentiate into
all of the
cell types of an adult organism.
The term "transformation" means introducing DNA into a suitable host cell so
that
the DNA is replicable, either as an extrachromosomal element, or by
chromosomal
integration. The term "transfection" refers to the taking up of an expression
vector by a
suitable host cell, whether or not any coding sequences are in fact expressed.
The term
"infection" refers to the introduction of nucleic acids into a suitable host
cell by use of a
virus or viral vector.
As used herein, an "uptake modulating fragment," UMF, means a series of
nucleotides which mediate the uptake of a linked DNA fragment into a cell.
UMFs can be
readily identified using known UMFs as a target sequence or target motif with
the
computer-based systems described below. The presence and activity of a UMF can
be
confirmed by attaching the suspected UMF to a marker sequence. The resulting
nucleic acid
molecule is then incubated with an appropriate host under appropriate
conditions and the
uptake of the marker sequence is determined. As described above, a UMF will
increase the
frequency of uptake of a linked marker sequence.
Each of the above terms is meant to encompass all that is described for each,
unless
the context dictates otherwise.
4.2 NUCLEIC ACIDS OF THE INVENTION
Nucleotide sequences of the invention are set forth in the Sequence Listing.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
17
The isolated polynucleotides of the invention include a polynucleotide
comprising
the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534; a polynucleotide
encoding
any one of the peptide sequences of SEQ m NO: 1-1041, or 2083-2534; and a
polynucleotide comprising the nucleotide sequence encoding the mature protein
coding
sequence of the polynucleotides of any one of SEQ m NO: 1-1041, or 2083-2534.
The
polynucleotides of the present invention also include, but are not limited to,
a polynucleotide
that hybridizes under stringent conditions to (a) the complement of any of the
nucleotides
sequences of SEQ m NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding
any one
of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c)
a
polynucleotide which is an allelic variant of any polynucleotide recited
above; (d) a
polynucleotide which encodes a species homolog of any of the proteins recited
above; or (e)
a polynucleotide that encodes a polypeptide comprising a specific domain or
truncation of
the polypeptides of SEQ m NO: 1042-2082, or 2535-2986 (for example, as set
forth in
Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the
encoded
polypeptide; e.g., domains in receptor-like polypeptides include ligand-
binding,
extracellular, transmembrane, or cytoplasmic domains, or combinations thereof;
domains in
irmnunoglobulin-like proteins include the variable immunoglobulin-like
domains; domains
in enzyme-like polypeptides include catalytic and substrate binding domains;
and domains in
ligand polypeptides include receptor-binding domains.
The polynucleotides of the invention include naturally occurring or wholly or
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The
polynucleotides may include entire coding region of the cDNA or may represent
a portion of
the coding region of the cDNA.
The present invention also provides genes corresponding to the cDNA sequences
disclosed herein. The corresponding genes can be isolated in accordance with
known methods
using the sequence information disclosed herein. Such methods include the
preparation of
probes or primers from the disclosed sequence information for identification
and/or
amplification of genes in appropriate genomic libraries or other sources of
genomic materials.
Further 5' and 3' sequence can be obtained using methods known in the art. For
example, full
length cDNA or genomic DNA that corresponds to any of the polynucleotides of
SEQ m NO:
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic
DNA
libraries under suitable hybridization conditions using any of the
polynucleotides of SEQ m
NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the
polynucleotides of

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
18
SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primers)
that allow
identification and/or amplification of genes in appropriate genomic DNA or
cDNA libraries.
The nucleic acid sequences of the invention can be assembled from ESTs~and
sequences
(including cDNA and genomic sequences) obtained from one or more public
databases, such as
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence
information, representative fragment or segment information, or novel segment
information for
the full-length gene.
The polynucleotides of the invention also provide pol5mucleotides including
nucleotide sequences that are substantially equivalent to the polynucleotides
recited above.
Polynucleotides according to the invention can have, e.g., at least about 65%,
at least about
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more
typically at least
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%,
93%, 94%,
and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence
identity to a
polynucleotide recited above.
Included within the scope of the nucleic acid sequences of the invention are
nucleic
acid sequence fragments that hybridize under stringent conditions to any of
the nucleotide
sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which
fragment is
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably
greater than 9
nucleotides and most preferably greater than 17 nucleotides. Fragments of,
e.g. 15, 17, or 20
nucleotides or more that are selective for (i.e. specifically hybridize to)
any one of the
polynucleotides of the invention are contemplated. Probes capable of
specifically
hybridizing to a polynucleotide can differentiate polynucleotide sequences of
the invention
from other polynucleotide sequences in the same family of genes or can
differentiate human
genes from genes of other species, and are preferably based on unique
nucleotide sequences.
The sequences falling within the scope of the present invention are not
limited to these
specific sequences, but also include allelic and species variations thereof.
Allelic and species
variations can be routinely determined by comparing the sequence provided in
SEQ ID NO: 1-
1041, or 2083-2534, a representative fragment thereof, or a nucleotide
sequence at least 90%
identical, preferably 95% identical, to SEQ m NO: 1-1041, or 2083-2534 with a
sequence from
another isolate of the same species. Furthermore, to accommodate colon
variability, the
invention includes nucleic acid molecules coding for the same amino acid
sequences as do the
specific ORFs disclosed herein. In other words, in the coding region of an
ORF, substitution of
one colon for another colon that encodes the same amiilo acid is expressly
contemplated.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
19
The nearest neighbor or homology results for the nucleic acids of the present
invention,
including SEQ m NO: 1-1041, or 2083-2534 can be obtained by searching a
database using an
algorithm or a program. Preferably, a BLAST (Basic Local Aligmnent Search
Tool) program is
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36
290-300 (1993) and
Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA
version 3 search
against Genpept, using FASTXY algorithm may be performed.
Species homologs (or orthologs) of the disclosed polynucleotides and proteins
are
also provided by the present invention. Species homologs may be isolated and
identified by
making suitable probes or primers from the sequences provided herein and
screening a
suitable nucleic acid source from the desired species.
The invention also encompasses allelic variants of the disclosed
polynucleotides or
proteins; that is, naturally-occurring alternative forms of the isolated
polynucleotide which
also encode proteins which are identical, homologous or related to that
encoded by the
polynucleotides.
The nucleic acid sequences of the invention. are further directed to sequences
which
encode variants of the described nucleic acids. These amino acid sequence
variants may be
prepared by methods known in the art by introducing appropriate nucleotide
changes into a
native or variant polynucleotide. There are two variables in the construction
of amino acid
sequence variants: the location of the mutation and the nature of the
mutation. Nucleic
acids encoding the amino acid sequence variants are preferably constructed by
mutating the
polynucleotide to encode an amino acid sequence that does not occur in nature.
These
nucleic acid alterations can be made at sites that differ in the nucleic acids
from different
species (variable positions) or in highly conserved regions (constant
regions). Sites at such
locations will typically be modified in series, e.g., by substituting first
with conservative
choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid)
and then with
more distant choices (e.g., hydrophobic amino acid to a charged amino acid),
and then
deletions or insertions may be made at the target site. Amino acid sequence
deletions
generally range from about 1 to 30 residues, preferably about 1 to 10
residues, and are
typically contiguous. Amino acid insertions include amino- and/or carboxyl-
terminal
fusions ranging in length from one to one hundred or more residues, as well as
intrasequence
insertions of single or multiple amino acid residues. Intrasequence insertions
may range
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues.
Examples of
terminal insertions include the heterologous signal sequences necessary for
secretion or for

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
intracellular targeting in different host cells and sequences such as FLAG or
poly-histidine
sequences useful for purifying the expressed protein.
In a preferred method, polynucleotides encoding the novel amino acid sequences
are
changed via site-directed mutagenesis. This method uses oligonucleotide
sequences to alter
5 a polynucleotide to encode the desired amino acid variant, as well as
sufficient adjacent
nucleotides on both sides of the changed amino acid to form a stable duplex on
either side of
the site of being changed. In general, the techniques of site-directed
mutagenesis are well
known to those of skill in the art and this technique is exemplified by
publications such as,
Edelman et al., DNA 2:183 (1983). A versatile and efficient method for
producing
10 site-specific changes in a polynucleotide sequence was published by Zoller
and Smith,
Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino
acid
sequence variants of the novel nucleic acids. When small amounts of template
DNA are
used as starting material, primers) that differs slightly in sequence from the
corresponding
region in the template DNA can generate the desired amino acid variant. PCR
amplification
15 results in a population of product DNA fragments that differ from the
polynucleotide
template encoding the polypeptide at the position specified by the primer. The
product DNA
fragments replace the corresponding region in the plasmid and this gives a
polynucleotide
encoding the desired amino acid variant.
A further technique for generating amino acid variants is the cassette
mutagenesis
20 technique described in Wells et al., Gene 34:315 (1985); and other
mutagenesis techniques
well known in the art, such as, for example, the techniques in Sambrook et
al., supra, and
Cur~eht Protocols i~z MoleculaY Biology, Ausubel et al. Due to the inherent
degeneracy of
the genetic code, other DNA sequences which encode substantially the same or a
functionally equivalent amino acid sequence may be used in the practice of the
invention for
the cloning and expression of these novel nucleic acids. Such DNA sequences
include those
which are capable of hybridizing to the appropriate novel nucleic acid
sequence under
stringent conditions.
Polynucleotides encoding preferred polypeptide truncations of the invention
could be
used to generate polynucleotides encoding chimeric or fusion proteins
comprising one or
more domains of the invention and heterologous protein sequences.
The polynucleotides of the invention additionally include the complement of
any of
the polynucleotides recited above. The polynucleotide can be DNA (genomic,
cDNA,
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
21
polynucleotides are well known to those of skill in the art and can include,
for example,
methods for determining hybridization conditions that can routinely isolate
polynucleotides
of the desired sequence identities.
In accordance with the invention, polynucleotide sequences comprising the
mature
protein coding sequences corresponding to any one of SEQ m NO: 1-1041, or 2083-
2534,
or functional equivalents thereof, may be used to generate recombinant DNA
molecules that
direct the expression of that nucleic acid, or a functional equivalent
thereof, in appropriate
host cells. Also included are the cDNA inserts of any of the clones identified
herein.
A polynucleotide according to the invention can be joined to any of a variety
of other
nucleotide sequences by well-established recombinant DNA techniques (see
Sambrook J et
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory, NY).
Useful nucleotide sequences for joining to polynucleotides include an
assortment of vectors,
e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like,
that are well
known in the art. Accordingly, the invention also provides a vector including
a
polynucleotide of the invention and a host cell containing the polynucleotide.
In general, the
vector contains an origin of replication functional in at least one organism,
convenient
restriction endonuclease sites, and a selectable marker for the host cell.
Vectors according to
the invention include expression vectors, replication vectors, probe
generation vectors, and
sequencing vectors. A host cell according to the invention can be a
prokaryotic or
eukaryotic cell and can be a unicellular organism or part of a multicellular
organism.
The present invention further provides recombinant constructs comprising a
nucleic
acid having any of the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534
or a
fragment thereof or any other pol5mucleotides of the invention. In one
embodiment, the
recombinant constructs of the present invention comprise a vector, such as a
plasmid or viral
vector, into which a nucleic acid having any of the nucleotide sequences of
SEQ m NO: 1-
1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse
orientation. In
the case of a vector comprising one of the ORFs of the present invention, the
vector may
further comprise regulatory sequences, including for example, a promoter,
operably linked to
the ORF. Large numbers of suitable vectors and promoters are known to those of
skill in the
art and are commercially available for generating the recombinant constructs
of the present
invention. The following vectors are provided by way of example: Bacterial:
pBs,
phagescript, PsiX174, pBluescript SK, pBs KS, pNHBa, pNHl6a, pNHl8a, pNH46a
(Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia);
Eukaryotic:

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
22
pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL
(Pharmacia).
The isolated polynucleotide of the invention may be operably linked to an
expression
control sequence such as the pMT2 or pED expression vectors disclosed in
Kaufinan et al.,
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein
recombinantly.
Many suitable expression control sequences are known in the art. General
methods of
expressing recombinant proteins are also known and are exemplified in R.
Kaufinan,
Methods iu Enzymology 185, 537-566 (1990). As defined herein "operably linked"
means
that the isolated polynucleotide of the invention and an expression control
sequence are
situated within a vector or cell in such a way that the protein is expressed
by a host cell
which has been transformed (transfected) with the ligated
polynucleotide/expression control
sequence.
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol transferase) vectors or other vectors with selectable
markers. Two
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial
promoters include
lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV
immediate
early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and
mouse
metallothionein-I. Selection of the appropriate vector and promoter is well
within the level
of ordinary skill in the art. Generally, recombinant expression vectors will
include origins of
replication and selectable markers permitting transformation of the host cell,
e.g., the
ampicillin resistance gene of E. coli and S. cerevisiae TRP 1 gene, and a
promoter derived
from a highly expressed gene to direct transcription of a downstream
structural sequence.
Such promoters can be derived from operons encoding glycolytic enzymes such as
3-
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock
proteins, among
others. The heterologous structural sequence is assembled in appropriate phase
with
translation initiation and termination sequences, and preferably, a leader
sequence capable of
directing secretion of translated protein into the periplasmic space or
extracellular medium.
Optionally, the heterologous sequence can encode a fusion protein including an
amino
terminal identification peptide imparting desired characteristics, e.g.,
stabilization or
simplified purification of expressed recombinant product. Useful expression
vectors for
bacterial use are constructed by inserting a structural DNA sequence encoding
a desired
protein together with suitable translation initiation and termination signals
in operable
reading phase with a functional promoter. The vector will comprise one or more
phenotypic

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
23
selectable markers and an origin of replication to ensure maintenance of the
vector and to, if
desirable, provide amplification within the host. Suitable prokaryotic hosts
for
transformation include E. coli, Bacillus subtilis, Salmonella typhimur iuna
and various species
within the genera Pseudomonas, Streptonayces, and Staphylococcus, although
others may
also be employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for
bacterial
use can comprise a selectable marker and bacterial origin of replication
derived from
commercially available plasmids comprising genetic elements of the well known
cloning
vector pBR322 (ATCC 37017). Such commercial vectors include, for example,
pI~K223-3
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech,
Madison, WI,
USA). These pBR322 "backbone" sections are combined with an appropriate
promoter and
the structural sequence to be expressed. Following transformation of a
suitable host strain
and growth of the host strain to an appropriate cell density, the selected
promoter is induced
or derepressed by appropriate means (e.g., temperature shift or chemical
induction) and cells
are cultured for an additional period. Cells axe typically harvested by
centrifugation,
disrupted by physical or chemical means, and the resulting crude extract
retained for further
purification.
Polynucleotides of the invention can also be used to induce immune responses.
For
example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999),
incorporated herein by
reference, nucleic acid sequences encoding a polypeptide may be used to
generate antibodies
against the encoded polypeptide following topical administration of naked
plasmid DNA or
following injection, and preferably intra-muscular injection of the DNA. The
nucleic acid
sequences are preferably inserted in a recombinant expression vector and may
be in the form
of naked DNA.
4.3 ANTISENSE
Another aspect of the invention pertains to isolated antisense nucleic acid
molecules
that are hybridizable to or complementary to the nucleic acid molecule
comprising the
nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs
or
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide
sequence that is
complementary to a "sense" nucleic acid encoding a protein, e.g.,
complementary to the
coding strand of a double-stranded cDNA molecule or complementary to an mRNA
sequence. In specific aspects, antisense nucleic acid molecules are provided
that comprise a

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
24
sequence complementary to at least about 10, 25, 50, 100, 250 or 500
nucleotides or an
entire coding strand, or to only a portion thereof. Nucleic acid molecules
encoding
fragments, homologs, derivatives and analogs of a protein of any of SEQ >D NO:
1-1041, or
2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence
of SEQ m
NO: 1-1041, or 2083-2534 are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a
"coding
region" of the coding strand of a nucleotide sequence of the invention. The
term "coding
region" refers to the region of the nucleotide sequence comprising codons
which are
translated into amino acid residues. In another embodiment, the antisense
nucleic acid
molecule is antisense to a "noncoding region" of the coding strand of a
nucleotide sequence
of the invention. The term "noncoding region" refers to 5' and 3' sequences
that flank the ,
coding region that are not translated into amino acids (i.e., also referred to
as 5' and 3'
untranslated regions).
Given the coding strand sequences encoding a nucleic acid disclosed herein
(e.g.,
SEQ >D NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can
be designed
according to the rules of Watson and Crick or Hoogsteen base pairing. The
antisense nucleic
acid molecule can be complementary to the entire coding region of an mRNA, but
more
preferably is an oligonucleotide that is antisense to only a portion of the
coding or noncoding
region of an mRNA. For example, the antisense oligonucleotide can be
complementary to
the region surrounding the translation start site of an mRNA. An antisense
oligonucleotide
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides
in length. An
antisense nucleic acid of the invention can be constructed using chemical
synthesis or
enzymatic ligation reactions using procedures known in the art. For example,
an antisense
nucleic acid (e.g., an antisense oligonucleotide) can be chemically
synthesized using
naturally occurring nucleotides or variously modified nucleotides designed to
increase the
biological stability of the molecules or to increase the physical stability of
the duplex formed
between the antisense and sense nucleic acids, e.g., phosphorothioate
derivatives and
acridine substituted nucleotides can be used.
Examples of modified nucleotides that can be used to generate the antisense
nucleic
acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine,
xanthine, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, 5-
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil,
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-
methylguanine,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-
methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-
methylaminomethyluracil, 5-methoxyamiuomethyl-2-thiouracil, beta-D-
mannosylqueosine,
5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-
isopentenyladenine,
5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-
thiocytosine, 5-methyl-
2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic
acid methylester,
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-
carboxypropyl) uracil,
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can
be produced
.biologically using an expression vector into which a nucleic acid has been
subcloned in an
10 antisense orientation (i.e., RNA transcribed from the inserted nucleic acid
will be of an
antisense orientation to a target nucleic acid of interest, described further
in the following
subsection).
The antisense nucleic acid molecules of the invention are typically
administered to a
subject or generated in situ such that they hybridize with or bind to cellular
mRNA and/or
15 genomic DNA encoding a protein according to the invention to thereby
inhibit expression of
the protein, e.g., by inhibiting transcription and/or translation. The
hybridization can be by
conventional nucleotide complementarity to form a stable duplex, or, for
example, in the
case of an antisense nucleic acid molecule that binds to DNA duplexes, through
specific
interactions in the major groove of the double helix. An example of a route of
20 administration of antisense nucleic acid molecules of the invention
includes direct injection
at a tissue site. Alternatively, antisense nucleic acid molecules can be
modified to target
selected cells and then administered systemically. For example, for systemic
administration,
antisense molecules can be modified such that they specifically bind to
receptors or antigens
expressed on a selected cell surface, e.g., by linking the antisense nucleic
acid molecules to
25 peptides or antibodies that bind to cell surface receptors or antigens. The
antisense nucleic
acid molecules can also be delivered to cells using the vectors described
herein. To achieve
sufficient intracellular concentrations of antisense molecules, vector
constructs in which the
antisense nucleic acid molecule is placed under the control of a strong pol II
or pol III
promoter are preferred.
W yet another embodiment, the antisense nucleic acid molecule of the invention
is an
a,-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms
specific
double-stranded hybrids with complementary RNA in which, contrary to the usual
a,-units,
the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids
Res 15:

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
26
6625-6641). The antisense nucleic acid molecule can also comprise a
2'-o-methylribonucleotide (moue et al. (1987) Nucleic Acids Res 15: 6131-6148)
or a
chimeric RNA -DNA analogue (moue et al. (1987) FEBS Lett 215: 327-330).
4.4 RIBOZYMES AND PNA MOIETIES
In still another embodiment, an antisense nucleic acid of the invention is a
ribozyme.
Ribozymes are catalytic RNA molecules with ribonuclease activity that are
capable of
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described
in
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically
cleave
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having
specificity
for a nucleic acid of the invention can be designed based upon the nucleotide
sequence of a
DNA disclosed herein (i.e., SEQ ID NO: 1-1041, or 2083-2534). For example, a
derivative
of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide
sequence of the
active site is complementary to the nucleotide sequence to be cleaved in a
mRNA. See, e.g.,
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742.
Alternatively,
mRNA of the invention can be used to select a catalytic RNA having a specific
ribonuclease
activity from a pool of RNA molecules. See, e.g., Bartel et al., (1993)
Seience
261:1411-1418.
Alternatively, gene expression can be inhibited by targeting nucleotide
sequences
complementary to the regulatory region (e.g., promoter and/or enhancers) to
form triple
helical structures that prevent transcription of the gene in target cells. See
generally, Helene.
(1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N Y. Acad.
Sci.
660:27-36; and Maher (1992) Bioassays 14: 807-15.
In various embodiments, the nucleic acids of the invention can be modified at
the
base moiety, sugar moiety or phosphate backbone to improve, e.g., the
stability,
hybridization, or solubility of the molecule. For example, the deoxyribose
phosphate
backbone of the nucleic acids can be modified to generate peptide nucleic
acids (see Hyrup
et al. (1996) Bioorg Med Chern 4: 5-23). As used herein, the terms "peptide
nucleic acids"
or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the
deoxyribose
phosphate backbone is replaced by a pseudopeptide backbone and only the four
natural
nucleobases are retained. The neutral backbone of PNAs has been shown to allow
for
specific hybridization to DNA and RNA under conditions of low ionic strength.
The

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
27
synthesis of PNA oligomers can be performed using standard solid phase peptide
synthesis
protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al.
(1996) PNAS 93:
14670-675.
PNAs of the invention can be used in therapeutic and diagnostic applications.
For
example, PNAs can be used as antisense or antigene agents for sequence-
specific modulation
of gene expression by, e.g., inducing transcription or translation arrest or
inhibiting
replication. PNAs of the invention can also be used, e.g., in the analysis of
single base pair
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial
restriction enzymes
when used in combination with other enzymes, e.g., S1 nucleases (Hyrup B.
(1996) above);
or as probes or primers for DNA sequence and hybridization (Hyrup et al.
(1996), above;
Perry-O'Keefe (1996), above).
In another embodiment, PNAs of the invention can be modified, e.g., to enhance
their stability or cellular uptake, by attaching lipophilic or other helper
groups to PNA, by
the formation of PNA-DNA chimeras, or by the use of liposomes or other
techniques of drug
delivery known in the art. For example, PNA-DNA chimeras can be generated that
may
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the
DNA
portion while the PNA portion would provide high binding affinity and
specificity.
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected
in terms of
base stacking, number of bonds between the nucleobases, and orientation (Hyrup
(1996)
above). The synthesis of PNA-DNA chimeras can be performed as described in
Hyrup
(1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a
DNA chain
can be synthesized on a solid support using standard phosphoramidite coupling
chemistry,
and modified nucleoside analogs, e.g., 5'-(4-methoxytrityl)amino-5'-deoxy-
thymidine
phosphoramidite, can be used between the PNA and the 5' end of DNA (Mag et al.
(1989)
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner
to
produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn
et al.
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5'
DNA
segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Clzem
Lett 5:
1119-11124.
In other embodiments, the oligonucleotide may include other appended groups
such
as peptides (e.g., for targeting host cell receptors in vivo), or agents
facilitating transport
across the cell membrane (see, e.g., Letsinger et al., 1989, P~oc. Natl. Acad.
Sci. U.S.A.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
28
86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT
Publication
No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No.
W089/10134).
In addition, oligonucleotides can be modified with hybridization triggered
cleavage agents
(See, e.g., Krol et al., 1988, BioTechhiques 6:958-976) or intercalating
agents. (See, e.g.,
Zon, 1988, Pha~m. Res. 5: 539-549). To this end, the oligonucleotide may be
conjugated to
another molecule, e.g., a peptide, a hybridization triggered cross-linking
agent, a transport
agent, a hybridization-triggered cleavage agent, etc.
4.5 HOSTS
The present invention further provides host cells genetically engineered to
contain
the polynucleotides of the invention. For example, such host cells may contain
nucleic acids
of the invention introduced into the host cell using known transformation,
transfection or
infection methods. The present invention still fizrther provides host cells
genetically
engineered to express the polynucleotides of the invention, wherein such
polynucleotides are
in operative association with a regulatory sequence heterologous to the host
cell which
drives expression of the polynucleotides in the cell.
Knowledge of nucleic acid sequences allows for modification of cells to
permit, or
increase, expression of endogenous polypeptide. Cells can be modified (e.g.,
by
homologous recombination) to provide increased polypeptide expression by
replacing, in
whole or in part, the naturally occurring promoter with all or part of a
heterologous promoter
so that the cells express the polypeptide at higher levels. The heterologous
promoter is
inserted in such a manner that it is operatively linked to the encoding
sequences. See, for
example, PCT International Publication No. WO94/12650, PCT International
Publication
No. W092/20808, and PCT International Publication No. W091/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl
phosphate
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA
may be
inserted along with the heterologous promoter DNA. If linked to the coding
sequence,
amplification of the marker DNA by standard selection methods results in co-
amplification
of the desired protein coding sequences in the cells.
The host cell can be a higher eukaryotic host cell, such as a mammalian cell,
a lower
eukaryotic host cell, such as a yeast cell, or the host cell can be a
prokaryotic cell, such as a
bacterial cell. Introduction of the recombinant construct into the host cell
can be effected by

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
29
calcium phosphate transfection, DEAE, dextran mediated transfection, or
electroporation
(Davis, L. et al., Basic Metlaods iri Molecular Biology (1986)). The host
cells containing one
of the polynucleotides of.the invention, can be used in conventional manners
to produce the
gene product encoded by the isolated fragment (in the case of an ORF) or can
be used to
produce a heterologous protein under the control of the EMF.
Any host/vector system can be used to express one or more of the ORFs of the
present invention. These include, but are not limited to, eukaryotic hosts
such as HeLa cells,
Cv-1 cell, COS cells, 293 cells, and S~ cells, as well as prokaryotic host
such as E. coli and
B. subtilis. The most preferred cells are those which do not normally express
the particular
polypeptide or protein or which expresses the polypeptide or protein at low
natural level.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other
cells under
the control of appropriate promoters. Cell-free translation systems can also
be employed to
produce such proteins using RNAs derived from the DNA constructs of the
present
invention. Appropriate cloning arid expression vectors for use with
prokaryotic and
eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A
Laboratory
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of
which is
hereby incorporated by reference.
Various mammalian cell culture systems can also be employed to express
recombinant protein. Examples of mammalian expression systems include the COS-
7 lines
of monkey kichley fibroblasts, described by Gluzman, Cell 23:175 (1981). Other
cell lines
capable of expressing a compatible vector are, for example, the C127, monkey
COS cells,
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal
A431 cells,
human Co1o205 cells, 3T3 cells, CV-1 cells, other transformed primate cell
lines, normal
diploid cells, cell strains derived from ih vitro culture of primary tissue,
primary explants,
HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian
expression
vectors will comprise an origin of replication, a suitable promoter and also
any necessary
ribosome binding sites, polyadenylation site, splice donor and acceptor sites,
transcriptional
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences
derived
from the SV40 viral genome, for example, SV40 origin, early promoter,
enhancer, splice,
and polyadenylation sites may be used to provide the required nontranscribed
genetic
elements. Recombinant polypeptides and proteins produced in bacterial culture
are usually
isolated by initial extraction from cell pellets, followed by one or more
salting-out, aqueous
ion exchange or size exclusion chromatography steps. Protein refolding steps
can be used,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
as necessary, in completing configuration of the mature protein. Finally, high
performance
liquid chromatography (HPLC) can be employed for final purification steps.
Microbial cells
employed in expression of proteins can be disrupted by any convenient method,
including
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing
agents.
Alternatively, it may be possible to produce the protein in lower eukaryotes
such as
yeast or insects or in prokaryotes such as bacteria. Potentially suitable
yeast strains include
SaccharonZyces cerevisiae, SclZizosacchaYOtnyces potrtbe, Kluyvet~omyces
strains, Candida,
or any yeast strain capable of expressing heterologous proteins. Potentially
suitable bacterial
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimuriut~t,
or any bacterial
10 strain capable of expressing heterologous proteins. If the protein is made
in yeast or
bacteria, it may be necessary to modify the protein produced therein, for
example by
phosphorylation or glycosylation of the appropriate sites, in order to obtain
the functional
protein. Such covalent attachments may be accomplished using known chemical or
enzymatic methods.
15 hl another embodiment of the present invention, cells and tissues may be
engineered
to express an endogenous gene comprising the polynucleotides of the invention
under the
control of inducible regulatory elements, in which case the regulatory
sequences of the
endogenous gene may be replaced by homologous recombination. As described
herein, gene
targeting can be used to replace a gene's existing regulatory region with a
regulatory
20 sequence. isolated from a different gene or a novel regulatory sequence
synthesized by
genetic engineering methods. Such regulatory sequences may be comprised of
promoters,
enhancers, scaffold-attachment regions, negative regulatory elements,
transcriptional
initiation sites, and regulatory protein binding sites or combinations of said
sequences.
Alternatively, sequences which affect the structure or stability of the RNA or
protein
25 produced may be replaced, removed, added, or otherwise modified by
targeting. These
sequence include polyadenylation signals, mRNA stability elements, splice
sites, leader
sequences for enhancing or modifying transport or secretion properties of the
protein, or
other sequences which alter or improve the function or stability of protein or
RNA
molecules.
30 The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
31
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different
cell-type specificity than the naturally occurnng elements. Here, the
naturally occurring
sequences are deleted and new sequences are added. In all cases, the
identification of the
targeting event may be facilitated by the use of one or more selectable marker
genes that are
contiguous with the targeting DNA, allowing for the selection of cells in
which the
exogenous DNA has integrated into the host cell genome. The identification of
the targeting
event may also be facilitated by the use of one or more marker genes
exhibiting the property
of negative selection, such that the negatively selectable marker is linked to
the exogenous
DNA, but configured such that the negatively selectable marker flanks the
targeting
sequence, and such that a correct homologous recombination event with
sequences in the
host cell genome does not result in the stable integration of the negatively
selectable marker.
Markers useful for this purpose include the Herpes Simplex Virus thymidine
kinase (TK)
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance
with this aspect of the invention are more particularly described in U.S.
Patent No. 5,272,071
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Tnternational
Application No.
PCT/US92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/US90/06436 (W091/06667) by Skoultchi et al., each of which is incorporated
by
reference herein in its entirety.
4.6 POLYPEPTIDES OF THE INVENTION
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide comprising: the amino acid sequences set forth as any one of SEQ
ID NO: 1042-
2082, or 2535-2986 or an amino acid sequence encoded by any one of the
nucleotide
sequences SEQ DJ NO: 1-1041, or 2083-2534 or the corresponding full length or
mature
protein. Polypeptides of the invention also include polypeptides preferably
with biological or
immunological activity that are encoded by: (a) a polynucleotide having any
one of the
nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b)
polynucleotides
encoding any one of the amino acid sequences set forth as SEQ m NO: 1042-2082,
or 2535-
2986 or (c) polynucleotides that hybridize to the complement of the
polynucleotides of either
(a) or (b) under stringent hybridization conditions. The invention also
provides biologically
active or immunologically active variants of any of the amino acid sequences
set forth as

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
32
SEQ m NO: 1042-2082, or 2535-2986 or the corresponding full length or mature
protein;
and "substantial equivalents" thereof (e.g., with at least about 65%, at least
about 70%, at
least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%,
at least about
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more
typically at least
about 98%, or most typically at least about 99% amino acid identity) that
retain biological
activity. Polypeptides encoded by allelic variants may have a similar,
increased, or
decreased activity compared to polypeptides comprising SEQ m NO: 1042-2082, or
2535-
2986.
Fragments of the proteins of the present invention which are capable of
exhibiting
biological activity are also encompassed by the present invention. Fragments
of the protein
may be in linear form or they may be cyclized using known methods, for
example, as
described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in
R. S.
McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are
incorporated herein by reference. Such fragments may be fused to Garner
molecules such as
immunoglobulins for many purposes, including increasing the valency of protein
binding
sites. Fragments are also identified in Tables 3, 5, 6, and 8.
The present invention also provides both full-length and mature forms (for
example,
without a signal sequence or precursor sequence) of the disclosed proteins.
The protein
coding sequence is identified in the sequence listing by translation of the
disclosed
nucleotide sequences. The predicted signal sequence is set forth in Table 6.
The mature
form of such protein may be obtained and confirmed by expression of a full-
length
polynucleotide in a suitable mammalian cell or other host cell and sequencing
of the cleaved
product. One of skill in the art will recognize that the actual cleavage site
may be different
than that predicted in Table 6. The sequence of the mature form of the protein
is also
determinable from the amino aci°d sequence of the full-length form.
Where proteins of the
present invention are membrane bound, soluble forms of the proteins are also
provided. In
such forms, part or all of the regions causing the proteins to be membrane
bound are deleted
so that the proteins are fully secreted from the cell in which they are
expressed.
Protein compositions of the present invention may further comprise an
acceptable
carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The present invention further provides isolated polypeptides encoded by the
nucleic
acid fragments of the present invention or by degenerate variants of the
nucleic acid
fragments of the present invention. By "degenerate variant" is intended
nucleotide

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
33
fragments which differ from a nucleic acid fragment of the present invention
(e.g., an ORF)
by nucleotide sequence but, due to the degeneracy of the genetic code, encode
an identical
polypeptide sequence. Preferred nucleic acid fragments of the present
invention are the
ORFs that encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one
of the
isolated polypeptides or proteins of the present invention. At the simplest
level, the amino
acid sequence can be synthesized using commercially available peptide
synthesizers. The
synthetically-constructed protein sequences, by virtue of sharing primary,
secondary or
tertiary structural and/or conformational characteristics with proteins may
possess biological
properties in common therewith, including protein activity. This technique is
particularly
useful in producing small peptides and fragments of larger polypeptides.
Fragments are
useful, for example, in generating antibodies against the native polypeptide.
Thus, they may
be employed as biologically active or immunological substitutes for natural,
purified
proteins in screening of therapeutic compounds and in immunological processes
for the
development of antibodies.
The polypeptides and proteins of the present invention can alternatively be
purified
from cells which have been altered to express the desired polypeptide or
protein. As used
herein, a Bell is said to be altered to express a desired polypeptide or
protein when the cell,
through genetic manipulation, is made to produce a polypeptide or protein
which it normally
does not produce or which the cell normally produces at a lower level. One
skilled in the art
can readily adapt procedures for introducing and expressing either recombinant
or synthetic
sequences into eukaryotic or prokaryotic cells in order to generate a cell
which produces one
of the polypeptides or proteins of the present invention.
The invention also relates to methods for producing a polypeptide comprising
growing a culture of host cells of the invention in a suitable culture medium,
and purifying
the protein from the cells or the culture in which the cells are grown. For
example, the
methods of the invention include a process for producing a polypeptide in
which a host cell
containing a suitable expression vector that includes a polynucleotide of the
invention is
cultured under conditions that allow expression of the encoded polypeptide.
The
polypeptide can be recovered from the culture, conveniently,from the culture
medium, or
from a lysate prepared from the host cells and further purified. Preferred
embodiments
include those in which the protein produced by such process is a full length
or mature form
of the protein.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
34
In an alternative method, the polypeptide or protein is purified from
bacterial cells
which naturally produce the polypeptide or protein. One skilled in the art can
readily follow
known methods for isolating polypeptides and proteins in order to obtain one
of the isolated
polypeptides or proteins of the present invention. These include, but are not
limited to,
S immunochromatography, HPLC, size-exclusion chromatography, ion-exchange
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Pf-
ateih
Pu~ificatiafa: Priheiples afad PYactice, Springer-Verlag (1994); Sambrook, et
al., in
Molecular Cloning: A Laboy~atoYy Manual; Ausubel et al., Cu~~efzt Protocols in
Molecular
Biology. Polypeptide fragments that retain biologicallimmunological activity
include
fragments comprising greater than about 100 amino acids, or greater than about
200 amino
acids, and fragments that encode specific protein domains.
The purified polypeptides can be used in in vitro binding assays which are
well
knov~m in the art to identify molecules which bind to the polypeptides. These
molecules
include but are not limited to, for e.g., small molecules, molecules from
combinatorial
1S libraries, antibodies or other proteins. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cellla~zimal death or prolonged survival of
the animal/cells.
In addition, the peptides of the invention or molecules capable of binding to
the
peptides may be complexed with toxins, e.g., ricin or cholera, or with other
compounds that
are toxic to cells. The toxin-binding molecule complex is then targeted to a
tumor. or other
cell by the specificity of the binding molecule for SEQ )D NO: 1042-2082, or
2S3S-2986.
The protein of the invention may also be expressed as a product of transgenic
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or
sheep which are
2S characterized by somatic or germ cells containing a nucleotide sequence
encoding the
protein.
The proteins provided herein also include proteins characterized by amino acid
sequences similar to those of purified proteins but into which modification
are naturally
provided or deliberately engineered. For example, modifications, in the
peptide or DNA
sequence, can be made by those skilled in the art using known techniques.
Modifications of
interest in the protein sequences may include the alteration, substitution,
replacement,
insertion or deletion of a selected amino acid residue in the coding sequence.
Fox example,
one or more of the cysteine residues may be deleted or replaced with another
amino acid to

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
alter the conformation of the molecule. Techniques for such alteration,
substitution,
replacement, insertion or deletion are well known to those skilled in the art
(see, e.g., U.S.
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement,
insertion or
deletion retains the desired activity of the protein. Regions of the protein
that are important
5 for the protein function can be determined by various methods known in the
art including the
alanine-scanning method which involved systematic substitution of single or
strings of
amino acids with alanine, followed by testing the resulting alanine-containing
variant for
biological activity. This type of analysis determines the importance of the
substituted amino
acids) in biological activity. Regions of the protein that are important for
protein function
10 may be determined by the eMATRIX program.
Other fragments and derivatives of the sequences of proteins which would be
expected to retain protein activity in whole or in part and are useful for
screening or other
immunological methodologies may also be easily made by those skilled in the
art given the
disclosures herein. Such modifications are encompassed by the present
invention.
15 The protein may also be produced by operably linking the isolated
polynucleotide of
the invention to suitable control sequences in one or more insect expression
vectors, and
employing an insect expression system. Materials and methods for
baculovirus/insect cell
expression systems are commercially available in kit form from, e.g.,
Invitrogen, San Diego,
Calif., U.S.A. (the MaxBatTM kit), and such methods are well known in the art,
as described
20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No.
1555 (1987),
incorporated herein by reference. As used herein, an insect cell capable of
expressing a
polynucleotide of the present invention is "transformed."
The protein of the invention may be prepared by culturing transformed host
cells
under culture conditions suitable to express the recombinant protein. The
resulting
25 expressed protein may then be purified from such culture (i.e., from
culture medium or cell
extracts) using known purification processes, such as gel filtration and ion
exchange
chromatography. The purification of the protein may also include an affinity
column
containing agents which will bind to the protein; one or more column steps
over such affinity
resins as concanavalin A-agarose, heparin-toyopearlTM or Cibacrom blue 3GA
SepharoseTM;
30 one or more steps involving hydrophobic interaction chromatography using
such resins as
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography.
Alternatively, the protein of the invention may also be expressed in a form
which will
facilitate purification. For example, it may be expressed as a fusion protein,
such as those of

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
36
maltose binding protein (MBP), glutatluone-S-transferase (GST) or thioredoxin
(TRX), or as
a His tag. Kits for expression and purification of such fusion proteins are
commercially
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscatav~iay,
N.J.) and
Invitrogen, respectively. The protein can also be tagged with an epitope and
subsequently
purified by using a specific antibody directed to such epitope. One such
epitope ("FLAG~")
is commercially available from Kodak (New Haven, Conn.).
Finally, one or more reverse-phase high performance liquid chromatography (RP-
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having
pendant
methyl or other aliphatic groups, can be employed to further purify the
protein. Some or all
of the foregoing purification steps, in various combinations, can also be
employed to provide
a substantially homogeneous isolated recombinant protein. The protein thus
purified is
substantially free of other mammalian proteins and is defined in accordance
with the present
invention as an "isolated protein."
The polypeptides of the invention include analogs (variants). This embraces
fragments, as well as peptides in which one or more amino acids has been
deleted, inserted,
or substituted. Also, analogs of the polypeptides of the invention embrace
fusions of the
polypeptides or modifications of the polypeptides of the invention, wherein
the polypeptide
or analog is fused to another moiety or moieties, e.g., targeting moiety or
another therapeutic
agent. Such analogs may exhibit improved properties such as activity and/or
stability.
Examples of moieties Which may be fused to the polypeptide or an analog
include, for
example, targeting moieties which provide for the delivery of polypeptide to
pancreatic cells,
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-
cells, monocytes,
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed
on pancreatic or
immune cells. Other moieties which may be fused to the polypeptide include
therapeutic
agents which are used for treatment, for example, immunosuppressive drugs such
as
cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also,
polypeptides may be
fused to immune modulators, and other cytokines such as alpha or beta
interferon.
4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE
IDENTITY AND SIMILARITY
Preferred identity and/or similarity are designed to give the largest match
between
the sequences tested. Methods to determine identity and similarity are
codified in computer
programs including, but are not limited to, the GCG program package, including
GAP

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
37
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics
Computer Group,
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA
(Altschul,
S.F. et al., J. Molec. Biol. 215:403-410 (1990), PST-BLAST (Altschul S.F. et
al., Nucleic
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix
software (Wu
et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by
reference), eMotif
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein
incorporated by
reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1),
pp. 320-322
(1998), herein incorporated by reference) and the Kyte-Doolittle
hydrophobocity prediction
algorithm (J. Mo1 Biol, 157, pp. 105-31 (1982), incorporated herein by
reference).
polypeptide sequences were examined by a proprietary algorithm, SeqLoc that
separates the
proteins into three sets of locales: intracellular, membrane, or secreted.
This prediction is
based upon three characteristics of each polypeptide, including percentage of
cysteine
residues, Kyte-Doolittle scores for the f rst 20 amino acids of each protein,
and Kyte-
Doolittle scores to calculate the longest hydrophobic stretch of the said
protein. Values of
predicted proteins are compared against the values from a set of 592 proteins
of known
cellular localization from the Swissprot database
(http:llwww.expasy.ch/sprot). Predictions
are based upon the maximum likelihood estimation.
The BLAST programs are publicly available from the National Center for
Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul,
S., et al.
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-
410
(1990).
4.7 CHIMERIC AND FUSION PROTEINS
The invention also provides chimeric or fusion proteins. As used herein, a
"chimeric
protein" or "fusion protein" comprises a polypeptide of the invention
operatively linked to
another polypeptide. Within a fusion protein the polypeptide according to the
invention can
correspond to all or a portion of a protein according to the invention. In one
embodiment, a
fusion protein comprises at least one biologically active portion of a protein
according to the
invention. In another embodiment, a fusion protein comprises at Least two
biologically
active portions of a protein according to the invention. Within the fusion
protein, the term
"operatively linked" is intended to indicate that the polypeptide according to
the invention
and the other polypeptide are fused in-frame to each other. The polypeptide
can be fused to
the N-terminus or C-terminus, or to the middle.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
38
For example, in one embodiment a fusion protein comprises a polypeptide
according
to the invention operably linked to the extracellular domain ~of a second
protein.
In another embodiment, the fusion protein is a GST-fusion protein in which the
polypeptide sequences of the invention are fused to the C-terminus of the GST
(i.e.,
glutathione S-transferase) sequences.
In another embodiment, the fusion protein is an immunoglobulin fusion protein
in
which the polypeptide sequences according to the invention comprise one or
more domains
fused to sequences derived from a member of the immunoglobulin protein family.
The
immunoglobulin fusion proteins of the invention can be incorporated into
pharmaceutical
compositions and administered to a subject to inhibit an interaction between a
ligand and a
protein of the invention on the surface of a cell, to thereby suppress signal
transduction ira
viv~. The immunoglobulin fusion proteins can be used to affect the
bioavailability of a
cognate ligand. Inhibition of the ligand/protein interaction may be useful
therapeutically for
both the treatment of proliferative and differentiative disorders, e.g.,
cancer as well as
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the
immunoglobulin
fusion proteins of the invention can be used as immunogens to produce
antibodies in a
subject, to purify ligands, and in screening assays to identify molecules that
inhibit the
interaction of a polypeptide of the invention with a ligand.
A chimeric or fusion protein of the invention can be produced by standard
recombinant DNA techniques. For example, DNA fragments coding for the
different
polypeptide sequences are ligated together in-frame in accordance with
conventional
techniques, e.g., by employing blunt-ended or stagger-ended termini for
ligation, restriction
enzyme digestion to provide for appropriate termini, filling-in of cohesive
ends as
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and
enzymatic
ligation. In another embodiment, the fusion gene can be synthesized by
conventional
techniques including automated DNA synthesizers. Alternatively, PCR
amplification of
gene fragments can be carned out using anchor primers that give rise to
complementary
overhangs between two consecutive gene fragments that can subsequently be
annealed and
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et
al. (eds.)
CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1992). Moreover,
many expression vectors are commercially available that already encode a
fusion moiety
(e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the
invention can be

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
39
cloned into such an expression vector such that the fusion moiety is linked in-
frame to the
protein of the invention.
4.8 GENE T~IERAPY
Mutations in the polynucleotides of the invention gene may result in loss of
normal
function of the encoded protein. The invention thus provides gene therapy to
restore normal
activity of the polypeptides of the invention; or to treat disease states
involving polypeptides
of the invention. Delivery of a functional gene encoding polypeptides of the
invention to
appropriate cells is effected ex vivo, ih situ, or is? vivo by use of vectors,
and more
particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a
retrovirus), or ex vivo
by use of physical DNA transfer methods (e.g., liposomes or chemical
treatments). See, for
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998).
For
additional reviews of gene therapy technology see Friedmann, Science, 244:
1275-1281
(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-
460 (1992).
Introduction of amy one of the nucleotides of the present invention or a gene
encoding the
polypeptides of the present invention can also be accomplished with
extrachromosomal
substrates (transient expression) or artificial chromosomes (stable
expression). Cells may
also be cultured ex vivo in the presence of proteins of the present invention
in order to
proliferate or to produce a desired effect on or activity in such cells.
Treated cells can then
be introduced ifa vivo for therapeutic purposes. Alternatively, it is
contemplated that in other
human disease states, preventing the expression of or inhibiting the activity
of polypeptides
of the invention will be useful in treating the disease states. It is
contemplated that antisense
therapy or gene therapy could be applied to negatively regulate the expression
of
polypeptides of the invention.
Other methods inhibiting expression of a protein include the introduction of
antisense
molecules to the nucleic acids of the present invention, their complements, or
their translated
RNA sequences, by methods known in the art. Further, the polypeptides of the
present
invention can be inhibited by using targeted deletion methods, or the
insertion of a negative
regulatory element such as a silencer, which is tissue specific.
The present invention still further provides cells genetically engineered ih
vivo to
express the polynucleotides of the invention, wherein such polynucleotides are
in operative
association with a regulatory sequence heterologous to the host cell which
drives expression of

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
the polynucleotides in the cell. These methods can be used to increase or
decrease the
expression of the polynucleotides of the present invention.
Knowledge of DNA sequences provided by the invention allows for modification
of
cells to permit, increase, or decrease, expression of endogenous polypeptide.
Cells can be
5 modified (e.g., by homologous recombination) to provide increased
polypeptide expression by
replacing, in whole or in part, the naturally occurring promoter with all or
part of a heterologous
promoter so that the cells express the protein at lugher levels. The
heterologous promoter is
inserted in such a manner that it is operatively linked to the desired protein
encoding sequences.
See, for example, PCT International Publication No. WO 94/12650, PCT
International
10 Publication No. WO 92/20808, and PCT International Publication No. WO
91/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA (e.g.,
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate
synthase,
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be
inserted along with
the heterologous promoter DNA. If linked to the desired protein coding
sequence,
15 amplification of the marker DNA by standard selection methods results in co-
amplification of
the desired protein coding sequences in the cells.
In another embodiment of the present invention, cells and tissues may be
engineered to
express an endogenous gene comprising the polynucleotides of the invention
under the control
of inducible regulatory elements, in which case the regulatory sequences of
the endogenous
20 gene may be replaced by homologous recombination. As described herein, gene
targeting can
be used to replace a gene's existing regulatory region with a regulatory
sequence isolated from
a different gene or a novel regulatory sequence synthesized by genetic
engineering methods.
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-
attachment
regions, negative regulatory elements, transcriptional initiation sites,
regulatory protein binding
25 sites or combinations of said sequences. Alternatively, sequences which
affect the structure or
stability of the RNA or protein produced may be replaced, removed, added, or
otherwise
modified by targeting. These sequences include polyadenylation signals, mRNA
stability
elements, splice sites, leader sequences for enhancing or modifying transport
or secretion
properties of the protein, or other sequences which alter or improve the
function or stability of
30 protein or RNA molecules.
The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting'a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
41
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different cell-type
specificity than the naturally occurring elements. Here, the naturally
occurring sequences are
deleted and new sequences are added. In all cases, the identification of the
targeting event may
be facilitated by the use of one or more selectable marker genes that are
contiguous with the
targeting DNA, allowing for the selection of cells in which the exogenous DNA
has integrated
into the cell genome. The identification of the targeting event may also be
facilitated by the use
of one or more marker genes exhibiting the property of negative selection,
such that the
negatively selectable marker is linked to the exogenous DNA, but configured
such that the
negatively selectable marker flanks the targeting sequence, and such that a
correct homologous
recombination event with sequences in the host cell genome does not result in
the stable
integration of the negatively selectable marker. Markers useful for this
purpose include the
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xantlune-
guanine
phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance with
this aspect of the invention are more particularly described in U.S. Patent
No. 5,272,071 to
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; W ternational
Application No.
PCT/LTS92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/LTS90/06436 (W091/06667) by Skoultchi et al., each of which is
incorporated by
reference herein in its entirety.
4.9 TRANSGENIC ANIMALS
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)J. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
42
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. WO94/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of a promoter of the
polynucleotides of the invention is either activated or inactivated to alter
the level of
expression of the polypeptides of the invention. Inactivation can be carried
out using
homologous recombination methods described above. Activation can be achieved
by
supplementing or even replacing the homologous promoter to provide for
increased protein
expression. The homologous promoter can be supplemented by insertion of one or
more
heterologous enhancer elements known to confer promoter activation in a
particular tissue.
The polynucleotides of the present invention also make possible the
development,
through, e.g., homologous recombination or knock out strategies, of animals
that fail to
express polypeptides of the invention or that express a variant polypeptide.
Such animals are
useful as models for studying the i~ vivo activities of polypeptide as well as
for studying
modulators of the polypeptides of the invention.
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)x. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. W094/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of the polynucleotides
of the
invention promoter is either activated or inactivated to alter the level of
expression of the
polypeptides of the invention. Inactivation can be carried out using
homologous
recombination methods described above. Activation can be achieved by
supplementing or
even replacing the homologous promoter to provide for increased protein
expression. The

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
43
homologous promoter can be supplemented by insertion of one or more
heterologous
enhancer elements known to confer promoter activation in a particular tissue.
4.10 USES AND BIOLOGICAL ACTIVITY
The polynucleotides and proteins of the present invention are expected to
exhibit one
or more of the uses or biological activities (including those associated with
assays cited
herein) identified herein. Uses or activities described fox proteins of the
present invention
may be provided by administration or use of such proteins or of
polynucleotides encoding
such proteins (such as, for example, in gene therapies or vectors suitable for
introduction of
DNA). The mechanism underlying the particular condition or pathology will
dictate whether
the polypeptides of the invention, the polynucleotides of the invention or
modulators
(activators or inhibitors) thereof would be beneficial to the subject in need
of treatment.
Thus, "therapeutic compositions of the invention" include compositions
comprising isolated
polynucleotides (including recombinant DNA molecules, cloned genes and
degenerate
variants thereof) or polypeptides of the invention (including full length
protein, mature
protein and truncations or domains thereof), or compounds and other substances
that
modulate the overall activity of the target gene products, either at the level
of target
gene/protein expression or target protein activity. Such modulators include
polypeptides,
analogs, (variants), including fragments and fusion proteins, antibodies and
other binding
proteins; chemical compounds that directly or indirectly activate or inhibit
the polypeptides
of the invention (identified, e.g., via drug screening assays as described
herein); antisense
polynucleotides and polynucleotides suitable for triple helix formation; and
in particular
antibodies or other binding partners that specifically recognize one or more
epitopes of the
polypeptides of the invention.
The polypeptides of the present invention may likewise be involved in cellular
activation or in one of the other physiological pathways described herein.
4.10.1 RESEARCH USES AND UTILITIES
The polynucleotides provided by the present invention can be used by the
research
community for various purposes. The polynucleotides can be used to express
recombinant
protein for analysis, characterization or therapeutic use; as markers for
tissues in which the
corresponding protein is preferentially expressed (either constitutively or at
a particular stage
of tissue differentiation or development or in disease states); as molecular
weight markers on

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
44
gels; as chromosome markers or tags (when labeled) to identify chromosomes or
to map
related gene positions; to compare with endogenous DNA sequences in patients
to identify
potential genetic disorders; as probes to hybridize and thus discover novel,
related DNA
sequences; as a source of information to derive PCR primers for genetic
fingerprinting; as a
probe to "subtract-out" known sequences in the process of discovering other
novel
polynucleotides; for selecting and making oligomers for attachment to a "gene
chip" or other
support, including for examination of expression patterns; to raise anti-
protein antibodies
using DNA immunization techniques; and as an antigen to raise anti-DNA
antibodies or
elicit another immune response. Where the polynucleotide encodes a protein
which binds or
potentially binds to another protein (such as, for example, in a receptor-
ligand interaction),
the polynucleotide can also be used in interaction trap assays (such as, for
example, that
described in Gyuris et al., Cell 75:791-803 (1993)) to identify
polynucleotides encoding the
other protein with which binding occurs or to identify inhibitors of the
binding interaction.
The polypeptides provided by the present invention can similarly be used in
assays to
determine biological activity, including in a panel of multiple proteins for
high-throughput
screening; to raise antibodies or to elicit another immune response; as a
reagent (including
the labeled reagent) in assays designed to quantitatively determine levels of
the protein (or
its receptor) in biological fluids; as markers for tissues in which the
corresponding
polypeptide is preferentially expressed (either constitutively or at a
particular stage of tissue
differentiation or development or in a disease state); and, of course, to
isolate correlative
receptors or ligands. Proteins involved in these binding interactions can also
be used to
screen for peptide or small molecule inhibitors or agonists of the binding
interaction.
Any or all of these research utilities are capable of being developed into
reagent
grade or kit format for commercialization as research products.
Methods for performing the uses listed above are well known to those skilled
in the
art. References disclosing such methods include without limitation "Molecular
Cloning: A
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J.,
E. F.
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to
Molecular
Cloning Techniques", Academic Press, Bergen S. L. and A. R. Kimmel eds., 1987.
4.10.2 NUTRITIONAL USES
Polynucleotides and polypeptides of the present invention can also be used as
nutritional sources or supplements. Such uses include without limitation use
as a protein or

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
aanino acid supplement, use as a carbon source, use as a nitrogen source and
use as a source of
carbohydrate. In such cases the polypeptide or polynucleotide of the invention
can be added to
the feed of a particular organism or can be administered as a separate solid
or liquid
preparation, such as in the form of powder, pills, solutions, suspensions or
capsules. In the case
of microorganisms, the polypeptide or polynucleotide of the invention can be
added to the
medium in or on which the microorganism is cultured.
4.10.3 CYTOHINE ANI) CELL PROLIFERATION/DIFFERENTIATION
ACTIVITY
10 A polypeptide of the present invention may exhibit activity relating to
cytokine, cell
proliferation (either inducing or inhibiting) or cell differentiation (either
inducing or
inhibiting) activity or may induce production of other cytokines in certain
cell populations.
A polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Many protein factors discovered to date, including all known cytokines, have
exhibited
15 activity in one or more factor-dependent cell proliferation assays, and
hence the assays serve
as a convenient confirmation of cytokine activity. The activity of therapeutic
compositions
of the present invention is evidenced by any one of a number of routine factor
dependent cell
proliferation assays for cell lines including, without limitation, 32D, DA2,
DAIG, T10, B9,
B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RBS, DAl, 123, T1165, HT2, CTLL2, TF-1,
20 Mo7e, CMI~, HUVEC, and Caco. Therapeutic compositions of the invention can
be used in
the following:
Assays for T-cell or thymocyte proliferation include without limitation those
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M.
Kruisbeek, D. H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
25 Wiley-Interscience (Chapter 3, Ih Yitro assays for Mouse Lymphocyte
Function 3.1-3.19;
Chapter 7, linmunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Bertagnolli et al., J. Iminunol. 145:1706-1712, 1990; Bertagnolli et al.,
Cellular Irmnunology
133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992;
Bowman et al., I.
hnmunol. 152:1756-1761, 1994.
30 Assays for cytokine production and/or proliferation of spleen cells, lymph
node cells
or thymocytes include, without limitation, those described in: Polyclonal T
cell stimulation,
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in hnmunology. J. E.
e.a. Coligan
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and
Measurement of

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
46
mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in
Immunology. J. E.
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.
Assays for proliferation and differentiation of hematopoietic and
lymphopoietic cells
include, without limitation, those described in: Measurement of Human and
Murine
Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E.
In Current
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John
Wiley and
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau
et al.,
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A.
80:2931-2938,
1983; Measurement of mouse and human interleukin 6--Nordan, R. In Current
Protocols in
Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons,
Toronto. 1991;
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of
human
Interleukin 11--Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In
Current Protocols
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons,
Toronto. 1991;
Measurement of mouse and human Interleukin 9--Ciarletta, A., Giannotti, J.,
Clark, S. C.
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1
pp. 6.13.1,
John Wiley and Sons, Toronto. 1991.
Assays for T-cell clone responses to antigens (which will identify, among
others,
proteins that affect APC-T cell interactions as well as direct T-cell effects
by measuring
proliferation and cytokine production) include, without limitation, those
described in:
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H.
Margulies,
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-
Interscience
(Chapter 3, Ih T~itYO assays for Mouse Lymphocyte Function; Chapter 6,
Cytokines and their
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et
al., Proc.
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun.
11:405-41 l,
1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. hnmunol.
140:508-512,
1988.
4.10.4 STEM CELL GROWTH FACTOR ACTIVITY
A polypeptide of the present invention may exhibit stem cell growth factor
activity
and be involved in the proliferation, differentiation and survival of
pluripotent and totipotent
stem cells including primordial germ cells, embryonic stem cells,
hematopoietic stem cells
and/or germ line stem cells. Administration of the polypeptide of the
invention to stem cells
in vivo or ex vivo is expected to maintain and expand cell populations in a
totipotential or

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
47
pluripotential state wluch would be useful for re-engineering damaged or
diseased tissues,
transplantation, manufacture of bio-pharmaceuticals and the development of bio-
sensors.
The ability to produce large quantities of human cells has important working
applications for
the production of human proteins which currently must be obtained from non-
human sources
or donors, implantation of cells to treat diseases such as Parkinson's,
Alzheimer's and other
neurodegenerative diseases; tissues for grafting such as bone marrow, skin,
cartilage,
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea,
neural cells,
gastrointestinal cells and others; and organs for transplantation such as
kidney, liver,
pancreas (including islet cells), heart and lung.
It is contemplated that multiple different exogenous growth factors and/or
cytokines
may be administered in combination with the polypeptide of the invention to
achieve the
desired effect, including any of the growth factors listed herein, other stem
cell maintenance
factors, and specifically including stem cell factor (SCF), leukemia
inhibitory factor (LIF),
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6
receptor fused to IL-
6, macrophage inflammatory protein 1-alpha (MIP-1-alpha), G-CSF, GM-CSF,
thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor
(PDGF),
neural growth factors and basic fibroblast growth factor (bFGF).
Since totipotent stem cells can give rise to virtually any mature cell type,
expansion
of these cells in culture will facilitate the production of large quantities
of mature cells.
Techniques for culturing stem cells are known in the art and administration of
polypeptides
of the invention, optionally with other growth factors and/or cytokines, is
expected to
enhance the survival and proliferation of the stem cell populations. This can
be
accomplished by direct administration of the polypeptide of the invention to
the culture
medium. Alternatively, stroma cells transfected with a polynucleotide that
encodes for the
polypeptide of the invention can be used as a feeder layer for the stem cell
populations in
culture or in vivo. Stromal support cells fort feeder layers may include
embryonic bone
marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured
embryonic
fibroblasts (see U.S. Patent No. 5,690,926).
Stem cells themselves can be transfected with a polynucleotide of the
invention to
induce autocrine expression of the polypeptide of the invention. This will
allow for
generation of undifferentiated totipotential/pluripotential stem cell lines
that are useful as is
or that can then be differentiated into the desired mature cell types. These
stable cell lines
can also serve as a source of undifferentiated totipotential/pluripotential
mRNA to create

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
48
cDNA libraries and templates for polymerase chain reaction experiments. These
studies
would allow for the isolation and identification of differentially expressed
genes in stem cell
populations that regulate stem cell proliferation and/or maintenance.
Expansion and maintenance of totipotent stem cell populations will be useful
in the
treatment of many pathological conditions. For example, polypeptides of the
present
invention may be used to manipulate stem cells in culture to give rise to
neuroepithelial cells
that can be used to augment or replace cells damaged by illness, autoimmune
disease,
accidental damage or genetic disorders. The polypeptide of the invention may
be useful for
inducing the proliferation of neural cells and for the regeneration of nerve
and brain tissue,
i.e. for the treatment of central and peripheral nervous system diseases and
neuropathies, as
well as mechanical and traumatic disorders which involve degeneration, death
or trauma to
neural cells or nerve tissue. In addition, the expanded stem cell populations
can also be
genetically altered for gene therapy purposes and to decrease host rejection
of replacement
tissues after grafting or implantation.
Expression of the polypeptide of the invention and its effect on stem cells
can also be
manipulated to achieve controlled differentiation of the stem cells into more
differentiated
cell types. A broadly applicable method of obtaining pure populations of a
specific
differentiated cell type from undifferentiated stem cell populations involves
the use of a cell-
type specific promoter driving a selectable marker. The selectable marker
allows only cells
of the desired type to survive. For example, stem cells can be induced to
differentiate into
cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et
al., J. Clin.
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. Tn:
Prifaciples of
Tissue Ehgiraeering eds. Lanza et al., Academic Press (1997)). Alternatively,
directed
differentiation of stem cells can be accomplished by culturing the stem cells
in the presence
of a differentiation factor such as retinoic acid and an antagonist of the
polypeptide of the
invention which would inhibit the effects of endogenous stem cell factor
activity and allow
differentiation to proceed. i
I~ vitro cultures of stem cells can be used to determine if the polypeptide of
the
invention exhibits stem cell growth factor activity. Stem cells are isolated
from any one of
various cell sources (including hematopoietic stem cells and embryonic stem
cells) and
cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad.
Sci, U.S.A.,
92: 7844-7848 (1995), in the presence of the polypeptide of the invention
alone or in
combination with other growth factors or cytokines. The ability of the
polypeptide of the

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
49
invention to induce stem cells proliferation is determined by colony formation
on semi-solid
support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991).
4.10.5 HEMATOPOIESIS REGULATING ACTIVITY
A polypeptide of the present invention may be involved in regulation of
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell
disorders.
Even marginal biological activity in support of colony forming cells or of
factor-dependent
cell lines indicates involvement in regulating hematopoiesis, e.g. in
supporting the growth
and proliferation of erythroid progenitor cells alone or in combination with
other cytokines,
thereby indicating utility, for example, in treating various anemias or for
use in conjunction
with irradiation/chemotherapy to stimulate the production of erythroid
precursors and/or
erythroid cells; in supporting the growth and proliferation of myeloid cells
such as
granulocytes and monocytes/macrophages (i.e., traditional CSF activity)
useful, for example,
in conjunction with chemotherapy to prevent or treat consequent myelo-
suppression; in
supporting the growth and proliferation of megakaryocytes and consequently of
platelets
thereby allowing prevention or treatment of various platelet disorders such as
thrombocytopenia, and generally for use in place of or complimentary to
platelet
transfusions; and/or in supporting the growth and proliferation of
hematopoietic stem cells
which are capable of maturing to any and all of the above-mentioned
hematopoietic cells and
therefore find therapeutic utility in various stem cell disorders (such as
those usually treated
with transplantation, including, without limitation, aplastic anemia and
paroxysmal nocturnal
hemoglobinuria), as well as in repopulating the stem cell compartment post
irradiation/chemotherapy, either i~-vivo or ex-vivo (i.e., in conjunction with
bone marrow
transplantation or with peripheral progenitor cell transplantation (homologous
or
heterologous)) as normal cells or genetically manipulated for gene therapy.
Therapeutic compositions of the invention can be used in the following:
Suitable assays for proliferation and differentiation of various hematopoietic
lines are
cited above.
Assays for embryonic stem cell differentiation (which will identify, among
others,
proteins that influence embryonic differentiation hematopoiesis) include,
without limitation,
those described in: Johansson et al. Cellular Biology 15:141-151, 1995;
I~eller et al.,
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood
81:2903-2915,
1993.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Assays for stem cell survival and differentiation (which will identify, among
others,
proteins that regulate lympho-hematopoiesis) include, without limitation,
those described in:
Methylcellulose colony forming assays, Freshney, M. G. In Culture of
Hematopoietic Cells.
R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, W c., New York, N.Y.
1994;
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 l, 1992; Primitive
hematopoietic
colony forming cells with high proliferative potential, McNiece, I. I~. and
Briddell, R. A. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39,
Wiley-Liss, Inc.,
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994;
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of
Hematopoietic Cells.
10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y.
1994; Long term
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter,
M. and Allen,
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-
179, Wiley-Liss,
Inc., New York, N.Y. I994; Long term culture initiating cell assay,
Sutherland, H. J. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162,
Wiley-Liss, Inc.,
15 New York, N.Y. 1994.
4.10.6 TISSUE GROWTH ACTIVITY
A polypeptide of the present invention also may be involved in bone,
cartilage,
tendon, ligament and/or nerve tissue growth or regeneration, as well as in
wound healing and
20 tissue repair and replacement, and in healing of burns, incisions and
ulcers.
A polypeptide of the present invention which induces cartilage and/or bone
growth in
circumstances where bone is not normally fomned, has application in the
healing of bone
fractures and cartilage damage or defects in humans and other animals.
Compositions of a
polypeptide, antibody, binding partner, or other modulator of the invention
may have
25 prophylactic use in closed as well as open fracture reduction and also in
the improved
fixation of artificial joints. De novo bone formation induced by an osteogenic
agent
contributes to the repair of congenital, trauma induced, or oncologic
resection induced
craniofacial defects, and also is useful in cosmetic plastic surgery.
A polypeptide of this invention may also be involved in attracting bone-
forming
30 cells, stimulating growth of bone-forming cells, or inducing
differentiation of progenitors of
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone
degenerative disorders, or
periodontal disease, such as through stimulation of bone and/or cartilage
repair or by
blocking inflammation or processes of tissue destruction (collagenase
activity, osteoclast

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
51
activity, etc.) mediated by inflammatory processes may also be possible using
the
composition of the invention.
Another category of tissue regeneration activity that may involve the
polypeptide of
the present invention is tendoWligament formation. Induction of
tendon/ligament-like tissue
or other tissue formation in circumstances where such tissue is not normally
formed, has
application in the healing of tendon or ligament tears, deformities and other
tendon or
ligament defects in humans and other animals. Such a preparation employing a
tendon/ligament-like tissue inducing protein may have prophylactic use in
preventing
damage to tendon or ligament tissue, as well as use in the improved fixation
of tendon or
ligament to bone or other tissues, and in repairing defects to tendon or
ligament tissue. De
novo tendon/ligament-like tissue formation induced by a composition of the
present
invention contributes to the repair of congenital, trauma induced, or other
tendon or ligament
defects of other origin, and is also useful in cosmetic plastic surgery for
attachment or repair
of tendons or ligaments. The compositions of the present invention may provide
environment to attract tendon- or ligament-forming cells, stimulate growth of
tendon- or
ligament-forming cells, induce differentiation of progenitors of tendon- or
ligament-forming
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for
return ira vivo to
effect tissue repair. The compositions of the invention may also be useful in
the treatment of
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The
compositions
may also include an appropriate matrix and/or sequestering agent as a carrier
as is well
known in the art.
The compositions of the present invention may also be useful for proliferation
of
neural cells and for regeneration of nerve and brain tissue, i.e. for the
treatment of central
and peripheral nervous system diseases and neuropathies, as well as mechanical
and
traumatic disorders, which involve degeneration, death or trauma to neural
cells or nerve
tissue. More specifically, a composition may be used in the treatment of
diseases of the
peripheral nervous system, such as peripheral nerve injuries, peripheral
neuropathy and
localized neuropathies, and central nervous system diseases, such as
Alzheimer's,
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and
Shy-Drager
syndrome. Further conditions which may be treated in accordance with the
present invention
include mechanical and traumatic disorders, such as spinal cord disorders,
head trauma and
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting
from

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
52
chemotherapy or other medical therapies may also be treatable using a
composition of the
invention.
Compositions of the invention may also be useful to promote better or faster
closure
of non-healing wounds, including without limitation pressure ulcers, ulcers
associated with
vascular insufficiency, surgical and traumatic wounds, and the like.
Compositions of the present invention may also be involved in the generation
or
regeneration of other tissues, such as organs (including, for example,
pancreas, liver,
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac)
and vascular
(including vascular endothelium) tissue, or for promoting the growth of cells
comprising
such tissues. Part of the desired effects may be by inhibition or modulation
of fibrotic
scarring may allow normal tissue to regenerate. A polypeptide of the present
invention may
also exhibit angiogenic activity.
A composition of the present invention may also be useful for gut protection
or
regeneration and treatment of lung or liver fibrosis, reperfusion injury in
various tissues, and
conditions resulting from systemic cytokine damage.
A composition of the present invention may also be useful for promoting or
inhibiting differentiation of tissues described above from precursor tissues
or cells; or for
inhibiting the growth of tissues described above.
Therapeutic compositions of the invention can be used in the following:
Assays for tissue generation activity include, without limitation, those
described in:
International Patent Publication No. W095/16035 (bone, cartilage, tendon);
International
Patent Publication No. W095/05846 (nerve, neuronal); International Patent
Publication No.
W091/07491 (skin, endothelium).
Assays for wound healing activity include, without limitation, those described
in:
Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T.,
eds.),
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and
Mertz, J. Invest.
Dermatol 71:382-84 (1978).
4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY
A polypeptide of the present invention may also exhibit immune stimulating or
immune suppressing activity, including without limitation the activities for
which assays are
described herein. A polynucleotide of the invention can encode a polypeptide
exhibiting
such activities. A protein may be useful in the treatment of various immune
deficiencies and

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
53
disorders (including severe combined immunodeficiency (SCID)), e.g., in
regulating (up or
down) growth and proliferation of T andlor B lymphocytes, as well as effecting
the cytolytic
activity of NIA cells and other cell populations. These immune deficiencies
may be genetic or
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or
may result from
autoimmune disorders. More specifically, infectious diseases causes by viral,
bacterial,
fungal or other infection may be treatable using a protein of the present
invention, including
infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania
spp., malaria
spp. and various fungal infections such as candidiasis. Of course, in this
regard, proteins of
the present invention may also be useful where a boost to the immune system
generally may
be desirable, i.e.~ in the treatment of cancer.
Autoimmune disorders which may be treated using a protein of the present
invention
include, for example, connective tissue disease, multiple sclerosis, systemic
lupus
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation,
Guillain-Barre
syndrome, autoirmnune thyroiditis, insulin dependent diabetes mellitis,
myasthenia gravis,
graft-versus-host disease and autoimmune inflammatory eye disease. Such a
protein (or
antagonists thereof, including antibodies) of the present invention may also
to be useful in
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum
sickness, drug
reactions, food allergies, insect venom allergies, mastocytosis, allergic
rhinitis,
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic
dermatitis, allergic
contact dermatitis, erythema, multiforme, Stevens-Johnson syndrome, allergic
conjunctivitis,
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary
conjunctivitis and
contact allergies), such as asthma (particularly allergic asthma) or other
respiratory
problems. Other conditions, in which immune suppression is desired (including,
for
example, organ transplantation), may also be treatable using a protein (or
antagonists
thereof) of the present invention. The therapeutic effects of the polypeptides
or antagonists
thereof on allergic reactions can be evaluated by in vivo animals models such
as the
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66,
I99~), skin
prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin
sensitization test
(Vohr et al., Arch. Toxocol. 73: 501-9), and marine local lymph node assay
(Kimber et al.,
J. Toxicol. Environ. Health 53: 563-79).
Using the proteins of the invention it may also be possible to modulate immune
responses, in a number of ways. Down regulation may be in the form of
inhibiting or
blocking an immune response already in progress or may involve preventing the
induction of

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
54
an immune response. The functions of activated T cells may be inhibited by
suppressing T
cell responses or by inducing specific tolerance in T cells, or both.
Immunosuppression of T
cell responses is generally an active, non-antigen-specific, process which
requires continuous
exposure of the T cells to the suppressive agent. Tolerance, which involves
inducing
non-responsiveness or energy in T cells, is distinguishable from
immunosuppression in that
it is generally antigen-specific and persists after exposure to the tolerizing
agent has ceased.
Operationally, tolerance can be demonstrated by the lack of a T cell response
upon
reexposure to specific antigen in the absence of the tolerizing agent.
Down regulating or preventing one or more antigen functions (including without
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g.,
preventing high
level lymphokine synthesis by activated T cells, will be useful in situations
of tissue, skin
and organ transplantation and in graft-versus-host disease (GVHD). For
example, blockage
of T cell function should result in reduced tissue destruction in tissue
transplantation.
Typically, in tissue transplants, rejection of the transplant is initiated
through its recognition
as foreign by T cells, followed by an immune reaction that destroys the
transplant. The
administration of a therapeutic composition of the invention may prevent
cytokine synthesis
by immune cells, such as T cells, and thus acts as an immunosuppressant.
Moreover, a lack
of costimulation may also be sufficient to energize the T cells, thereby
inducing tolerance in
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking
reagents may
avoid the necessity of repeated administration of these blocking reagents. To
achieve
sufficient immunosuppression or tolerance in a subject, it may also be
necessary to block the
function of a combination of B lymphocyte antigens.
The efficacy of particular therapeutic compositions in preventing organ
transplant
rejection or GVHD can be assessed using animal models that are predictive of
efficacy in
humans. Examples of appropriate systems which can be used include allogeneic
cardiac
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of
which have been
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in
vivo as
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al.,
Proc. Natl. Aced.
Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul
ed.,
Fundamental Irmnunology, Raven Press, New York, 1989, pp. 846-847) can be used
to
determine the effect of therapeutic compositions of the invention on the
development of that
disease.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Blocking antigen function may also be therapeutically useful for treating
autoimmune diseases. Many autoimmune disorders are the result of inappropriate
activation
of T cells that are reactive against self tissue and which promote the
production of cytokines
asld autoantibodies involved in the pathology of the diseases. Preventing the
activation of
5 autoreactive T cells may reduce or eliminate disease symptoms.
Administration of reagents
which block stimulation of T cells can be used to inhibit T cell activation
and prevent
production of autoantibodies or T cell-derived cytokines which may be involved
in the
disease process. Additionally, blocking reagents may induce antigen-specific
tolerance of
autoreactive T cells which could lead to long-teen relief from the disease.
The efficacy of
10 blocking reagents in preventing or alleviating autoimmune disorders can be
determined
using a number of well-characterized animal models of human autoimmune
diseases.
Examples include marine experimental autoixnmune encephalitis, systemic lupus
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, marine autoimmune
collagen
arthritis, diabetes mellitus in NOD mice and BB rats, and marine experimental
myasthenia
15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989,
pp.
840-856).
Upregulation of an antigen function (e.g., a B lymphocyte antigen function),
as a
means of up regulating immune responses, may also be useful in therapy.
Upregulation of
immune responses may be in the form of enhancing an existing immune response
or eliciting
20 an initial immune response. For example, enhancing an immune response may
be useful in
cases of viral infection, including systemic viral diseases such as influenza,
the common
cold, and encephalitis.
Alternatively, anti-viral immune responses may be enhanced in an infected
patient by
removing T cells from the patient, costimulating the T cells in vitro with
viral antigen-pulsed
25 APCs either expressing a peptide of the present invention or together with
a stimulatory
form of a soluble peptide of the present invention and reintroducing the in
vitro activated T
cells into the patient. Another method of enhancing anti-viral immune
responses would be to
isolate infected cells from a patient, transfect them with a nucleic acid
encoding a protein of
the present invention as described herein such that the cells express all or a
portion of the
30 protein on their surface, and reintroduce the transfected cells into the
patient. The infected
cells would now be capable of delivering a costimulatory signal to, and
thereby activate, T
cells in vivo.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
56
A polypeptide of the present invention may provide the necessary stimulation
signal
to T cells to induce a T cell mediated immune response against the transfected
tumor cells.
W addition, tumor cells which lack MHC class I or MHC class II molecules, or
which fail to
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be
transfected
with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain
truncated portion)
of an MHC class I alpha chain protein and (32 microglobulin protein or an MHC
class II
alpha chain protein and an MHC class II beta chain protein to thereby express
MHC class I
or MHC class II proteins on the cell surface. Expression of the appropriate
class I or class II
MHC in conjunction with a peptide having the activity of a B lymphocyte
antigen (e.g.,
B7-1, B7-2, B7-3) induces a T cell mediated immune response against the
transfected tumor
cell. Optionally, a gene encoding an antisense construct which blocks
expression of an MHC
class II associated protein, such as the invariant chain, can also be
cotransfected with a DNA
encoding a peptide having the activity of a B lymphocyte antigen to promote
presentation of
tumor associated antigens and induce tumor specific immunity. Thus, the
induction of a T
cell mediated immune response in a human subject may be sufficient to overcome
tumor-specific tolerance in the subject.
The activity of a protein of the invention may, among other means, be measured
by
the following methods:
Suitable assays for thymocyte or splenocyte cytotoxicity include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. I~ruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Hemnann et al., Proc. Natl. Acad.
Sci. USA
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et
al., J.
Tm_m__unol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500,
1986; Takai et al.,
J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998;
Bertagnolli et
al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Tmmunol. 153:3079-
3092,
1994.
Assays for T-cell-dependent immunoglobulin responses and isotype switching
(which will identify, among others, proteins that modulate T-cell dependent
antibody
responses and that affect Thl/Th2 profiles) include, without limitation, those
described in:
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function:
In vitro

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
57
antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in
Immunology. J.
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto.
1994.
Mixed lymphocyte reaction (MLR) assays (which will identify, among others,
proteins that generate predominantly Thl and CTL responses) include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. Kruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol.
149:3778-3783,
1992.
Dendritic cell-dependent assays (which will identify, among others, proteins
expressed by dendritic cells that activate naive T-cells) include, without
limitation, those
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et aL,
Journal of
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of
Immunology
154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-
260,
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al.,
Science
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-
1264,
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and
Inaba et al.,
Journal of Experimental Medicine 172:631-640, 1990.
Assays for lymphocyte survival/apoptosis (which will identify, among others,
proteins that prevent apoptosis after superantigen induction and proteins that
regulate
lymphocyte homeostasis) include, without limitation, those described in:
Darzynkiewicz et
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993;
Gorczyca et
al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991;
Zacharchuk,
Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897,
1993;
Gorczyca et al., International Journal of Oncology 1:639-648, 1992.
Assays for proteins that influence early steps of T-cell commitment and
development
include, without limitation, those described in: Antica et al., Blood 84:111-
117, 1994; Fine
et al., Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-
2778, 1995;
Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.
4.10.8 ACTIVIN/INHIBIN ACTIVITY

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
58
A polypeptide of the present invention may also exhibit activin- or inhibin-
related
activities. A polynucleotide of the invention may encode a polypeptide
exhibiting such
characteristics. Inhibins are characterized by their ability to inhibit the
release of follicle
stimulating hormone (FSH), while activins and are characterized by their
ability to stimulate
the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the
present
invention, alone or in heterodimers with a member of the inhibin family, may
be useful as a
contraceptive based on the ability of inlubins to decrease fertility in female
mammals and
decrease spermatogenesis in male marmnals. Administration of sufficient
amounts of other
inlubins can induce infertility in these mammals. Alternatively, the
polypeptide of the
invention, as a homodimer or as a heterodimer with other protein subunits of
the inhibin
group, may be useful as a fertility inducing therapeutic, based upon the
ability of activin
molecules in stimulating FSH release from cells of the anterior pituitary.
See, for example,
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for
advancement
of the onset of fertility in sexually immature mammals, so as to increase the
lifetime
reproductive performance of domestic animals such as, but not limited to,
cows, sheep and
pigs.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods.
Assays for activiWinhibin activity include, without limitation, those
described in:
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782,
1986; Vale et
al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage
et al., Proc.
Natl. Acad. Sci. USA 83:3091-3095, 1986.
4.10.9 CHEMOTACTIC/CHEMOHINETIC ACTIVITY
A polypeptide of the present invention may be involved in chemotactic or
chemokinetic activity for mammalian cells, including, for example, monocytes,
fibroblasts,
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial
cells. A
polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Chemotactic and chemokinetic receptor activation can be used to mobilize or
attract a
desired cell population to a desired site of action. Chemotactic or
chemokinetic compositions
(e.g. proteins, antibodies, binding partners, or modulators of the invention)
provide particular
advantages in treatment of wounds and other trauma to tissues, as well as in
treatment of
localized infections. For example, attraction of lymphocytes, monocytes or
neutrophils to

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
59
tumors or sites of infection may result in improved immune responses against
the tumor or
infecting agent.
A protein or peptide has chemotactic activity for a particular cell population
if it can
stimulate, directly or indirectly, the directed orientation or movement of
such cell
population. Preferably, the protein or peptide has the ability to directly
stimulate directed
movement of cells. Whether a particular protein has chemotactic activity for a
population of
cells can be readily determined by employing such protein or peptide in any
known assay for
cell chemotaxis.
Therapeutic compositions of the invention can be used in the following:
Assays for chemotactic activity (which will identify proteins that induce or
prevent
chemotaxis) consist of assays that measure the ability of a protein to induce
the migration of
cells across a membrane as well as the ability of a protein to induce the
adhesion of one cell
population to another cell population. Suitable assays for movement and
adhesion include,
without limitation, those described in: Current Protocols in Immunology, Ed by
J. E.
Coligan, A. M. I~ruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub.
Greene
Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of
alpha and beta
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995;
Lind et al.
APMIS 103:140-146, 1995; Muller et al Eur. J. Imrnunol. 25:1744-1748; Gruber
et al. J. of
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768,
1994.
4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY
A polypeptide of the invention may also be involved in hemostatis or
thrombolysis or
thrombosis. A polynucleotide of the invention can encode a polypeptide
exhibiting such
attributes. Compositions may be useful in treatment of various coagulation
disorders
(including hereditary disorders, such as hemophiliac) or to enhance
coagulation and other
hemostatic events in treating wounds resulting from trauma, surgery or other
causes. A
composition of the invention may also be useful for dissolving or inhibiting
formation of
thromboses and for treatment and prevention of conditions resulting therefrom
(such as, for
example, infarction of cardiac and central nervous system vessels (e.g.,
stroke).
Therapeutic compositions of the invention can be used in the following:
Assay for hemostatic and thrombolytic activity include, without limitation,
those
described in: Linet et al., J. Clin. Phannacol. 26:131-140, 1986; Burdick et
al., Thrombosis

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub,
Prostaglandins 35:467-474, 1988.
4.14.11 CANCER DIAGNOSIS AND THERAPY
5 Polypeptides of the invention may be involved in cancer cell generation,
proliferation
or metastasis. Detection of the presence or amount of polynucleotides or
polypeptides of the
invention may be useful for the diagnosis and/or prognosis of one or more
types of cancer.
For example, the presence or increased expression of a
polynucleotide/polypeptide of the
invention may indicate a hereditary risk of cancer, a precancerous condition,
or an ongoing
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide
may be
associated with a cancer condition. Identification of single nucleotide
polymorphisms
associated with cancer or a predisposition to cancer may also be useful for
diagnosis or
prognosis.
Cancer treatments promote tumor regression by inhibiting tumor cell
proliferation,
15 inhibiting angiogenesis (growth of new blood vessels that is necessary to
support tumor
growth) and/or prohibiting metastasis by reducing tumor cell motility or
invasiveness.
Therapeutic compositions of the invention may be effective in adult and
pediatric oncology
including in solid phase tumors/malignancies, locally advanced tumors, human
soft tissue
sarcomas, metastatic cancer, including l5nnphatic metastases, blood cell
malignancies
20 including multiple myeloma, acute and chronic leukemias, and lymphomas,
head and neck
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers
including
small cell carcinoma and non-small cell cancers, breast cancers including
small cell .
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal
cancer,
stomach cancer, colon cancer, colorectal cancer and polyps associated with
colorectal
25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including
bladder cancer and
prostate cancer, malignancies of the female genital tract including ovarian
carcinoma, uterine
(including endometrial) cancers, and solid tumor in the ovarian follicle,
kidney cancers
including renal cell carcinoma, brain cancers including intrinsic brain
tumors,
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell
invasion in the central
30 nervous system, bone cancers including osteomas, skin cancers including
malignant
melanoma, tumor progression of human skin keratinocytes, squamous cell
carcinoma, basal
cell carcinoma, hemangiopericytoma and Karposi's sarcoma.
Polypeptides, polynucleotides, or modulators of polypeptides of the invention

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
61
(including inhibitors and stimulators of the biological activity of the
polypeptide of the
invention) may be administered to treat cancer. Therapeutic compositions can
be
administered in therapeutically effective dosages alone or in combination with
adjuvant
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and
laser
therapy, and may provide a beneficial effect, e.g. reducing tumor size,
slowing rate of tumor
growth, inhibiting metastasis, or otherwise improving overall clinical
condition, without
necessarily eradicating the cancer.
The composition can also be administered in therapeutically effective amounts
as a
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of
the polypeptide or
modulator of the invention with one or more anti-cancer drugs in addition to a
pharmaceutically acceptable carrier for delivery. The use of anti-cancer
cocktails as a cancer
treatment is routine. Anti-cancer drugs that are well knovcm in the art and
can be used as a
treatment in combination with the polypeptide or modulator of the invention
include:
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan,
Carboplatin,
Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine
HCl
(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCI,
Doxombicin HCl,
Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-
Fluorouracil (5-Fu),
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-Za,
Interferon
Alpha-Zb, Leuprolide acetate (LHRH-releasing factor analog), Lomustine,
Mechlorethamine
HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX),
Mitomycin, Mitoxantrone HCI, Octreotide, Plicamycin, Procaxbazine HCI,
Streptozocin,
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine
ulfate,
Amsacrine, Azacitidine, Hexamethyhnelamine, Interleukin-2, Mitoguazone,
Pentostatin,
Semustine, Teniposide, and Vindesine sulfate.
In addition, therapeutic compositions of the invention may be used for
prophylactic
treatment of cancer. There axe hereditary conditions and/or environmental
situations (e.g.
exposure to carcinogens) known in the art that predispose an individual to
developing
cancers. Under these circumstances, it may be beneficial to treat these
individuals with
therapeutically effective doses of the polypeptide of the invention to reduce
the risk of
developing cancers.
In vitfro models can be used to determine the effective doses of the
polypeptide of the
invention as a potential cancer treatment. These ivy. vitYO models include
proliferation assays
of cultured tumor cells, growth of cultured tumor cells in soft agar (see
Freshney, (1987)

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
62
Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY
Ch 18
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J.
Natl. Can. Inst.,
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden
Chamber assays
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and
angiogenesis
assays such as induction of vascularization of the chick chorioallantoic
membrane or
induction of vascular endothelial cell migration as described in Ribatta et
al., Intl. J. Dev.
Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9
(1899), respectively.
Suitable ttunor cells lines are available, e.g. from American Type Tissue
Culture Collection
catalogs.
4.10.12 RECEPTOR/LIGAND ACTIVITY
A polypeptide of the present invention may also demonstrate activity as
receptor,
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A
polynucleotide of
the invention can encode a polypeptide exhibiting such characteristics.
Examples of such
receptors and ligands include, without limitation, cytokine receptors and
their ligands,
receptor kinases and their ligands, receptor phosphatases and their ligands,
receptors
involved in cell-cell interactions and their ligands (including without
limitation, cellular
adhesion molecules (such as selectins, integrins and their ligands) and
receptorfligand pairs
involved in antigen presentation, antigen recognition and development of
cellular and
humoral immune responses. Receptors and ligands are also useful for screening
of potential
peptide or small molecule inhibitors of the relevant receptor/ligand
interaction. A protein of
the present invention (including, without limitation, fragments of receptors
and ligands) may
themselves be useful as inhibitors of receptor/ligand interactions.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods:
Suitable assays for receptor-ligand activity include without limitation those
described
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
Wiley-
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static
conditions
7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987;
Bierer et al.,
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160
1989;
Stoltenborg et al., J. Iminunol. Methods 175:59-68, 1994; Stitt et al., Cell
80:661-670, 1995.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
63
By way of example, the polypeptides of the invention may be used as a receptor
for a
ligand(s) thereby transmitting the biological activity of that ligand(s).
Ligands may be
identified through binding assays, affinity chromatography, dihybrid screening
assays,
BIAcore assays, gel overlay assays, or other methods knOWn 1I1 the art.
Studies characterizing drugs or proteins as agonist or antagonist or partial
agonists or
a partial antagonist require the use of other proteins as competing ligands.
The polypeptides
of the present invention or ligand(s) thereof may be labeled by being coupled
to
radioisotopes, colorimetric molecules or a toxin molecules by conventional
methods.
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in
Enzymology Vol. 182
(1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but
are not
limited to, tritium and carbon-14 . Examples of colorimetric molecules
include, but are not
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other
colorimetric
molecules. Examples of toxins include, but are not limited, to ricin.
4.10.13 DRUG SCREENING
This invention is particularly useful for screening chemical compounds by
using the
novel polypeptides or binding fragments thereof in any of a variety of drug
screening
techniques. The polypeptides or fragments employed in such a test may either
be free in
solution, affixed to a solid support, borne on a cell surface or located
intracellularly. One
method of drug screening utilizes eukaryotic or prokaryotic host cells which
are stably
transformed with recombinant nucleic acids expressing the polypeptide or a
fragment
thereof. Drugs are screened against such transformed cells in competitive
binding assays.
Such cells, either in viable or fixed form, can be used for standard binding
assays. One may
measure, for example, the formation of complexes between polypeptides of the
invention or
fragments and the agent being tested or examine the diminution in complex
formation
between the novel polypeptides and an appropriate cell line, which are well
known in the art.
Sources for test compounds that may be screened for ability to bind to or
modulate
(i.e., increase or decrease) the activity of polypeptides of the invention
include (1) iilorganic
and organic chemical libraries, (2) natural product libraries, and (3)
combinatorial libraries
comprised of either random or mimetic peptides, oligonucleotides or organic
molecules.
Chemical libraries may be readily synthesized or purchased from a number of
commercial sources, and may include structural analogs of known compounds or
compounds
that are identified as "hits" or "leads" via natural product screening.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
64
The sources of natural product libraries are microorganisms (including
bacteria and
fungi), animals, plants or other vegetation, or marine organisms, and
libraries of mixtures for
screening may be created by: (1) fermentation and extraction of broths from
soil, plant or
marine microorganisms or (2) extraction of the organisms themselves. Natural
product
libraries include polyketides, non-ribosomal peptides, and (non-naturally
occurring) variants
thereof. For a review, see Science 282:63-68 (1998).
Combinatorial libraries are composed of large numbers of peptides,
oligonucleotides
or organic compounds and can be readily prepared by traditional automated
synthesis
methods, PCR, cloning or proprietary synthetic methods. Of particular interest
are peptide
and oligonucleotide combinatorial libraries. Still other libraries of interest
include peptide,
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial,
and polypeptide
libraries. For a review of combinatorial chemistry and libraries created
therefrom, see
Myers, Curs. Opin. BioteclZnol. 8:701-707 (1997). For reviews and examples of
peptidomimetic libraries, see Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23
(1998); Hruby
et al., Curn Opin Clzem Biol, 1(1):114-19 (1997); Dorner et al., BioofgMed
Chem,
4(5):709-15 (1996) (alkylated dipeptides).
Identification of modulators through use of the various libraries described
herein
permits modification of the candidate "hit" (or "lead") to optimize the
capacity of the "hit"
to bind a polypeptide of the invention. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cell/animal death or prolonged survival of
the animal/cells.
The binding molecules thus identified may be complexed with toxins, e.g.,
ricin or
cholera, or with other compounds that are toxic to cells such as
radioisotopes. The
toxin-binding molecule complex is then targeted to a tumor or other cell by
the specificity of
the binding molecule for a polypeptide of the invention. Alternatively, the
binding
molecules may be complexed with imaging agents for targeting and imaging
purposes.
4.10.14 ASSAY FOR RECEPTOR ACTIVITY
The invention also provides methods to detect specific binding of a
polypeptide e.g. a
ligand or a receptor. The art provides numerous assays particularly useful for
identifying
previously unknown binding partners for receptor polypeptides of the
invention. For
example, expression cloning using mammalian or bacterial cells, or dihybrid
screening

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
assays can be used to identify polynucleotides encoding binding partners. As
another
example, affinity chromatography with the appropriate immobilized polypeptide
of the
invention can be used to isolate polypeptides that recognize and bind
polypeptides of the
invention. There are a number of different libraries used for the
identification of
5 compounds, and in particular small molecules, that modulate (i.e., increase
or decrease)
biological activity of a polypeptide of the invention. Ligands for receptor
polypeptides of the
invention can also be identified by adding exogenous ligands, or cocktails of
ligands to two
cells populations that are genetically identical except for the expression of
the receptor of the
invention: one cell population expresses the receptor of the invention whereas
the other does
10 not. The responses of the two cell populations to the addition of
ligands(s) are then
compared. Alternatively, an expression library can be co-expressed with the
polypeptide of
the invention in cells and assayed for an autocrine response to identify
potential ligand(s). As
still another example, BIAcore assays, gel overlay assays, or other methods
known in the art
can be used to identify binding partner polypeptides, including, (1) organic
and inorganic
15 chemical libraries, (2) natural product libraries, and (3) combinatorial
libraries comprised of
random peptides, oligonucleotides or organic molecules.
The role of downstream intracellular signaling molecules in the signaling
cascade of
the polypeptide of the invention can be determined. For example, a chimeric
protein in
which the cytoplasmic domain of the polypeptide of the invention is fused to
the
20 extracellular portion of a protein, whose ligand has been identified, is
produced in a host
cell. The cell is then incubated with the ligand specific for the
extracellular portion of the
chimeric protein, thereby activating the chimeric receptor. Known downstream
proteins
involved in intracellular signaling can then be assayed for expected
modifications i.e.
phosphorylation. Other methods known to those in the art can also be used to
identify
25 signaling molecules involved in receptor activity.
4.10.15 ANTI-INFLAMMATORY ACTIVITY
Compositions of the present invention may also exhibit anti-inflammatory
activity.
The anti-inflammatory activity may be achieved by providing a stimulus to
cells involved in
30 the inflammatory response, by inhibiting or promoting cell-cell
interactions (such as, for
example, cell adhesion), by inhibiting or promoting chemotaxis of cells
involved in the
inflammatory process, inhibiting or promoting cell extravasation, or by
stimulating or
suppressing production of'other factors which more directly inhibit or promote
an

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
66
inflammatory response. Compositions with such activities can be used to treat
inflammatory
conditions including chronic or acute conditions), including without
limitation intimation
associated with infection (such as septic shock, sepsis or systemic
inflammatory response
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis,
complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-
induced lung
injury, inflammatory bowel disease, Crohn's disease or resulting from over
production of
cytokines such as TNF or IL-1. Compositions of the invention may also be
useful to treat
anaphylaxis and hypersensitivity to an antigenic substance or material.
Compositions of this
invention may be utilized to prevent or treat conditions such as, but not
limited to, sepsis,
acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid
arthritis, chronic
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1,
graft versus
host disease, inflammatory bowel disease, inflamation associated with
pulmonary disease,
other autoimmune disease or inflammatory disease, an antiproliferative agent
such as for
acute or chronic mylegenous leukemia or in the prevention of premature labor
secondary to
intrauterine infections.
4.10.16 LEUKEMIAS
Leukemias and related disorders may be treated or prevented by administration
of a
therapeutic that promotes or inhibits function of the polynucleotides and/or
polypeptides of
the invention. Such leukemias and related disorders include but are not
limited to acute
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic,
promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia,
chronic
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a
review of such
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co.,
Philadelphia).
4.10.17 NERVOUS SYSTEM DISORDERS
Nervous system disorders, involving cell types which can be tested for
efficacy of
intervention with compounds that modulate the activity of the polynucleotides
and/or
polypeptides of the invention, and which can be treated upon thus observing an
indication of
therapeutic utility, include but are not limited to nervous system injuries,
and diseases or
disorders which result in either a disconnection of axons, a diminution or
degeneration of
neurons, or demyelination. Nervous system lesions which may be treated in a
patient
(including human and non-human mammalian patients) according to the invention
include

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
67
but are not limited to the following lesions of either the central (including
spinal cord, brain)
or peripheral nervous systems:
(i) traumatic lesions, including lesions caused by physical injury or
associated
with sua-gery, for example, lesions which sever a portion of the nervous
system, or
compression injuries;
(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous
system
results in neuronal injury or death, including cerebral infarction or
ischemia, or spinal cord
infarction or ischemia;
(iii) infectious lesions, in which a portion of the nervous system is
destroyed or
injured as a result of infection, for example, by an abscess or associated
with infection by
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with
Lyme
disease, tuberculosis, syphilis;
(iv) degenerative lesions, in which a portion of the nervous system is
destroyed or
injured as a result of a degenerative process including but not limited to
degeneration
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea,
or
amyotrophic lateral sclerosis;
(v) lesions associated with nutritional diseases or disorders, in which a
portion of
the nervous system is destroyed or injured by a nutritional disorder or
disorder of
metabolism including but not limited to, vitamin B 12 deficiency, folic acid
deficiency,
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease
(primary
degeneration of the corpus callosum), and alcoholic cerebellar degeneration;
(vi) neurological lesions associated with systemic diseases including but not
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus
erythematosus,
carcinoma, or sarcoidosis;
(vii) lesions caused by toxic substances including alcohol, lead, or
particular
neurotoxins; and
(viii) demyelinated lesions in which a portion of the nervous system is
destroyed or
injured by a demyelinating disease including but not limited to multiple
sclerosis, human
immunodeficiency virus-associated myelopathy, transverse myelopathy or various
etiologies, progressive multifocal leukoencephalopathy, and central pontine
myelinolysis.
Therapeutics which are useful according to the invention for treatment of a
nervous
system disorder may be selected by testing for biological activity in
promoting the survival

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
68
or differentiation of neurons. For example, and not by way of limitation,
therapeutics which
elicit any of the following effects may be useful according to the invention:
(i) increased survival time of neurons in culture;
(ii) increased sprouting of neurons in culture or in vivo;
(iii) increased production of a neuron-associated molecule in culture or in
vivo,
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor
neurons; or
(iv) decreased symptoms of neuron dysfunction in vivo.
Such effects may be measured by any method known in the art. In preferred,
non-limiting embodiments, increased survival of neurons may be measured by the
method
set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3S1S); increased
sprouting of neurons
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol.
70:65-82) or
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of
neuron-associated molecules may be measured by bioassay, enzymatic assay,
antibody
binding, Northern blot assay, etc., depending on the molecule to be measured;
and motor
1 S neuron dysfunction may be measured by assessing the physical manifestation
of motor
neuron disorder, e.g., weakness, motor neuron conduction velocity, or
functional disability.
In specific embodiments, motor neuron disorders that may be treated according
to the
invention include but are not limited to disorders such as infarction,
infection, exposure to
toxin, trauma, surgical damage, degenerative disease or malignancy that may
affect motor
neurons as well as other components of the nervous system, as well as
disorders that
selectively affect neurons such as amyotrophic lateral sclerosis, and
including but not limited
to progressive spinal muscular atrophy, progressive bulbar palsy, primary
lateral sclerosis,
infantile and juvenile muscular atrophy, progressive bulbar paralysis of
childhood (Fazio-
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary
Motorsensory
2S Neuropathy (Charcot-Marie-Tooth Disease).
4.10.18 OTHER ACTIVITIES
A polypeptide of the invention may also exhibit one or more of the following
additional activities or effects: inhibiting the growth, infection or function
of, or killing,
infectious agents, including, without limitation, bacteria, viruses, fungi and
other parasites;
effecting (suppressing or enhancing) bodily characteristics, including,
without limitation,
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue
pigmentation, or
organ or body part size or shape (such as, for example, breast augmentation or
diminution,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
69
change in bone form or shape); effecting biorhythms or circadian cycles or
rhythms;
effecting the fertility of male or female subjects; effecting the metabolism,
catabolism,
anabolism, processing, utilization, storage or elimination of dietary fat,
lipid, protein,
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or
component(s);
effecting behavioral characteristics, including, without limitation, appetite,
libido, stress,
cognition (including cognitive disorders), depression (including depressive
disorders) and
violent behaviors; providing analgesic effects or other pain reducing effects;
promoting
differentiation and growth of embryonic stem cells in Iineages other than
hematopoietic
lineages; hormonal or endocrine activity; in the case of enzymes, correcting
deficiencies of
the enzyme and treating deficiency-related diseases; treatment of
hyperproliferative
disorders (such as, for example, psoriasis); immunoglobulin-like activity
(such as, for
example, the ability to bind antigens or complement); and the ability to act
as an antigen in a
vaccine composition to raise an immune response against such protein or
another material or
entity which is cross-reactive with such protein.
4.10.19 IDENTIFICATION OF POLYMORPHISMS
The demonstration of polymorphisms makes possible the identification of such
polymorphisms in human subjects and the pharmacogenetic use of this
information for
diagnosis and treatment. Such polymorphisms may be associated with, e.g.,
differential
predisposition or susceptibility to various disease states (such as disorders
involving
inflammation or immune response) or a differential response to drug
administration, and this
genetic information can be used to tailor preventive or therapeutic treatment
appropriately.
For example, the existence of a polymorphism associated with a predisposition
to
inflammation or autoimmune disease makes possible the diagnosis of this
condition in
humans by identifying the presence of the polymorphism.
Polymorphisms can be identified in a variety of ways known in the art which
all
generally involve obtaining a sample from a patient, analyzing DNA from the
sample,
optionally involving isolation or amplification of the DNA, and identifying
the presence of
the polymorphism in the DNA. For example, PCR may be used to amplify an
appropriate
fragment of genomic DNA which rnay then be sequenced. Alternatively, the DNA
may be
subjected to allele-specific oligonucleotide hybridization (in which
appropriate
oligonucleotides are hybridized to the DNA under conditions permitting
detection of a single
base mismatch) or to a single nucleotide extension assay (in which an
oligonucleotide that

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
hybridizes immediately adjacent to the position of the polymorphism is
extended with one or
more labeled nucleotides). In addition, traditional restriction fragment
length polymorphism
analysis (using restriction enzymes that provide differential digestion of the
genomic DNA
depending on the presence or absence of the polymorphism) may be performed.
Arrays with
5 nucleotide sequences of the present invention can be used to detect
polyrnorphisms. The
array can comprise modified nucleotide sequences of the present invention in
order to detect
the nucleotide sequences of the present invention. In the alternative, any one
of the
nucleotide sequences of the present invention can be placed on the array to
detect changes
from those sequences.
10 Alternatively a polymorphism resulting in a change in the amino acid
sequence could
also be detected by detecting a corresponding change in amino acid sequence of
the protein,
e.g., by an antibody specific to the variant sequence.
4.10.20 ARTHRITIS AND INFLAMMATION
15 The immunosuppressive effects of the compositions of the invention against
rheumatoid arthritis is determined in an experimental animal model system. The
experimental model system is adjuvant induced arthritis in rats, and the
protocol is described
by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963,
Int. Arch.
Allergy Appl. Immunol., 23:129. W duction of the disease can be caused by a
single
20 injection, generally intradermally, of a suspension of killed Mycobacterium
tuberculosis in
complete Freund's adjuvant (CFA). The route of injection can vary, but rats
may be injected
at the base of the tail with an adjuvant mixture. The polypeptide is
administered in phosphate
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of
administering
PBS only.
25 The procedure for testing the effects of the test compound would consist of
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by
immediately
administering the test compound and subsequent treatment every other day until
day 24. At
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an
overall arthritis
score may be obtained as described by J. Holoskitz above. An analysis of the
data would
30 reveal that the test compound would have a dramatic affect on the swelling
of the joints as
measured by a decrease of the arthritis score.
4.11 THERAPEUTIC METHODS

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
71
The compositions (including polypeptide fragments, analogs, variants and
antibodies
or other binding partners or modulators including antisense polynucleotides)
of the invention
have numerous applications in a variety of therapeutic methods. Examples of
therapeutic
applications include, but are not limited to, those exemplified herein.
4.11.1 EXAMPLE
One embodiment of the invention is the administration of an effective amount
of the
polypeptides or other composition of the invention to individuals affected by
a disease or
disorder that can be modulated by regulating the peptides of the invention.
While the mode
of administration is not particularly important, parenteral administration is
preferred. An
exemplary mode of administration is to deliver an intravenous bolus. The
dosage of the
polypeptides or other composition of the invention will normally be determined
by the
prescribing physician. It is to be expected that the dosage will vary
according to the age,
weight, condition and response of the individual patient. Typically, the
amount of
polypeptide administered per dose will be in the range of about 0.01 ~,g/kg to
100 mg/kg of
body weight, with the preferred dose being about 0.1 ~.g/kg to 10 mg/kg of
patient body
weight. For parenteral administration, polypeptides of the invention will be
formulated in an
injectable form combined with a pharmaceutically acceptable parenteral
vehicle. Such
vehicles are well known in the art and examples include water, saline,
Ringer's solution,
dextrose solution, and solutions consisting of small amounts of the human
serum albumin.
The vehicle may contain minor amounts of additives that maintain the
isotonicity and
stability of the polypeptide or other active ingredient. The preparation of
such solutions is
within the skill of the art.
4.12 PHARMACEUTICAL . FORMULATIONS AND ROUTES OF
ADMINISTRATION
A protein or other composition of the present invention (from whatever source
derived, including without limitation from recombinant and non-recombinant
sources and
including antibodies and other binding partners of the polypeptides of the
invention) may be
administered to a patient in need, by itself, or in pharmaceutical
compositions where it is
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a
variety of
disorders. Such a composition may optionally contain (in addition to protein
or other active
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers,
solubilizers, and other

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
72
materials well known in the art. The term "pharmaceutically acceptable" means
a non-toxic
material that does not interfere with the effectiveness of the biological
activity of the active
ingredient(s). The characteristics of the carrier will depend on the route of
administration.
The pharmaceutical composition of the invention may also contain cytokines,
lymphokines,
or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3,
IL-4, IL-5,
IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO,
TNF1, TNF2,
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In
further
compositions, proteins of the invention may be combined with other agents
beneficial to the
treatment of the disease or disorder in question. These agents include various
growth factors
such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF),
transforming
growth factors (TGF-oc and TGF-[3), insulin-like growth factor (IGF), as well
as cytokines
described herein.
The pharmaceutical composition may further contain other agents which either
enhance the activity of the protein or other active ingredient or complement
its activity or
use in treatment. Such additional factors and/or agents may be included in the
pharmaceutical composition to produce a synergistic effect with protein or
other active
ingredient of the invention, or to minimize side effects. Conversely, protein
or other active
ingredient of the present invention may be included in formulations of the
particular clotting
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-
thrombotic
factor, or anti- inflammatory agent to minimize side effects of the clotting
factor, cytokine,
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic
factor, or
anti-inflammatory agent (such as IL-lRa, IL-1 Hyl, IL-I Hy2, anti-TNF,
corticosteroids,
immunosuppressive agents). A protein of the present invention may be active in
multimers
(e.g., heterodimers or homodimers) or complexes with itself or other proteins.
As a result,
pharmaceutical compositions of the invention may comprise a protein of the
invention in
such multimeric or complexed foam.
As an alternative to being included in a pharmaceutical composition of the
invention
including a first protein, a second protein or a therapeutic agent may be
concurrently
administered with the first protein (e.g., at the same time, or at differing
times provided that
therapeutic concentrations of the combination of agents is achieved at the
treatment site).
Techniques for formulation and administration of the compounds of the instant
application
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co.,
Easton, PA,
latest edition. A therapeutically effective dose further refers to that amount
of the compound

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
73
sufficient to result in amelioration of symptoms, e.g., treatment, healing,
prevention or
amelioration of the relevant medical condition, or an increase in rate of
treatment, healing,
prevention or amelioration of such conditions. When applied to an individual
active
ingredient, administered alone, a therapeutically effective dose refers to
that ingredient
alone. When applied to a combination, a therapeutically effective dose refers
to combined
amounts of the active ingredients that result in the therapeutic effect,
whether administered
in combination, serially oresimultaneously.
In practicing the method of treatment or use of the present invention, a
therapeutically effective amount of protein or other active ingredient of the
present invention
is administered to a mammal having a condition to be treated. Protein or other
active
ingredient of the present invention may be administered in accordance with the
method of
the invention either alone or in combination with other therapies such as
treatments
employing cytokines, lyrnphokines or other hematopoietic factors. When co-
administered
with one or more cytokines, lymphokines or other hematopoietic factors,
protein or other
active ingredient of the present invention may be administered either
simultaneously with
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or
anti-thrombotic factors, or sequentially. If administered sequentially, the
attending physician
will decide on the appropriate sequence of administering protein or other
active ingredient of
the present invention in combination with cytokine(s), lyrnphokine(s), other
hematopoietic
factor(s), thrombolytic or anti-thrombotic factors.
4.12.1 ROUTES OF ADMINISTRATION
Suitable routes of administration may, for example, include oral, rectal,
transmucosal, or intestinal administration; parenteral delivery, including
intramuscular,
subcutaneous, intramedullary injections, as well as intrathecal, direct
intraventricular,
intravenous, intraperitoneal, intranasal, or intraocular injections.
Administration of protein
or other active ingredient of the present invention used in the pharmaceutical
composition or
to practice the method of the present invention can be carned out in a variety
of conventional
ways, such as oral ingestion, inhalation, topical application or cutaneous,
subcutaneous,
intraperitoneal, parenteral or intravenous injection. Intravenous
administration to the patient
is preferred.
Alternately, one may administer the compound in a local rather than systemic
manner, for example, via injection of the compound directly into a arthritic
joints or in

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
74
fibrotic tissue, often in a depot or sustained release formulation. In order
to prevent the
scarnng process frequently occurring as complication of glaucoma surgery, the
compounds
may be administered topically, for example, as eye drops. Furthermore, one may
administer
the drug in a targeted drug delivery system, for example, in a liposome coated
with a specific
antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes
will be targeted
to and taken up selectively by the afflicted tissue.
The polypeptides of the invention are administered by any route that delivers
an
effective dosage to the desired site of action. The determination of a
suitable route of
administration and an effective dosage for a particular indication is within
the level of skill
in the art. Preferably for wound treatment, one administers the therapeutic
compound
directly to the site. Suitable dosage ranges for the polypeptides of the
invention can be
extrapolated from these dosages or from similar studies in appropriate animal
models.
Dosages can then be adjusted as necessaxy by the clinician to provide maximal
therapeutic
benefit.
4.12.2 COMPOSITIONS/FORMULATIONS
Pharmaceutical compositions for use in accordance with the present invention
thus
may be formulated in a conventional manner using one or more physiologically
acceptable
carriers comprising excipients and auxiliaries which facilitate processing of
the active
compounds into preparations which can be used pharmaceutically. These
pharmaceutical
compositions may be manufactured in a manner that is itself known, e.g., by
means of
conventional mixing, dissolving, granulating, dragee-making, levigating,
emulsifying,
encapsulating, entrapping or lyophilizing processes. Proper formulation is
dependent upon
the route of administration chosen. When a therapeutically effective amount of
protein or
other active ingredient of the present invention is administered orally,
protein or other active
ingredient of the present invention will be in the form of a tablet, capsule,
powder, solution
or elixir. When administered in tablet form, the pharmaceutical composition of
the invention
may additionally contain a solid carrier such as a gelatin or an adjuvant. The
tablet, capsule,
and powder contain from about 5 to 95% protein or other active ingredient of
the present
invention, and preferably from about 25 to 90% protein or other active
ingredient of the
present invention. When administered in liquid form, a liquid carrier such as
water,
petroleum, oils of animal or plant origin such as peanut oil, mineral oil,
soybean oil, or
sesame oil, or synthetic oils may be added. The liquid form of the
pharmaceutical

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
composition may further contain physiological saline solution, dextrose or
other saccharide
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene
glycol. When
administered in liquid form, the pharmaceutical composition c~antains from
about 0.5 to 90%
by weight of protein or other active ingredient of the present invention, and
preferably from
5 about 1 to 50% protein or other active ingredient of the present invention.
When a therapeutically effective amount of protein ox other active ingredient
of the
present invention is administered by intravenous, cutaneous or subcutaneous
injection,
protein or other active ingredient of the present invention will be in the
form of a
pyrogen-free, parenterally acceptable aqueous solution. The preparation of
such parenterally
10 acceptable protein or other active ingredient solutions, having due regard
to pH, isotonicity,
stability, and the like, is within the skill in the art. A preferred
pharmaceutical composition
for intravenous, cutaneous, or subcutaneous injection should contain, in
addition to protein
or other active ingredient of the present invention, an isotonic vehicle such
as Sodium
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and
Sodium Chloride
15 Injection, Lactated Ringer's Injection, or~other vehicle as known in the
art. The
pharmaceutical composition of the present invention may also contain
stabilizers,
preservatives, buffers, antioxidants, or other additives known to those of
skill in the art. For
injection, the agents of the invention may be formulated in aqueous solutions,
preferably in
physiologically compatible buffers such as Hanks's solution, Ringer's
solution, or
20 physiological saline buffer. For transmucosal administration, penetrants
appropriate to the
barrier to be permeated are used in the formulation. Such penetrants are
generally known in
the art.
For oral administration, the compounds can be formulated readily by combining
the
active compounds with pharmaceutically acceptable carriers well knov~m in the
art. Such
25 carriers enable the compounds of the invention to be formulated as tablets,
pills, dragees,
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral
ingestion by a
patient to be treated. Pharmaceutical preparations for oral use can be
obtained from a solid
excipient, optionally grinding a resulting mixture, and processing the mixture
of granules,
after adding suitable auxiliaries, if desired, to obtain tablets or dragee
cores. Suitable
30 excipients are, in particular, fillers such as sugars, including lactose,
sucrose, mannitol, or
sorbitol; cellulose preparations such as, for example, maize starch, wheat
starch, rice starch,
potato starch, gelatin, gum tragacanth, methyl cellulbse, hydroxypropylmethyl-
cellulose,
sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
76
disintegrating agents may be added, such as the cross-linked polyvinyl
pyrrolidone, agar, or
alginic acid or a salt thereof such as sodium alginate. Dragee cores are
provided with
suitable coatings. For this purpose, concentrated sugar solutions may be used,
which may
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel,
polyethylene glycol,
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or
solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for
identification or to
characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules
made
of gelatin, as well as soft, sealed capsules made of gelatin and a
plasticizer, such as glycerol
or sorbitol. The push-fit capsules can contain the active ingredients in
admixture with filler
such as lactose, binders such as starches, and/or lubricants such as talc or
magnesium
stearate and, optionally, stabilizers. In soft capsules, the active compounds
may be dissolved
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or
liquid polyethylene
glycols. In addition, stabilizers may be added. All formulations for oral
administration
should be in dosages suitable for such administration. For buccal
administration, the
compositions may take the form of tablets or lozenges formulated in
conventional manner.
For administration by inhalation, the compounds for use according to the
present
invention are conveniently delivered in the form of an aerosol spray
presentation from
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g.,
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane,
carbon dioxide
or other suitable gas. In the case of a pressurized aerosol the dosage unit
may be determined
by providing a valve to deliver a metered amount. Capsules and cartridges of,
e.g., gelatin
for use in an inhaler or insufflator may be formulated containing a powder mix
of the
compound and a suitable powder base such as lactose or starch. The compounds
may be
formulated for parenteral administration by injection, e.g., by bolus
injection or continuous
infusion. Formulations for injection may be presented in unit dosage form,
e.g., in ampules
or in mufti-dose containers, with an added preservative. The compositions may
take such
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and
may contain
formulatory agents such as suspending, stabilizing and/or dispersing agents.
Pharmaceutical formulations for parenteral administration include aqueous
solutions
of the active compounds in water-soluble form. Additionally, suspensions of
the active
compounds may be prepared as appropriate oily injection suspensions. Suitable
lipophilic
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty
acid esters, such

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
77
as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions
may contain
substances which increase the viscosity of the suspension, such as sodium
carboxymethyl
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain
suitable
stabilizers or agents which increase the solubility of the compounds to allow
for the
preparation of highly concentrated solutions. Alternatively, the active
ingredient may be in
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-
free water, before
use.
The compounds may also be formulated in rectal compositions such as
suppositories
or retention enemas, e.g., containing conventional suppository bases such as
cocoa butter or
other glycerides. In addition to the formulations described previously, the
compounds may
i
also be formulated as a depot preparation. Such long acting formulations may
be
administered by implantation (for example subcutaneously or intramuscularly)
or by
intramuscular injection. Thus, for example, the compounds may be formulated
with suitable
polymeric or hydrophobic materials (for example as an emulsion in an
acceptable oil) or ion
exchange resins, or as sparingly soluble derivatives, for example, as a
sparingly soluble salt.
A pharmaceutical carrier for the hydrophobic compounds of the invention is a
co-
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-
miscible organic
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent
system.
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant
polysorbate
80, and 65% w/v polyethylene glycol 300, made up to volume in absolute
ethanol. The VPD
co-solvent system (VPD:SW) consists of VPD diluted 1:1 with a 5% dextrose in
water
solution. This co-solvent system dissolves hydrophobic compounds well, and
itself produces
low toxicity upon systemic administration. Naturally, the proportions of a co-
solvent system
may be varied considerably without destroying its solubility and toxicity
characteristics.
Furthermore, the identity of the co-solvent components may be varied: for
example, other
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the
fraction size of
polyethylene glycol may be varied; other biocompatible polyners may replace
polyethylene
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may
substitute for
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical
compounds
may be employed. Liposomes and emulsions are well known examples of delivery
vehicles
or Garners for hydrophobic drugs. Certain organic solvents such as
dimethylsulfoxide also
may be employed, although usually at the cost of greater toxicity.
Additionally, the
compounds may be delivered using a sustained-release system, such as
semipermeable

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
78
matrices of solid hydrophobic polymers containing the therapeutic agent.
Various types of
sustained-release materials have been established and are well known by those
skilled in the
art. Sustained-release capsules may, depending on their chemical nature,
release the
compounds for a few weeks up to over 100 days. Depending on the chemical
nature and the
biological stability of the therapeutic reagent, additional strategies for
protein or other active
ingredient stabilization may be employed.
The pharmaceutical compositions also may comprise suitable solid or gel phase
Garners or excipients. Examples of such carriers or excipients include but are
not limited to
calcium carbonate, calcium phosphate, various sugars, starches, cellulose
derivatives,
gelatin, and polymers such as polyethylene glycols. Many of the active
ingredients of the
invention may be provided as salts with pharmaceutically compatible counter
ions. Such ,
pharmaceutically acceptable base addition salts are those salts which retain
the biological
effectiveness and properties of the free acids and which are obtained by
reaction with
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide,
ammonia,
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium
acetate,
potassium benzoate, triethanol amine and the like.
The pharmaceutical composition of the invention may be in the form of a
complex of
the proteins) or other active ingredients) of present invention along with
protein or peptide
antigens. The protein and/or peptide antigen will deliver a stimulatory signal
to both B and T
lymphocytes. B lymphocytes will respond to antigen through their surface
imm.unoglobulin
receptor. T lymphocytes will respond to antigen through the T cell receptor
(TCR)
following presentation of the antigen by MHC proteins. MHC and structurally
related
proteins including those encoded by class I and class II MHC genes on host
cells will serve
to present the peptide antigens) to T lymphocytes. The antigen components
could also be
supplied as purified MHC-peptide complexes alone or with co-stimulatory
molecules that
can directly signal T cells. Alternatively antibodies able to bind surface
immunoglobulin
and other molecules on B cells as well as antibodies able to bind the TCR and
other
molecules on T cells can be combined with the pharmaceutical composition of
the invention.
The pharmaceutical composition of the invention may be in the form of a
liposome in
which protein of the present invention is combined, in addition to other
pharmaceutically
acceptable carriers, with amphipathic agents such as lipids which exist in
aggregated form as
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous
solution.
Suitable lipids for liposomal formulation include, without limitation,
monoglycerides,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
79
diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids,
and the like.
Preparation of such liposomal formulations is within the level of skill in the
art, as disclosed,
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and
4,737,323, all of
which are incorporated herein by reference.
The amount of protein or other active ingredient of the present invention in
the
pharmaceutical composition of the present invention will depend upon the
nature and
severity of the condition being treated, and on the nature of prior treatments
which the
patient has undergone. Ultimately, the attending physician will decide the
amount of protein
or other active ingredient of the present invention with which to treat each
individual patient.
Initially, the attending physician will administer low doses of protein or
other active
ingredient of the present invention and observe the patient's response. Larger
doses of
protein or other active ingredient of the present invention may be
administered until the
optimal therapeutic effect is obtained for the patient, and at that point the
dosage is not
increased further. It is contemplated that the various pharmaceutical
compositions used to
practice the method of the present invention should contain about 0.01 ~g to
about 100 mg
(preferably about 0.1 ~,g to about 10 mg, more preferably about 0.1 ~,g to
about 1 mg) of
protein or other active ingredient of the present invention per kg body
weight. For
compositions of the present invention which are useful for bone, cartilage,
tendon or
ligament regeneration, the therapeutic method includes administering the
composition
topically, systematically, or locally as an implant or device. When
administered, the
therapeutic composition for use in this invention is, of course, in a pyrogen-
free,
physiologically acceptable form. Further, the composition may desirably be
encapsulated or
injected in a viscous form for delivery to the site of bone, cartilage or
tissue damage.
Topical administration may be suitable for wound healing and tissue repair.
Therapeutically
useful agents other than a protein or other active ingredient of the invention
which may also
optionally be included in the composition as described above, may
alternatively or
additionally, be administered simultaneously or sequentially with the
composition in the
methods of the invention. Preferably for bone and/or cartilage formation, the
composition
would include a matrix capable of delivering the protein-containing or other
active
ingredient-containing composition to the site of bone and/or cartilage damage,
providing a
structure for the developing bone and cartilage and optimally capable of being
resorbed into
the body. Such matrices may be formed of materials presently in use for other
implanted
medical applications.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The choice of matrix material is based on biocompatibility, biodegradability,
mechanical properties, cosmetic appearance and interface properties. The
particular
application of the compositions will define the appropriate formulation.
Potential matrices
for the compositions may be biodegradable arid chemically defined calcium
sulfate,
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and
polyanhydrides.
Other potential materials are biodegradable and biologically well-defined,
such as bone or
dermal collagen. Further matrices are comprised of pure proteins or
extracellular matrix
components. Other potential matrices are nonbiodegradable and chemically
defined, such as
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may
be comprised
10 of combinations of any of the above-mentioned types of material, such as
polylactic acid and
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be
altered in
composition, such as in calcium-aluminate-phosphate and processing to alter
pore size,
particle size, particle shape, and biodegradability. Presently preferred is a
50:50 (mole
weight) copolymer of lactic acid and glycolic acid in the form of porous
particles having
15 diameters ranging from 150 to 800 microns. In some applications, it will be
useful to utilize
a sequestering agent, such as carboxymethyl cellulose or autologous blood
clot, to prevent
the protein compositions from disassociating from the matrix.
A preferred family of sequestering agents is cellulosic materials such as
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose,
20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose,
hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred
being
cationic salts of carboxyrnethylcellulose (CMC). Other preferred sequestering
agents
include hyaluronic acid, sodium alginate, polyethylene glycol),
polyoxyethylene oxide,
carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent
useful
25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation
weight, which
represents the amount necessary to prevent desorption of the protein from the
polymer
matrix and to provide appropriate handling of the composition, yet not so much
that the
progenitor cells are prevented from infiltrating the matrix, thereby providing
the protein the
opportunity to assist the osteogenic activity of the progenitor cells. In
further compositions,
30 proteins or other active ingredients of the invention may be combined with
other agents
beneficial to the treatment of the bone and/or cartilage defect, wound, or
tissue in question.
These agents include various growth factors such as epidermal growth factor
(EGF), platelet

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
81
derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(3),
and
insulin-like growth factor (IGF).
The therapeutic compositions are also presently valuable for veterinary
applications.
Particularly domestic animals and thoroughbred horses, in addition to humans,
are desired
patients for such treatment with proteins or other active ingredients of the
present invention.
The dosage regimen of a protein-containing pharmaceutical composition to be
used in tissue
regeneration will be determined by the attending physician considering various
factors which
modify the action of the proteins, e.g., amount of tissue weight desired to be
formed, the site
of damage, the condition of the damaged tissue, the size of a wound, type of
damaged tissue
(e.g., bone), the patient's age, sex, and diet, the severity of any infection,
time of
administration and other clinical factors. The dosage may vary with the type
of matrix used
in the reconstitution and with inclusion of other proteins in the
pharmaceutical composition.
For example, the addition of other known growth factors, such as IGF I
(insulin like growth
factor I), to the final composition, may also effect the dosage. Progress can
be monitored by
periodic assessment of tissue/bone growth and/or repair, for example, X-rays,
histomorphometric determinations and tetracycline labeling.
Polynucleotides of the present invention can also be used for gene therapy.
Such
polynucleotides can be introduced either in vivo or ex vivo into cells for
expression in a
mammalian subject. Polynucleotides of the invention may also be administered
by other
known methods for introduction of nucleic acid into a cell or organism
(including, without
limitation, in the form of viral vectors or naked DNA). Cells may also be
cultured ex vivo in
the presence of proteins of the present invention in order to proliferate or
to produce a
desired effect on or activity in such cells. Treated cells can then be
introduced in vivo for
therapeutic purposes.
4.12.3 EFFECTIVE DOSAGE
Pharmaceutical compositions suitable for use in the present invention include
compositions wherein the active ingredients are contained in an effective
amount to achieve
its intended purpose. More specifically, a therapeutically effective amount
means an amount
effective to prevent development of or to alleviate the existing symptoms of
the subject
being treated. Determination of the effective amount is well within the
capability of those
skilled in the art, especially in light of the detailed disclosure provided
herein. For any
compound used in the method of the invention, the therapeutically effective
dose can be

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
82
estimated initially from appropriate in vitro assays. For example, a dose can
be formulated in
animal models to achieve a circulating concentration range that can be used to
more
accurately determine useful doses in humans. For example, a dose can be
formulated in
animal models to achieve a circulating concentration range that includes the
ICso as
determined in cell culture (i. e., the concentration of the test compound
which achieves a
half maximal inhibition of the protein's biological activity). Such
information can be used
to more accurately determine useful doses in humans.
A therapeutically effective dose refers to that amount of the compound that
results in
amelioration of symptoms or a prolongation of survival in a patient. Toxicity
and therapeutic
efficacy of such compounds can be determined by standard pharmaceutical
procedures in
cell cultures or experimental animals, e.g., for determining the LDso (the
dose lethal to 50%
of the population) and the EDSO (the dose therapeutically effective in 50% of
the population).
The dose ratio between toxic and therapeutic effects is the therapeutic index
and it can be
expressed as the ratio between LDso and EDso. Compounds which exhibit high
therapeutic
indices are preferred. The data obtained from these cell culture assays and
animal studies
can be used in formulating a range of dosage for use in human. The dosage of
such
compounds lies preferably within a range of circulating concentrations that
include the EDSo
with little or no toxicity. The dosage may vary within this range depending
upon the dosage
form employed and the route of administration utilized. The exact formulation,
route of
administration and dosage can be chosen by the individual physician in view of
the patient's
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of
Therapeutics", Ch.
1 p.1. Dosage amount and interval may be adjusted individually to provide
plasma levels of
the active moiety which are sufficient to maintain the desired effects, or
minimal effective
concentration (MEC). The MEC will vary for each compound but can be estimated
from ire
vitro data. Dosages necessary to achieve the MEC will depend on individual
characteristics
and route of administration. However, HPLC assays or bioassays can be used to
determine
plasma concentrations.
Dosage intervals can also be determined using MEC value. Compounds should be
administered using a regimen which maintains plasma levels above the MEC for
10-90% of
the time, preferably between 30-90% and most preferably between 50-90%. In
cases of local
administration or selective uptake, the effective local concentration of the
drug may not be
related to plasma concentration.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
83
An exemplary dosage regimen for polypeptides or other compositions of the
invention will be in the range of about 0.01 ~g/kg to 100 mg/kg of body weight
daily, with
the preferred dose being about 0.1 q.g/kg to 25 mg/kg of patient body weight
daily, varying
in adults and children. Dosing may be once daily, or equivalent doses may be
delivered at
longer or shorter intervals.
The amount of composition administered will, of course, be dependent on the
subject
being treated, on the subject's age and weight, the severity of the
affliction, the manner of
administration and the judgment of the prescribing physician.
4.12.4 PACKAGING
The compositions may, if desired, be presented in a pack or dispenser device
which
may contain one or more unit dosage forms containing the active ingredient.
The pack may,
for example, comprise metal or plastic foil, such as a blister pack. The pack
or dispenser
device may be accompanied by instructions for administration. Compositions
comprising a
compound of the invention formulated in a compatible pharmaceutical carrier
may also be
prepared, placed in an appropriate container, and labeled for treatment of an
indicated
condition.
4.13 ANTIBODIES
Also included in the invention are antibodies to proteins, or fragments of
proteins of
the invention. The term "antibody" as used herein refers to immunoglobulin
molecules and
immunologically active portions of immunoglobulin (Ig) molecules, i.e.,
molecules that
contain an antigen-binding site that specifically binds (inununoreacts with)
an antigen. Such
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric,
single chain,
Fab, Fab' and F~ab~>2 fragments, and an Fib expression library. In general, an
antibody molecule
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD,
which differ
from one another by the nature of the heavy chain present in the molecule.
Certain classes
have subclasses as well, such as IgGI, IgG2, and others. Furthermore, in
humans, the light
chain may be a kappa chain or a lambda chain. Reference herein to antibodies
includes a
reference to all such classes, subclasses and types of human antibody species.
An isolated related protein of the invention may be intended to serve as an
antigen, or
a portion or fragment thereof, and additionally can be used as an immunogen to
generate
antibodies that immunospecifically bind the antigen, using standard techniques
for

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
84
polyclonal and monoclonal antibody preparation. The full-length protein can be
used or,
alternatively, the invention provides antigenic peptide fragments of the
antigen for use as
immunogens. An antigenic peptide fragment comprises at least 6 amino acid
residues of the
amino acid sequence of the full length protein, such as an amino acid sequence
shown in
SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses
an epitope
thereof such that an antibody raised against the peptide forms a specific
immune complex
with the full length protein or with any fragment that contains the epitope.
Preferably, the
antigenic peptide comprises at least 10 amino acid residues, or at least 15
amino acid
residues, or at least 20 amino acid residues, or at least 30 amino acid
residues. Preferred
epitopes encompassed by the antigenic peptide are regions of the protein that
are located on
its surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by
the
antigenic peptide is a surface region of the protein, e.g., a hydrophilic
region. A
hydrophobicity analysis of the human related protein sequence will indicate
which regions of
a related protein are particularly hydrophilic and, therefore, are likely to
encode surface
residues useful for targeting antibody production. As a means for targeting
antibody
production, hydropathy plots showing regions of hydrophilicity and
hydrophobicity may be
generated by any method well known in the art, including, for example, the
I~yte Doolittle or
the Hopp Woods methods, either with or without Fourier transformation. See,
e.g., Hopp and
Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982,
J. Mol.
Biol. 157: 105-142, each of which is incorporated herein by reference in its
entirety.
Antibodies that axe specific for one or more domains within an antigenic
protein, or
derivatives, fragments, analogs or homologs thereof, are also provided herein.
A protein of the invention, or a derivative, fragment, analog, homolog or
ortholog
thereof, may be utilized as an immunogen in the generation of antibodies that
immunospecifically bind these protein components.
The term "specific for" indicates that the variable regions of the antibodies
of the
invention recognize and bind polypeptides of the invention exclusively (i.e.,
able to
distinguish the polypeptide of the invention from other similar polypeptides
despite sequence
identity, homology, or similarity found in the family of polypeptides), but
may also interact
with other proteins (for example, S. aureus protein A or other antibodies in
ELISA
techniques) through interactions with sequences outside the variable region of
the antibodies,
and in particular, in the constant region of the molecule. Screening assays to
determine

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
binding specificity of an antibody of the invention are well known and
routinely practiced in
the art. For a comprehensive discussion of such assays, see Harlow et al.
(Eds), Antibodies
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY
(1988),
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of
the
5 invention axe also contemplated, provided that the antibodies are first and
foremost specific
for, as defined above, full-length polypeptides of the invention. As with
antibodies that are
specific for full length polypeptides of the invention, antibodies of the
invention that
recognize fragments are those which can distinguish polypeptides from the same
family of
polypeptides despite inherent sequence identity, homology, or similarity found
in the family
10 of proteins.
Antibodies of the invention are useful for, for example, therapeutic purposes
(by
modulating activity of a polypeptide of the invention), diagnostic purposes to
detect or
quantitate a polypeptide of the invention, as well as purification of a
polypeptide of the
invention. Kits comprising an antibody of the invention for any of the
purposes described
15 herein are also comprehended. In general, a kit of the invention also
includes a control
antigen for which the antibody is immunospecific. The invention further
provides a
hybridoma that produces an antibody according to the invention. Antibodies of
the
invention are useful for detection and/or purification of the polypeptides of
the invention.
Monoclonal antibodies binding to the protein of the invention may be useful
20 diagnostic agents for the immunodetection of the protein. Neutralizing
monoclonal
antibodies binding to the protein may also be useful therapeutics for both
conditions
associated with the protein and also in the treatment of some forms of cancer
where
abnormal expression of the protein is involved. In the case of cancerous cells
or leukemic
cells, neutralizing monoclonal antibodies against the protein may be useful in
detecting and
25 preventing the metastatic spread of the cancerous cells, which may be
mediated by the
protein.
The labeled antibodies of the present invention can be used for i~r
vita°o, iya vivo, and
in situ assays to identify cells or tissues in which a fragment of the
polypeptide of interest is
expressed. The antibodies may also be used directly in therapies or other
diagnostics. The
30 present invention further provides the above-described antibodies
immobilized on a solid
support. Examples of such solid supports include plastics such as
polycarbonate, complex
carbohydrates such as agarose and Sepharose~, acrylic resins and such as
polyacrylamide
and latex beads. Techniques for coupling antibodies to such solid supports are
well known

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
86
in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed.,
Blackwell
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et
al., Meth.
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the
present
invention can be used for in vitro, ifa vivo, and i~a situ assays as well as
for immuno-affinity
purification of the proteins of the present invention.
Various procedures known within the art may be used for the production of
polyclonal or monoclonal antibodies directed against a protein of the
invention, or against
derivatives, fragments, analogs homologs or orthologs thereof (see, for
example, Antibodies:
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory
Press,
Cold Spring Harbor, NY, incorporated herein by reference). Some of these
antibodies are
discussed below.
4.13.1 POLYCLONAL ANTIBODIES
For the production of polyclonal antibodies, various suitable host animals
(e.g.,
rabbit, goat, mouse or other mammal) may be immunized by one or more
injections with the
native protein, a synthetic variant thereof, or a derivative of the foregoing.
An appropriate
immunogenic preparation can contain, for example, the naturally occurring
immunogenic
protein, a chemically synthesized polypeptide representing the immunogenic
protein, or a
recombinantly expressed inununogenic protein. Furthermore, the protein may be
conjugated
to a second protein known to be immunogenic in the mammal being immunized.
Examples
of such immunogenic proteins include but are not limitedrto keyhole limpet
hemocyanin,
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The
preparation can
further include an adjuvant. Various adjuvants used to increase the
immunological response
include, but are not limited to, Freund's (complete and incomplete), mineral
gels (e.g.,
aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic
polyols,
polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in
humans such as
Bacille Calmette-Guerin and Corynebacterium parvum, or similar
immunostimulatory
agents. Additional examples of adjuvants that can be employed include MPL-TDM
adjuvant
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate).
The polyclonal antibody molecules directed against the immunogenic protein can
be
isolated from the mammal (e.g., from the blood) and further purified by well
known
techniques, such as affinity chromatography using protein A or protein G,
which provide
primarily the IgG fraction of immune serum. Subsequently, or alternatively,
the specific

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
87
~i
antigen which is the target of z~ i~~~~~~wsr~~iin sought, or an epitope
thereof, may be
imrri~bilized on a column to purify the immune specific antibody by
immunoaffinity
chromatography. Purification of immunoglobulins is discussed, for example, by
D.
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA,
Vol. 14, No. 8
(April 17, 2000), pp. 25-28).
4.13.2 MONOCLONAL ANTIBODIES
The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as
used herein, refers to a population of antibody molecules that contain only
one molecular
species of antibody molecule consisting of a unique light chain gene product
and a unique
heavy chain gene product. In pauticular, the complementarity determining
regions (CDRs)
of the monoclonal antibody are identical in all the molecules of the
population. MAbs thus
contain an antigen-binding site capable of immunoreacting with a particular
epitope of the
antigen characterized by a unique binding affinity for it.
Monoclonal antibodies c'an be prepared using hybridoma methods, such as those
described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma
method, a
mouse, hamster, or other appropriate host animal, is typically immunized with
an
immunizing agent to elicit lymphocytes that produce or are capable of
producing antibodies
that will specifically bind to the immunizing agent. Alternatively, the
lymphocytes can be
immunized in vitro.
The innnunizing agent will typically include the protein antigen, a fragment
thereof
or a fusion protein thereof. Generally, either peripheral blood lymphocytes
are used if cells
of human origin are desired, or spleen cells or lymph node cells are used if
non-human
mammalian sources are desired. The lymphocytes are then fused with an
immortalized cell
line using a suitable fusing agent, such as polyethylene glycol, to form a
hybridoma cell
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press,
(1986) pp. 59-
103). Immortalized cell lines are usually transformed mammalian cells,
particularly
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse
myeloma cell
lines are employed. The hybridoma cells can be cultured in a suitable culture
medium that
preferably contains one or more substances that inhibit the growth or survival
of the unfused,
immortalized cells. For example, if the parental cells lack the enzyme
hypoxanthine guanine
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the
hybridomas

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
88
typically will include hypoxanthine, aminopterin, and thymidine ("HAT
medium"), which
substances prevent the growth of HGPRT-deficient cells.
Preferred immortalized cell lines are those that fuse efficiently, support
stable high
level expression of antibody by the selected antibody-producing cells, and are
sensitive to a
medium such as HAT medium. More preferred immortalized cell lines are marine
myeloma
lines, which can be obtained, for instance, from the Salk Institute Cell
Distribution Center,
San Diego, California and the American Type Culture Collection, Manassas,
Virginia.
Human myeloma and mouse-human heteromyeloma cell lines also have been
described for
the production of human monoclonal antibodies (Kozbor, J. linmunol., 133:3001
(1984);
Brodeur et al., Monoclonal Antibody Production Techniques and Applications,
Marcel
Dekker, Inc., New York, (1987) pp. 51-63).
The culture medium in which the hybridoma cells are cultured can then be
assayed
for the presence of monoclonal antibodies directed against the antigen.
Preferably, the
binding specificity of monoclonal antibodies produced by the hybridoma cells
is determined
by immunoprecipitation or by an in vitro binding assay, such as
radioimmunoassay (RIA) or
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are
known in
the art. The binding affinity of the monoclonal antibody can, for example, be
determined by
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980).
Preferably,
antibodies having a high degree of specificity and a high binding affinity for
the target
antigen are isolated.
After the desired hybridoma cells are identified, the clones can be subcloned
by
limiting dilution procedures and grown by standard methods. Suitable culture
media for this
purpose include, far example, Dulbecco's Modifed Eagle's Medimn and RPMI-1640
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in
a mammal.
The monoclonal antibodies secreted by the subclones can be isolated or
purified from
the culture medium or ascites fluid by conventional immunoglobulin
purification procedures
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel
electrophoresis, dialysis, or affinity chromatography.
The monoclonal antibodies can also be made by recombinant DNA methods, such as
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal
antibodies of
the invention can be readily isolated and sequenced using conventional
procedures (e.g., by
using oligonucleotide probes that are capable of binding specifically to genes
encoding the
heavy and light chains of marine antibodies). The hybridoma cells of the
invention serve as

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
89
a preferred source of such DNA. Once isolated, the DNA can be placed into
expression
vectors, which are then transfected into host cells such as simian COS cells,
Chinese hamster
ovary (CHO) cells, or myeloma cells that do not otherwise produce
immunoglobulin protein,
to obtain the synthesis of monoclonal antibodies in the recombinant host
cells. The DNA
also can be modified, for example, by substituting the coding sequence for
human heavy and
light chain constant domains in place of the homologous rnurine sequences
(LJ.S. Patent No.
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to
the
immunoglobulin coding sequence all or part of the coding sequence for a non-
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be
substituted
for the constant domains of an antibody of the invention, or can be
substituted for the
variable domains of one antigen-combining site of an antibody of the invention
to create a
chimeric bivalent antibody.
4.13.3 HUMANIZED ANTIBODIES
The antibodies directed against the protein antigens of the invention can
further
comprise humanized antibodies or human antibodies. These antibodies are
suitable for
administration to humans without engendering an irninune response by the human
against
the administered immunoglobulin. Humanized forms of antibodies are chimeric
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab,
Fab',
F(ab')Z or other antigen-binding subsequences of antibodies) that are
principally comprised
of the sequence of a human immunoglobulin, and contain minimal sequence
derived from a
non-human immunoglobulin. Humanization can be performed following the method
of
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et
al., Nature,
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by
substituting
rodent CDRs or CDR sequences for the corresponding sequences of a human
antibody. (See
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of
the human
immunoglobulin are replaced by corresponding non-human residues. Humanized
antibodies
can also comprise residues that are found neither in the recipient antibody
nor in the
imported CDR or framework sequences. W general, the humanized antibody will
comprise
substantially all of at least one, and typically two, variable domains, in
which all or
substantially all of the CDR regions correspond to those of a non-human
immunoglobulin
and all or substantially all of the framework regions are those of a human
immunoglobulin
consensus sequence. The humanized antibody optimally also will comprise at
least a portion

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
of an immtmoglobulin constant region (Fc), typically that of a human
immunoglobulin
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct.
Biol., 2, 593-596
(1992)).
5 4.13.4 HUMAN ANTIBODIES
' Fully human antibodies relate to antibody molecules in which essentially the
entire
sequences of both the light chain and the heavy chain, including the CDRs,
arise from
human genes. Such antibodies are termed "human antibodies", or "fully human
antibodies"
herein. Human monoclonal antibodies can be prepared by the trioma technique;
the human
10 B-cell hybridoma technique (see Kozbor, et al., 1983 linmunol Today 4: 72)
and the EBV
hybridoma technique to produce human monoclonal antibodies (see Cole, et al.,
1985 In:
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).
Human
monoclonal antibodies may be utilized in the practice of the present invention
and may be
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci
USA 80,
15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in
vitro (see Cole, et
aL, 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp.
77-96).
In addition, human antibodies can also be produced using additional
techniques,
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227,
381 (1991);
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can
be made by
20 introducing human immunoglobulin loci into transgenic animals, e.g., mice
in which the
endogenous immunoglobulin genes have been partially or completely inactivated.
Upon
challenge, human antibody production is observed, which closely resembles that
seen in
humans in all respects, including gene rearrangement, assembly, and antibody
repertoire.
This approach is described, for example, in U.S. Patent Nos. 5,545,807;
5,545,806;
25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al.
(Bio/Technology 10, 779-
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature
368, 812-13
(1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger
(Nature
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol.
13, 65-93
(1995)).
30 Human antibodies may additionally be produced using transgenic nonhuman
animals
that are modified so as to produce fully human antibodies rather than the
animal's
endogenous antibodies in response to challenge by an antigen. (See PCT
publication
W094/02602). The endogenous genes encoding the heavy and light immunoglobulin
chains

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
91
in the nonhuman host have been incapacitated, and active loci encoding human
heavy and
light chain immunoglobulins are inserted into the host's genome. The human
genes are
incorporated, for example, using yeast artificial chromosomes containing the
requisite
human DNA segments. An animal which provides all the desired modifications is
then
obtained as progeny by crossbreeding intermediate transgenic animals
containing fewer than
the full complement of the modifications. The preferred embodiment of such a
nonhuman
animal is a mouse, and is termed the XenomouseTM as disclosed in PCT
publications WO
96/33735 and WO 96/34096. This animal produces B cells that secrete fully
human
immunoglobulins. The antibodies can be obtained directly from the animal after
immunization with an immunogen of interest, as, for example, a preparation of
a polyclonal
antibody, or alternatively from immortalized B cells derived from the animal,
such as
hybridomas producing monoclonal antibodies. Additionally, the genes encoding
the
immunoglobulins with human variable regions can be recovered and expressed to
obtain the
antibodies directly, or can be further modified to obtain analogs of
antibodies such as, for
example, single chain Fv molecules.
An example of a method of producing a nonhuman host, exemplified as a mouse,
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in
U.S.
Patent No. 5,939,598. It can be obtained by a method including deleting the J
segment genes
from at least one endogenous heavy chain locus in an embryonic stem cell to
prevent
rearrangement of the locus and to prevent formation of a transcript of a
rearranged
immunoglobulin heavy chain locus, the deletion being effected by a targeting
vector
containing a gene encoding a selectable marker; and producing from the
embryonic stem cell
a transgenic mouse whose somatic and germ cells contain the gene encoding the
selectable
marker.
A method for producing an antibody of interest, such as a human antibody, is
disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression
vector that
contains a nucleotide sequence encoding a heavy chain into one mammalian host
cell in
culture, introducing an expression vector containing a nucleotide sequence
encoding a light
chain into another mammalian host cell, and fusing the two cells to form a
hybrid cell. The
hybrid cell expresses an antibody containing the heavy chain and the light
chain.
In a further improvement on this procedure, a method for identifying a
clinically .
relevant epitope on an immunogen, and a correlative method for selecting an
antibody that

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
92
binds immunospecifically to the relevant epitope with high affinity, are
disclosed in PCT
publication WO 99/53049.
4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES
According to the invention, techniques can be adapted for the production of
single-chain antibodies specific to an antigenic protein of the invention (see
e.g., LJ.S. Patent
No. 4,946,778). In addition, methods can be adapted for the construction of
Fab expression
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid
and effective
identification of monoclonal Fab fragments with the desired specificity for a
protein or
derivatives, fragments, analogs or homologs thereof. Antibody fragments that
contain the
idiotypes to a protein antigen may be produced by techniques known in the art
including, but
not limited to: (i) an F(ab')z fragment produced by pepsin digestion of an
antibody molecule;
(ii) an Fab fragment generated by reducing the disulfide bridges of an F~~b~~2
fragment; (iii) an
Fab fragment generated by the treatment of the antibody molecule with papain
and a reducing
agent and (iv) F~ fragments.
4.13.6 BISPECIFIC ANTIBODIES
Bispecific antibodies are monoclonal, preferably human or humanized,
antibodies
that have binding specificities for at least two different antigens. In the
present case, one of
the binding specificities is for an antigenic protein of the invention. The
second binding
target is any other antigen, and advantageously is a cell-surface protein or
receptor or
receptor subunit.
Methods for making bispecific antibodies are known in the art. Traditionally,
the
recombinant production of bispecific antibodies is based on the co-expression
of two
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have
different
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of
the random
assortment of immunoglobulin heavy and light chains, these hybridomas
(quadromas)
produce a potential mixture of ten different antibody molecules, of which only
one has the
correct bispecific structure. The purification of the correct molecule is
usually accomplished
by affinity chromatography steps. Similar procedures are disclosed in WO
93/08829,
published 13 May 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659.
Antibody variable domains with the desired binding specificities (antibody-
antigen
combining sites) can be fused to immunoglobulin constant domain sequences. The
fusion

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
93
preferably is with an immunoglobulin heavy-chain constant domain, comprising
at least part
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-
chain constant
region (CH1) containing the site necessary for light-chain binding present in
at least one of
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if
desired, the
immunoglobulin light chain, are inserted into separate expression vectors, and
are co-
transfected into a suitable host organism. For further details of generating
bispecific
antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210
(1986).
According to another approach described in WO 96/27011, the interface between
a
pair of antibody molecules can be engineered to maximize the percentage of
heterodimers
that are recovered from recombinant cell culture. The preferred interface
comprises at least
a part of the CH3 region of an antibody constant domain. In this method, one
or more small
amino acid side chains from the interface of the first antibody molecule are
replaced with
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of
identical or
similar size to the large side chains) are created on the interface of the
second antibody
molecule by replacing large amino acid side chains with smaller ones (e.g.
alanine or
threonine). This provides a mechanism for increasing the yield of the
heterodimer over other
unwanted end-products such as homodimers.
Bispecific antibodies can be prepared as full-length antibodies or antibody
fragments
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific
antibodies from
antibody fragments have been described in the Literature. For example,
bispecific antibodies
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985)
describe a
procedure wherein intact antibodies are proteolytically cleaved to generate
F(ab')2
fragments. These fragments are reduced in the presence of the dithiol
complexing agent
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular
disulfide formation.
The Fab' fragments generated are then converted to thionitrobenzoate (TNB)
derivatives.
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by
reduction with
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB
derivative to form the bispecific antibody. The bispecific antibodies produced
can be used
as agents for the selective immobilization of enzymes.
Additionally, Fab' fragments can be directly recovered from E. coli and
chemically
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med_ 175, 217-
225 (1992)
describe the production of a fully humanized bispecific antibody F(ab')2
molecule. Each
Fab' fragment was separately secreted from E. coli and subjected to directed
chemical

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
94
coupling in vitro to form the bispecific antibody. The bispecific antibody
thus formed was
able to bind to cells overexpressing the ErbB2 receptor and normal human T
cells, as well as
trigger the lytic activity of human cytotoxic lymphocytes against human breast
tumor targets.
Various techniques for making and isolating bispecific antibody fragments
directly
from recombinant cell culture have also been described. For example,
bispecific antibodies
have been produced using leucine zippers. I~ostelny et al., J. Immunol.
148(5), 1547-1553
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked
to the Fab'
portions of two different antibodies by gene fusion. The antibody homodimers
were reduced
at the hinge region to form monomers and then re-oxidized to form the antibody
heterodimers. This method can also be utilized for the production of antibody
homodimers.
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci.
USA 90,
6444-6448 (1993) has provided an alternative mechanism for making bispecific
antibody
fragments. The fragments comprise a heavy-chain variable domain (VH) coimected
to a
light-chain variable domain (VL) by a linker which is too short to allow
pairing between the
two domains on the same chain. Accordingly, the VH and VL domains of one
fragment are
forced to pair with the complementary VL and VH domains of another fragment,
thereby
forming two antigen-binding sites. Another strategy for making bispecific
antibody
fragments by the use of single-chain Fv (sFv) dimers has also been reported.
See, Gruber et
al., J. Immunol. 152, 5368 (1994).
Antibodies with more than two valencies are contemplated. For example,
trispecific
antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991).
Exemplary bispecific antibodies can bind to two different epitopes, at least
one of
which originates in the protein antigen of the invention. Alternatively, an
anti-antigenic arm
of an irnmunoglobulin molecule can be combined with an arm which binds to a
triggering
molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3,
CD28, or B7),
or Fc receptors for IgG (Fc~yR), such as Fc~yRI (CD64), Fc~yRII (CD32) and
Fc°yRIII (CD16)
so as to focus cellular defense mechanisms to the cell expressing the
particular antigen.
Bispecific antibodies can also be used to direct cytotoxic agents to cells
which express a
particular antigen. These antibodies possess an antigen-binding arm and an arm
which binds
a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or
TETA.
Another bispecific antibody of interest binds the protein antigen described
herein and further
binds tissue factor (TF).

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4.13.7 HETEROCONJUGATE ANTIBODIES
Heteroconjugate antibodies are also within the scope of the present invention.
Heteroconjugate antibodies are composed of two covalently joined antibodies.
Such
antibodies have, for example, been proposed to target immune system cells to
unwanted cells
5 (IJ.S. Patent No. 4,676,980), and for treatment of HIV infection (WO
91/00360; WO
921200373; EP 03089). It is contemplated that the antibodies can be prepared
in vitro using
known methods in synthetic protein chemistry, including those involving
crosslinking
agents. For example, immunotoxins can be constructed using a disulfide
exchange reaction
or by forming a thioether bond. Examples of suitable reagents for this purpose
include
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for
example, in U.S.
Patent No. 4,676,980.
4.13.8 EFFECTOR FUNCTION ENGINEERING
It can be desirable to modify the antibody of the invention with respect to
effector
15 function, so as to enhance, e.g., the effectiveness of the antibody in
treating cancer. For
example, cysteine residues) can be introduced into the Fc region, thereby
allowing
interchain disulfide bond formation in this region. The homodimeric antibody
thus
generated can have improved internalization capability andlor increased
complement-
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See
Caron et
20 al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Tmmunol., 148, 2918-
2922 (1992).
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared
using
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research,
53, 2560-
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc
regions and can
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et
al.,
25 Anti-Cancer Drug Design, 3, 219-230 (1989).
4.13.9 IMMUNOCONJUGATES
The invention also pertains to immunoconjugates comprising an antibody
conjugated
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an
enzymatically active
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof),
or a radioactive
isotope (i.e., a radioconjugate).
Chemotherapeutic agents useful in the generation of such immunoconjugates have
been described above. Enzymatically active toxins and fragments thereof that
can be used

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
96
include diphtheria A chain, nonbinding active fragments of diphtheria toxin,
exotoxin A
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A
chain,
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca
americana proteins
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin,
sapaonaria
officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin,
enomycin, and the
tricothecenes. A variety of radionuclides are available for the production of
radioconjugated
antibodies. Examples include ZiaBiy3ih i3lln, 9oY, and ls6Re.
Conjugates of the antibody and cytotoxic agent axe made using a variety of
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-
pyridyldithiol) propionate
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as
dimethyl
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes
(such as
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl)
hexanediamine), bis-
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediasnine),
diisocyanates
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as
1,5-difluoro-
2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as
described in
Vitetta et al., Science, 238: 1098 (1987). Carbon-14-labeled 1-
isothiocyanatobenzyl-3-
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating
agent for
conaugation of radionucleotide to the antibody. See W094/11026.
In another embodiment, the antibody can be conjugated to a "receptor" (such
streptavidin) for utilization in tumor pretargeting wherein the antibody-
receptor conjugate is
administered to the patient, followed by removal of unbound conjugate from the
circulation
using a clearing agent and then administration of a "ligand" (e.g., avidin)
that is in turn
conjugated to a cytotoxic agent.
4.14 COMPUTER READAELE SEQUENCES
In one application of this embodiment, a nucleotide sequence of the present
invention
can be recorded on computer readable media. As used herein, "computer readable
media"
refers to any medium which can be read and accessed directly by a computer.
Such media
include, but are not limited to: magnetic storage media, such as floppy discs,
hard disc
storage medium, and magnetic tape; optical storage media such as CD-ROM;
electrical
storage media such as RAM and ROM; and hybrids of these categories such as
magnetic/optical storage media. A skilled artisan can readily appreciate how
any of the

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
97
presently known computer readable mediums can be used to create a manufacture
comprising computer readable medium having recorded thereon a nucleotide
sequence of the
present invention. As used herein, "recorded" refers to a process for storing
information on
computer readable medium. A skilled artisan can readily adopt any of the
presently known
methods for recording information on computer readable medium to generate
manufactures
comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for
creating a
computer readable medium having recorded thereon a nucleotide sequence of the
present
invention. The choice of the data storage structure will generally be based on
the means
chosen to access the stored information. In addition, a variety of data
processor programs
and formats can be used to store the nucleotide sequence information of the
present
invention on computer readable medium. The sequence information can be
represented in a
word processing text file, formatted in commercially-available software such
as WordPerfect
and Microsoft Word, or represented in the form of an ASCII file, stored in a
database
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can
readily adapt any
number of data processor structuring formats (e.g. text file or database) in
order to obtain
computer readable medium having recorded thereon the nucleotide sequence
information of
the present invention.
By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534
or
a representative fragment thereof; or a nucleotide sequence at least 95%
identical to any of
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer
readable form, a
skilled artisan can routinely access the sequence information for a variety of
purposes.
Computer software is publicly available which allows a skilled artisan to
access sequence
information provided in a computer readable medium. The examples which follow
demonstrate how software which implements the BLAST (Altschul et al., J. Mol.
Biol.
215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993))
search
algorithms on a Sybase system is used to identify open reading frames (ORFs)
within a
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be
useful in
producing commercially important proteins such as enzymes used in fermentation
reactions
and in the production of commercially useful metabolites.
As used herein, "a computer-based system" refers to the hardware means,
software
means, and data storage means used to analyze the nucleotide sequence
information of the
present invention. The minimum hardware means of the computer-based systems of
the

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
98
present invention comprises a central processing unit (CPL, input means,
output means, and
data storage means. A skilled artisan can readily appreciate that any one of
the currently
available computer-based systems are suitable for use in the present
invention. As stated
above, the computer-based systems of the present invention comprise a data
storage means
having stored therein a nucleotide sequence of the present invention and the
necessary
hardware means and software means for supporting and implementing a search
means. As
used herein, "data storage means" refers to memory which can store nucleotide
sequence
information of the present invention, or a memory access means which can
access
manufactures having recorded thereon the nucleotide sequence infornlation of
the present
invention.
As used herein, "search means" refers to one or more programs which are
implemented on the computer-based system to compare a target sequence or
target structural
motif with the sequence information stored within the data storage means.
Search means are
used to identify fragments or regions of a known sequence which match a
particular target
sequence or target motif. A variety of known algorithms are disclosed publicly
and a variety
of commercially available software for conducting search means are and can be
used in the
computer-based systems of the present invention. Examples of such software
includes, but
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the
available
algorithms or implementing software packages for conducting homology searches
can be
adapted for use in the present computer-based systems. As used herein, a
"target sequence"
can be any nucleic acid or amino acid sequence of six or more nucleotides or
two or more
amino acids. A skilled artisan can readily recognize that the longer a target
sequence is, the
less likely a target sequence will be present as a random occurrence in the
database. The
most preferred sequence length of a target sequence is from about 10 to 300
amino acids,
more preferably from about 30 to 100 nucleotide residues. However, it is well
recognized
that searches for commercially important fragments, such as sequence fragments
involved in
gene expression and protein processing, may be of shorter length.
As used herein, "a target structural motif," or "target motif," refers to any
rationally
selected sequence or combination of sequences in which the sequences) are
chosen based on
a three-dimensional configuration which is formed upon the folding of the
target motif.
There are a variety of target motifs known in the art. Protein target motifs
include, but are
not limited to, enzyme active sites and signal sequences. Nucleic acid target
motifs include,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
99
but are not limited to, promoter sequences, hairpin structures and inducible
expression
elements (protein binding sequences).
4.15 TRIPLE HELIX FORMATION
In addition, the fragments of the present invention, as broadly described, can
be used
to control gene expression through triple helix formation or antisense DNA or
RNA, both of
which methods are based on the binding of a polynucleotide sequence to DNA or
RNA.
Polynucleotides suitable for use in these methods are preferably 20 to 40
bases in length and
are designed to be complementary to a region of the gene involved in
transcription (triple
helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science
15241, 456
(1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself
(antisense-
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense
Inhibitors of
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally
results in a shut-off of RNA transcription from DNA, while antisense RNA
hybridization
blocks translation of an mRNA molecule into polypeptide. Both techniques have
been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide.
4.16 DIAGNOSTIC ASSAYS AND HITS
The present invention further provides methods to identify the presence or
expression
of one of the ORFs of the present invention, or homolog thereof, in a test
sample, using a
nucleic acid probe or antibodies of the present invention, optionally
conjugated or otherwise
associated With a suitable label.
In general, methods for detecting a polynucleotide of the invention can
comprise
contacting a sample with a compound that binds to and forms a complex with the
polynucleotide for a period sufficient to form the complex, and detecting the
complex, so
that if a complex is detected, a polynucleotide of the invention is detected
in the sample.
Such methods can also comprise contacting a sample under stringent
hybridization
conditions with nucleic acid primers that anneal to a polynucleotide of the
invention under
such conditions, and amplifying annealed polynucleotides, so that if a
polynucleotide is
amplified, a polynucleotide of the invention is detected in the sample.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
100
In general, methods for detecting a polypeptide of the invention can comprise
contacting a sample with a compound that binds to and forms a complex with the
polypeptide for a period sufficient to form the complex, and detecting the
complex, so that if
a complex is detected, a polypeptide of the invention is detected in the
sample.
In detail, such methods comprise incubating a test sample with one or more of
the
antibodies or one or more of the nucleic acid probes of the present invention
and assaying
for binding of the nucleic acid probes or antibodies to components within the
test sample.
Conditions for incubating a nucleic acid probe or antibody with a test sample
vary.
Incubation conditions depend on the format employed in the assay, the
detection methods
employed, and the type and nature of the nucleic acid probe or antibody used
in the assay.
One skilled in the art will recognize that any one of the commonly available
hybridization,
amplification or immunological assay formats can readily be adapted to employ
the nucleic
acid probes or antibodies of the present invention. Examples of such assays
can be found in
Chard, T., An Introduction to Radioimmunoassay and Related Techniques,
Elsevier Science
Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al.,
Techniques in
hnmunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983),
Vol. 3
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory
Techniques in
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam,
The
Netherlands (1985). The test samples of the present invention include cells,
protein or
membrane extracts of cells, or biological fluids such as sputum, blood, serum,
plasma, or
urine. The test sample used in the above-described method will vary based on
the assay
format, nature of the detection method and the tissues, cells or extracts used
as the sample to
be assayed. Methods for preparing protein extracts or membrane extracts of
cells are well
known in the art and can be readily be adapted in order to obtain a sample
which is
compatible with the system utilized.
In another embodiment of the present invention, kits are provided which
contain the
necessary reagents to carry out the assays of the present invention.
Specifically, the
invention provides a compartment kit to receive, in close confinement, one or
more
containers which comprises: (a) a first container comprising one of the probes
or antibodies
of the present invention; and (b) one or more other containers comprising one
or more of the
following: wash reagents, reagents capable of detecting presence of a bound
probe or
antibody.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
101
In detail, a compartment kit includes any kit in which reagents are contained
in
separate containers. Such containers include small glass containers, plastic
containers or
strips of plastic or paper. Such containers allows one to efficiently transfer
reagents from
one compartment to another compartment such that the samples and reagents are
not
cross-contaminated, and the agents or solutions of each container can be added
in a
quantitative fashion from one compartment to another. Such containers will
include a
container which will accept the test sample, a container which contains the
antibodies used
in the assay, containers which contain wash reagents (such as phosphate
buffered saline,
Tris-buffers, etc.), and containers which contain the reagents used to detect
the bound
antibody or probe. Types of detection reagents include labeled nucleic acid
probes, labeled
secondary antibodies; or in the alternative, if the primary antibody is
labeled, the enzymatic,
or antibody binding reagents which are capable of reacting with the labeled
antibody. One
skilled in the art will readily recognize that the disclosed probes and
antibodies of the present
invention can be readily incorporated into one of the established kit formats
which are well
known in the art.
4.17 MEDICAL IMAGING
The novel polypeptides and binding partners of the invention are useful in
medical
imaging of sites expressing the molecules of the invention (e.g., where the
polypeptide of the
invention is involved in the immune response, for imaging sites of
inflammation or
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods
involve
chemical attachment of a labeling or imaging agent, administration of the
labeled
polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging
the labeled
polypeptide ih vivo at the target site.
4.18 SCREENING ASSAYS
Using the isolated proteins and polynucleotides of the invention, the present
invention further provides methods of obtaining and identifying agents which
bind to a
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences
set forth
in SEQ m NO: 1-1041, or 2083-2534, or bind to a specific domain of the
polypeptide
encoded by the nucleic acid. In.detail, said method comprises the steps of
(a) contacting an agent with an isolated protein encoded by an ORF of the
present invention, or nucleic acid of the invention; and

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
102
(b) determining whether the agent binds to said protein or said nucleic acid.
In general, therefore, such methods for identifying compounds that bind to a
polynucleotide of the invention can comprise contacting a compound with a
polynucleotide
of the invention for a time sufficient to form a polynucleotide/compound
complex, and
detecting the complex, so that if a polynucleotide/compound complex is
detected, a
compound that binds to a polynucleotide of the invention is identified.
Likewise, in general, therefore, such methods for identifying compounds that
bind to
a polypeptide of the invention can comprise contacting a compound with a
polypeptide of
the invention for a time sufficient to form a polypeptide/compound complex,
and detecting
the complex, so that if a polypeptide/compound complex is detected, a compound
that binds
to a polynucleotide of the invention is identified.
Methods for identifying compounds that bind to a polypeptide of the invention
can
also comprise contacting a compound with a polypeptide of the invention in a
cell for a time
sufficient to form a polypeptide/compound complex, wherein the complex drives
expression
of a receptor gene sequence in the cell, and detecting the complex by
detecting reporter gene
sequence expression, so that if a polypeptide/compound complex is detected, a
compound
that binds a polypeptide of the invention is identified.
Compounds identified via such methods can include compounds which modulate the
activity of a polypeptide of the invention (that is, increase or decrease its
activity, relative to
activity observed in the absence of the compound). Alternatively, compounds
identified via
such methods can include compounds which modulate the expression of a
polynucleotide of
the invention (that is, increase or decrease expression relative to expression
levels observed
in the absence of the compound). Compounds, such as compounds identified via
the
methods of the invention, can be tested using standard assays well known to
those of skill in
the art for their ability to modulate activity/expression.
The agents screened in the above assay can be, but are not limited to,
peptides,
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents
can be
selected and screened at random or rationally selected or designed using
protein modeling
techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical
agents
and the like are selected at random and are assayed for their ability to bind
to the protein
encoded by the ORF of the present invention. Alternatively, agents may be
rationally
selected or designed. As used herein, an agent is said to be "rationally
selected or designed"

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
103
when the agent is chosen based on the configuration of the particular protein.
For example,
one skilled in the art can readily adapt currently available procedures to
generate peptides,
pharmaceutical agents and the like, capable of binding to a specific peptide
sequence, in
order to generate rationally designed antipeptide peptides, for example see
Hurby et al.,
Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides,
A User's
Guide, W.H. Freeman, NY (1992), pp. 289-307, and I~aspczak et al.,
Biochemistry
28:9230-8 (1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as
broadly
described, can be used to control gene expression through binding to one of
the ORFs or
EMFs of the present invention. As described above, such agents can be randomly
screened
or rationally designed/selected. Targeting the ORF or EMF allows a skilled
artisan to design
sequence specific or element specific agents, modulating the expression of
either a single
ORF or multiple ORFs which rely on the same EMF for expression control. One
class of
DNA binding agents are agents which contain base residues which hybridize or
form a triple
helix formation by binding to DNA or RNA. Such agents can be based on the
classic
phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl
or polymeric
derivatives which have base attachment capacity.
Agents suitable for use in these methods preferably contain 20 to 40 bases and
are
designed to be complementary to a region of the gene involved in transcription
(triple helix -
see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241,
456 (1988); and
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-
Okano, J.
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense W hibitors of
Gene
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally results in
a shut-off of RNA transcription from DNA, while antisense RNA hybridization
blocks
translation of an mRNA molecule into polypeptide. Both techniques have been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide and other DNA binding agents.
Agents which bind to a protein encoded by one of the ORFs of the present
invention
can be used as a diagnostic agent. Agents which bind to a protein encoded by
one of the
ORFs of the present invention can be formulated using known techniques to
generate a
pharmaceutical composition.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
104
4.19 USE OF NUCLEIC ACIDS AS PROBES
Another aspect of the subject invention is to provide for polypeptide-specific
nucleic
acid hybridization probes capable of hybridizing with naturally occurnng
nucleotide
sequences. The hybridization probes of the subject invention may be derived
from any of
the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the
corresponding
gene is only expressed in a limited number of tissues, a hybridization probe
derived from
any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as
an
indicator of the presence of RNA of cell type of such a tissue in a sample.
Any suitable hybridization technique can be employed, such as, for example, in
situ
hybridization. PCR as described.in US Patents Nos. 4,683,195 and 4,965,188
provides
additional uses for oligonucleotides based upon the nucleotide sequences. Such
probes used
in PCR may be of recombinant origin, may be chemically synthesized, or a
mixture of both.
The probe will comprise a discrete nucleotide sequence for the detection of
identical
sequences or a degenerate pool of possible sequences for identification of
closely related
genomic sequences.
Other means for producing specific hybridization probes for nucleic acids
include the
cloning of nucleic acid sequences into vectors for the production of mRNA
probes. Such
vectors are known in the art and are commercially available and may be used to
synthesize
RNA probes ifa vitro by means of the addition of the appropriate RNA
polyrnerase as T7 or
SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The
nucleotide
sequences may be used to construct hybridization probes for mapping their
respective
genomic sequences. The nucleotide sequence provided herein may be mapped to a
.
chromosome or specific regions of a chromosome using well-known genetic and/or
chromosomal mapping techniques. These techniques include in situ
hybridization, linkage
analysis against known chromosomal markers, hybridization screening with
libraries or
flow-sorted chromosomal preparations specific to known chromosomes, and the
like. The
technique of fluorescent in situ hybridization of chromosome spreads has been
described,
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic
Techniques, Pergamon Press, New York NY.
Fluorescent ifz situ hybridization of chromosomal preparations and other
physical
chromosome mapping techniques may be correlated with additional genetic map
data.
Examples of genetic map data can be found in the 1994 Genome Issue of Science
(265:1981f). Correlation between the location of a nucleic acid on a physical
chromosomal

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
105
map and a specific disease (or predisposition to a specific disease) may help
delimit the
region of DNA associated with that genetic disease. The nucleotide sequences
of the subject
invention may be used to detect differences in gene sequences between normal,
carrier or
affected individuals.
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared
by, for
example, directly synthesizing the oligonucleotide by chemical means, as is
commonly
practiced using an automated oligonucleotide synthesizer.
Support bound oligonucleotides may be prepared by any of the methods known to
those
of skill in the art using any suitable support such as glass, polystyrene or
Teflon. One strategy
is to precisely spot oligonucleotides synthesized by standard synthesizers.
Immobilization can
be achieved using passive adsorption (hlouye & Hondo, (1990) J. Clin.
Microbiol. 28(6), 1469-
72); using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey &
Collins, (1989) Mol.
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (I~eller
et al., 1988;
1989); all references being specifically incorporated herein.
Another strategy that may be employed is the use of the strong biotin-
streptavidin
interaction as a linker. For example, Broude et al. (1994) Froc. Natl. Acad.
Sci. USA 91(8),
3072-6, describe the use of biotinylated probes, although these are duplex
probes, that are
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads
may be
purchased from Dynal, Oslo. Of course, this same linking chemistry is
applicable to coating
any surface with streptavidin. Biotinylated probes may be purchased from
various sources,
such as, e.g., Operon Technologies (Alameda, CA).
Nunc Laboratories (Naperville, IL) is also selling suitable material that
could be used.
Nunc Laboratories have developed a method by which DNA can be covalently bound
to the
microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface
grafted with
secondary amino groups (>NH) that serve as bridgeheads for further covalent
coupling.
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be
bound
to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing
immobilization of
more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-
42).
The use of CovaLink NH~ strips for covalent binding of DNA molecules at the 5'-
end
has been described (Rasmussen et al., (1991). In this technology, a
phosphoramidate bond is
employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is
beneficial as
immobilization using only a single covalent bond is preferred. The
phosphoramidate bond joins

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
106
the DNA to the CovaLink NH secondary amino groups that are positioned at the
end of spacer
arms covalently grafted onto the polystyrene surface through a 2 nm long
spacer arm. To link
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the
oligonucleotide terminus
must have a 5'-end phosphate group. It is, perhaps, even possible for biotin
to be covalently
bound to CovaLink and then streptavidin used to bind the probes.
More specifically, the linkage method iilcludes dissolving DNA in water (7.5
ng/~,1) and
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold
0.1 M 1-
methylimidazole, pH 7.0 (1-MeIm~), is then added to a final concentration of
10 mM 1-Melm~.
A ss DNA solution is then dispensed into CovaLink NH strips (75 p,l/well)
standing on ice.
Carbodiimide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC),
dissolved in 10 mM 1-Melm~, is made fresh and 25 ~,1 added per well. The
strips are incubated
for 5 hours at 50°C. After incubation the strips are washed using,
e.g., Nunc-Immuno Wash;
first the wells are washed 3 times, then they are soaked with washing solution
for 5 min., and
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH,
0.25% SDS
heated to 50°C).
It is contemplated that a further suitable method for use with the present
invention is
that described in PCT Patent Application WO 90/03382 (Southern & Maskos),
incorporated
herein by reference. This method of preparing an oligonucleotide bound to a
support involves
attaching a nucleoside 3'-reagent through the phosphate group by a covalent
phosphodiester link
to aliphatic hydroxyl groups carried by the support. The oligonucleotide is
then synthesized on
the supported nucleoside and protecting groups removed from the synthetic
.oligonucleotide
chain under standard conditions that do not cleave the oligonucleotide from
the support.
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen
phosphorate.
An on-chip strategy for the preparation of DNA probe for the preparation of
DNA probe
arrays may be employed. For example, addressable laser-activated
photodeprotection may be
employed in the chemical synthesis of oligonucleotides directly on a glass
surface, as described
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by
reference. Probes
may also be immobilized on nylon supports as described by Van Ness et al.
(1991) Nucleic
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan &
Cavalier (1988)
Anal. Biochem. 169(1), 104-8; all references being specifically incorporated
herein.
To link an oligonucleotide to a nylon support, as described by Van Ness et al.
(1991),
requires activation of the nylon surface via alkylation and selective
activation of the 5'-amine of
oligonucleotides with cyanuric chloride.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
107
One particular way to prepare support bound oligonucleotides is to utilize the
light-generated synthesis described by Pease et al., (1994) Proc. Nafl. Acad.
Sci., USA 91(11),
5022-6, incorporated herein by reference). These authors used current
photolithographic
techniques to generate arrays of immobilized oligonucleotide probes (DNA
chips). These
methods, in which light is used to direct the synthesis of oligonucleotide
probes in high-density,
miniaturized arrays, utilize photolabile 5'-protected N acyl-deoxynucleoside
phosphoramidites,
surface linker chemistry and versatile combinatorial synthesis strategies. A
matrix of 256
spatially defined oligonucleotide probes may be generated in this manner.
4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS
The nucleic acids may be obtained from any appropriate source, such as cDNAs,
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC
inserts, and RNA, including mRNA without any amplification steps. For example,
Sambrook
et al. (1989) describes three protocols for the isolation of high molecular
weight DNA from
mammalian cells (p. 9.14-9.23).
DNA fragments may be prepared as clones in M13, plasmid or lambda vectors
and/or
prepared directly from genomic DNA or cDNA by PCR or other amplification
methods.
Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of
DNA
samples may be prepared in 2-500 ml of final volume.
The nucleic acids would then be fragmented by any of the methods known to
those of
skill in the art including, for example, using restriction enzymes as
described at 9.24-9.28 of
Sambrook et al. (1989), shearing by ultrasound and NaOH treatment.
Low pressure sheariilg is also appropriate, as described by Schriefer et al.
(1990)
Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this
method, DNA
samples are passed through a small French pressure cell at a variety of low to
intermediate
pressures. A lever device allows controlled application of low to intermediate
pressures to the
cell. The results of these studies indicate that low-pressure shearing is a
useful alternative to
sonic.and enzymatic DNA fragmentation methods.
One particularly suitable way for fragmenting DNA is contemplated to be that
using the
two base recognition endonuclease, C'viJI, described by Fitzgerald et al.
(1992) Nucleic Acids
Res. 20(14) 3753-62. These authors described an approach for the rapid
fragmentation and
fractionation of DNA into particular sizes that they contemplated to be
suitable for shotgun
cloning and sequencing.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
108
The restriction endonuclease CviJI normally cleaves the recognition sequence
PuGCPy
between the G and C to leave blunt ends. Atypical reaction conditions, which
alter the
specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA
fragments form
the small molecule pUCl9 (2688 base pairs). Fitzgerald et al. (1992)
quantitatively evaluated
the randomness of this fragmentation strategy, using a CviJI** digest of pUCl9
that was size
fractionated by a rapid gel filtration method and directly ligated, without
end repair, to a lac Z
minus M13 cloning vector. Sequence analysis of 76 clones showed that CviJI**
restricts
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is
accumulated
at a rate consistent with random fragmentation.
As reported in the literature, advantages of this approach compared to
sonicaion and
,:
agarose gel fractionation include: smaller amounts of DNA are required (0.2-
0.5 ~Cg instead of
2-5 ~,g); and fewer steps are involved (no preligation, end repair, chemical
extraction, or
agarose gel electrophoresis and elution are needed).
Irrespective of the manner in which the nucleic acid fragments are obtained or
prepared,
it is important to denature the DNA to give single stranded pieces available
for hybridization.
This is aclueved by incubating the DNA solution for 2-5 minutes at 80-
90°C. The solution is
then cooled quickly to 2°C to prevent renaturation of the DNA fragments
before they are
contacted with the chip. Phosphate groups must also be removed from genomic
DNA by
methods known in the art.
4.22 PREPARATION OF DNA ARRAYS
Arrays may be prepared by spotting DNA samples on a support such as a nylon
membrane. Spotting may be performed by using arrays of metal pins the
positions of which
correspond to an array of wells in a microtiter plate) to repeated by transfer
of about X20 n1 of a
DNA solution to a nylon membrane. By offset printing, a density of dots higher
than the density
of the wells is achieved. One to 25 dots may be accommodated in 1 mm2,
depending on the
type of label used. By avoiding spotting in some preselected number of rows
and columns,
separate subsets (subarrays) may be formed. Samples in one subarray may be the
same genomic
segment of DNA (or the same gene) from different individuals, or may be
different, overlapped
genomic clones. Each of the subarrays may represent replica spotting of the
same samples. In
one example, a selected gene segment may be amplified from 64 patients. For
each patient, the
amplified gene segment may be in one 96-well plate (all 96 wells containing
the same sample).
A plate for each of the 64 patients is prepared. By using a 96-pin device, all
samples may be
spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from
each patient.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
109
Where the 96 subarrays are identical, the dot span may be 1 mmz and there may
be a 1 mm
space between subarrays.
Another approach is to use membranes or plates (available from NL1NC,
Naperville,
Illinois) which may be partitioned by physical spacers e.g. a plastic grid
molded over the
membrane, the grid being similar to the sort of membrane applied to the bottom
of multiwell
plates, or hydrophobic strips. A fixed physical spacer is not preferred for
imaging by exposure
to flat phosphor-storage screens or x-ray films.
The present invention is illustrated in the following examples. Upon
consideration of
the present disclosure, one of skill in the art will appreciate that many
other embodiments and
variations may be made in the scope of the present invention. Accordingly, it
is intended that
the broader aspects of the present invention not be limited to the disclosure
of the following
examples. The present invention is not to be limited in scope by the
exemplified embodiments
which are intended as illustrations of single aspects of the invention, and
compositions and
methods which are functionally equivalent are within the scope of the
invention. Indeed,
numerous modifications and variations in the practice of the invention are
expected to occur to
those skilled in the art upon consideration of the present preferred
embodiments. Consequently,
the only limitations which should be placed upon the scope of the invention
are those which
appear in the appended claims.
All references cited within the body of the instant specification are hereby
incorporated
by reference in their entirety.
5.0 EXAMPLES
5.1 EXAMPLE 1
Novel Nucleic Acid Seguences Obtained From Various Libraries
A plurality of novel nucleic acids were obtained from cDNA libraries prepared
from
various human tissues and in some cases isolated from a genomic library
derived from human
chromosome using standard PCR, SBH sequence signature analysis and Sanger
sequencing
techniques. The inserts of the library were amplified with PCR using primers
specific for the
vector sequences which flank the inserts. Clones from cDNA libraries were
spotted on nylon
membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to
obtain signature
sequences. The clones were clustered into groups of similar or identical
sequences.
Representative clones were selected for sequencing.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
110
In some cases, the 5' sequence of the amplified inserts was then deduced using
a typical
Sanger sequencing protocol. PCR products were purified and subjected to
fluorescent dye
terminator cycle sequencing. Single pass gel sequencing was done using a 377
Applied
Biosystems (ABA sequencer to obtain the novel nucleic acid sequences.
5.2 EXAMPLE 2
Assemblage of Novel Conti~s
The contigs of the present invention, designated as SEQ m NO: 2083-2534 were
assembled using an EST sequence as a seed. Then a recursive algorithm was used
to extend the
seed EST into an extended assemblage, by pulling additional sequences from
different
databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and
UniGene, and
exons from public domain genomic sequences predicated by GenScan) that belong
to this
assemblage. The algorithm terminated when there were no additional sequences
from the
above databases that would extend the assemblage. Further, inclusion of
component sequences
into the assemblage was based on a BLASTN hit to the extending assemblage with
BLAST
score greater than 300 and percent identity greater than 95%.
Table 8 sets forth the novel predicted polypeptides (including proteins)
encoded by the
novel pohynucleotides (SEQ )D NO: 2083-2534) of the present invention, and
their
corresponding translation start and stop nucleotide locations to each of SEQ
ID NO: 2083-2534.
Table 8 also indicates the method by which the polypeptide was predicted.
Method A refers to
a polypeptide obtained by using a software program called FASTY (available
from
http://fasta.bioch.virginia.edu) which selects a polypeptide based on a
comparison of the
translated novel polynucleotide to known polynucleotides (W.R. Pearson,
Methods in
Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B
refers to a
polypeptide obtained by using a software program called GenScan for
human/vertebrate
sequences (available from Stanford University, Office of Technology Licensing)
that predicts
the polypeptide based on a probabilistic model of gene structure/compositional
properties (C.
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by
reference).
Method C refers to a polypeptide obtained by using a Hyseq proprietary
software program that
translates the novel polynucheotide and its complementary strand into six
possible amino acid
sequences (forward and reverse frames) and chooses the polypeptide with the
longest open
reading frame.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
111
5.3 EXAMPLE 3
Novel Nucleic Acids
The novel nucleic acids of the present invention SEQ ID NO: 1-1041 were
assembled
from Hyseq's proprietary EST sequences as described in Example 1 and human
genome
sequences that are available from the public databases
(htt~://www.ncbi.nlm.nih.~ovn.
Exons were predicted from human genome sequences using GenScan
(http:l/genes.mit.edu/GENSCANinfo.html); HMMgene
(http~l/www cbs dtu.dl~/services/HMM~enemmmgenel l.html); and GenMark.hmm
(httpyenemark.biology.~atech.edu/GeneMark/whmm info.html). The Hyseq
proprietary
EST sequences and the predicted exons were assembled based on a BLASTN hit to
the
extending assemblage with BLAST score greater than 300 and percent identity
greater than
95%. Then, the predicted genes were analyzed using Neural Network SignalP V1.1
program
(from Center for Biological Sequence Analysis, The Technical University of
Denmark) for
presence of a signal peptide. These sequences ware further analyzed for
absence of a
transmembrane region using the TMpred program
(http://www.ch.embnet.or~/software/TMPRED form.html).
Table 1 shows the various tissue sources of SEQ ID NO: 1-1041.
The homologs for polypeptides SEQ m NO: 1042-2082, that correspond to
nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0a1
19MP-
WashU searches against Genpept release 124 using BLAST algorithm. The results
showing
homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J.
Comp. Biol., Vol. 6, 219-235 (1999), http:l/motif.stanford.edu/ematrix-search/
herein
incorporated by reference), all the polypeptide sequences were examined to
determine
whether they had identifiable signature regions. Scoring matrices of the
eMatrix software
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO
databases. Table 3 shows the accession number of the homologous eMatrix
signature found
in the indicated polypeptide sequence, its description, and the results
obtained which include
accession number subtype; raw score; p-value; and the position of signature in
amino acid
sequence.
Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences
were examined for domains with homology to certain peptide domains. Table 4
shows the

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
112
name of the Pfam model found, the description, the e-value and the Pfam score
for the
identified model within the sequence. Further description of the Pfam models
can be found
at http://pfam.wustl.edu/.
The GeneAtlasT"' software package (Molecular Simulations Inc. (MSI), San
Diego,
CA) was used to predict the three-dimensional structure models for the
polypeptides
encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated
by
(1) PSI-BLAST which is a multiple alignment sequence profile-based searching
developed
by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High
Throughput Modeling
(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated
sequence
and structure searching procedure (http://www.msi.com/), and (3) SeqFoldTM
which is a fold
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-
791 (1998)).
This analysis was carried out, in part, by comparing the polypeptides of the
invention with
the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional
structures
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier
given to
template structure; "Chain ID", identifier of the subcomponent of the PDB
template
structure; "Compound Information", information of the PDB template structure
and/or its
subcomponents; "PDB Function Amlotation" gives function of the PDB template as
annotated by the PDB files (http:/www.rcsb.or DB/); start and end amino acid
position of
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold
score, and the
Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlasT"'
software
(MST), is based on Dr. Eisenberg's Profile-3D threading program developed in
Dr. David
Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and
Eisenberg, Nature,
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc.. Natl.
Acad. Sci. USA,
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify
score for
proteins with different lengths so that a unified cutoff can be used to select
good models as
follows:
Verify score (normalized) _ (raw score -1/2 high score)/(1/2 high score)
The PFM score, produced by GeneAtlasT"' software (MSI), is a composite scoring
function that depends in part on the compactness of the model, sequence
identity in the
alignment used to build the model, pairwise and surface mean force potentials
(MFP). As
given in table 5, a verify score between 0 to 1.0, with 1 being the best,
represents a good

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
113
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best,
represents a good
model. A SeqFoldTM score of more than 50 is considered significant. A good
model may
also be determined by one of skill in the art based all the information in
Table 5 taken in
totality.
Table 6 shows the position of the signal peptide in each of the polypeptides
and the
maximum score and mean score associated with that signal peptide using Neural
Network
SignalP V1.1 program (from Center for Biological Sequence Analysis, The
Technical
University of Denmark). The process for identifying prokaryotic and eukaryotic
signal
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob
Engelbrecht,
Soren Brunak, and Gunnar von Heijne in the publication " Identification of
prokaryotic and
eukaryotic signal peptides and prediction of their cleavage sites" Protein
Engineering, Vol.
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score
and a mean
S score, as described in the Nielson et al reference, was obtained for the
polypeptide
sequences.
Table 7 correlates each of SEQ ID NO: 1-1041 to a specific chromosomal
location.
Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their
corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their
corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and
the US
serial number of the priority application in which the contig sequence was
filed.
Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the
corresponding SEQ
ID NO in which the sequence was filed in priority US application 60/311,261.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
114
Table 1
'Tissue Ori in 1Z1VA/Tissue Librar Name SEQ ID NO:
Source

adrenal gland Clontech ADR002 13 23 34 45 77 111
115 122 187

194 210-211 249-250
255 290

320 357-358 362 420
443 451

492 499 551 577 630
698 702

713 718 805 808 819
841-843

845 861 896 899 909
924 937

949 985 1037

adult bladder Invitrogen BLD001 9 87 189 320-321
358 563 768

840 970

adult brain Clontech ABR001 ~ 184-186 277 282 352
558 849

871 898 958

adult brain Clontech ABR006 30 45 170 199 210
226 260 292-

294 340 357 413 443-444
478

499 551-552 579 582
584-588

632-637 646 654-655
676 683

731-732 755-756 777
813-827

861 872 874 880 883
1002 1012

adult brain Clontech ABR008 15 45 54 61 67 81
87 101 106

108 122-123 143-144
170 181-

183 195-209 215 222
245-248

261-270 283-289 292-293
296

306 308-310 327 340
358 370

394-407 409 421 428
440 442

459 477-478 496 531-547
551-

552 556 565-566 578-579
606

618 620-621 629-630
651 653-

655 664 667-668 707
713-714

729 745 750 753 756
772 779

788 790 793-794 799-800
802

808 812 823 826-827
849-850

859 862 872 883 885
898 917

919 921 930 935-936
947 974

985-986 992 1002
1006 1012

1028 1030 1036 1039

adult brain Clontech ABRO 11 1012

adult brain GIBCO AB3001 23 57-58 67 85 296
492 499 579

853 898-899 950 1012

adult brain GIBCO ABD003 45 59-62 67 72 82
85-88 156

179-180 182 296 299
355-356

440 458 474 483 499
563 823

840 852 860 885 898
992 999

1012

adult brain Invitrogen ABR014 45 115 238 470 599
653 974-976

adult brain Invitrogen ABR015 45 600 885 1012

adult brain Invitro en ABR016 599 1012

adult brain Invitrogen ABT004 ' 34 45 54 74 84 118
138-143 170-

171 180-181 208 255
277 359

379 428 438 499 501
536 715

731 783 793 799 805
809 824

862 898 912 977 998
1012

adult cervix BioChain CVX001 23 26 48 54 57 67
77 118 121

177 183 238 255 271-272
296

303 311-319 325 352
361-362

411-412 419-420 424
428 440

447 478 541 567 569
599-600

622 699 793 805 813
831 836-

837 839 844-845 848
863 872

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
115
Table 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

913 928-929 944 958
965 970

973 1001 1004

adult colon Invitrogen CLN001 250 322-325 429 630
788 970

985

adult heart GIBCO AHR001 28-30 45 61 67 90-94
118 122

150-151 183 193 250-251
279

349-351 369-370 410
419 474

483 485 490 493 552
563 719

773 835-836 853 861
961 976

1030

adult kidney GIBCO AKD001 24 31-34 44-46 48
55 62 67 81

121 144 151 162 176-178
183

251 255 258 277 352
358 369-

370 386 408 420 429
483 490

536 546 579 599-600
602 645

698 793 805 874 898
913

adult kidney Invitrogen AKT002 32 53-54 67 85 177
251 260 341

386 408 419-420 431-436
478

490 493 507 561 582
596-599

698 728 788 805 819
837 844-

848 885 898 969 989
1013

adult liver Clontech ALV003 101 121 193 579 638-639
729

890-893 919 1007 1017

adult liver Invitrogen ALV002 75 157 173 183 212-214
236 240

263 292 323 335 386
408 415

495-499 552 577 589
599 727

782 858 869 898-900
924 968

adult lung GIBCO ALG001 67 77 152 369 386
419 443 483

583 732 849 907

adult ovary Invitrogen AOV001 5 26 34 43 45 48 55
61-62 64-67

77 87 101-102 105
115 118 122-

129 143 151 155-163
170 174-

175 177 181-183 193
251-252

286 292 338 347 353-354
369

381 410 415 420 424
451 458

483 489 497 499 515
536 541

546 552 577 579 595
599-600

604 647 658 661 665
699 744

782-783 800 805-806
814 831

835 839-840 844 853
874 895

898-899 913 924 929
941-942

949 973 977 994 1004
1007 1012

1016 1031 1037

adult lacenta Clontech APL001 67 419 688 728 848
930

adult spleen Clontech SPLc01 82 101 187 255 260
358 370 447

483 489 579 586 648
768 835

845 848 853-857 863
885 913

917 962 986

adult spleen GIBCO ASP001 87 105 108 122 158
172 215 299

380 492 499 552 599
622 785

830 840 850 889

adult testis GIBCO ATS001 68-69 106 183 251
301 360 386

520 541 570 753 788
832 840

890 916

bone marrow Clontech BMD001 10-12 16-19 24-26
35 46 48 58

77 85 95-96 98-99
122 156 164

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
116
TahlP 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

172 187 222 251 385
424 429

458 478 483 489 519
568-569

599 622-623 630-631
696 700

758 765 794 844 914
919 924

944 971 985 992 1001
1017

bone marrow GF BMD002 23 45 81-82 104-105
115 136

144 156 170 172-173
181 183

247 287 292 306 319-320
327

362 370 418 478-483
489 492

536 548-552 565 569-570
572

579 596 599 614-622
630 640-

641 643 653 668 691
699 708

715-718 726 743 756
758 772

789 841 889 917 920
947 958

994 1006 1010 1037
1039

cultured preadipocytesStratagene ADP001 121 255 400 490-494
511 629

689 758 793 835 861
913 944

949 984

endothelial cellsStratagene EDT001 34 45 54 58 67 120-122
144 151-

154 183 193 299 385
440 451

458 483 490 499 515
552 563

569 577 579 599 622-623
752

793 800 844-845 898-899
942

944 949

fetal brain Clontech FBR001 139 168 356 599 702
712 831

845 850 872-873 898
921 1037

fetal brain Clontech FBR004 138 168 250 363 873-875
882

fetal brain Clontech FBR006 14 29 45 51 81 87
101 104 118

131 143-144 157 171
177 206

208-209 215 229 238
251 261

273 279 283 291-293
326-332

358 362 370-371 397
400 402

413 419 428 461 472
485 551-

560 568-569 579 618
620 629-

630 653-657 659-661
663-673

675 700 714 739-742
744-746

766 779 793 809 815
819 822

840 850 859 862 872
875-885

930 958 972 995 1002
1006 1028

1030-1031 1038

fetal brain GIBCO HFB001 13-15 54-57 62 67
70-72 84 121

174 177 180 183 410
417 424

485 518 520 542 552
578-579

599 785 793 805 831-832
840

858 871 883 898-899
977 1012

fetal brain Invitrogen FBT002 7 45 49 144-149 157
180 255 263

356 493 501 600 630
707 748

832 845 858 913 1012

fetal heart Invitrogen FHR001 24 45 81-82 104 114-115
118

121 144 152 181 239
247 288

292 327 362 370 381
419 428

444 453 458 478 486
493 503

569 571 576 582.596
618 640

' 668 674-688 719-722
731 744

753 762 772 784 794
819 823

836 850 885 914 944
949 957-

958 1017

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
117
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

fetal kidney Clontech FI~D001 82 107 208 458 483
485 536 758

760 819 836 894 1017

fetal kidney Clontech FKD002 61 101 105 183 189
238 247 263

292 327 340 370 405
416 419

517 569 586 620 648
668 689-

691 731 746-752 763
771-772

787-788 819 840 842
854 861

872 944 958 961 969

fetal kidney Invitro en FKD007 116

fetal liver Clontech FLV002 410 429 454 692-695
704 781

805 894-895 1017

fetal liver Clontech FLV004 67 107 115 118 151
187 241 255

287 370 466 478 492
518 548

552 569 582 589 630
653 668

696-699 752-757 784
789 805

885 908 985

fetal liver ~ Invitrogen FLV001 45 101 130-137 157
222 240 337

386 428-429 492 552
589 693

727 840

fetal liver-spleenColumbia FLS001 1-9 18 20-23 27 34
36-38 45 55

University 67 70 83 89 94 118
122 158 164

172-173 177 183 219
238 240

246 251 292 299 323
335 338

358 369 376 385-386
397 408

416 419 421-422 429
451 456-

460 466 472 478 483
489-490

493 516 536 543 546
551 569-

573 579 586 588-589
593-595

599-603 619 622 668
676 691

699 702 724 731 734
743 787

789 794 800 805 834-835
840

848 853 874 880 885
890-891

899 908 910 923 926-927
930

939-940 944 949 958
973 980

992 999 1004 1007
1009 1013

fetal liver-spleenColumbia FLS002 3 8 17 22 36-37 46
55 61 63 70

University 72 85 89-90 94 106
122 148 156

158 165 172 177 181
194 213

215 219 246 251 292
299 304-

307 323-324 338 346
355 366

371 374 380-381 386
392 397

410 417 421 440 455
462-464

466-468 489-490 492-493
507-

521 536 552 565-566
569 571-

576 592 596 599 619
630 650

655 661 688 698-699
712 718

723-729 731 735-737
753 767

783 824 831 834 840
845 871

885 891 894 899 902
906-909

913 923-930 940 943
949 958

. 973 980 992 999 1003
1007 1017

1032 1040-1041

fetal liver-spleenColumbia FLS003 23 67 106 150 158
193 338 374

University 376 411 443 478 493
546 565

569-570 582 589 609-613
630

661 699 724 727-734
767 809

812 834-835 845 880
890 910

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
118
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

929-930 958 973 980
985 1013

fetal lung Clontech FLG001 728 824 1008

fetal lun Clontech FLG004 115 668

fetal lung Invitrogen FLG003 120 183 322 333-336
476 516

691 831 835 850 1012

fetal muscle Invitrogen FMS001 45 338-339 365 369
386 429 431

496-497 789 793 856
970 1008

1019 1033 1035

fetal muscle Invitrogen FMS002 45 115 171 247 327
365 370 405

536 642-652 668 710-711
719

726 758-761 765 836
899 901

907 913 948 965 1037

fetal skin Invitrogen FSK001 29 57 67 74 81 118
152 177 180

193 294 340-342 345
375 397

419 437-443 445-451
454 475

532 541 546 565 598
604 630

650 668 728 742 772
789 793

804-805 823 828-830
837 840

849 899 901 922 958
970 1007

1022 1033

fetal skin Invitrogen FSK002 34 45 77 81 85 115
173 200 279

292-293 360 370 381
419 428-

429 451 466 490 551
569-570

579 600 604 630 647
668 698

700-706 729 731 746
750 758

762-766 768-773 780
794 840

850 859 861 885 901
911 913

957 961 965 973 1038

fibroblast Stratagene LFB001 55 72 143 255 490
502-505 587

599 627 861 863 885
984 1037

induced neuron-cellsStratagene NTD001 30 82 111 124 181
206 356 392

410 417 484-488 578
831-834

898 977 1036 1039

infant brain Columbia IB2002 18 21 45 66 73-75
100-103 118

University 152 168-171,177 180
241-242

252 292-295 340 345
366-367

413 438 454 499 501
542 561-

562 578-580 599 668
702 728-

729 745 765 768 772
793 796-

799 823-824 863 874
887 899

948-949 967 975 977
981 983

992 995 1012

infant brain Columbia IB2003 81 101 113 118 177
180 241 252

Uiliversity 293 340 345 367 371
379 381

400 417 499-501 536
562 578

580-581 629-630 702
713 745

796-805 824 831 837
840 845

874 885 967 977 981
985 1012

1030

infant brain Columbia IBM002 168 358 413-414 913

University

infant brain Columbia IBS001 415 417 533 581 886-888
977

University

leukocyte Clontech LUG003 77 619889 949

leukocyte GIBCO LUC001 34 36 38-42 50-52
55 67 77 81-

83 85 121 137 144
158 172 183

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
119
Tahl a 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

223 226 251 254 258
291 324

368-374 378 424 429
443 483

492 536 552 564 600
602 732

760 768 782 785 805
838 844-

845 848 850 889 898
905 908

946 973 992

leg 55 72 143
255 490

502-505 587
599

627 861 863
885

984 1037

lung tumor Invitrogen LGT002 55 61 65 77-79 82
102 105 115

156-157 165-167 170
182-183

197 243-244 251 253
296-297

325 370 386 418-419
421-425

478 483 492 499 520
531 533

541 569 577 582 600
788 844-

845 848 874 899 911
913 916-

918 939 944 949 956
970 976

lymph node Clontech ALN001 47 63 104-105 183
483 492 691

894 1017

lymphocytes ATCC LPC001 45 53 77 158 193 251
392 421

455 469-474 483 507
536 546

579 581 618 621 640
765 780-

787 793 838 845 875
924 968

978 999

macrophage Invitrogen HMP001 122 147 157 183 251
255 493

738 898-899 903-905

mammary gland Invitrogen MMG001 45 64 67 83-84 101
113 143 148

152 158 164 177 181-183
189

216-218 253 255 258
263 274

299 336 419 421 423
426-430

440 466 478 490 520
533 536

564 569 579 582 630
646 753

768 782 789 800 835
840 848

850 883 912-913 944
950 958

melanoma from-cell-line-Clontech MEL004 62 158 181 298 362
364 402 419

ATCC-#CRL-1424 515 536 896-897 958
973 1004

1008

*Mixture of 16 Various VendorsCGd010 353 358 823 942 982
tissues - 1020

mRNA

*Mixture of 16 Various VendorsCGd011 569 630 944 955 999
tissues -

mRNA

*Mixture of 16 Various VendorsCGd012 9 38 59 63 80 85 122-123
tissues - 152

~A 154 177 195 217 232
246 250

296 300 306 323-324
381 427

434 438-439 478 489
499 507

517 538 558 565 571
575 630

657 681 701 736 762
792 800

802 823-824 861 871-872
899

929 941 955 968 974
985-1003

1006 1011-1012 1033

*Mixture of 16 Various VendorsCGd013 232 434 748 956-958
tissues - 992

mRNA

*Mixture of 16 Various VendorsCGd015 18 69 115 324 335
tissues - 548 551 569

~A 582 600 622 731 819
899 911

944 957-958 1012 1017-1018

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
120
Tahle 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source

*Mixture of 16 Various VendorsCGd016 46 172 183 323 371
tissues - 481 493 565

~A 569 571 596 599 630
654 698

745 762 786 849 907
944 1004-

1013 1037 1039

neuronal cells Stratagene NTU001 7 33 45 107 113 121
150 183 286

385 440 478 483 485
487 489

536 569 582 756 768
772 819

836 944 958 966 1001

pituitary gland Clontech PIT004 158 222 255 345 356
370 379

569 579 819 831 861-862
885

898 922 1017

placenta Clontech PLA003 7 36 61 279 419 478
489 582 586

599 641 647 668 681
707-711

774-779 1001

placenta Invitrogen APL002 57 173 536 728 793
800

prostate Clontech PRT001 26 219-222 229 412
599 665 762

835 837 860 878 951
1031

rectum Invitrogen REC001 9 292 343-346 431
546 714 800

863 918

retinoic acid-induced-Shatagene NTR001 112 400 478 569 582
629 756

neuronal-cells 758 800 819 831 835-836
850

906 944 958

salivary gland Clontech SAL001 58 61 77 118 150 158
294 347-

348 483 492-493 546
752 830

915

skeletal muscle Clontech SI~M001 80 118 247 365 483
719 805 812

823

small intestine Clontech SIN001 34 37 45 52 60 93
106 119 121

138 144 177 180 208
223-225

238 247 294 323 335-336
343

362 370 380 386 397
409-411

416 420 440 451 455
478 489

493 536 571 577 579
590 602

604-608 614 622 624-628
655

668 688 700 714 805-812
831

841 872 894 899 914
924 926

929 958 961 965 973
991 998

1017

spinal cord Clontech SPC001 51 164 182-183 190
226-228

255-257 275-277 286
296 299

451 454 542 552 579
591 728

753 770 786 790 831
835 849-

852 898 907 958 1000
1012

stomach Clontech STO001 72 222 232 247 258
366 645

thalamus Clontech THA002 45 49 113 155 164
180 183 191-

192 208 229-232 238
345 417

443 512 551 558 592
630 728

800 823 840 858-860
885 898

976 1012

thymus Clontech THM001 45 141 160 183 258
360 378-379

418 451 460 569 602
619 731

788-790 819 835 845
958 965

1004

thymus Clontech THMc02 47 108 115 121 144
157 173 247

259-260 300 327 340
358 362

375-393 409 453 455
461 478-

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
121
Table 1
Tissue Ori in RNAITissue Librar Name SEQ ID NO:
Source

479 489 551 565 569-570
579

582 615 630 640 653
668 708

744 752 758 766 790-795
810

819 823 835-836 845
850 853

861 885 911 919 938
958 962

994 1001 1027

thyroid gland Clontech THR001 46 58 67 80 82 144
160 177 183

193-194 233-235 251
255 263

268 278-280 286 299
301-303

324 358 370 386 397
408 410

420 440 474 483 493
506 519-

520 533 594 599-600
602 658

661 719 758 772 785
788 793

830 851 853 864-867
898 904

909 924 929 961 973
991 998

1001 1009

trachea Clontech TRC001 45 154 236 238 281
323 416 571

602 868-869 913

umbilical cord BioChain FUC001 34 45 54 58 67 70
85 152 154

177 180 188 208 251
299 370

409 415 419 434 451-455
483

596 599 647 661 733
742 793

808 839-840 845 849-850
861

888 911 913 992

uterus Clontech UTR001 177 237-239 255 258
417 493

520 567 599 604 646
844 870

874 898 973

young liver GIBCO ALV001 45 419 440 443 490
653 732 753

805 845 898 904

*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult
brain mRNA (Invitrogen), 2)
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA
(Invitrogen), 4) Normal adult liver
mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Norn~al fetal
liver mRNA (Invitrogen), 7)
normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech),
9) Human bone marrow
mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 11) Human
thymus mRNA
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord
mRNA (Clontech), 14)
human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human
conceptional
umbilical cord mRNA (BioChain).

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
122
Tahle 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1044 AAB32400 Homo SapiensHUMA- Human secreted 339 100
protein

sequence encoded by
gene 30 SEQ ID

N0:86.

1044 AAM74711 Homo SapiensMOLE- Human bone marrow335 100

expressed probe encoded
protein SEQ

ID NO: 35017.

1044 AAM61909 Homo SapiensMOLE- Human brain expressed335 100
single

exon probe encoded protein
SEQ ID

NO: 34014.

1045 gi3859599Arabidopsis similar to class I chitinases74 27
(Pfam:

thaliana PF00182, E=1.2e-142,
N=1)

1045 gi15292107Drosophila LD38671p 74 33

melanogaster

1045 gi2258324Fusarium yellowing-associated 73 32
protein

oxysporum
f. Sp.

ciceris

1046 gi17428204Ralstonia CONSERVED HYPOTHETICAL 74 32

solanacearumPROTEIN

1046 gi4314432Homo Sapienssimilar to phosphatidylinositol71 30

(4,5)bisphosphate 5-phosphatase;

match to PID:g1399105

1046 gi~17545909~Ralstonia CONSERVED HYPOTHETICAL 74 32

ref~NP_5193solanacearumPROTEIN

11.1

1047 gi9756017Actinoplanesalpha-amylase 69 38
Sp.

50/110

1047 gi~6572499~gHomo SapiensLHX3 protein 67 26

b~AAF17291

.1~

1047 gi~18572988~Homo SapiensLIM homeobox protein 67 26
3

re~XP_0291

70.2

1048 AAY28474-Homo SapiensUYJO Human Capon protein.721 99

1048 gi2895555Homo sapienscarboxyl-terminal PDZ 721 99
ligand of

neuronal nitric oxide
synthase

1048 gi2895557Rattus carboxyl-terminal PDZ 654 92
ligand of

norve icus neuronal nitric oxide
synthase

1049 gi19713721FusobacteriumGTP-binding protein 66 28
era

nucleatum
subsp.

nucleatum

ATCC 25586

1050 131291 Homo sa iensfumarylacetoacetase 175 70
(AA 1-349)

1050 g1182393 Homo sa iensfumarylacetoacetate 175 70
hydrolase

1050 g112803409Homo Sapiensfiunar lacetoacetate 175 70

1052 g14680089Human envelope glycoprotein 79 26

immunodeficienc

y virus a
1

1052 g13868997Ephydatia EFPDE2 74 20

fluviatilis

1052 g14679590Human envelope glycoprotein 74 25

immunodeficienc

y virus type
1

1054 g13844648Mycoplasma glycerol kinase (glpK) 71 28

genitalium

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
123
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1054 gi18448155Ipomoea AC3 70 27
leaf

curl virus

1054 gi~12044888~Mycoplasma glycerol kinase (glpK) 71 28

ref~IVP_0726genitalium

98.1

1056 AAM56747 Homo SapiensMOLE- Human brain expressed229 72
single

exon probe encoded protein
SEQ ID

NO: 28852.

1056 AAM67067 Homo SapiensMOLE- Human bone marrow 224 69

expressed probe encoded
protein SEQ

ID NO: 27373.

1056 AAM54664 Homo SapiensMOLE- Human brain expressed224 69
, single

exon probe encoded protein
SEQ ID

NO: 26769.

1058 gi~13310191~multiple recombinant envelope 228 79
protein

gb~AAK181sclerosis

89.1~AF331associated

500_1 retrovirus

element

1058 gi~21103962~Homo sapiensenverin-2 209 77

gb~AAM331

41.1

1058 gi~8272468~gHomo Sapiensenvelope protein 198 75

b~AAF74215

.1 ~AF15696

3 1

1059 120380199Homo sa Similar to LOC168246 251 100
iens

1059 gi~8388692~eLeishmania probable DNA-binding 67 46
protein

mb~CAB940major

42.1 ~

1060 gi~21292780~Anopheles agCP4203 70 39
'

gb~EAA049gambiae
str.

25.1 J PEST

1061 g1330862 Equine membrane glycoprotein 179 30

herpesvirus
1

1061 g117221106Equine glycoprotein gp2 178 34

herpesvirus
1

1061 AAE03643 Homo SapiensINCY- Human extracellular175 29
matrix and

cell adhesion molecule-7
(XMAD-7).

1062 gi~11037117~Homo SapiensNAG13 334 66

gb~AAG274

85.1 CAF
194

537 1

1062 gi~1335205~eHomo SapiensORFII 332 66

mb~CAA364

80.1 ,

1063 g121323402CorynebacteriumABC-type transporter, 70 36
periplasmic

glutamicum component

ATCC 13032

1063 gi~19551869~CorynebacteriumCOG1464:ABC-type uncharacterized70 36

reflNP-5998glutamicum transport systems, periplasmic

71.1 ~ component

1063 gi~17551878~CaenorhabditisTPRDomain 67 37

re NP elegans
4990

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
124
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

90.1

1064 gi2308977Aspergilluschitin synthase 66 29

nidulans

1065 gi18076958Yarrowia Optl protein 74 30

lipolytica

1065 gi786145 Walleye envelope polyprotein 73 28
dermal

sarcoma
virus

1065 gi2801522Walleye gPr env 73 28
dermal

sarcoma
virus

1066 gi9294279ArabidopsisTal l-like non-LTR retroelement67 32

thaliana protein-like; CHP-rich
zinc finger

rotein-like

1066 gi~20848817~Mus musculussimilar to HEAT SHOCK 83 69
COGNATE

ref~XP_1380 PROTEIN 80

10.1

1069 AAM77637 Homo SapiensMOLE- Human bone marrow 96 65

expressed probe encoded
protein SEQ

ID NO: 37943.

1069 AAM64901 Homo SapiensMOLE- Human brain expressed96 65
single

exon probe encoded protein
SEQ ID

NO: 37006.

1069 gig 17473741Homo Sapienssimilar to Meningioma-expressed112 56
~

ref~~ antigen 6/11 (MEA6) (MEAL
0623 l)

80.1

1070 gi296288 Homo Sapienshistone H1 77 44

1070 15923857 Artemisia s ualene synthase 75 35
annua

1070 AAO08837 Homo SapiensHYSE- Human polypeptide 73 39
SEQ ID

NO 22729.

1071 g121483554Drosophila SD02058p 72 29

melano aster

1071 g18515845Homo Sapienshepatocellular carcinoma71 38
associated

rotein TD26

1071 gi~21483554~Drosophila SD02058p 72 29

gb~AAM527melanogaster

52.1 ~

1072 g15902896Streptomycestype I polyketide synthase74 50
AVES 4

avermitilis

1072 gi~21301752~Anopheles agCP8235 70 34

gb~EAA138gambiae
str.

97.1 PEST

1073 AAV30916 Homo SapiensGEMY Human secreted protein9.9 66

_ AR415 4 cDNA.
aal

1073 ABB89113 Homo SapiensHUMA- Human polypeptide 99 66
SEQ ID

NO 1489.

1073 AAB90679 Homo SapiensGEMY Human AR415 4 protein99 66

sequence SEQ ID 35.

1074 AAG99338 Homo SapiensTAKE Human atypical tachykinin380 92
~

rotein fragment SEQ ID
NO: 20.

1074 AAG99336 Homo SapiensTAKE Human atypical tachykinin329 91

rotein fragment SEQ ID
NO: 13.

1074 AAG99333 Homo SapiensTAKE Human atypical tachykinin324 91

protein fra ment SEQ
ID NO: 3.

1075 g117945760Drosophila RE33302p 305 29

melanogaster

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
125
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1075 gi1039447SaccharomycesLpblp 91 25

cerevisiae

1075 AAB64777 Homo SapiensHUMA- Human secreted 78 77
protein

sequence encoded by gene
5 SEQ ID

N0:63.

1076 AAB50261 Homo SapiensCORI- Human breast cancer308 39
associated

B726P-20 rotein.

1076 AAB50244 Homo SapiensLORI- Human breast cancer308 39
associated

B726P-79 rotein.

1076 AAB84702 Homo SapiensCORR Amino acid sequence308 39
of a

human cancer associated
antigen.

1077 12529735 Gorilla 1 co horin BlE recursor 71 31
orilla

1077 AAB74724 Homo SapiensINCY- Human membrane 70 31
associated

protein MEMAP-30.

1077 g14164424Scluzosaccharomsimilar to yeast cytoskeleton70 24
control

yces ombe protein Bnilp

1078 g118145107Clostridiumprobable transcriptional71 28
regulator

perfringens

1078 gi~9581801~ePlasmodium guanylyl cyclase 69 24

mb~CAC005falciparum

46.1

1078 gi~16805032~Plasmodium Ser/Thr protein kinase 69 26

ref~NP_4730falciparum

61.1

1079 gi~20886321~Mus musculussimilar to olfactory 72 34
receptor, family 5,

ref~XP subfamily V, member 1;
1406 olfactory

_ receptor, family 5, subfamily
14.1 V

member 1

1081 g19650824Petroselinumcommon plant regulatory 76 28
factor 5

Iris um

1081 g1559695 Hydrolagus This CDS feature is included74 31
to show

colliei the translation of the
corresponding

C_region. Presently translation

qualifiers on C region
features are

illega1

1081 g1476622 Hydrolagus immunoglobulin light 74 31
chain

colliei

1082 AAM39205 Homo SapiensHYSE- Human polypeptide 363 71
SEQ ID

NO 2350.

1082 AA007159 Homo SapiensHYSE- Human polypeptide 357 76
SEQ ID

NO 21051.

1082 AAM40991 Homo SapiensHYSE- Human polypeptide 343 79
SEQ ID

NO 5922.

1083 gi~17229222~Nostoc Sp. similar to HetF protein 72 30
PCC

reflNP-48577120

70.1

1084 g117221628Felis catusT-lym hocyte surface 76 38
CD2 antigen

1084 g118565073Crimean-Congoenvelope glycoprotein 74 29
precursor

hemorrhagic

fevervirus

1084 gi~17221628~Felis catusT-lymphocyte surface 76 38
CD2 antigen

dbj~BAB784

75.1

1085 117430213Ralstonia PUTATIVE HEMAGGLUTININ- 74 26

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
126
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

solanacearumRELATED PROTEIN

1087 gi2323287multiple polyprotein 618 79

sclerosis

associated

retrovirus

1087 gi~4996596~dHuman polyprotein 317 74

bj~BAA7854endogenous

9.1 ~ retrovirus
W

1087 gi~9630708~rFeline leukemiagag-pol precursor polyprotein293 38
gPr80

e~NP_0472virus

55.1

1088 gi15075953SinorhizobiumPUTATIVE MOLYBDENUM 70 56

meliloti TRANSPORT SYSTEM PERMEASE

ABC TRANSPORTER PROTEIN

1088 gi2288880Arthrobactertransmembrane protein 67 56

nicotinovorans

1088 gi17298547BradyrllizobiumModB 67 56

japonicum

1089 AAY95660Homo sa iensZYMO Human Zntr2 protein.231 61

1089 AAU83682Homo SapiensGETH Human PRO protein, 210 59
Seq ID No

182.

1089 AAY99386Homo SapiensGETH Human PR01305 (UNQ671)210 59

amino acid sequence SEQ
ID N0:153.

1090 gi7688355Solanum Dof zinc finger protein 70 31

tuberosum

1090 gi4389445Drosophila transcription factor 67 32

melanogaster

1090 gi~7688355~eSolanum Dof zinc finger protein 70 31

mb~CAB898tuberosum

31.1

1092 AAG78884Homo SapiensBIOW- Human ribosomal 90 44
protein s5-

17.

1092 AAM91239Homo SapiensHUMA- Human 72 53

immune/haematopoietic
antigen SEQ

ID NO:18832.

1092 AAM95026Homo sapiensHUMA- Human reproductive72 48
system

related antigen SEQ ID
NO: 3684.

1094 gi18676450Homo sa iensFLJ00122 protein 69 38

1094 gi18073428Homo sa iensstabilin-2 69 38

1094 gi~20806091~Homo Sapiensstabilin-2; CD44-like 69 38
precursor FELL

ref~NP_0600

34.8

1095 gi20906397Methanosarcinaconserved protein 76 44

mazei Goel

1095 gi~21299784~Anopheles agCP6531 75 30

gb~EAA119gambiae str.

29.1 PEST
~

1095 gi~17549046~Ralstonia CONSERVED HYPOTHETICAL 73 32

reflNP-5223solanacearumPROTEIN

86.1

1096 AAB58317Homo SapiensROSE/ Lung cancer associated678 100

of eptide sequence SEQ
ID 655.

1096 gi862600Drosophila male-specific lethal-1 176 25
protein

melanogaster

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
127
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1096 gi601930 Oryctolagus neurofilament-H 115 24

cuniculus

1097 AAU83109 Homo SapiensZYMO Novel secreted 76 85
protein

Z701935G4P.

1097 gi~20348496~Mus musculussimilar to RII~EN cDNA 72 57
9030605E16

ref~XP_1117

12.1

1098 gi18031887Mus musculusFanconi anemia complementation77 29

gr ou G

1098 112002137Mus musculusFanconi anemia grou 77 29
G rotein

1098 AAB72381 Homo sapiensLEEM/ Human hairy and 75 28
enhancer of

S lit homolo a amino
acid se uence.

1099 g18217648Homo SapiensdJ579F20.1 (high-mobility159 70
group

(nonhistone chromosomal)
protein 1-

like 1)

1099 g15815432Gallus gallushi h mobility group 154 70
protein HMGl

1099 14140289 Gallus allushigh mobility group 154 70
1 rotein

1100 ABB 11527Homo SapiensHYSE- Human apolipoprotein84 26
B

rece for homolo ue,
SEQ ID N0:1897.

1100 1487347 Homo sa iensbrea oint cluster region81 32
rotein

1100 g1144050 Bordetella filamentous hemagglutinin78 30

periussis

1102 AAM68946 Homo SapiensMOLE- Human bone marrow327 81

expressed probe encoded
protein SEQ

ID NO: 29252.

1102 AAM79768 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO

3414.

1102 AAM78784 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO

1446.

1103 AAZ11186 Homo SapiensSAGA Gene encoding transmembrane143 68

_ domain containing protein
aal clone

HP02239.

1103 AAD31079_Homo SapiensINCY- Human cornichon 143 68
protein

aal (CORN) cDNA.

1103 AAA88439_Homo SapiensGETH Antitumour PR0181 143 68
cDNA

aal clone DNA23330-1390.

1104 ABB07527 Homo sapiensINCY- Human drug metabolizing562 100

enzyme (DME) (ID: 5643401CD1).

1104 ABB07515 Homo SapiensINCY- Human drug metabolizing562 100

enzyme (DME) ID: 8097779CD1).

1104 113161409Mus musculusfamily 4 cytochrome 431 76
P450

1107 g113542874Mus musculusSimilar to CGI-67 protein677 64

1107 AAU81978 Homo sa iens1NCY- Human secreted 665 65
protein SECP4.

1107 AAU77137 Homo SapiensMILL- Human alpha/beta 665 65
hydrolase

38618 polypeptide.

1108 113620885Homo Sapiensmitochondrial ribosomal323 100
protein S6

1108 113620887Mus musculusmitochondrial ribosomal284 82
protein S6

1108 g119713140FusobacteriumFusobacterium outer 79 28
membrane protein

nucleatum family
subsp.

nucleatum

ATCC 25586

1109 g118378673Homo SapiensPATE 607 89

1109 g1530'5193Rattus sperm protein 10 108 30

norvegicus

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12,8
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1109 gi969103 Mus musculusmSP-10 107 27

1110 12462979 Bos taurus Tenascin-X 119 34

1110 g13413958Homo SapiensLDL rece for related 110 27
rotein 105

1110 g113938519Homo Sapienslow density lipoprotein110 27
receptor-related

protein 3

1111 g117981053Mus musculustranscri tion factor 82 32
NFATS

1111 g115425825Mus musculustonicity-responsive 82, 32
enhancer binding

rotein

1111 g16911148Mus musculustranscription factor 82 32
NFATS isoform b

1112 g16634473Metarhizium adenylate cyclase, ACY 73 . 30

anisopliae
var.

anisopliae

1113 AAU19759 Homo SapiensHUMA- Human novel extracellular900 70

matrix rotein, Seq ID
No 409.

1113 g13171934Mus musculusneuronal-STOP rotein 886 52

1113 g12769587Mus musculusSTOP protein 885 52

1114 g118652188Oenococcus OppF 72 41
oeni

1115 g19119 Drosophila fos-related anti en 69 37
s .

1115 g17769652Drosophila Fos-related antigen 69 37

melanogaster

1115 g117862946Drosophila SD04477p 69 37

melanogaster

1116 121212948Mus musculusperoxisomal rotein (PeP)243 83

1116 12347114 Mus musculusCC chemokine receptor-572 28

1116 12431976 Mus musculusCCRS 72 28

1117 gi~20825251~Mus musculussimilar to RE1-silencing77 40
transcription

ref~XP factor; neuron restrictive
1319 silencer

_ factor; re ressor bindin
98.1 ~ to the X2 box

1117 gi~15597871~Pseudomonas probable type II secretion69 41
system

ref~NP_2513aeruginosa protein

65.1

1118 gi~3860513~eMus famulus reverse transcriptase 303 82

mb~CAA135

74.1 ~

1118 gi~3860536~eMus saxicolareverse transcriptase 303 81

mb~CAA135

77.1 ~

1118 gi~3860510~eMus dunni reverse transcriptase 298 63

mb~CAA135

73.1

1119 AA004758 Homo SapiensHYSE- Human polypeptide234 59
SEQ ID

NO 18650.

1119 AAM69569 Homo sapiensMOLE- Human bone marrow220 63

expressed probe encoded
protein SEQ

ID NO: 29875.

1119 AAM67717 Homo SapiensMOLE- Human bone marrow219 49

expressed probe encoded
protein SEQ

ID NO: 28023.

1120 g121107877Xanthomonas cytochrome C 78 27

axonopodis
pv.

citri str.
306

1120 g115292331Drosophila LD47230p . 77 42

melanogaster

1120 115072444Avian phospho rotein 72 38

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
129
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

paramyxovirus
6

1121 AAB44126 Homo SapiensHUMA- Human cancer associated150 83

protein sequence SEQ
ID N0:1571.

1121 gi550015 Homo sapiensribosomal protein L21 150 83

1121 gi619788 Homo sa L21 ribosomal protein 150 83
iens

1122 AAU74448 Homo SapiensOULU- Human protein sequence125 100
of

lysyl hydroxylase 1 (LH
1 ).

1122 1190074 Homo sa lysyl hydroxylase 125 100
iens

1122 g15817297Homo Sapienslysyl hydroxylase 1 125 100

1123 g121281601CaenorhabditisC. elegans PQN-44 protein78 34

ele ans (corresponding sequence
F55A12.9c)

1123 g114578225CaenorhabditisC. elegans PQN-44 protein76 38

elegans (comes ondin se uence
F55A12.9b)

1123 g12088669CaenorhabditisC. elegans PQN-44 protein76 38

elegans comes ondin se uence
F55A12.9a)

1125 AAU17301 Homo SapiensHUMA- Novel signal transduction344 88

athway rotein, Se ID
866.

1125 AAE11776 Homo SapiensINCY- Human kinase (PKIN)-10344 88

protein.

1125 AAU17304 Homo SapiensHUMA- Novel signal transduction340 86

athway rotein, Se ID
869.

1126 AAM41712 Homo sapiensHYSE- Human polypeptide 152 96
SEQ ID

NO 6643.

1126 AAM39926 Homo SapiensHYSE- Human polypeptide 152 96
SEQ ID

NO 3071.

1126 AAM79067 Homo SapiensHYSE- Human protein SEQ 152 96
ID NO

1729.

1127 AAE02938 Homo SapiensMILL- Human adenylate 252 98
cyclase

25678.

1127 AAB02006 Homo sapiensTEXA Adenylyl cyclase 252 98
type II-C2 C2

al ha domain.

1127 g1202752 Rattus adenylyl cyclase type 252 98
II

norvegicus

1128 AAA94860_Homo SapiensTEXA Human caspase activator96 100
Smac

aal codin se uence.

1128 AAU78447 Homo SapiensUYJE- Inhibitor of apoptosis96 100
(IAP)

roteiii Smac.

1128 AAB26210 Homo sa TEXA Human cas ase activator96 100
iens Smac.

1129 g13874765CaenorhabditisSimilarity to Drosophila97 30
acetylcholine

elegans receptor protein

(SW:ACH1 DROME), contains

similarity to Pfam domain:
PF00065

(Neurotransmitter-gated
ion-channel),

Score=296.9, E-value=5e-86,
N=3

1129 g16681597Yaba monkeysimilar to vaccinia G8R 72 28

tumor virus

1129 gi~17548199~Caenorhabditisacetylcholine receptor 97 30

reflNP elegans
5099

32.1 ~

1130 gi~17564116~Caenorhabditistyrosine-proteinkinase 73 29

ref~IVP-5064elegans

84.1

1131 113925613Homo sa insulinoma-associated 88 27
iens protein IA-6

r 1131g1158485 Drosophila son of sevenless protein85 24
~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
130
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

melanogaster

1131 gi728778205-Feb-1998symbol=Sos; 85 24

synonym=BG:DS00941.4;

match=method:"sim4",
score:"1000.0",

desc:"GenBank::M83931:Drosophila

melanogaster son of sevenless
(Sos)

mRNA, complete cds. CDS:346..5133;

PID:g158485.", species:"Drosophila

melanogaster' ;

match=method: "BLASTX",

version:"2.Oa19MP-WashU
[Build

so12.5-ultra 01:47:30

1132 gi9696 Mytilus of henolic adhesive protein75 25
edulis

1134 gi13562016Plectreurysfibroin 2 72 29
tristis

1134 gi1129074Bacillus beta-N-acetylglucosaminidase69 28
subtilis

1134 gi2636104Bacillus N-acetylglucosaminidase 69 28
subtilis (major

autolysin (CWBP90)

1135 AAB58870 Homo SapiensHUMA- Breast and ovarian72 80
cancer

associated antigen protein
sequence

SEQ ID 578.

1135 111595476Homo sa RPBllblbeta protein 72 80
iens

1135 AAB44840 Homo SapiensHUMA- Human secreted 69 45
protein

encoded by gene 11.

1137 g1206985 Rattus troponin I 70 46

norve icus

1137 g116945895Takifugu SUN-like 1 70 31

rubri es

1137 gi~8394466~rRattus troponin I, skeletal, 70 46
fast 2

ef~NP norvegicus
0588
_

81.1

1140 AA004998 Homo SapiensHYSE- Human polypeptide 277 96
SEQ ID

NO 18890.

1140 g119917538MethanosarcinamttA/Hcf106 protein 80 28

acetivorans
str.

C2A]

[Methanosarcina

acetivorans
C2A

1140 14959705 Mus musculusfibulin-2 76 28

1141 g110141010Vesicular non-structural polyprotein91 31

exanthema
of

swiiia virus

1141 g16566147Drosophila large Forked protein 85 30

melanogaster

1141 g12317953murid glycoprotein 150 79 28

he esvirus
4

1142 AAB54067 Homo SapiensHUMA- Human pancreatic 218 56
cancer

antigen protein sequence
SEQ ID

N0:519.

1142 g11710365Mus musculusnoggin 89 29

1142 g121105761Equus caballusno gin 89 29

1143 gi~21295753~Anopheles agCP1560 69 26

gb~EAA078gambiae
str.

98.1 ~ PEST

1144 g1505094 Homo Sapienssimilar to an actin bundling127 35
~ protein,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
131
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

dematn.

1144 gi2337952Homo Sapiensactin-binding double-zinc-finger122 36

rotein

1144 gi21304227Oryza sativaovule development aintegumenta-like76 29

rotein BNM3

1145 gi~21298336~Anopheles agCP2121 68 37

gb~EAA104gambiae str.

81.1 ~ PEST

1146 AAW22049 Homo SapiensINCY- Interferon gamma 221 100
inducing

factor-2 (IGIF-2) alternate
transcript

variant.

1146 AAV05368_Homo SapiensSCHE cDNA encoding human167 84

aal interleukin-1-gamma.

1146 AAH78060-Homo SapiensSTRD Nucleotide sequence167 84
of human

aal interleukin 18 (IL-18).

1147 AAY57937 Homo SapiensINCY- Human transmembrane123 100
protein

HTMPN-61.

1147 gi~20345904~Mus musculussimilar to delta-like 105 86
homolog

ref~XP_1098 (Drosophila)

23.1

1148 gi19069293Encephalitozoonsimilarity to ADP/ATP 75 32
CARRIER

cuniculi PROTEIN

1148 gi8978336Arabidopsis contains similarity 74 26
to CHP-rich zinc

thaliana finger rotein~ ene id:K23F3.4

1148 gi19716318Aspergillus antigenic cell wall 74 32
protein MP1

flavus

1149 gi5456699Emericella ATP-binding cassette 70 35
multidrug

nidulans traps ort protein ATRC

1149 gi~20898840~Mus musculussimilar to HSPC038 protein69 0 31

re~XP_1393

87.1 ~

1150 gi3883128Arabidopsis arabinogalactan-protein96 32

thaliana

1150 gi17429208Ralstoua CONSERVED HYPOTHETICAL 92 26

solanacearumPROTEIN

1150 gi4063766Emericella chitinase 91 27

nidulans

1151 gi13561058Homo SapiensdJ1108D11.1 (novel protein107 31
similar to

C. elegans T22C1.7 )

1151 gi21105299Mytilus precollagen-NG 105 26

alloprovincialis

1151 gi14164347Oncorhynchuscollagen al(I) 96 28

mykiss

1152 gil8479434Mus musculusolfactory rece for MOR188-176 33

1152 gi2653915Oran virus glycoprotein G1 and 72 46
G2 precursor;

envelo a Tyco rotein
precursor

1152 gi18479436Mus musculusolfactory rece for MOR188-272 33

1153 gi3403167Homo sa tensGBAS 161 86

1153 112804791Homo sa tensglioblastoma am lifted 161 86
sequence

1153 AAB57149 Homo SapiensROSEI Human prostate 134 81
cancer antigen

protein se uence SEQ
ID N0:1727.

1154 g117742234Agrobacteriumhistidase 87 35

tumefaciens
str.

C58 (U.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
132
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

Washington)

1154 gi15159496AgrobacteriumAGR_L_1400GMp 87 35

tumefaciens
str.

C58 (Cereon)

1154 gi158521Drosophila seven-up protein type 80 32
2

melano aster

1155 gi~10441551~Cryptotermescytochrome b 65 28

gb~AAG170domesticus

99.1~AF189

115 1

1156 AA012089Homo SapiensHYSE- Human polypeptide 475 98
SEQ ID

NO 25981.

1156 gi20147787Xeno us laevisnuclear rece for core 74 25
ressor

1156 gi19881705Oryza sativaPutative transposable 72 32
element

1157 19963851Homo SapiensHT019 80 34

1157 AAB93530Homo SapiensHELI- Human protein sequence77 34
SEQ

ID N0:12884.

1157 11040970Homo sa iensfus-like protein 77 42

1158 19795254Sepia officinalisGABA-A rece for beta 71 27
subunit

1158 g115026157Clostridium amidase, germination 68 34
specific

acetobutylicumcwlC/cwlD B.subtilis
ortholo )

1158 gi~9795254~gSepia officinalisGABA-A receptor beta 71 27
subunit

b~AAF97816

.1

1159 AAB93423Homo sapiensHELI- Human protein sequence336 100
SEQ

ID NO:12641.

1159 g113097768Homo SapiensSimilar to RIKEN cDNA 336 100
2900073H19

ene

1159 g120071708Mus musculusRIKEN cDNA 2900073H19 334 96
gene

1160 AAM72558Homo SapiensMOLE- Human bone marrow 274 100

expressed probe encoded
protein SEQ

ID NO: 32864.

1160 AAM59959Homo sapiensMOLE- Human brain expressed274 100
single

exon probe encoded protein
SEQ ID

NO: 32064.

1161 AAB07704Homo SapiensINMR Protein encoded 139 36
by the

endogenetic fragment
of HERV-W.

1161 g18272464Homo sa iensag 139 36

1161 gi~5726238~gmultiple gag polyprotein 131 35

b~AAD4837sclerosis

5.1~AF1238associated

81_1 retroviriis

element

1162 AAU25448Homo sapiensINCY- Human mddt protein346 79
from clone

LG:1083264.1:2000MAY
19.

1162 AAU11265Homo sa iensBODE- Human zinc finger 319 65
rotein 51.

1162 AAB95637Homo SapiensHELI- Human protein sequence314 67
SEQ

ID N0:18371.

1163 g114189950Homo Sapiensconnexin 58 536 84

1163 g19957542Homo Sapiensconnexin 59 536 84

1163 110946367Danio rerio connexin 55.5 485 81

1164 1755700 Bombyx mori sericinlB 76 27

1164 g119569861DictyosteliumRTOA protein (Ratio-A). 76 28

discoideum

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
133
Table 2
SEQ AccessionSpecies Description Score

ID No, Identity

NO:

1164 gi10580635HalobacteriumVng1087c 76 25

s . NRC-1

1165 gi19915386MethanosarcinaWD-domain containing 89 28
protein

acetivorans
str.

C2A]

[Methanosarcina

acetivorans
C2A

1165 15639663 Homo sa iensWD re eat protein WDR3 83 28

1165 g111544739Homo sa iensdJ776P7.2 (WD re eat 83 28
domain 3

1166 AAM69338 Homo SapiensMOLE- Human bone marrow72 31

expressed probe encoded
protein SEQ

ID NO: 29644.

1166 AAM56953 Homo sapiensMOLE- Human brain expressed72 31
single

exon probe encoded protein
SEQ ID

NO: 29058.

1166 g120197507Arabidopsis expressed protein 67 39

thaliana

1167 g15802812Homo SapiensGa rotein 83 30

1167 g17160650Bordetella pertactin (P.68) 79 31

bronchiseptica

1167 g113173444Bordetella pertactin 79 31

bronchise
tics

1168 g11495029Danio rerio protein kinase CK2 alpha'84 24

1168 g1643443 Penicillium PHOG 82 32

chrysogenum

1168 gi~18858419~Danio rerio casein kinase 2 alpha 84 24
2

re~NP_5713

15.1

1169 g1206716 Rattus salivary proline-rich 90 31
protein

norvegicus

1169 g115029903Mus musculusSimilar to proline-rich89 36
protein BstNI

subfamil 2

1169 g153182 Mus musculusproline rich rotein 81 34

1170 gi~17553370~CaenorhabditisF40H6.S.p 78 33

ref~NP_4983elegans

18.1

1170 gi~15215731~Arabidopsis AT4g36780/C7A10 580 73 30

gb~AAK914thaliana

11.1

1171 1340446 Homo sa ienszinc fm er protein 7 218 61
(ZFP7)

1171 AAB43928 Homo SapiensHLTMA- Human cancer 216 58
associated

protein sequence SEQ
ID NO:1373.

1171 AAB21040 Homo SapiensINCY- Human nucleic 213 48
acid-binding

protein, NuABP-44.

1172 AAE04368 Homo sapiensINCY- Human kinase (PKIN)-9.120 85

1172 AAM79153 Homo SapiensHYSE- Human protein 120 85
SEQ ID NO

1815.

1172 AAE10614 Homo SapiensCUR A- Human novel STE20-like120 85

rotein, NOV-3d.

1173 1218572 Pan troglodytesrot GOR 74 29

1173 1243898 Pan GOR 74 29

1173 11666473 Mus musculusNOV rotein 71 50

1174 g15901830Drosophila BcDNA.GH07910 74 31

melano aster

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
134
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1174 AAM80237 Homo SapiensHYSE- Human protein SEQ 71 38
ID NO

3883.

1174 ABB 11528Homo SapiensHYSE- Human secreted 71 38
protein

homologue, SEQ ID N0:1898.

1175 gi~12054759~Podospora catalase A 65 33

emb~CAC20anserina

748.1

1176 AAM93289 Homo SapiensHELI- Human polypeptide,145 100
SEQ ID

NO: 2777.

1176 gi17431512Ralstonia PUTATIVE OUTER MEMBRANE 71 26

solanacearumCHANNEL LIPOPROTEIN

TRANSMEMBRANE

1176 gi15823991Streptomycesmodular polyketide synthase70 51

avermitilis

1177 AAM41939 Homo SapiensHYSE- Human polypeptide 84 61
SEQ ID

NO 6870.

1177 gi870751 Homo SapiensN-acetylgalactosamine 84 61
6-sulfate

sulfatase (GALNS)

1177 1618426 Homo sa N-acetyl alactosamine 84 61
iens 6-sul hatase

1178 1435855 Mus Sp. CREB-binding protein; 89 22
CBP

1178 AAW40058 Homo sapiensUSSH Cellular transcriptional87 22
factor

CBP.

1178 g117944308Drosophila RE12101p 86 26
-

melanogaster

1179 AAM25814 Homo SapiensHYSE- Human protein sequence73 93
SEQ

ID N0:1329.

1179 AAM25290 Homo SapiensHYSE- Human protein sequence73 93
SEQ

ID N0:805.

1179 AAM79441 Homo SapiensHYSE- Human protein SEQ 73 93
~ NO

3087.

1180 AAB88388 Homo SapiensHELI- Human membrane 719 97
or secretory

protein clone PSEC0131.

1180 g120810493Homo SapiensSimilar to RII~EN cDNA 716 96
2810417M05

gene

1180 AAD30543_Homo SapiensMILL- Human B7RP-2 DNA. 83 38

aal

1181 ABB 14686Homo SapiensHUMA- Human nervous system190 97
related

olypeptide SEQ ID NO
3343.

1181 g114329731Secale cerealehigh molecular weight 88 27
glutenin subunit

x

1181 g114329761Triticum high molecular weight 84 26
glutenin subunit

aestivum x

1182 111692645Mus musculusaspartly beta-hydroxylase74 28

_ g111878112Mus musculusaspartyl beta-hydroxylase74 28
1182 6.6 kb

transcript

1182 g111878110Mus musculusaspartyl beta-hydroxylase74 28
4.5 kb

transcript

1183 g115485622Homo SapiensQ9H4T4 like 80 25

1183 g119714949FusobacteriumTong protein 78 32

nucleatum
subsp.

nucleatum

ATCC 25586

1183 g17717375Homo Sapienshuman CHD2-52 down syndrome71 23
cell

adhesion molecule

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
135
Table 2
SEQ AccessionSpecies Description Score /a

ID No. Identity

NO:

1184 AAU83667 Homo SapiensGETH Human PRO protein,388 100
Seq ID No

152.

1184 AAG89161 Homo SapiensGEST Human secreted 388 100
protein, SEQ ID

NO: 281.

1184 AAY99348 Homo SapiensGETH Human PR01194 (UNQ607)388 100

amino acid sequence
SEQ ID NO:29.

1185 AAB93506 Homo SapiensHELI- Human protein 543 100
sequence SEQ

ID N0:12830.

1185 AAB87570 Homo SapiensGETH Human PR01268. 426 95

1185 AAY78808 Homo sapiensPROT- Hydrophobic domain426 95

containing protein clone
HP10537

rotein se uence.

1187 gi15823978Streptomycesmodular polyketide synthase75 41

avermitilis

1187 AAB66657 Homo SapiensHSCR- Human elastin 71 39
protein without

si nal pe tide.

1187 AAY69137 Homo SapiensUNSY Amino acid sequence71 39
of a

human tropoelastin derivative.

1188 gi6907090Oryza sativaSimilar to Oryza sativa76 30
root-specific

(japonica RCc3 mRNA. (L27208)

cultivar-
ou

1188 AAY36063 Homo SapiensGEST Extended human 74 26
secreted

rotein se uence, SEQ
ID NO. 448.

1188 AAY35971 Homo SapiensGEST Extended human 73 26
secreted

protein sequence, SEQ
ID NO. 220.

1189 gi9827989Leishmania possible CG12797 protein72 36

ma' or

1189 gi~13625467)Leishmania LACK protective antigen68 27

gb~AAK350donovani

68.1

1190 gi17027071Xiphocentronelongation factor-1 107 27
Sp. alpha

UMSP00002937

2-Costa Rica

1190 gi310665 StrongylocentrotNf Y-A subunit 88 24

us p uratus

1190 gi21743 Triticum lugh molecular weight 86 23
glutenin subunit

aestivum lAxl

1191 gi16878287Homo SapiensSimilar to C-terminal 167 96
modulator protein

1191 115866714Homo SapiensC-terminal modulator 167 96
protein

1191 AA006984 Homo SapiensHYSE- Human polypeptide132 83
SEQ ID

NO 20876.

1192 AAD05496_Homo SapiensHUMA- Human secreted 859 100
protein-

aal encoding gene 5 cDNA
clone

HHBCS39, SEQ ID N0:15.

1192 AAE01707 Homo SapiensHUMA- Hurnan gene 5 859 100
encoded

secreted protein HHBCS39,
SEQ ID

N0:119.

1192 AAE01676 Homo SapiensHUMA- Human gene 5 encoded859 100

secreted protein HHBCS39,
SEQ ID

N0:88.

1193 g118650588Homo Sapiensretinoic acid early 1312 99
transcript 1

1193 AAB15540 Homo SapiensINCY- Human immune system1283 97

molecule from Inc a
clone 3402252.

1193 ABB84887 Homo SapiensGETH Human PR0791 protein1234 94

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
136
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

se uence SEQ ID N0:142.

1195 11196427 Homo sa a 2 protein 248 50
iens

1195 g11780975Human gag protein 248 50

endogenous

retrovirus
K

1195 g11556397Human gag 248 50

endogenous

retrovirus
K

1196 g1556256 Leishmania G protein alpha subunit 72 22

donovani

1197 AAY07237 Homo SapiensISTF Wild type monocyte 121 100
chemotactic

rotein 2.

1197 AAY05300 Homo sa ISTF C-C chemokine, MCP2.121 100
iens

1197 AAW42072 Homo sa INCY- Human MC roprotein.121 100
iens

1198 ABB57423 Homo sapiensHUMA- Human secreted 187 79
protein

encodin olypeptide SEQ
ID NO 69.

1198 ABB57394 Homo SapiensHUMA- Human secreted 187 79
protein

encoding polypeptide
SEQ ID NO 40.

1198 AAY59757 Homo SapiensMETA- Human normal ovarian187 79
tissue

derived protein 34.

1199 AAY72603 Homo SapiensINCY- Human Electron 155 100
Transfer

Protein, ETRN-1.

1199 AAB88465 Homo SapiensHELI- Human membrane 155 100
or secretory

protein clone PSEC0259.

1199 AAE03926 Homo sapiensHUMA- Human gene 29 encoded155 100

secreted protein HTADC63,
SEQ ID

N0:89.

1200 g16458884Deinococcuschorismate mutase/prephenate73 42

radioduransdehydratase

1201 g120803920MesorhizobiumHYPOTHETICAL PROTEIN 68 32

loti

1201 gi~17545158~Ralstonia PUTATIVE LIPASE/ESTERASE66 31

ref~NP_5185solanacearumPROTEIN

60.1

1202 AAM67586 Homo SapiensMOLE- Human bone marrow 69 30

expressed probe encoded
protein SEQ

ID NO: 27892.

1202 AAM55191 Homo SapiensMOLE- Human brain expressed69 30
single

exon probe encoded protein
SEQ ID

NO: 27296.

1202 g1849219 SaccharomycesProlp: Glutamate 5-kinase69 33
(Swiss Prot.

cerevisiae accession number P32264)

1203 g118676554Homo SapiensFLJ00174 rotein 269 84

1203 gi~20913341~Mus musculussimilar to FLJ00174 protein125 81

ref~XP-1267

63.1

1203 gi~20850247~Mus musculussimilar to proline-rich 121 33
protein

ref~XP-1366

64.1

1204 AAM68056 Homo SapiensMOLE- Human bone marrow 140 84

expressed probe encoded
protein SEQ

ID NO: 28362.

1204 AAM55676 Homo SapiensMOLE- Hurnan brain expressed140 84
single

exon probe encoded rotein
SEQ ID

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
137
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

NO: 27781.

1205 gi541624 Drosophila pdm2 71 39

virilis

1205 gi9955855AspergillusRNA polymerase II largest69 38
subunit

oryzae

1205 gi662296 Rattus MIBP1 68 32

norvegicus

1206 ABB50703 Homo SapiensHLTMA- Human secreted 260 94
protein

encoded by gene 52 SEQ
ID N0:651.

1206 AAW88802 Homo SapiensHLJMA- Polypeptide fragment260 94
encoded

by ene 52.

1206 ABB50706 Homo sapiensHL1MA- Human secreted 143 96
protein

encoded by gene 52 SEQ
ID N0:654.

1207 AAM79588 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO

3234.

1207 AAM78604 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO

1266.

1207 AAB58944 Homo SapiensHUMA- Breast and ovarian72 41
cancer

associated antigen protein
sequence

SEQ ID 652.

1208 AAE03429 Homo SapiensHLTMA- Human gene 3 encoded575 64

secreted protein HETDB76,
SEQ ID

NO: 112.

1208 gi19110438Homo Sapienspolycystin-1L1 575 64

1208 AAE03463 Homo SapiensHLTMA- Human gene 3 encoded185 97

secreted protein HETDB76,
SEQ ID

NO: 146.

1209 16760015 Homo sa brain rotein 1114 85
iens

1209 g11747306Mus musculusSDR2 151 31

1209 g120381292Mus musculusstromal cell derived 151 31
factor receptor 2

1211 g114043211Homo SapiensSimilar to RIKEN cDNA 460 89
4931428F04

gene

1211 g1190508 Homo Sapienssalivary proline-rich 113 28
rotein recursor

1211 112862320Homo SapiensWDC146 102 28

1212 AAO14407 Homo SapiensFARB Human 11 beta-hydroxysteroid291 63

dehydrogenase 1-like
enzyme.

1212 AAM79592 Homo sapiensHYSE- Human protein SEQ 217 45
ID NO

3238.

1212 g14581319Homo SapiensdJ28O10.3(HSD11B1 (hydroxysteroid217 45

(11-beta) dehydrogenase
1)

1213 AAR06514 Homo SapiensSTRI Natural human Platelet238 64
Factor-

4var1 encoded by EcolZi
fra ment.

1213 g1292390 Homo Sapiensplatelet factor 4 238 64

1213 AAZ28361_Homo SapiensSMIK Platelet factor-4 200 56
(PF-4)

aal nucleotide sequence.

1214 AAD12580 Homo SapiensSAGA Human protein having162 82

_ hydrophobic domain encoding
aal cDNA

clone HP 10753.

1214 AAD08193 Homo SapiensHUMA- Human secreted 162 82
protein-

_ encoding gene 3 cDNA
aal clone

HNTAC64, SEQ ID N0:13.

1214 AAD05544_Homo sapiensHUMA- Human secreted 162 82
protein-

aal encoding gene l2 cDNA
clone

HNTAC64, SEQ ID N0:63.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13~
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1215 gi21429094Drosophila LD38004p 354 49

melanogaster

1215 gi15292155Drosophila LD40717p 354 49

melanogaster

1215 AAG75596 Homo SapiensHL1MA- Human colon cancer294 50
antigen

protein SEQ ID N0:6360.

1216 gi7248894Xeno us laevisAr rotein-tyrosine kinase84 35

1216 1402191 Mus musculusHNF-3beta 80 26

1216 g1404764 Mus musculusfork head related rotein80 26

1218 AAM39205 Homo SapiensHYSE- Human polypeptide559 74
SEQ ID

NO 2350.

1218 AAO03505 Homo SapiensHYSE- Human polypeptide502 81
SEQ ID

NO 17397.

1218 AAM40991 Homo SapiensHYSE- Human polypeptide467 66
SEQ ID

NO 5922.

1220 AA001188 Homo SapiensHYSE- Human polypeptide248 86
SEQ ID

NO 15080.

1220 AAY73334 Homo sapiens1NCY- HT1ZM clone 180506179 35
protein

se uence.

1220 120249 Oryza sativagt-2 77 32

1221 g14519619Haliotis colla en pro al ha-chain90 28
discus

1221 g17380690Neisseria UDP-N-acetylglucosamine--N-90 37

meningitidesacetylmuramyl-(pentape

22491 pyrophosphoryl-undecaprenol
N-

acetylglucosamine transferase

1221 g17225645Neisseria UDP-N-acetylglucosamine--N-90 37

meningitidesacetylmuramyl-(pentapeptide)

MC58 pyrophosphoryl-undecaprenol
N-

acetyl lucosamine transferase

1222 ABA05334_Homo SapiensMILL- Human fucosyltransferase2154 99

aal family member 32132
coding

sequence.

1222 AAM47905 Homo SapiensMILL- Human fucosyltransferase2154 99

family member 32132.

1222 ABA05333_Homo SapiensMILL- Human fucosyltransferase2154 99

aal family member 32132
encoding cDNA.

1223 AAY21852 Homo SapiensINCY- Human signal peptide-150 100

contianing protein (SIGP)
(clone ID

2652271).

1223 AAY48563 Homo SapiensMETA- Human breast tumour-150 100

associated rotein 24.

1223 AAW75103 Homo SapiensHLTMA- Human secreted 150 100
protein

encoded by ene 47 clone
HMCBP63.

1224 AAM67078 Homo SapiensMOLE- Human bone marrow517 99

expressed probe encoded
protein SEQ

ID NO: 27384.

1224 AAM54676 Homo SapiensMOLE- Human brain expressed517 99
single

exon probe encoded protein
SEQ ID

NO: 26781.

1224 117467358Sus scrofa MIF2 suppressor 184 80

1225 g19454237CochliobolusDNA binding protein 73 30
MAT-1

sativus

1225 g121428792Drosophila GH03582p 72 38

melanogaster

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
139
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1225 gi6633838ArabidopsisF2K11.15 70 31

thaliana

1226 gi21430124Drosophila HL01222p 76 28

melanogaster

1226 AAM77437 Homo SapiensMOLE- Human bone marrow 72 33

expressed probe encoded
protein SEQ

ID NO: 37743.

1226 AAM64659 Homo SapiensMOLE- Human brain expressed72 33
single

exon probe encoded protein
SEQ ID

NO: 36764.

1227 AAM50715 Homo SapiensMILL- Human TRP-like 243 83
calcium

channel-5 (TLCC-5).

1227 gi~20874183~Mus musculussimilar to hornerin 80 29

ref~XP_1310

03.1

1227 gi~17864717~Mus musculushornerin 80 29

gb~AAKl
57

91.1

1229 gi4019247Ateline thymidine kinase 71 46

he esvirus
3

1229 gi2760368Drosophila Shar pei/DRhoGEF2 70 26

melanogaster

1229 gi17862944Drosophila SD04476p 70 26

melanogaster

1230 gi4559296Mus musculussilencing mediator of 80 30
retinoic acid and

thyroid hormone receptor
extended

isoform

1230 118181872Mus musculusGATA-2 protein 78 41

1230 g118033511Rattus transcription factor 78 41
GATA-2

norvegicus

1231 g113365501C rinus integrin beta2-chain 75 27
carpio

1231 g13322933Treponema DNA ligase (11g) 73 32

allidum

1231 gi~13365501~Cyprinus integrinbeta2-chain 75 27
carpio

dbj~BAB391

30.1

1232 AAM79791 Homo SapiensHYSE- Human protein SEQ 78 35
ID NO

3437.

1232 AAM78807 Homo sapiensNYSE- Human protein SEQ 78 35
ID NO

1469.

1232 AAB19338 Homo Sapiens1NCY- Amino acid sequence78 35
of a

human fibrous roteiii
(FIBR).

1233 AAU21459 Homo SapiensHUMA- Human novel foetal87 26
antigen,

SEQ ID NO 1703.

1233 g115081227Arabidopsisglycine-rich protein 75 37
GRP20

thaliana

1233 12645433 Homo SapiensCHD3 74 30

1234 AAU83676 Homo SapiensGETH Human PRO protein, 178 97
Seq ID No

170.

1234 ABB84911 Homo SapiensGETH Human PR01244 protein178 97

sequence SEQ ID N0:190.

1234 AAB62403 Homo sapiensCURA- Human MBSP7 polypeptide178 97

(clone 3499605Ø64 .

1235 ABB 10348Homo SapiensHUMA- Human cDNA SEQ 409 61
ID NO:

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
140
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

656.

1235 AAU18012Homo SapiensHUMA- Human immunoglobulin178 83

olypeptide SEQ ID No
157.

1235 ABB89226Homo SapiensHUMA- Human polypeptide 78 82
SEQ ID

NO 1602.

1236 gi10566951Rattus s-gicerin/MIJC18 85 45

norvegicus

1236 gi10566949Rattus 1-gicerin/MUC18 85 45

norvegicus

1236 AAB90798Homo sapiensNOJI/ Human shear stress-response84 42

rotein SEQ ID NO: 96.

1238 gi21464300Drosophila GH20068p 95 36

melano aster

1238 gi3868879Xeno us laevisZic-related-2 88 35

1238 gi1841756Mus musculusGATA-5 cardiac transcription87 52
factor

1239 gi17946266Drosophila RE61793p 96 40

melanogaster

1239 gi15636898Gallus gallusformin binding protein 91 27
11-related

rotein

1239 gi780454African swinepB407L 88 30

fever virus

1240 AAE05302Homo SapiensMILL- Human TANGO 457 1331 100
protein.

1240 AAE05303Homo SapiensMILL- Human mature TANGO1207 100
457

rotein.

1240 AAE05305Homo SapiensMILL- Human TANGO 457 1201 100
protein

cyto lasmic domain.

1241 gi5640111LycopersiconRAD23 protein 84 25

esculentum

1241 gi17131739Nostoc Sp. polyketide synthase type76 33
PCC I

7120

1241 gi~5640111~eLycopersiconRAD23 protein 84 25

mb~CAB515esculentum

44.1

1242 AAG03496Homo SapiensGEST Human secreted protein,67 39
SEQ ID

NO: 7577.

1242 gi~13876270~Mus musculusprotocadherin alpha 8 66 35

gb~AAK260

55.1

1243 AAE16665Homo SapiensMILL- Human calcium chaimel196 87
family

member, 21784 rotein.

1243 AAB62248Homo SapiensWARN Human calcium channel196 87

alpha2delta subunit.

1243 AAY92320Homo SapiensWARN Human alpha-2-delta-C196 87

calcium channel subunit
polype tide.

1244 gi~4102990~gAspergillus DNA polymerase epsilon 70 30
homolog

b~AAD0163nidulans

7.1

1245 15917666Zea mays extensin-like rotein 94 26

1245 g119481644shrimp whiteWSSV052 89 36

spot syndrome

virus

1245 g117016928shrimp whitewsv001 89 36

spot syndrome

virus

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
141
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1246 AA012623 Homo SapiensHYSE- Human polypeptide 169 69
SEQ ID

NO 26515.

1246 AA012822 Homo SapiensHYSE- Human polypeptide 153 75
SEQ ID

NO 26714.

1246 AAO02255 Homo SapiensHYSE- Human polypeptide 123 65
SEQ ID

NO 16147.

1247 gi1653353Synechocystisnodulation protein 75 28

s . PCC
6803

1247 14468626 Mus musculusTEF-5 74 26

1247 g117430764Ralstonia SKWP PROTEIN 5 74 23

solanacearum

1248 g115139973SinorhizobiumCONSERVED HYPOTHETICAL 77 47

meliloti PROTEIN

1249 g17191078Leishmania L712.2 99 29

maj or

1249 g117384256Homo sapiensmucin 5 85 31

1249 g15821153Homo SapiensRNA binding rotein 83 33

1250 AAY36495 Homo SapiensHUMA- Fragment of human 124 86
secreted

protein encoded by ene
27.

1250 AA012122 Homo sapiensHYSE- Human polypeptide 123 91
SEQ ID

NO 26014.

1250 AAB95063 Homo SapiensHELI- Human protein sequence121 90
SEQ

ID N0:16901.

1252 gi~15839838~Mycobacteriummembrane protein, MmpL 68 27
family

re~NP_3348tuberculosis

75.1 CDC1551

1254 AAG00399 Homo SapiensGEST Human secreted protein,328 100
SEQ ID

NO: 4480.

1254 g121428466Drosophila LD22609p 85 24

melanogaster

1254 g119914274Methanosarcinasensory transduction 85 26
histidine kinase

acetivorans[Methanosarcina
str.

C2A

1256 g114161094Choloepus von Willebrand Factor 80 24

didactylus

1256 g114161092Cyclopes von Willebrand Factor 78 23

didactylus

1256 g113872552Acomys von Willebrand Factor 77 23

cahirinus

1258 g17008025Callithrix prochymosin 715 64

'acchus

1258 g111990126Camelus chymosin 634 57

dromedarius

1258 g1491952 synthetic preprochymosin 618 56

construct

1259 gi~21402709~Bacillus AMP-binding, AMP-binding72 34
enzyme

ref~NP_6586anthracis [Bacillus anthracis
A2012

94.1

1260 gi~4505431~rHomo Sapiensnuclear protein, ataxia-telangiectasia64 33

ef~NP_0025 locus; NPAT gene; E14
gene

10.1

1260 gi~15309894~Homo Sapienssimilar to nuclear protein,64 33
ataxia-

ref~XP_0408 telangiectasia locus;
NPAT gene; E14

46.2 gene

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
142
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1260 gi~1304114~dHomo sapiensNPAT 64 33

bj ~BAA
1186

1.1

1261 gi4519535Homo SapiensLeukotriene B4 ome a-hydroxylase133 49

1261 gi1857022Homo Sapiensleukotriene B4 omega-hydroxylase133 49

1261 gi18266446Homo Sapienscytochrome P450, subfamily133 49
IVF,

of epode 2

1262 gi13363530Escherichia cell division protein 79 26
coli HfIB/FtsH

0157:H7 protease

1262 gi746401 Escherichia ATP-binding rotein 79 26
coli

1262 1146028 Escherichia ftsH 79 26
coli

1263 AAW67859 Homo SapiensHUMA- Human secreted 283 100
protein

encoded by gene 53 clone
HBMCL41.

1264 g111066248Helix lucorumpresenilin 85 21

1264 gi~19115422~Schizosaccharomribonuclease II RNB 69 30
family protein;

ref~NP'5945yces pombe dis3-like

10.1

1264 gi~14720912~Homo Sapienssimilar to Matrin 3 69 32

ref~XP_03
82

04.1

1265 g15757703Mus musculussyntrophin-associated 82 38
serine-threonine

protein kinase

1265 g14996035Human 69.8% identical to U47 76 42
gene of strain

heipesvirus U1102 of HHV-6
6

1265 g1330951 Gallid ICP4 76 36

lie esvirus
1

1266 gi~17511177~CaenorhabditisZK1053.3.p 75 40

ref~NP,4933elegans

24.1 ~

1266 gi~17538077~CaenorhabditisZK1248.2.p 69 34

ref~NP elegans
4951

59.1

1267 g1915540 Ovis aries pregnancy-s ecific antigen85 25

1267 16179989 Capra hircuspregnancy-associated 84 25
glycoprotein-2

1267 g19798658Rhinolophus pepsinogen A 80 23

ferrume uinum

1268 gi~15789526~Halobacteriumserine proteinase; HtrA69 30

ret~NP_2793Sp. NRC-1

50.1

1269 g19988674Influenza hemagglutinin protein 70 24
A virus .

(A/Swine/Wisco

nsin/14094/99(H

3N2))

1269 g16552676Influenza hemagglutinin 70 25
A virus

(ABangkok/1/97

(H3N2))

1269 g16552638Influenza hemagglutinin 70 24
A virus

(A/Trinidad/51/9

6(H3N2))

1270 13378527 Zea mays anther specific protein87 41

1270 AAW 15787Homo sapiensPENN- Human metastasis 85 28
suppressor

KISS-1.

1270 g121410770Homo SapiensSimilar to RTKFN cDNA 84 46
1500005K14

gene

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
143
Table
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1271 gi1335527Human reading frame VP3 75 38

oliovirus
1

1271 gi61253 Human polyprotein 75 38

oliovirus
1

1271 gi~17453412~Homo Sapienssimilar to 60S ribosomal76 40
protein L7A

reflXP-0631 (Surfeit locus protein
3)

32.1

1272 AAU87081 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related

lectin, Siglec-11.

1272 AAU87077 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related

lectin, Siglec-BMS-L3d.

1272 AAU87076 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related

lectin, Siglec-BMS-L3c.

1273 AAA09121 Homo SapiensCURA- Clone 2355875 720 100
cDNA

_ (update), encodes syncollin
aal homologue.

1273 AAY92233 Homo SapiensCURA- Glone 2355875f 720 100
- syncollin

homologue.

1273 AAB54267 Homo SapiensHUMA- Human pancreatic 715 100
cancer

antigen protein sequence
SEQ ID

N0:719.

1274 gi15559064Mus musculusSNAGl 198 59

1274 AAU17435 Homo sapiensHUMA- Novel signal transduction131 62

athway protein, Se ID
1000.

1274 AAW99023 Homo sa iensMOUN 1762 eptide sequence.131 62

1275 gi~6753732~rMus musculusepidermal growth factor65 30

ef~NP_0342

43.1 ~

1275 gi~50801 Mus musculuspolyprotein 65 30
hem

b~CAA2411

5.1

1275 gi~20341089~Mus musculusepidermal growth factor65 30

ref~XP_1093

85.1

1276 AAM39205 Homo sapiensHYSE- Human polypeptide447 78
SEQ ID

NO 2350.

1276 AAM40991 Homo SapiensHYSE- Human polypeptide424 74
SEQ ID

NO 5922.

1276 AA007159 Homo SapiensHYSE- Human polypeptide401 75
SEQ ID

NO 21051.

1277 gi13905120Mus musculusRIKEN cDNA 0610013I17 134 35
gene

1277 113936283Mus musculusTRH3 134 35

1277 AAB92625 Homo SapiensHELI- Human protein 127 35
sequence SEQ

ID N0:10921.

1279 AAM66940 Homo SapiensMOLE- Human bone marrow362 85

expressed probe encoded
protein SEQ

ID NO: 27246.

1279 AAM54534 Homo SapiensMOLE- Human brain expressed362 85
single

exon probe encoded protein
SEQ ID

NO: 26639.

1279 gi~208153~gbsynthetic crystal toxin 79 40

~AAA73184.construct

1~

1280 AAE05187 Homo Sapiens1NCY- Human drug metabolising484 100

enzyme (DME-18) rotein.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
144
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1280 AAU12266 Homo SapiensGETH Human PR05780 polypeptide484 100

sequence.

1280 AAY91631 Homo SapiensHUMA- Human secreted 484 100
protein

sequence encoded by gene
24 SEQ ID

N0:304.

1281 AAH46856 Homo SapiensHUMA- Human serine/threonine238 100

_ phosphatase encoding
aal cDNA (clone ID

HLD0020.

1281 AAG77801 Homo SapiensHUMA- Human HLD0020 238 100

serine/threonine phosphatase
protein

se uence. .

1281 AAB85476 Homo SapiensHUMA- Human serine/threonine238 100

phosphatase (clone ID
HLD0020).

1282 gi~14762786~Homo SapiensGS2 gene 70 30

ref~XP
0478

71.1

1283 gi3860165Arabidopsisdisease resistance protein69 38
RPP1-WsB

thaliana

1283 AA009033 Homo SapiensHYSE- Human polypeptide 68 38
SEQ ID

NO 22925.

1283 gi6967115Arabidopsisdisease resistance protein68 38
homlog

thaliana

1285 gi1055252Rattus pheromone receptor VN5 78 32

norve icus

1285 gi2746733Drosophila circadian clock protein 73 26

virilis

1285 gi2641617Drosophila TIM 73 26

virilis

1286 gi6013135Rattus coxsackie-adenovirus-receptor86 67

norvegicus homolog

1286 AAV50429 Homo SapiensUYNY Human coxsackievirus83 75
and Ad2

_ and Ad5 receptor (HCAR)
aal cDNA.

1286 AAV28845 Homo SapiensDAND Human coxsackievirus83 75
and

_ adenovirus receptor encoding
aal DNA.

1287 AAU83224 Homo SapiensZYMO Novel secreted protein642 100

Z930757G12P.

1287 AAY70692 Homo sa DAND Human soluble aitractin-2.84 54
iens

1287 AAY70691 Homo sa DAND Human membrane attractin-2.84 54
iens

1288 AAW70326 Homo SapiensGEMY Secreted protein 1655 99
DU123 1.

1288 ABB 12473Homo SapiensHYSE- Human bone marrow 547 72
expressed

protein SEQ ID NO: 312.

1288 15689736 Homo SapiensMyopodin rotein 475 100

1289 g14103543Tomato chlorosisheat shock protein 70 73 29

virus

1289 g112247413Cristatellacytochrome b 72 30

mucedo

1289 gi~4103543Tomato chlorosisheat shock protein 70 73 29
~g

b~AAD0179virus

0.1~

1291 AAB94128 Homo SapiensHELI- Human protein sequence520 98
SEQ

ID N0:14383.

1291 AAY85576 Homo sapiensJANC Hs-UNC-53/1 fragment/GFP520 98

fusion insert of plasmid
pGI3150.

1291 AAY85564 Homo Sapiens~ JANC Human homologue ~ 520 ~ 98
of UNC-53

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
145
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

(Hs-UNC-53/1) se uence.

1292 AAY01413 Homo SapiensHLTMA- Secreted protein 207 97
encoded by

gene 31 clone HHBAG64.

1292 AAY05324 Homo SapiensGEMY Human secreted protein207 97

1j167 5.

1292 g115157864AgrobacteriumAGR_C_4816p 71 34

tumefaciens
str.

058 (Cereon)

1294 AAB 12146Homo SapiensPROT- Hydrophobic domain219 100
protein

from clone HP 10672 isolated
from

Thymus cells.

1295 gi~17228767~Nostoc Sp. probable glycogen phosphorylase78 34
PCC

ref~NP,48537120

15.1

1295 gi~10835203~Homo Sapiensadvanced glycosylation 65 58
end product-

ref~NP_0011 specific receptor

27.1 ~

1295 gi~190846~gbHomo Sapiensreceptor for advanced 65 58
glycosylation

~AAA03574. end products

1~

1296 g117511816Homo SapiensSimilar to RIKEN cDNA 1268 99
1110032022

ene

1296 AAB88440 Homo sapiensHELI- Human membrane 688 100
or secretory

rotein clone PSEC0222.

1296 g17211438Homo sa golgin-67 94 30
iens

1298 g118314436Homo SapiensSimilar to RIKEN cDNA 481 79
4921511004

gene

1298 11872546 Mus musculusNIK 86 25

1298 g15533305Homo Sapienssomatostatin receptor 85 29
interacting

rotein s lice variant
a

1299 11334643 Xeno us APEG recursor roteiii 105 27
laevis

1299 g117428053Ralstonia PROBABLE RIBONUCLEASE 100 32
E

solanacearum(RNASE E) PROTEIN

1299 g16690017HerpesvirusNTR 96 25

apio

1300 AAB87346 Homo SapiensHUMA- Human gene 5 encoded586 74

secreted protein HDPIE85,
SEQ ID

N0:87.

1300 AAB44298 Homo SapiensGETH Human PR0706 (UNQ370)586 74

rotein sequence SEQ ID
N0:385.

1300 AAY41742 Homo SapiensGETH Human PR0706 protein586 74

sequence.

1301 g1218572 Pan troglodytesprot GOR 1344 62

1301 1243898 Pan GOR 1040 68

1301 g117862570Drosophila LD38414p 486 45

melano aster

1302 g113276598Homo sapiensdJ614O4.7 (Novel rotein)260 28

1302 g113397804Homo SapiensdJ616B8.3 (novel gene) 230 30

1302 AAB56641 Homo SapiensROSE/ Human prostate 226 30
cancer antigen

protein sequence SEQ
ID N0:1219.

1303 g1603989 Drosophila salivary gland glue protein149 23

melano aster

1303 g113324584Borrelia LMP1 129 17

burgdorferi

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
146
Table 2
SEQ AccessionSpecies Description Score

1D No. Identity

NO:

1303 g1161956 Trypanosomasurface antigen 128 13

cruzi

1304 g113569248Human gag protein 81 34

immunodeficienc

y virus
a 1

1304 g14324832Human gag-pol polyprotein 80 29

immunodeficienc

y virus
a 1

1304 g111691875Mus musculusADP-ribosylation factor 79 22
1 GTPase

activatin rotein

1305 AA006469 Homo SapiensHYSE- Human polypeptide 191 100
SEQ ID

NO 20361.

1305 g13608368Xenopus origin recognition complex69 30
laevis associated

protein p81

1305 ABB 15196Homo SapiensHUMA- Human nervous system68 36
related

polype tide SEQ ID NO
3853.

1306 AAE03657 Homo SapiensINCY- Human extracellular109 27
matrix and

cell adhesion molecule-21
(XMAD-

21).

1306 ABB 11890Homo SapiensHYSE- Human protocadherin109 27

Flamingo 1 homologue,
SEQ ID

NO:2260.

1306 13449298 Homo SapiensMEGF2 109 27

1308 g19294050Arabidopsisprotein kinase-like protein84 32

thaliana

1308 g115983765ArabidopsisAT3g24550/MOB24 8 84 32

thaliana

1308 g113877617Arabidopsisprotein kinase-like protein84 32

thaliana

1309 AAU00375 Homo SapiensBERN/ Htunan stem cell 127 54
growth factor

rece tor.

1309 AAE07145 Homo SapiensSALK Human Kit/stem cell127 54
factor

receptor kinase insert
region.

1309 13236223 E uus caballustyrosine kinase receptor127 50
homolog

1310 g121449343Actinosynnemapolyketide synthase 77 46

pretiosum
subsp.

auranticum

1310 g121114513Xanthomonastranscriptional regulator75 36

campestris
pv.

campestris
str.

ATCC 33913

1310 gi13364364Escherichiaacetylglutamate kinase 73 36
- coli

0157:H7

1311 g120146220Oryza sativasimilar to splicing factor/activator110 33

(japonica protein

cultivar-
oup)

1311 g1206712 Rattus salivary proline-rich 104 27
protein

norvegicus

1311 AAY84592 Homo SapiensUNIW Amino acid sequennce103 34
of a

human artemin olypeptide.

1312 12065210 Mus musculusPro-Pol-dUTPase of rotein530 69

__ gi~10834720~Homo sapiensPP565 249 66
1312

gb~AAG237

90.1 ~AF258

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
147
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

587_1

1312 gi~13194728~Gallus galluspol-like protein ENS-3 115 21

gb~AAK155

26.1
~AF329

451 1

1313 AAW03515Homo sa iensSHKJ Human DOCK180 rotein.147 58

1313 gi1339910Homo sa iensDOCK180 protein 147 58

1313 gi1504002Homo sapienssimilar to a human major111 43
CRK-binding

protein DOCK180.

1314 gi12007418Mus musculusB3 olfactory rece for 76 38

1314 118480290Mus musculusolfactory rece for MOR260-376 38

1314 112007432Mus musculusB3 olfacto rece for 76 38

1315 g1483581Mus musculusNotch 3 82 26

1315 g118159668Pyrobaculum paREP2b 81 29

aerophilum

1315 g14584086Spermatozopsisp210 protein 79 25

similis

1316 AAM71305Homo SapiensMOLE- Human bone marrow 422 98

expressed probe encoded
protein SEQ

ID NO: 31611.

1316 AAM58790Homo SapiensMOLE- Human brain expressed422 98
single

exon probe encoded protein
SEQ ID

NO: 30895.

1316 g1149490Lactococcus sucrose-6-phosphate hydrolase72 31

lactis

1317 g11620040Paramecium Asp-rich 72 28

bursaria

Chlorella
virus 1

1317 13721615C rinus carpioMEF2C 71 25

1317 gi~9631936~rParamecium Asp-rich 72 28

ef~NP_0487bursaria

25.1 Chlorella
virus 1

1318 gi~21291797~Anopheles agCP3974 74 35

gb~EAA039gambiae str.

42.1 PEST
~

1319 g121306283Chlamydomonasiron transporter Ftrl 74 30

reinhardtii

1319 AAB60461Homo sapiens1NCY- Human cell cycle 73 33
and

proliferation protein
CCYPR-9, SEQ

ID N0:9.

1319 g16013155Homo Sapiensp35s ' 73 33

1320 g19717245Mus musculuscytoplasmic dynein heavy430 94
chain

1320 g1402528Rattus cytoplasmic dynein heavy430 94
chain

norvegicus

1320 g1294543Rattus dynein heavy chain 430 94

norvegicus

1323 gig 17221411Burkholderiakdo transferase 70 34
~

emb~CADl2cepacia

639.1
~

1324 g11698601Cricetulus beta-1,6-N- 440 38

griseus acetylglucosaminyltransferase

1324 g1349091Rattus N-acetylglucosaminyltransferase438 43
V

norvegicus

1324 118997007Mus musculusN-acetylglucosaminyltransferase438 43
V

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
148
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1325 AAM70545 Homo SapiensMOLE- Human bone marrow 115 47

expressed probe encoded
protein SEQ

ID NO: 30851.

1325 AAM58098 Homo SapiensMOLE- Human brain expressed115 47
single

exon probe encoded protein
SEQ ID

NO: 30203.

1325 AAM72994 Homo SapiensMOLE- Human bone marrow 111 28

expressed probe encoded
protein SEQ

ID NO: 33300.

1326 gi12724969Lactococcusphenolic acid decarboxylase77 46

lactis subsp.

lactis

1327 AAB53097 Homo SapiensGETH Human angiogenesis-associated372 63

rotein PRO 1246, SEQ
ID N0:167.

1327 AAU12416 Homo SapiensGETH Human PR01246 polypeptide372 63

sequence.

1327 AAY99377 Homo SapiensGETH Human PR01246 (UNQ630)372 63

amino acid sequence SEQ
ID NO:132.

1328 gi6014505Hepatitis polyprotein 76 43
GB

virus B

1328 gi765145 Hepatitis polypeptide 68 41
GB

virus B

1328 gi~20544059~Homo Sapienssimilar to U4/U6-associated294 100
RNA

ref~XP_0862 splicing factor

20.4

1329 AAV42689_Homo sapiensSIBI- DNA encoding human158 91
calcium

aal channel alpha-2 subunit.

1329 AAQ84667_Homo SapiensSALK Human neuronal calcium158 91

aal channel subunit alpha
2c.

1329 AAQ84664-Homo SapiensSALK Human neuronal calcium158 91

aal channel subunit alpha
2b.

1330 gi19923 Nicotiana pistil extensin like 71 38
protein, partial CDS

tabacum

1330 gi~144429~gbCellulomonasbeta-1,4-xylanase 67 30

~AAA56792.fimi

1~

1331 12388676 Mytilus precolla en P 85 35
edulis

1331 g117862044Drosophila LD06016p 75 30

melano aster

1331 g113879780MycobacteriumPE_PGRS family protein 74 30

tuberculosis

CDC1551

1333 AA000015 Homo SapiensHYSE- Human polypeptide 442 61
SEQ ID

NO 13907.

1333 AAB82479 Homo SapiensZYMO Human RING finger 81 31
protein

Za op2.

1333 120975274Homo sapiensskeletrophin 81 31

1334 ABB 11819Homo SapiensHYSE- Human secreted 367 82
protein

homolo ue, SEQ ID N0:2189.

1334 AAW80398 Homo SapiensGEMY A secreted protein 130 67
encoded by

clone cw1543 3.

1334 g15081693Samanea pulvinus inward-rectifying70 34
samara channel

SPICK2

1335 ABB89969 Homo sapiensHUMA- Human polype tide 142 96
SEQ ID

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
149
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

NO 2345.

1335 AAB38385 Homo SapiensHUMA- Human secreted 142 96
protein

encoded by gene 18 clone
HTLEJ24.

1335 AAB38338 Homo SapiensHUMA- Human secreted 142 96
protein

encoded by gene 18 clone
HTLFE57.

1336 gi~14590195~Pyrococcus asparaginyl-tRNA synthetase70 37

re~NP_1422horikoshii

60.1

1337 gi3879419Caenorhabditiscontains similarity to 69 29
Pfam domain:

elegans PF00102 (Protein-tyrosine

phosphatase), Score=51.6,
E-

value=1.8e-14, N=1

1337 gi~17563828~Caenorhabditisprotein tyrosine phosphatase69 29

ref~NP_5059elegans

65.1

1338 gi~2072960~gHomo Sapiensp40 138 33

b~AACS
126

8.1~

1338 gi~4185940~eHuman env protein 124 75

mb~CAA768endogenous

80.1 ~ retrovirus
K

1338 gi~757872~eHuman env 124 75

mb~CAA577endogenous

23.1 ~ retrovirus

1340 gi1491979Molluscum MC036R 78 33

contagiosum

virus subtype
1

1340 gi~9628968~rMolluscum MC036R 78 33

ef~NP_0439contagiosum

87.1 virus

1341 gi18676514Homo SapiensFLJ00154 protein 1560 100

1341 AAB84252 Homo SapiensHUMA- Amino acid sequence572 63
of a

human cytokine receptor-like
rotein.

1341 AAB84251 Homo SapiensHUMA- Human cytokine 572 63
receptor-like

protein fragment.

1342 AAY27757 Homo SapiensHUMA- Human secreted 152 71
protein

encoded by gene No. 47:

1342 AAB27551 Homo SapiensMYRI- Human tumour suppressor77 32

BRG1 encoded by cDNA
mutated at

base 1705.

1342 AAB27550 Homo sapiensMYRI- Human tumour suppressor77 32

BRG1 protein from cell
lines DU145

and NCI-H 1300.

1344 gi21464394Drosophila RE18651p 78 26

melanogaster

1344 AAM39065 Homo SapiensHYSE- Human polypeptide 77 21
SEQ ID

NO 2210.

1344 1338290 Homo Sapiensson3 protein 77 21

1345 12202 Canis s Clox 135 37
.

1345 g13879551Caenorhabditiscontains similarity to 125 33
Pfam domain:

elegans PF01391 (Collagen triple
helix repeat

(20 copies)), Score=56.4,
E-value=2e-

13, N=2; PF01484 (Nematode
cuticle

collagen N-terminal domain),

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
150
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

Score=87.2, E-value=l.le-22,
N=1

1345 gi158695 Drosophila tropomyosin isoform 118 30
33 (9C)

melanogaster

1346 gi7862077Giardia 3-hydroxy-3-methylglutaryl-coenzyme90 26

intestinalisA reductase

1346 gi1098615Mycoplasma adhesin-related 30 kDa 87 23
protein

pneumoniae

1346 gi20380058Homo sa iensSimilar to PRAM-1 rotein84 28

1347 113905302Mus musculusSimilar to ATPase, class736 85
II, type 9A

1347 g117862322Drosophila LD22119p 633 72

melanogaster

1347 AAM25271 Homo SapiensHYSE- Human protein 572 100
sequence SEQ

ID N0:786.

1348 g1456319 Bacteriophage74kDa protein 75 33

FC1

1348 g11524115Lycopersiconsubtilisin-like endoprotease73 28

esculentum

1348 g14200334LycopersiconP69A protein 73 28

esculentum

1349 g121391988Drosophila HL08052p 78 31

melano aster

1349 g120148339Arabidopsis cyclin delta-3 77 25

thaliana

1349 gi~17647607~Drosophila maroon-like; bronzy; 78 31
section 5

ref~NP_5234melanogaster

23.1

1351 g118676524Homo sa iensFLJ00159 rotein 164 52

1351 g121392066Drosophila RE04357p 139 34

melanogaster

1351 AAB92637 Homo SapiensHELI- Human protein 81 43
sequence SEQ

ID N0:10953.

1352 g119071965Aspergillus chitin synthase 79 28

oryzae

1352 g117945592Drosophila RE26660p 78 41

melano aster

1352 g116184663Drosoplula LD28370p 74 22

melanogaster

1353 gi~11037117~Homo SapiensNAG13 307 65

gb~AAG274

85.1 CAF
194

537_1

1353 gi~1335205~eHomo SapiensORFII 305 65

mb~CAA364

80.1

1354 g11388166Drosophila Bowel 80 32

melano aster

1354 g115553187Scyliorhinushomeodomain protein 79 22
Otxl

canicula

1354 AAY85573 Homo sapiensJANC Hs-UNC-53/3 fragment/GFP78 26

fusion insert of plasmid
pGI3303.

1358 gi~21288288~Anopheles agCP9766 71 30

gb~EAA006gambiae str.

09.1 ~ PEST

1358 ~ gi~17465558~Homo Sapiens~ similar to mucin ~ 68 ~ 36

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
151
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

re~XP_0698

88.1

1359 gi~21302892~Anopheles agCP5020 70 31

gb~EAA150gambiae str.

37.1 PEST

1361 gi15080686Lentinula CDCS 79 26
edodes

1361 gi495516 Plasmodium circumsporozoite protein77 31

vivax

1361 gi21070569DictyosteliumVSAE2 (FR.AGMENT). 3/10176 31

discoideum

1362 gi8953400Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23
~

thaliana s these-like rotein

1362 gi~15239030~Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23

ref~NP-1966thaliana synthase - like protein

99.1 ~

1363 gi2444430Xenopus laevisdeacetylase 327 81

1363 gi602098 Xeno us laeviseast ltPD3 homologue 324 80

1363 AAB49954 Homo SapiensMETH- Human histone 323 80
deacetylase

HDAC-1.

1364 AAM69686 Homo SapiensMOLE- Human bone marrow418 55

expressed probe encoded
protein SEQ

ID NO: 29992.

1364 AAM57281 Homo SapiensMOLE- Human brain expressed418 55
single

exon probe encoded protein
SEQ ID

NO: 29386.

1364 gi~1780971~eHuman gag protein 172 37

mb~CAA714endogenous

16.1 ~ retrovirus
K

1365 gi437084 Gallus gallusvitamin D3 hydroxylase 510 41
associated

protein

1365 12149156 Homo Sapiensfatty acid amide hydrolase477 38

1365 AAW57783 Homo SapiensSCRI Human fatty acid 468 38
amide

hydrolase.

1366 g13510695Homo SapiensDNA polymerase theta 77 21

1366 g1309132 Mus musculuscalnexin 72 22

1366 g115214567Mus musculusSimilar to calnexin 72 22

1367 gi~17508849~Caenorhabditishelicase 73 40

re~NP elegans
4914

26.1 ~

1368 g15457567Pyrococcus Na+/H+ antiporter (napA-1)76 33

abyssi

1368 g18247211Candida albicansShe9 rotein 69 31

1368 gi~14590079)Pyrococcus Na(+)/H(+) antiporter 76 30

ref~NP_1421horikoshii

43.1

1369 g117644260Homo SapiensbB206I21.1 (ATPase, 305 98
Class VI, type

11C ) .

1369 AA014200 Homo SapiensINCY- Human transporter166 50
and ion

channel TRICH-17.

1369 g15080816Arabidopsis Putative ATPase 166 49

thaliana

1370 gi~18573281~Homo Sapienssimilar to 40S ribosomal70 38
protein S3A

re~XP_0959

33.1

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
152
Tahle: 7
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1372 gi6683562Mus musculushe aran sulfate 6-sulfotransferase886 91
3

1372 gi6683558Mus musculusheparan sulfate 6-sulfohansferase265 72
2

1372 ABL39900_Homo SapiensSEGK Human HS6ST2v encoding262 71

aal cDNA SEQ ID NO:1.

1373 gi~20882231Mus musculussimilar to LIM domain 76 24
~ only 7

ref~XP_1392

03.1

1373 gi~20302988~Medicago nodule-specific glycine-rich72 26
sativa protein 3

gb~AAM189

48.1 ~AF498

989 1

1373 gi~9965267~ginfectious non-structural protein 72 24
2

b~AAG1000hypodermal
and

8.1 ~ hematopoietic

necrosis
virus

1374 13355835 Rhizobium RBSK 78 32
etli

1374 g17453560Polyangium epoD 73 28

cellulosum

1374 g11749684Schizosaccharomsimilar to Saccharomyces72 28
cerevisiae

yces pombe porphobilinogen deaminase,
SWISS-

PROT Accession Number
P28789

1375 116973455Danio reriobeta-3-galactosyltransferase1050 63

1375 AAB24035 Homo SapiensGETH Human PR04397 protein725 46

sequence SEQ ID NO:42.

1375 AAB88404 Homo SapiensHELI- Human membrane 709 43
or secretory

protein clone PSEC0159.

1376 g17668 Drosophila bsg25D protein 73 33

melanogaster

1376 g120177037Drosophila LD21844p 73 33

melanogaster

1376 g11353669CaenorhabditisUNC-24 69 43

ele ans

1379 AAS16182_Homo SapiensGENA- Human apolipoprotein245 67
C1

aal (APOC1 DNA.

1379 AAU10534 Homo SapiensGENA- Human apolipoprotein245 67
C1

(APOC1) of eptide.

1379 AAS 16825-Homo SapiensGENA- Human apolipoprotein245 67
C1

aal (APOC1) DNA coding se
uence.

1380 AAY36290 Homo sapiensHUMA- Human secreted 177 74
protein

encoded by gene 67.

1380 g116551305Tatianyx DNA-directed RNA polymerase71 38
beta'

arnacites subunit 2

1380 13411013 Candida protein mannosyltransferase68 35
albicans 1

1381 AAM80132 Homo SapiensHYSE- Human protein SEQ 173 66
ID NO

3778.

1381 g14731867Dictyosteliumsterol glucosyltransferase107 30

discoideum

1381 AAB74726 Homo SapiensINCY- Human membrane 89 41
associated

protein MEMAP-32.

1382 AAB62100 Homo SapiensWIST- Human bridging 78 27
integrator-2

(Bin2) rotein.

1382 g16527168Homo Sapiensbreast cancer associated78 27
protein

BRAP 1

1382 g15852834Homo Sapiensbridging integrator-2 78 27
~ ~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
153
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1383 gi7670050Xeno us type I collagen al ha 92 27
laevis 1

1383 AA001606 Homo SapiensHYSE- Human polypeptide 85 29
SEQ ID

NO 15498.

1383 gi17738485Agrobacteriumbiopolymer transport 85 28
protein

tumefaciens
str.

C58 (U.

Washin ton)

1384 gi20451261CaenorhabditisC. elegans GCY-17 protein71 26

elegans (comes onding se uence
W03F11.2)

1384 gi2665714AgrobacteriummoaC 71 29

tumefaciens

1384 gi~20864452~Mus musculusRIKEN cDNA 2410018E23 130 59

ref]XP-1500

76.1 ~

1385 AAY94938 Homo SapiensGEMY Human secreted protein103 25
clone

ye78 1 protein sequence
SEQ ID

N0:82.

1385 gi12831176Agelaius gamma filamin protein 96 29

phoeniceus

1385 AAU81998 Homo sapiensINCY- Human secreted 87 27
protein

SECP24.

1386 gi10440468Homo SapiensFLJ00070 protein 102 41

1386 gi11136912Danio rerioRPTP-al ha protein 94 32

1386 120377083Homo Sapiensp78 92 36

1387 AAM40810 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID

NO 5741.

138.7 AAM39024 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID

NO 2169.

1387 g115080474Homo SapiensSimilar to RIKEN cDNA 190 59
1700023011

ene

1388 g112802591Bovine tegument protein 82 30

herpesvirus
4

1388 g1950226 SaccharomycesTrf4p ' 73 26

cerevisiae

1388 gi~13095641~Bovine tegumentprotein 82 30

ref~NP_0765herpesvirus
4

56.1

1389 AAI67224_Homo SapiensCORI- BS11S cDNA sequence.363 100

aal

1389 AAF85500_Homo SapiensEOSB- Nucleotide sequence363 100
of a

aal human breast cancer protein
designated

BCH1.

1389 AAA54120-Homo sapiensEOSB- Breast cancer protein363 100
BCH1

aal codin se uence.

1390 g1184653 Homo SapiensIFN-alpha responsive 74 30
transcription

factor

1390 gi~2580453~gXenopus Xbap 68 47
laevis

b~AAB8233

6.1~

1391 AAB88456 Homo SapiensHELI- Human membrane 85 52
or secretory

protein clone PSEC0246.

1391 AAB62392 Homo SapiensLEXI- Human LDL receptor85 52
family

rotein (LDLP).

1392 ABB 12009Homo Sapiens~ HYSE- Human RAMP 1 ~ 90 ~ 100
homologue,

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
154
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

SEQ ID N0:2379.

1392 gi3171910Homo sa RAMP1 90 100
iens

1392 gi12653551Homo Sapiensreceptor (calcitonin) 90 100
activity modifying

rotein 1

1394 gi4467343Drosophila EG:140G11.1 70 27

melano aster

1394 gi6018879Drosophila BACN4L24.d 70 27

melanogaster

1394 gi157993 Drosophila developmental protein 70 27

melanogaster

1395 gi4928919Arabidopsiszinc forger protein 2 86 26

thaliana

1395 gi2702272Arabidopsisexpressed protein 86 26

thaliana

1396 AAM25276 Homo sapiensHYSE- Human protein sequence729 93
SEQ

ID N0:791.

1396 AAE14340 Homo sapiensINCY- Human protease 528 33
PRTS-5

protein.

1396 AAB47561 Homo sa INCY- Protease PRTS-3. 528 33
iens

1397 gi18369843Infectious P6 89 40

salmon anemia

virus

1397 gi4092530Infectious NS1 protein 87 39

salmon anemia

virus

1397 gi14009648Infectious NS1 87 39

salmon anemia

virus

1398 AAW63707 Homo sa UYOR- Human hSK2 protein.331 91
iens

1398 gi1575663Rattus ~ calcium-activated potassium331 91
channel

norvegicus rSK2

1398 gi15082148Homo Sapienssmall-conductance calcium-activated331 91

otassium channel

1399 AAB01.381Homo sapiensINCY- Neuron-associated 1653 68
protein.

1399 gi18157547Mus musculuspecanex-like 3 1620 66

1399 16650377 Mus musculusecanex 1 1277 51

1400 gi~20887681Mus musculussimilar to melastatin 468 91
~ 1

ref~XP,1405

75.1

1400 gi~3243075~gHomo Sapiensmelastatin 1 355 75

b~AAC8000

0.1~

1400 gi~20552333~Homo Sapienssimilar to melastatin 355 75
1

ref~XP-0076

62.9

1401 AAU15955 Homo SapiensHUMA- Human novel secreted931 92
protein,

Seq ID 908.

1401 g13978441Homo SapiensPITSLRE protein kinase 95 24
alpha SV9

isoform

1401 g11517914Homo Sapiensmonocytic leukaemia zinc91 28
finger

rotein

1402 g11289326Mus musculusROR-al ha 1 84 25

1402 g1530878 Chlamydomonasamino acid feature: N-glycosylation79 32
,

eugametos sites, as 41 .. 43, 46
.. 48, 51 .. 53, 72
..

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
155
Tahle 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

74, 107 .. 109, 128 ..
130, 132 .. 134,

158 .. 160, 163 .. 165;
amino acid

feature: Rod protein
domain, as 169 ..

340; amino acid feature:
globular

protein domain, as 32
.. 168

1402 gi220763 Rattus HES-3 factor 79 52

norve icus

1403 gi~20479430~Homo Sapienssimilar to olfactory 71 32
receptor MOR231-

ref~XP-1149 1

55.1

1403 gi~20480897~Homo sapienssimilar to olfactory 71 32
receptor MOR234-

ref~XP-1150 3

14.1 ~

1404 AAA88548_Homo sapiensSMIK Human CASB616 cDNA.89 100

aal

1404 AAB 19591Homo SapiensSMIK Human CASB616. 89 100

1404 11100110 Homo sa protein-tyrosine kinase 89 100
iens

1405 g14206753Oryctolagushomeodomain-containing 74 24
protein

cuniculus

1405 g113445253Mus musculusorphan Gpr37-like rotein72 33
1

1405 g13080552Mus musculusHoxa-9 71 50

1406 AAM50585 Homo SapiensNISB Benign prostatic 325 100
hyperplasia

associated protein JT460914.

1406 g118031947Homo SapiensSOCS box protein ASB-5 325 100

1406 AAU20593 Homo sapiensHUMA- Human secreted 316 100
protein, Seq

ID No 585.

1407 AAU83222 Homo SapiensZYMO Novel secreted protein895 97

Z930005G2P.

1407 AAY02712 Homo SapiensHUMA- Human secreted 91 56
protein

encoded by gene 63 clone
HBJFV28.

1407 AA000641 Homo SapiensHYSE- Human polypeptide 86 64
SEQ ID

NO 14533.

1408 ABB17944 Homo SapiensHUMA- Human nervous system81 53
related

pol eptide SEQ ID NO
6601.

1408 AAM77906 Homo SapiensMOLE- Human bone marrow 72 40

expressed probe encoded
protein SEQ

ID NO: 38212.

1408 AAM65199 Homo SapiensMOLE- Human brain expressed72 40
single

exon probe encoded protein
SEQ ID

NO: 37304.

1409 g15230847Vitreoscillaglutamine synthetase 68 33
Sp. homolog

C1

1409 g18515736Drosophila highwire 67 35

melano aster

1409 g13138797Sulfolobus Ssh7b 65 48

shibatae

1410 AAW23309 Homo sapiensEIJI- Human Werner's 151 96
syndrome WS-2

protein.

1410 g11913785Homo SapiensRep-8 151 96

1410 g118089098Homo sapiensre roduction 8 151 96

1411 gi~21297468~Anopheles agCP15537 166 56

gb~EAA096gambiae
str.

13.1 PEST

1411 gi~20983200~Mus musculusRIKEN cDNA 1810030007 73 24

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
156
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

ref~XP-1358

12.1

1412 gi532572 Hordeum lipoxygenase 1 82 28

vulgare

1412 gi945419 Mus musculushepatoma derived growth 77 35
factor

(HDGF)

1412 gi17932895stork hepatitispreC/core antigen 77 26
B

virus

1413 gi2370143Homo Sapiensimmunoglobulin-like domain-169 42

containing 1

1413 gi2645890Homo sa IGSF1 169 42
iens

1413 AAB40232 Homo SapiensHUMA- Human secreted 162 40
protein

sequence encoded by gene
46 SEQ ID

N0:142.

1414 gi21204314Staphylococcusproline-tRNA ligase 78 32

aureussubsp.

aureus MW2

1414 gi14247033Staphylococcusproline-tRNA ligase 78 32

aureus subsp.

aureus Mu50

1414 gi13701063Staphylococcusproline-tRNA ligase 78 32

aureus subsp.

aureus N315

1415 gi9948469Pseudomonasprobable non-ribosomal 78 31
peptide

aeruginosa synthetase

1415 AAE19251 Homo SapiensBIOI- SOSl protein sequence75 23
from

PS462.

1415 AAU84311 Homo SapiensBAAI~/ Protein ABCB2 74 30
differentially

ex ressed in breast cancer
tissue.

1416 gi18676710Homo sa FLJ00254 rotein 623 75
iens

1416 gi2065210Mus musculusPro-Pol-dUTPase pol rotein583 69

1416 gi~18676710~Homo SapiensFLJ00254 protein 623 75

dbj~BAB850

07.1 ~

1417 AAR85785 Homo SapiensUYNY Human GRB-10. 77 32

1417 gi841210 Mus musculusgrowth factor receptor 77 32
binding protein

Grb 10

1417 AAM90963 Homo SapiensHUMA- Human 74 32

immune/haematopoietic
antigen SEQ

ID N0:18556.

1419 AAM79990 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO

3636.

1419 AAM79006 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO

1668.

1419 AAR28494 Homo SapiensXIAM/ Sequence encoded 82 100
by the

CAMPATH-1 antigen cDNA.

1420 AAU01383 Homo SapiensMILL- Human TANGO 499 828 73
form 2,

variant 1 amino acid
sequence.

1420 AAU01382 Homo SapiensMILL- Human TANGO 499 828 73
form 2,

variant 4 amino acid
se uence.

1420 AAU01380 Homo SapiensMILL- Human TANGO 499 828 73
form 2,

amino acid se uence.

1421 gi19069609EncephalitozoonPROTEASOME REGULATORY 76 26

cuniculi SUBUNIT YTA6 OF THE AAA

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
157
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

FAMILY OF ATPASES

1422 AAM66177 Homo SapiensMOLE- Human bone marrow199 72

expressed probe encoded
protein SEQ

ID NO: 26483.

1422 AAM53791 Homo SapiensMOLE- Human brain expressed199 72
single

exon probe encoded protein
SEQ ID

NO: 25896.

1422 AAM68472 Homo SapiensMOLE- Human bone marrow176 81

expressed probe encoded
protein SEQ

ID NO: 28778.

1423 11800227 Oryza sativaBowman-Birk roteinase 74 34
inhibitor

1423 g110141005San Miguel non-structural polyprotein74 26
sea

lion virus

1423 gi~17490177~Homo sapienssimilar to RING finger 76 28
protein 18

re~XP-0623 (Testis-specific ring-forger
protein)

00.1 ~

1424 g1461336 Pyrenomonas hsp70 75 29

salina

1424 g113880037Mycobacteriummembrane protein, MmpL 75 24
family

tuberculosis

CDC1551

1424 g11449306MycobacteriummmpL2 75 24

tuberculosis

H37Rv

1425 g115600 Enterobacteriagene 7.3, host range 79 30

ha a T7

1425 g116198065Drosophila LD28477p 77 30

melanogaster

1425 g111870012Drosophila xnp/atr-x DNA helicase 77 30

melanogaster

1426 g116185397Drosophila LD39815p 204 44

melano aster

1426 g12244793Arabidopsis disease resistance N 86 30
like protein

thaliana

1426 AAU84280 Homo SapiensBGHM Human endometrial 77 26
cancer

related rotein, HERC1.

1427 AAY36302 Homo SapiensHUMA- Human secreted 183 79
protein

encoded by gene 79.

1427 AAB88359 Homo SapiensHELI- Human membrane 178 80
or secretory

protein clone PSEC0087.

1427 AAM41635 Homo SapiensHYSE- Human polypeptide178 80
SEQ ID

NO 6566.

1428 AAU82008 Homo Sapiens1NCY- Human secreted 114 64
protein

SECP34.
Y

1428 AAB32391 Homo SapiensHUMA- Human secreted 114 64
protein

sequence encoded by
gene 21 SEQ ID

N0:77.

1428 AAY08306 Homo SapiensFIBR- Human collagen 74 45
IX alpha-3

chain rotein.

1429 g12792523Ralstonia alternative RNA sigma 69 30
factor RpoS

solanacearum

1429 g117428221Ralstonia RNA POLYMERASE SIGMA 69 33
S

solanacearum(SIGMA-38) FACTOR

TRANSCRIPTION REGULATOR

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
158
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

PROTEIN

1429 gi~5032313~rHomo Sapiensdystrophin Dp140bc isoform;73 26

e~NP_0040 Dystrophin (muscular
dystrophy,

14.1 Duchenne and Becker
types)

1433 gi9954445Rattus TEMO 171 62

norve icus

1433 gi14030260maize rayadopolyprotein ~ 79 32

fino virus

1433 AAB95656 Homo sapiensHELI- Human protein 77 36
sequence SEQ

ID N0:18419.

1434 AAR04212 Homo SapiensCALB- Human 32K alveolar391 43
surfactant

rotein.

1434 AAP60661 Homo SapiensKUSH/ Genomic sequence 386 43
of human

alveolar surfactant
protein

(hASP)encoded by genomic
DNA.

1434 AAB58135 Homo SapiensROSE/ Lung cancer associated366 42

pol a tide sequence
SEQ ID 473.

1435 gi17224904Mus musculusimmuno lobulin superfamily180 48
member 9

1435 gi20988778Homo SapiensSimilar to immunoglobulin173 53

su erfamily, member
9

1435 gi14149050Drosophila turtle protein, isoform114 36
4

melanogaster

1436 gi1465855CaenorhabditisC. elegans PQN-57 protein85 23

elegans (correspondin sequence
R09F10.7)

1436 gi1465856CaenorhabditisC. elegans PQN-56 protein85 23

elegans (correspondin sequence
R09F10.2)

1436 117864717Mus musculushornerin 83 26

1437 gi~21292574~Anopheles agCP3449 66 33

gb~EAA047gambiae str.

19.1 PEST

1438 ABB 10160Homo SapiensHUMA- Human cDNA SEQ 166 62
ID NO:

468.

1438 g19657279Vibrio choleraeaspartokinase II/homoserine71 28

dehydrogenase, methionine-sensitive

1439 g14582571Gallus gallusH erion protein, 419 75 24
kD isoform

1439 g113165 Oenothera ATPase alpha-subunit 72 26
(aa 1-511)

biennis

1439 g1903838 Oenothera F-1-ATPase alpha subunit72 26

berteriana

1440 g14558758Homo Sapienstestis-specific chromodomain233 62
Y-like

protein

1440 g14558762Mus musculustestis-specific chromodomain231 36
Y-like

rotein

1440 g13342716Homo Sapienstestis-specific ChromoDomain195 36
Y

isoform 1

1441 g1155627 Acanthamoebamyosin I heavy chain 118 42

castellanii

1441 g113093370Mycobacteriuminitiation factor IF-2 116 33

1e rae

1441 AAY20289 Homo SapiensUYRO- Human apolipoprotein114 39
E

mutant rotein fragment
5.

1442 g12253707Mus musculusDaxx 84 36

1442 g11934970Plasmodium AARP1 protein 79 65

falciparum

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
159
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1442 14050098 Mus musculusFas-bindin protein 78 34

1443 g12425111DictyosteliumZipA 90 26

discoideum

1443 AAY06119 Homo SapiensHARD Human CIITA interacting88 26

protein 104 CIP104).

1443 g15420387Leishmania proteophosphoglycan 86 21

maj or

1444 g1893355 AcinetobacterL-2,4-diaminobutyrate 77 26
decarboxylase

baumannii

1445 ABB55744 Homo sapiensFECH/ Human polypeptide 135 47
SEQ ID

NO 94.

1445 AAU39035 Homo SapiensGEMY Human secreted protein135 47

nh328 5.

1445 AAY28679 Homo SapiensGEMY Human nh328 5 secreted135 47

rotein.

1446 g119744390Homo sapiensretinoic acid inducible 247 54
in

neuroblastoma cells RAINB
1 d

1446 g119744388Homo Sapiensretinoic acid inducible 247 54
in

neuroblastoma cells RAINB
1

1446 AAY85565 Homo SapiensJANC Human homologue 240 52
of UNC-53

(Hs-UNC-53/2) se uence.

1447 AAU19716 Homo SapiensHUMA- Human novel extracellular71 31

matrix protein, Seq ID
No 366.

1447 g118025476cercopithicineBPLF1 71 38

he esvirus
15

1447 AAS 14575_Homo SapiensMILL- Human cDNA encoding69 62
G

aal protein-coupled receptor,
GPCR,

52872.

1448 g114027507Mesorhizobiumsalicylate hydroxylase 69 31

loti

1449 AAG64798 Homo sapiensSREH- Human peptide methionine192 . 71

sulphoxide reductase
(hPMSR).

1449 AAB81893 Homo SapiensSEQU- Human genomic database192 71

related protein SEQ ID
NO: 38.

1449 AAM42046 Homo SapiensHYSE- Human polypeptide 192 71
SEQ ID

NO 6977.

1450 g118249657Mus musculusNC8 1063 80

1450 1406748 Mus musculuszinc finger protein 250 37

1450 AAB43498 Homo SapiensHUMA- Human cancer associated249 37

rotein sequence SEQ ID
N0:943.

1451 ABB89331 Homo SapiensHUMA- Human polypeptide 732 88
SEQ ID

NO 1707.

1451 g113421927CaulobacterMaoC family protein 273 42

crescentus
CB15

1451 g119338616MethylobacteriuR-specific enoyl-CoA 261 44
hydratase

m extorquens

1452 gi~20908171~Mus musculussimilar to NADPH oxidase68 30
3; NADPH

ref~XP_1397 oxidase catalytic subunit-like
3

15.1

1452 gi~17533619~CaenorhabditisF32A5.8.p 67 42

ref~NP_4955elegans

16.1

1453 gi~15614051~Bacillus sodium-dependent phosphate65 34

reflNP halodurans traps orter
2423

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
160
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

54.1 ~

1454 gi~17551878~CaenorliabditisTPRDomain 76- 29

ref~NP_4990elegans

90.1

1455 AAM40727 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID

NO 5658.

1455 AAM38941 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID

NO 2086.

1455 gi19702127Homo sa P-Rexl rotein 191 56
iens

1456 ABB05666 Homo SapiensGEHU- Human nucleic acid496 91

management rotein clone
amy2 l 1n4.

1456 AAE03372 Homo SapiensHUMA- Human gene 18 encoded496 91

secreted protein fragment,
SEQ ID

N0:152.

1456 AAE03371 Homo SapiensHUMA- Human gene 18 encoded496 91

secreted protein fragment,
SEQ ID

N0:150.

1457 AAM66940 Homo SapiensMOLE- Human bone marrow 290 77

expressed probe encoded
protein SEQ

ID NO: 27246.

1457 AAM54534 Homo SapiensMOLE- Human brain expressed290 77
single

exon probe encoded protein
SEQ ID

NO: 26639.

1457 AAM64410 Homo SapiensMOLE- Human brain expressed287 77
single

exon probe encoded protein
SEQ ID

NO: 36515.

1458 AAB53445 Homo SapiensHUMA- Human colon cancer335 100
antigen

rotein se uence SEQ ID
N0:985.

1458 AAY30055 Homo SapiensARIA- Amino acid sequence165 91
of a

FK506-binding protein
(FKBP).

1458 AAQ52277_Homo sapiensVERT- FK506 binding protein159 100

aal (FKBP12A) cDNA.

1460 AAU20255 Homo SapiensHUMA- Human novel endocrine104 76

antigen, SEQ ID No 312.

1460 ABB 17663Homo SapiensHUMA- Human nervous system94 77
related

pol a tide SEQ ID NO
6320.

1460 AA002331 Homo SapiensHYSE- Human polypeptide 88 61
SEQ ID

NO 16223.

1461 AAM65951 Homo SapiensMOLE- Human bone marrow 97 57

expressed probe encoded
protein SEQ

ID NO: 26257.

1461 AAM53568 Homo SapiensMOLE- Human brain expressed97 57
single

exon probe encoded protein
SEQ ID

NO: 25673.

1461 AAU83199 Homo sapiensZYMO Novel secreted protein96 38

Z891639G1P.

1463 15565687 Homo sa topoisomerase-related 514 75
iens function protein

1463 15139669 Homo SapiensLAK-1 468 75

1463 g121430468Drosoplula LP06848p 332 51

melano aster

1464 AAY91421 Homo sapiensHUMA- Human secreted 109 35
protein

sequence encoded by gene
7 SEQ ID

N0:142.

1464 AAY91396 Homo SapiensHUMA- Human secreted 109 35
rotein

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
161
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

sequence encoded by gene
7 SEQ ID

N0:117.

1464 AAY91352 Homo SapiensHUMA- Human secreted 109 35
protein

sequence encoded by gene
7 SEQ ID

N0:73.

1465 AAU15978 Homo SapiensHUMA- Human novel secreted575 100
protein,

Se ID 931.

1465 AAU15958 Homo SapiensHUMA- Human novel secreted575 100
protein,

Se ID 911.

1465 116041675Homo sa 'oined to JAZF1 575 100
iens

1466 AA001502 Homo SapiensHYSE- Human polypeptide 173 66
SEQ ID

NO 15394.

1466 gi~10947038~Homo Sapiensankyrin 1, isoform l; 74 28
anlcyrin-1,

ref~NP erythrocytic; ankyrin-R
0652

09.1 ~

1466 gi~10947036~Homo Sapiensankyrin 1, isoform4; 74 28
ankyrin-1,

reflNP erythrocytic; ankyrin-R
0652

08.1

1467 g119354550Mus musculussimilar to src homology 842 91
three (SH3)

and cysteine rich domain

1467 AAU17352 Homo SapiensHUMA- Novel signal transduction361 98

athway rotein, Se ID
917.

1467 g11799566Mus musculusstet 302 44

1468 g113506771Mus musculusstructural protein FBF1 767 74

1468 g17549210Babesia 200 lcDa antigen p200 213 29

bigemina

1468 g11747 Oryctolagustrichohyalin 191 30

cuniculus

1469 111345048Homo SapiensSCAN domain-containing 86 32
rotein 2

1469 111320940Homo SapiensSCAND2 86 32

1469 g114210722Tupaia t41 86 30

herpesvirus

1470 AAY88278 Homo SapiensMILL- Human TANGO 188 1442 100
rotein.

1470 114336711Homo Sapienssimilar to C. Elegans 1442 100
protein F17C8.5

1470 AAA39947'Homo SapiensMILL- Human TANGO 188 1438 99
cDNA.

aal

1471 AAE10204 Homo SapiensHYSE-Humen bone marrow 71 44
derived

contig protein, SEQ ID
NO: 69.

1471 AAA23458 Homo SapiensALPH- cDNA encoding human67 46

_ secreted protein vpl5_l,
aal SEQ ID

N0:71.

1471 AAB80228 Homo sa GETH Human PR0269 protein.67 46
iens

1472 AAB88433 Homo SapiensHELI- Human membrane 136 86
or secretory

rotein clone PSEC0210.

1472 AAB95155 Homo SapiensHELI- Human protein sequence136 86
SEQ

ID N0:17188.

1472 AAE01745 Homo SapiensHUMA- Human gene 2 encoded136 86

secreted protein HOGCS52
variant,

SEQ ID N0:160.

1473 g19294201Arabidopsisdisease resistance protein70 24

thaliana

1474 AAE1915 Homo SapiensTHOR/ Human lcinase polypeptide631 98
7

(PKIN-15).

1474 AAM79131 Homo SapiensHYSE- Human protein SEQ ~ 494 ~ 72
ID NO

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
162
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1793.

1474 AAW 19920Homo sapiensREGC Human I~sr' (kinase494 72
suppressor

of Ras).

1475 AAD 12609_Homo SapiensSAGA Human protein having657 73

aal hydrophobic domain encoding
cDNA

clone HP03974.

1475 AA014199Homo Sapiens1NCY- Human transporter 657 73
and ion

channel TRICH-16.

1475 AAE06614Homo SapiensSAGA Human protein having657 73

hydrophobic domain, HP03974.

1476 113905246Mus musculusRIKEN cDNA 2410024K20 71 34
gene

1476 gi~17505208~Mus musculusCD2 antigen (cytoplasmic71 34
tail) binding

ref~NP'0816 protein 2; 1500011B02Rik

29.1
~

1477 g1806491Rarius guanylylcyclase 140 65

norvegicus

1477 g12648066Canis familiarisguanylate cyclase E 118 55

1477 g12623074Bos taurus rod outer segment guanylate116 55
cyclase

precursor

1478 12065210Mus musculusPro-Pol-dUTPase polyprotein585 73

1478 118676710Homo SapiensFLJ00254 protein 408 69

1478 AA004042Homo SapiensHYSE- Human polypeptide 392 75
SEQ ID

NO 17934.

1479 AAU05396Homo SapiensGEHO Human titin (connectin)208 29
protein

sequence.

1479 g11212992Homo SapiensProtein sequence and 208 29
annotation

available soon via Swiss-Prot;
available

at present via e-mail
from

LABEIT EMBL-Heidelber
.DE

1479 g117066105Homo sa iensTitin 208 29

1480 AAV44685,Homo SapiensTEXA Osteoclast inhibitor94 41
protein,

aal OIP-1, coding sequence.

1480 AAB35287Homo sa iensUROG- Human stem call 94 41
antigen-2.

1480 AAY99709Homo SapiensREGC Human stem cell 94 41
antigen-2,

hSCA-2.

1481 AAB57094Homo SapiensROSE/ Human prostate 122 100
cancer antigen

protein sequence SEQ
ID N0:1672.

1481 g132672 Homo Sapiensinterferon alphalbeta 122 100
receptor

1481 AAQ49625-Homo SapiensEUBI- Human interferon 118 96
receptor

aal extracellular domain
codin se uence.

1482 AAD17516_Homo SapiensSENO- Human taste receptor,890 94
hTlR1

aal cDNA coding sequence.

1482 ABB77319Homo Sapiens1NCY- Human G-protein 890 94
coupled

rece for SEQ ID NO 3.

1482 AAE10372Homo SapiensSEND- Human taste receptor,890 94
hTlR1

rotein.

1483 g118376312Neurospora related to SSD1 protein 109 39

crassa

1483 g12645173Schizosaccharomsts5+ 99 42

yces ombe

1483 g12459997Candida albicansrotein phosphatase Ssdl 99 40
homolog

1484 gi~18569064~Homo Sapienssimilar to 40S RIBOSOMAL319 96

ref~XP-0953 PROTEIN S3A (V-FOS

78.1 TRANSFORMATION EFFECTOR
~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
163
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

PROTEIN

1484 gi~20539276~Homo Sapienssimilar to olfactory 259 94
receptor MOR145-

ref~XP_0952 2

20.2

1484 gi~21295882~Anopheles agCP1347 68 32

gb~EAA080gambiae
str.

27.1 PEST

1485 ABB 11761Homo SapiensHYSE- Human secreted 197 36
protein

homologue, SEQ ID NO:2131.

1485 gi930259 Woolly monkeyreverse transcriptase 148 33
(476 AA)

sarcoma
virus

1485 gi18076262porcine Pol protein 147 38

endogenous

retrovirus

1486 AAM74887 Homo SapiensMOLE- Human bone marrow 172 100

expressed probe encoded
protein SEQ

ID NO: 35193.

1486 AAM62085 Homo sapiensMOLE- Human brain expressed172 100
single

exon probe encoded protein
SEQ ID

NO: 34190.

1486 1152661 Plasmid neomycin resistance rotein75 26
SB24.2

1487 112653493Homo sa Similar to brain acid-soluble75 34
iens protein 1

1487 g117428832Ralstoilia PROBABLE AVRBS3-LIKE 75 33

solanacearuxnPROTEIN

1487 g17329672Arabidopsisphosphatidate cytidylyltransferase-like72 46

thaliana protein

1488 AAU74754 Homo SapiensINCY- Human protease 2042 83
PRTS-14

rotein se uence.

1488 AAU74752 Homo SapiensINCY-Human protease PRTS-12476 39

protein sequence.

1488 111935122Mus musculusa ilin 431 40

1489 gi~17543712~CaenorhabditisYSSF3C.8.p 72 32

ref~NP-4999elegans

76.1

1489 gi~20344600~Mus musculusRIKEN cDNA 4933431K05 70 30

ref~XP_1095

79.1

1489 gi~11692798~Xenopus ataxia telangiectasia 69 26
laevis and Rad3-related

gb~AAG400 protein

02.1 ~AF320

125 1

1490 AAB95817 Homo SapiensHELI- Human protein sequence256 63
SEQ

ID N0:18817.

1490 ABB06369 Homo SapiensBODE- Human neurogenesis173 64
related

rotein 12 SEQ ID N0:2.

1490 AAB44394 Homo sapiensHUMA- Gene 10 encoded 83 66
human

secreted protein fragment
as BLASTX

query se uence.

1491 g1438795 Mus musculusserotonin 1A receptor 73 26

1491 g11066326Mus musculusserotoninlA receptor 72 26

1491 gi~438795~gbMus musculusserotonin 1A receptor 73 26
.

AAA 16850.

1~

1492 g116198083Drosophila LD29875p ~ 87 ~ 33

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
164
Table 2
SEQ AccessionSpecies Description Score

No. Identity

NO:

melano aster

1492 gi2327063Pneumocystisprotease 1 75 34

carinii f.
Sp.

carinii

1492 120420 Prunus dulcisextensin 75 34

1493 AAG67087 Homo SapiensSHAN- Human ATP-dependent106 67
serine

rotein hydrolase 13.

1493 AAM76636 Homo SapiensMOLE- Human bone marrow103 68

expressed probe encoded
protein SEQ

ID NO: 36942.

1493 AAM63822 Homo SapiensMOLE- Human brain expressed103 68
single

exon probe encoded protein
SEQ ID

NO: 35927.

1494 AAY31225 Homo SapiensAVET Human RNA helicase73 38
p135

protein.

1494 g13123906Homo sa ienspre-mRNA splicin factor73 38

1494 g113278975Homo Sapienspre-mRNA splicing factor73 38
similar to S.

cerevisiae P 16

1495 gi~17568307~Caenorhabditiscollagen 74 35

ref~NP-5098elegans

37.1 ~

1496 12065210 Mus musculusPro-Pol-dUTPase polyprotein410 81

1496 gi~10834720~Homo SapiensPP565 301 77

gb~AAG237

90.1~AF258

587 1

1496 gi~6753924~rMus musculusFriend virus susceptibility127 37
1

ef~NP_0343

74.1

1497 g120901968CaenorhabditisC. elegans RPL-36 protein71 34

elegans (comes ondin sequence
F37C12.4)

1497 gig 17554754CaenorhabditisRibosomal protein YL39 71 34

ref~NP elegans
4985

73.1

1498 g15305335Mycobacteriumproline-rich mucin homolog102 27

tuberculosis

1498 g1330130 human latency associated transcript97 37
(LAT)

herpesvirus ORF-2
1

1498 AAU83682 Homo SapiensGETH Human PRO protein,94 30
Seq ID No

182.

1499 AAY57937 Homo Sapiens1NCY- Human transmembrane199 81
protein

HTMPN-61.

1499 AAY36295 Homo SapiensHUMA- Human secreted 151 100
protein

encoded by gene 72.

1499 AAG75708 Homo SapiensHUMA- Human colon cancer141 92
antigen

rotein SEQ ID N0:6472.

1500 g121428712Drosophila SD05267p 165 54

melanogaster

1500 g120975274Homo Sapiensskeletrophin 114 40

1500 g119773434Mus musculusskeletrophin 99 52

1501 ABB 17830Homo SapiensHUMA- Human nervous 82 37
system related

pol epode SEQ ID NO
6487.

1501 AA012929 Homo SapiensHYSE- Human polypeptide73 43
SEQ ID

NO 26821.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
165
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1502 gi8778340ArabidopsisF15O4.13 77 39

thaliana

1503 AAW03515 Homo sa SHKJ Human DOCK180 protein.144 33
iens

1503 11339910 Homo sa DOCK180 protein 144 33
iens

1503 113195147Mus musculusHCH 129 25

1505 AAM70790 Homo SapiensMOLE- Human bone marrow 77 53

expressed probe encoded
protein SEQ

ID NO: 31096.

1505 AAM58316 Homo SapiensMOLE- Human brain expressed77 53
single

exon probe encoded protein
SEQ ID

NO: 30421.

1505 gi~21302711~Anopheles agCP4916 77 30

gb~EAA148gambiae
sir.

56.1 PEST

1506 AAU75102 Homo sa MYRI- Heat shock protein592 79
iens 8 (HspB).

1506 AAB82535 Homo SapiensUYCO- Human heat shock 592 79
protein

Hsc70.

1506 AAE12987 Homo SapiensSRIV/ Human Hsp70 family592 79

homologue, Hsc70.

1507 ABL53627 Homo SapiensGENO- Breast protein-eukaryotic213 92

_ conserved gene 1 (BSTP-ECG1)
aal

cDNA.

1507 ABB75677 Homo SapiensGENO- Breast protein-eukaryotic213 92

conserved gene 1 (BSTP-ECG1)

protein.

1507 AAY99421 Homo sapiensGETH Human PRO1433 (UNQ738)213 92

amino acid se uence SEQ
ID N0:292.

1508 AAW 15565Homo SapiensUYJO Human intracellular79 29
tyrosine

kinase Tnkl-al ha.

1508 g1233062 Gallus gallussrc dovcmstream region 78 33

1508 g118376366Neurospora related to ribosomal 72 30
protein S 15

crassa precursor (mitochondrial)

1509 gi~21297482~Anopheles agCP15541 68 36

gb~EAA096gambiae
str.

27.1 PEST

1510 AAM41631 Homo SapiensHYSE- Human polypeptide 127 37
SEQ ID

NO 6562.

1510 AAM39845 Homo sapiensHYSE- Human polypeptide 127 37
SEQ ID

NO 2990.

1510 AAM79502 Homo SapiensHYSE- Human protein SEQ 127 37
ID NO

3148.

1511 g121217669Mus musculusm osin IIIA 70 28

1511 gi~21302393~Anopheles agCP8799 71 36

gb~EAA145gambiae
str.

38.1 PEST

1511 gi~20822589~Mus musculussimilar to myosin IIIA 70 28

ref~XP,1408

54.1 ~

1512 g16911049Babesia p9.6.2-like variant erythrocyte82 28
bovis surface

antigen-la

1512 g16911045Babesia p9.6.2 variant erythrocyte82 28
bovis surface

antigen-la

1512 g16911047Babesia p8.4.1 variant erythrocyte81 28
bovis surface

antigen-la

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
166
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1513 gi10174843Bacillus maltose transport system77 25
(permease)

halodurans

1513 gi56312 Rattus Gephyrin 76 31

norvegicus

1513 gi4325371Arabidopsis contains similarity to 74 28
Medicago

thaliana truncatula N7 protein
(GB:Y17613)

1514 AAY14196Homo SapiensTAKEI T cell receptor 95 100
zeta chain

protein sequence.

1514 1623042 Homo SapiensT-cell receptor zeta 95 100
chain

1514 14960202Sus scrofa CD3 zeta chain 95 100

1515 ABB07508Homo SapiensINCY- Human aminoacyl 726 100
tRNA

synthetase (ATRS) polypeptide
(ID:

7474756CD 1 ).

1515 AAB43670Homo SapiensHUMA- Human cancer associated604 82

rotein sequence SEQ ID
NO:1115.

1515 g11464742Homo sa iensthreonyl-tRNA synthetase604 82

1516 g121109348Xanthomonas cytochrome B561 77 29

axonopodis
pv.

citri str.
306

1516 g121114046Xanthomonas cytochrome B561 76 28

campestris
pv.

campestris
str.

ATCC 33913

1516 gi~21243760~Xanthomonas cytochrome B561 77 29

reflIVP-6433axonopodis
pv.

42.1 citri str.
306

1517 ABB 11450Homo SapiensHYSE- Human neurotoxin 119 33
homologue,

SEQ ID N0:1820.

1517 18809770Mus musculusLy-6I.1 94 30

1517 18809768Mus musculuslymphocyte antigen LY6I 94 30
recursor

1519 gi~59977~emHuman tripartite fusion transcript171 67
PLA2L

b~CAA7866endogenous

2.1 ~ retrovirus

1519 gi~17826947~Pseudomonas beta-1,4-xylanase 73 34
sp.

dbj~BAB792ND137

87.1
~

1519 gi~21232680~Xanthomonas ribonuclease PH 72 30

ref~NP_6385campestris
pv.

97.1 campestris
~ str.

ATCC 33913

1520 AAM78023Homo sapiensMOLE- Human bone marrow 190 100

expressed probe encoded
protein SEQ

ID NO: 38329.

1520 AAM65326Homo sapiensMOLE- Human brain expressed190 100
single

exon probe encoded protein
SEQ ID

NO: 37431.

1520 g113447468Emericella FH1/FH2 protein homolog 121 49

nidulans

1522 AAG81417Homo SapiensZYMO Human AFP protein 287 100
sequence

SEQ ID N0:352.

1523 AAY90349Homo SapiensSMII~ Human fatty acid 158 85
synthase

(FAS) protein sequence.

1523 AAB43871Homo SapiensHLTMA- Human cancer associated158 85

rotein se uence SEQ ID
N0:1316.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
167
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1523 1915392 Homo Sapiensfatty acid synthase 158 85

1525 AAG03819 Homo SapiensGEST Human secreted protein,93 100
SEQ ID

NO: 7900.

1525 11311466 Homo sa 24-kDa subunit of Com 93 100
iens lex I

1525 g1188852 Homo SapiensNADH-ubi uinone reductase93 100

1526 AAD02855_Homo SapiensSUKA Human platelet membrane73 31

aal lycoprotein VI (GPVI)
cDNA.

1526 AAB49403 Homo SapiensMERE Human glycoprotein 73 31
VI mature

protein.

1526 AAB61257 Homo SapiensMILL- Mature human TANGO73 31
268

rotein.

1527 g117864896Mus musculusrotocadherin 18 precursor81 31

1527 g115980222Yersinia aconitate hydratase 1 79 30
pestis

1527 g112248353Fasciola NADH dehydrogenase subunit75 56
hepatica 5

1528 g12440214Trypanosomainvariant surface glycoprotein83 28
100

bruceibrucei

1528 g110567463Rhizobium probable viral gene 78 22

rhizogenes
.

1529 g12231279Porcine envelope protein 66 31

reproductive
and

respiratory

syndrome
virus

1530 gi~199851~gbMus musculuspot protein 257 42

~AAA39757.

1~

1530 gi~1498648~gMus musculusGag-Pol polyprotein 257 42

b~AAB0645

0.1~

1530 gi~331995~gbAKV marine gag-pot polyprotein (tag257 42
amber codon

~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)

1~

1533 g1435698 Homo sa CD44SP 136 100
iens

1533 AAV63461_Homo SapiensGEHO Human CD44 antigen 130 100
cDNA.

aal

1533 AAT14724_Homo SapiensGEHO Human haematopoietic130 100
CD44

aal cDNA clone CD44.5.

1534 g12622165Methanothermobacetyltransferase 71 29

acter

thermautotrophic

us str.
Delta H

1534 gi~15679078~Methanothermobacetyltransferase 71 29

ref~NP_2761acter

95.1 ~ thermautotrophic

us

1535 g17777 Drosophila protein H 73 28

melanogaster

1535 g1457146 Plasmodium rhoptryprotein 73 38

yoelii

1535 g113195258Plasmodium 235 kDa rhoptry protein 73 38

yoelii yoelii

1536 ABB09740 Homo sapiensBODE- Amino acid sequence132 43
of human

protein hos hatase 11.66.

1536 gi~20830386~Mus musculussimilar to importin alpha72 35
1b

reflXP
1456

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
168
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

42.1

1537 gi14039907Rattus cytochrome P450 monooxygenase353 39

norvegicus CYP2T1

1537 gi2920650Mus musculuscytochrome P450 CYP2B19 275 44

1537 12353336 Capra hircuscytochrome P450 271 31

1538 AAU83175 Homo SapiensZYMO Novel secreted protein282 100

Z874015G4P.

1538 g16714803Streptomycesintegral membrane protein.77 26

coelicolor
A3(2)

1539 g112963397Prunus x ribulose-1,5-bisphosphate74 32

yedoensis carboxylase/oxygenase
lar a subunit

1539 g1466436 SaccharomycesBOI1 69 31

cerevisiae

1539 g15833897Besleria ribulose 1,5-bisphosphate69 31
affinis carboxylase

large subunit

1542 AAY32193 Homo SapiensINCY- Human receptor 73 26
molecule

(REC) encoded by Incyte
clone

044150.

1542 g17576677HelicobacterIceAl 72 44

ylori

1542 gi~20841498~Mus musculussimilar to MUF1 protein 73 26

re~XP_l
315

41.1

1546 114581448Homo SapiensFSHD Region Gene 2 protein73 42

1546 g115982852ArabidopsisAT5g66850/MUD21_ll 71 34

thaliana

1546 gi~14581448~Homo SapiensFSHD Region Gene 2 protein73 42

gb~AAK219

77.1 ~

1547 g118676660Homo sa FLJ00229 protein 192 92
iens

1547 AAU21409 Homo SapiensHUMA- Human novel foetal179 100
antigen,

SEQ ID NO 1653.

1547 AAM42128 Homo SapiensHYSE- Human polypeptide 114 53
SEQ ID

NO 7059.

1548 AAG64494 Homo SapiensSHAN- Human natriuretic 539 100
peptide

receptor 18.

1548 118676710Homo sa FLJ00254 rotein 268 77
iens

1548 AAB28764 Homo SapiensHUMA- Sequence homologous249 72
to

rotein fragment encoded
by gene 21.

1549 AAB67055 Homo Sapiens1NCY- Human immune response606 82

molecule (IMUN) protein
SEQ ID NO:

9.

1549 AA001862 Homo SapiensHYSE- Human polypeptide 404 72
SEQ ID

NO 15754.

1549 gi~6753924~rMus musculusFriend virus susceptibility213 36
1

ef~NP
0343
_

74.1 ~

1550 1190129 Homo Sapiens70kDa peroxisomal membrane92 100
protein

1550 g1825711 Homo Sapiens7bkD peroxisomal integral92 100
membrane

protein

1550 g1220862 Rattus PMP70 89 94

norve icus

1551 AAM69543 Homo SapiensMOLE- Human bone marrow 228 100

expressed robe encoded
rotein SEQ

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
169
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

ID NO: 29849.

1551 AAM57148 Homo SapiensMOLE- Human brain expressed228 100
single

exon probe encoded protein
SEQ ID

NO: 29253.

1551 AAB93944 Homo SapiensHELI- Human protein 94 57
sequence SEQ

ID N0:13960.

1552 gi4884924Rangiferine glycoprotein C 75 34

he esvirus
1

1552 gi~18556240~Homo sapienssimilar to Salivary 78 30
glue protein SGS-3

ref~~ precursor
0676

28.2

1552 gi~4884924~gRangiferine glycoprotein C 75 34

b~AAD3187herpesvirus
1

6.1~

1553 gi~2193870~dMus musculusreverse iranscriptase 176 35

bj ~BAA2041

9.1

1553 gi~2731767~gMus musculusendonuclease/reverse 176 35
transcriptase

b~AAC5354

2.1

1554 ABB08776 Homo SapiensBODE- Human neuregulin 75 29
55 SEQ ID

NO 2.

1554 AAM92816 Homo SapiensHUMA- Human digestive 71 29
system

antigen SEQ ID NO: 2165.

1554 gi~6322838~rSaccharomycesProtein required for 70 27
cell viability;

ef~NP cerevisiae Yk1014cp
0129

_
11.1

1555 gi7528184Drosophila bicoid-interacting protein78 28
B1N3

melanogaster

1555 gi15292595Drosophila SD09926p 78 28

melanogaster

1555 gi4514620Mus musculusRor2 71 24

1557 ABA91504_Homo SapiensEYEE- Human epidermal 144 93
growth factor

aal rece for recursor cDNA.

1557 AAF85332_Homo SapiensNOVS Nucleotide sequence144 93
of wild

aal a EGFRl.

1557 AAM50768 Homo SapiensEPEE- Human epidermal 144 93
growth factor

receptor precursor.

1558 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein221 100

cysteine methyltransferase
14.

1558 AAU16267 Homo SapiensHUMA- Human novel secreted221 100
protein,

Seq ID 1220.

1558 ABB 11507Homo SapiensHYSE- Human secreted 183 97
protein

homologue, SEQ ID N0:1877.

1559 gi14599730Sachea correaematurase 71 28

1559 gi14599648Blepharandramaturase 71 30

hetero etala

1559 gi14599673Galphimia maturase 70 28

acilis

1560 gi2323287multiple polyprotein 340 83

sclerosis

associated

retrovirus

1560 gi 13310191multiple recombinant envelope 260 70
protein

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
170
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

gb~AAK181sclerosis

89.1~AF331associated

500_1 retrovirus

element

1560 gi~21103962~Homo Sapiensenverin-2 248 84

gb~AAM331

41.1

1561 AAB94698 Homo SapiensHELI- Human protein sequence107 95
SEQ

ID NO:15680.

1561 AAU18480 Homo SapiensHUMA- Human endocrine 107 95
polypeptide

SEQ ID No 435.

1561 ABB 10288Homo sapiensHUMA- Human cDNA SEQ 107 95
ID NO:

596.

1562 gi969078 Drosophila S-adenosylhomocysteine 73 26
hydrolase

melanogaster

1562 gi21064553Drosophila RE58316p 73 26

melano aster

1562 AAM41205 Homo SapiensHYSE- Human polypeptide 72 30
SEQ ID

NO 6136.

1563 gi1778844DictyosteliumLimA 71 34

discoideum

1563 gi~20985456~Mus musculussimilar to actin beta 75 36
chain - human

ref~XP-1421

11.1

1563 gi~1778844~gDictyosteliumLimA 71 34

b~AAB4092discoideum

9.1~

1564 gi~9507757~rPlasmid resolvase 507 91
F

etlNP_0614

23.1

1564 gi~148589~gbPlasmid Protein D 507 91
F

~AAA24900.

1~

1564 gi~10955295~Escherichiaresolvase 501 90
coli

retlNP_0526

36.1

1565 gi7649370Arabidopsisguanine nucleotide-exchange-like77 38

thaliana rotein

1565 gi1674160Mycoplasma involved in cytadherence,71 35
see:

neumoniae MPN142

1565 gi~15229258~Arabidopsisguanine nucleotide-exchange77 38
- like

ref~NP_1899thaliana protein

16.1

1566 gi1799600SwissProt similar to 1051 99

Accession

Number P31458

1566 gi13814506Sulfolobus Mandelate racemase /muconate286 35

solfataricuslactonizing enzyme related
protein

(MR/MLE)

1566 gi10640034Thermoplasmastarvation-sensing protein270 35
rspA related

acido hilumprotein

1567 gi13359972Escherichiaacridine efflux pump 573 98
coli

0157:H7

1567 gi1773144Escherichiaprobable transmembrane 573 98
coli protein AcrE

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
171
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1567 gi532311 Escherichia114 kDa rotein 573 98
coli

1569 gi8918871YccA of 96 pct identical to gp:AB021078288 98
plasmid 30

ColIb-P9]

[Plasmid
F

1569 gi~17136976~Drosophila repo-P1; Antibody RK2 71 33

ref~NP_4770melanogaster

26.1)

1569 gi~6502544~gGlomus homeobox protein HB 1 70 31

b~AAF14351intraradices

.1~AF11019

81

1570 gi13363792Escherichiazinc-transporting ATPase410 87
coli

0157:H7

1570 gi466605 EscherichiaNo definition line found410 87
coli

1570 gi12518128Escherichiazinc-transporting ATPase410 87
coli

0157:H7

EDL933

1571 AAU83186 Homo SapiensZYMO Novel secreted protein1006 100

Z887014G7P.

1571 gi7248459Zea mays arabinogalactan protein 85 29

1571 gi3513742Arabidopsiscontains similarity to 82 35
Zea mays

thaliana embryogenesis transmembrane
protein

(GB:X97570)

1572 gi12597465CaenorhabditisCED-1 72 44

elegans

1572 gi19571666Caenorhabditissimilar to EGF-like domain72 44

elegans

1572 gi4883938Drosophila laminin alphal,2 67 31

melanogaster

1573 ABB12490 Homo sapiensHYSE- Human bone marrow 106 38
expressed

rotein SEQ ID NO: 329.

1574 11478205 Mus musculusPNG rotein 75 41

1574 AAM40148 Homo SapiensHYSE- Human polypeptide 69 56
SEQ ID

NO 3293.

1574 AAM79341 Homo SapiensHYSE- Human protein SEQ 69 35
ID NO

2987.

1576 gi~20882651~Mus musculusATPase, class 2, member 234 91
b

ref~XP_1233

03.1

1576 gi~7656918~rMus musculusATPase, class 2, member 234 91
b; ATPase

ef]NP_0566 9B, class II; ATPase
9B, p type

20.1 ~

1577 g118143418Alteromonaschitinase A 77 39
Sp.

O-7

1577 g115426105Leishmania probable surface antigen75 24
protein

ma'or

1578 119702241Homo Sapiensrabconnectin 439 93

1578 g17452946Homo SapiensX-like 1 protein 132 41

1578 g11279384Drosophila X 109 29

melanogaster

1580 AAE20337 Homo SapiensHUMA- Human B7-H11 protein122 23

mature extracellular
domain.

1580 AAE20336 Homo SapiensHUMA- Human B7-H11 protein122 23

extracellular domain.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
172
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1580 gi2062702Homo sa butyrophilin 122 23
iens

1581 AAE18640 Homo SapiensINCY- Human G-protein 70 35
coupled

rece for (GCREC-1).

1581 118369751Oryza sativaethylene res onsive rotein70 50

1581 g115217292Oryza sativa]Putative AP2 domain containing70 50

[Oryza sativaprotein

(japonica

cultivar-
oup)

1583 g16468047Homo SapiensKrup el-like factor 85 73

1583 g15916096Homo SapiensKru pel-like factor LKLF85 73

1583 g14583418Homo SapiensKruppel-like zinc forger85 73
transcription

factor

1585 g12570021Homo Sapienspaired box containing 77 .37
transcription

factor

1585 13115988 Homo SapiensdJ394P2-1.1 (PAX-7) 77 37

1585 12570015 Homo sa alternative 77 37
iens

1586 g17861533Rattus retina specific protein 72 43
PAL

norvegicus

1586 g120977028Xenopus mitotic hosphoprotein 72 34
laevis 39

1586 AAB58458 Homo SapiensROSE/ Lung cancer associated68 39

polype tide se uence
SEQ ID 796.

1587 g15901864Drosophila BcDNA.LD27873 81 24

melanogaster

1587 g115458514StreptococcusPneumococcal histidine 78 27
triad protein D

neumoniae precursor
R6

1587 15042400 Homo sa NFI-X3=transcription 75 30
iens factor AA

1592 g14210501Homo sa BC85722_1 253 61
iens

1592 g114794910Homo sa ca icua protein 253 61
iens

1592 114794914Mus musculusca icua protein 253 61

1593 gi~8131854~gTrypanosomaantigen JL8 69 34

b~AAF73108cruzi

.1 CAF
14795

61

1595 g118892729Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase

furiosus
DSM

3638

1595 gi~20847046~Mus musculussimilar to Transcription70 28
factor BTF3

ref~XP_1366 (RNA polymerise B transcription

21.1 factor 3)

1595 gi~18977088~Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase

ref~NP_5784furiosus
DSM

45.1 3638

1597 AAU83621 Homo SapiensGETH Human PRO protein, 151 42
Seq ID No

60.

1597 AA005826 Homo SapiensHYSE- Human polypeptide 146 83
SEQ ID

NO 19718.

1597 AAM41346 Homo SapiensHYSE- Human polypeptide 102 46
SEQ ID

NO 6277.

1598 AAM79503 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO

3149.

1598 AAM78519 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO

1181.

1598 g118676526Homo sa FLJ00160 rotein 80 35
iens

1599 g12149640ArabidopsisAr~onaute protein 72 33

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
173
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

thaliana

1599 gi15027491respiratoryglycoprotein 71 32

syncytial
virus

1599 gig 15221177Arabidopsisleaf development protein72 33
Argonaute

reflNP-1752thaliana

74.1

1601 gi17130010Nostoc Sp. WD-40 repeat protein 136 28
PCC

7120

1601 gi1653631Synechocystisbeta transducin-like 131 26
protein

s . PCC '
6803

1601 gi17135261Nostoc Sp. WD-40 repeat protein 115 27
PCC

7120

1602 gi1103853Rattus rHAPl-A 89 33

norve icus

1602 gi1103851Rattus huntingtin associated 89 33
protein

norve icus

1602 gi14579673Takifugu pericentriolar material 87 30
1 protein

rubripes

1603 gi537446 ArabidopsisAtHSP101 75 31

thaliana

1603 gi12324908Arabidopsisheat shock protein 101; 75 31
13093-16240

thaliana

1603 gi6715468Arabidopsisheat shock protein 101 75 31

thaliana

1604 12190531 Vibrio choleraemethyl acceptin chemotaxis71 26
rotein

1604 g19657614Vibrio choleraehemolysin secretion protein71 26
HyIB

1604 g19655306Vibrio choleraeheat shock rotein E 70 35

1605 g13912936Geobacillusornithine carbamoyltransferase68 31

stearothermophil

us

1606 g18797 Drosophila CYS3HIS finger protein 678 51

melano aster

1606 g115291975Drosophila LD33756p 617 65

melanogaster

1606 g16967181Homo Sapiensc399E4.1 (similar to 549 75
D.melanogaster

unkem t protein.)

1607 gi~21301783~Anopheles agCP8730 72 35

gb~EAA139gambiae
str.

28.1 PEST

1607 gi~21361276~Homo Sapiensinterferon-stimulated 68 29
transcription

ref~NP_0060 factor 3, gamma (48kD);
interferon-

75.2~ stimulated gene factor
3, gamma

subunit (48 kD)

1609 g12661094Spinacia cold acclimation protein76 32

oleracea

1612 gi~1780975~eHuman gag protein 312 34

mb~CAA714endogenous

18.1 ~ retrovirus
K

1612 gi~5802810~gHomo SapiensGag-Pro-Pol protein 309 34

b~AAD5179

1.1~

1612 gi~887448~eHuman gag 309 34

mb~CAA513endogenous

06.1 ~ retrovirus

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
174
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1613 AA013889Homo SapiensHYSE- Human polypeptide 73 42
SEQ ID

NO 27781.

1614 111065727Homo sa iensdJ493F7.1 (similar to 347 100
marine BET3)

1614 g12791806Mus musculusbeta 253 69

1614 113277654Mus musculusBet3 homolo (S. cerevisiae)253 69

1615 g11122901SaccharomycesMSP8 77 20

cerevisiae

1615 g1825546SaccharomycesCatBp 77 20

cerevisiae

1615 g117978563Xeno us laevisSpl-like zinc-finger 75 40
protein XSPR-1

1616 AAY02536Homo SapiensICOS- Human ICAM-6 protein458 98

sequence.

1616 g112248907Homo sa iensTCAM-1 458 98

1616 g14579740Ratios testicular cell adhesion366 76
molecule 1

norve icus (TCAM1)

1617 AAM67067Homo SapiensMOLE- Human bone marrow 271 64

expressed probe encoded
protein SEQ

ID NO: 27373.

1617 AAM54664Homo SapiensMOLE- Human brain expressed271 64
single

exon probe encoded protein
SEQ ID

NO: 26769.

1617 AAM56747Homo SapiensMOLE- Human brain expressed229 69
single

exon probe encoded protein
SEQ ID

NO: 28852.

1618 g15802814Homo sapiensGag-Pro-Pol-Env rotein 532 52

1618 g11780973Human poi protein 531 52

endogenous

retrovirus
K

1618 15802821Homo sa iensGa -Pro-Pol protein 531 52

1619 g12769587Mus musculusSTOP rotein 662 86

1619 g11370291Rattus STOP protein 662 92

norve icus

1619 g13287265Rattus E-STOP protein 662 92

norve icus

1620 AAM65980Homo sapiensMOLE- Human bone marrow 266 100

expressed probe encoded
protein SEQ

ID N0: 26286.

1620 AAM53601Homo SapiensMOLE- Human brain expressed266 100
single

exon probe encoded protein
SEQ ID

NO: 25706.

1620 gi~20270271~Mus musculusRIKEN cDNA 1190017012 198 80

ref~NP_6200

82.1

1621 g111862941Mus musculusDDM36E 74 33

1621 111862939Mus musculusDDM36 74 33

1621 g17650186Mus musculusneighbor of Punc e1 l 73 33
rotein

1622 g13157464Thermos Sp. integral membrane rotein74 38
A4

1623 gi~59977~emHuman tripartite fusion transcript129 82
PLA2L

b~CAA7866endogenous

2.1 ~ retrovirus

1623 gi~20161147~Oryza sativaVsaA -like protein 88 32

dbj~BAB900(japonica

75.1 cultivar-group)
~

1623 gi~17864474~Drosophila domino ~ 87 41

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
175
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

ref~NP_5248melanogaster

33.1

1626 AA000498 Homo SapiensHYSE- Human polypeptide99 43
SEQ ID

NO 14390.

1627 g114041733Xenorhabdus XptA2 protein 70 23

nematophila

1627 gi~15641593~Vibrio choleraecatalase 69 23

re~NP_2312

25.1

1628 g119888204MethanopyrusSite-specific DNA methylase80 27

kandleri
AV 19

1628 g16358691Simian Pol protein 78 32

immunodeficienc

y virus

1628 gi~20094956~MethanopyrusSite-specific DNA methylase80 27

ref~NP-6148kandleri
AV19

03.1 ~

1629 AAB07704 Homo Sapiens1NMR Protein encoded 594 67
by the

endogenetic fragment
of HERV-W.

1629 g18272464Homo sa iensgag 594 67

1629 AAB07703 Homo SapiensINMR Protein encoded 590 66
by the

endogenetic fragment
of HERV-W.

1630 g132498 Homo sa iensprecursor (AA -23 to 145 100
476)

1630 1339595 Homo sa ienstriglyceride lipase 145 100
precursor

1630 1386859 Homo sa ienshepatic 1i ase 145 100

1631 g18777465Rattus cytoplasmic dynein heavy703 77
chain

norvegicus

1631 g117019507Tripneustes dynein heavy chain isotype505 53
1B

gratilla

1631 AAB93815 Homo SapiensHELI- Human protein 457 71
sequence SEQ

ID N0:13606.

1632 AAM68837 Homo SapiensMOLE- Human bone marrow122 48

expressed probe encoded
protein SEQ

ID NO: 29143.

1632 AAM56460 Homo SapiensMOLE- Human brain expressed122 48
single

exon probe encoded protein
SEQ ID

NO: 28565.

1632 g117861826Drosophila GM01964p 90 51

melano aster

1633 gi~21300783~Anopheles ebiP1105 77 33

gb~EAA129gambiae str.

28.1 ~ PEST

1633 gi~19880523~Bactrocera vitellogenin 1 precursor68 27

gb~AAM003dorsalis

72.1 ~AF3
68

053 1

1633 gi~21070999~Homo Sapiensstromal interaction 68 39
molecule 2

ref~NP-0659 precursor

11.1

1637 g12323287multiple polyprotein 289 91

sclerosis

associated

retrovirus

1637 gi~21103962~Homo Sapiensenverin-2 261 82

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
176
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

gb~AAM331

41.1

1637 gi~13310191~multiple recombinant envelope 259 82
protein

gb~AAK181sclerosis

89.1~AF331associated

500_1 retrovirus

element

1638 AAR58809 Homo sa iensUYNY Human RPTP- aroma.86 26

1638 gi292411 Homo Sapiensreceptor-type protein 86 26
tyrosine

hosphatase aroma

1638 11263069 Homo sa iensreceptor tyrosine phos 86 26
hatase gamma

1639 g19857054Leishmania possible CG7055 protein74 27

maj or

1639 gi~20853034~Mus musculusexpressed sequence AI44751973 35

ref~XP_1259

62.1

1639 gi~7008003~dMus musculustranscription factor 73 35
MAZR

bj ~BAA9087

4.1~

1640 AAG03810 Homo SapiensGEST Human secreted 220 95
protein, SEQ ID

NO: 7891.

1640 1186800 Homo Sapiensribosomal protein L12 220 95

1640 g157680 Rattus rattusribosomal protein L12 220 95

1641 AAB44286 Homo SapiensGETH Human PR01072 (UNQ529)1709 100

protein sequence SEQ
ID N0:303.

1641 AAY41730 Homo sapiensGETH Human PR01072 protein1709 100

sequence.

1641 114602625Homo sapiensPAN2 rotein 1709 100

1642 g120147241Arabidopsis ATSg09850/MYH9 6 74 32

thaliana

1642 g114329782Homo sa iensdJ1121G12.3 (Novel gene)72 28

1642 gi~16648730~Arabidopsis ATSg09850/MYH9_6 74 32

gb~AAL255thaliana

57.1

1643 g12952340Ratios insulin receptor substrate89 31
2

norvegicus

1643 g12653351Bovine product of latency-related83 30
gene

herpesvirus
type

1.1

1643 14511969 Homo Sapiensinsulin rece for substrate-282 26

1644 g19964099Chlamydia inclusion membrane protein73 35

trachomatis

1644 g119171028EncephalitozoonATP DEPENDENT DNA BINDING67 29

cuniculi HELICASE (RAD3/XPD

SUBFAMILY OF HELICASES)

1644 gi~9964095~gChlamydia inclusion membrane protein73 35

b~AAG0982trachomatis

1.1 ~AF2793

62 1

1646 gi~10863995~Homo Sapiensclones 23667 and 23775 67 42
zinc finger

ref~NP_0670 protein

11.1

1647 11196425 Homo sa iensenvelo a rotein 93 39

1647 g1200296 Mus musculusperlecan 85 26

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
177
Tahle 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1647 18131894 Homo Sapiensmitofilin 84 27

1648 g11573040Haemophilusaspartokinase I / homoserine73 36

influenzae dehydrogenase I (thrA
Rd

1648 g18778726ArabidopsisT25N20.14 73 31

thaliana

1648 gi~16272063~Haemophilusaspartokinase I / homoserine73 36

refjNP-4382influenzae dehydrogenase I (thrA)
Rd

62.1

1649 g1295642 Saccharomycesphospholipase C 79 36

cerevisiae

1649 g17548846Saccharomycesdelta class phosphoinositide-specific77 36

cerevisiae hos holi ase C homolo

1649 g1161104 Schistosomaengrailed-like homeodomain74 35
protein

mansoni

1651 gi~13129464~Oryza sativa]Polyprotein 66 40

gb~AAK131[Oryza sativa

22.1~AC080(japonica

019 14 cultivar-
ou )

1652 AAG81446 Homo SapiensZYMO Human AFP protein 249 100
sequence

SEQ ID N0:410.

1652 118032212Homo sa histone acetyltransferase89 34
iens MOZ2

1652 AAR34936 Homo sapiensUYJO CENP-B. 77 35

1653 g120145484Bos taurus SCO-spondin 71 29

1655 AAM86382 Homo SapiensHUMA- Human 129 55

immune/haematopoietic
antigen SEQ

ID N0:13975.

1655 ABB03887 Homo SapiensHLTMA- Human musculoskeletal118 62

system related polypeptide
SEQ ID NO

1834.

1655 AAM75964 Homo SapiensMOLE- Human bone marrow 85 56

expressed probe encoded
protein SEQ

ID NO: 36270.

1659 g138035 Homo Sapiensp25 protein 110 45

1659 g1330915 Equine IR4 protein 99 28

herpesvirus
1

1659 g1156606 Chironomus SpId 84 30

tentans

1660 g19654641Vibrio cholerae3-deoxy-D-manno-octulosonic-acid84 23

transferase

1660 gi~20835446~Mus musculussimilar to STARP antigen73 25

reflXP-1444

09.1 ~

1660 gi~15596880~Pseudomonasprobable sugar aldolase 72 26

re~NP_2503aeruginosa

74.1

1661 g14062318EscherichiaHeat-responsive re ulatory79 36
coli protein

1661 g1976025 EscherichiaHrsA 79 36
coli

1661 g11786951Escherichiaprotein modification 79 36
coli enzyme, induction

K12 of om C

1662 AAM68588 Homo sapiensMOLE- Human bone marrow 155 100

expressed probe encoded
protein SEQ

ID NO: 28894.

1662 AAM56212 Homo SapiensMOLE- Human brain expressed155 100
single

exon probe encoded rotein
SEQ ID

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
178
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

NO: 28317.

1662 gi3845169Plasmodium phosphatase (acid phosphatase66 52
family)

falci arum
3D7

1663 AAG89215 Homo SapiensGEST Human secreted protein,218 100
SEQ ID

NO: 335.

1663 gi20070921Mus musculusRIKEN cDNA 2410008M22 130 55
ene

1663 AAR77602 Homo SapiensFORSI Human circulating 92 44
cytokine

CC-1 C-terminal fragment.

1664 AAE18212 Homo SapiensCURA- Human MOL4 protein.75 47

1664 AAM00966 Homo SapiensHYSE- Human bone marrow 72 35
protein,

SEQ ID NO: 442.

1665 AAB92828 Homo SapiensHELI- Human protein sequence74 93
SEQ

ID N0:11365.

1665 AAG63852 Homo SapiensINCY- Amino acid sequence74 93
of human

GTPase activating protein
GTPAP2.

1665 AAG63851 Homo SapiensINCY- Amino acid sequence74 93
of human

GTPase activatin protein
GTPAP 1.

1666 AAM72897 Homo sapiensMOLE- Human bone marrow 135 65

expressed probe encoded
protein SEQ

ID NO: 33203.

1666 AAM60268 Homo SapiensMOLE- Human brain expressed135 65
single

exon probe encoded protein
SEQ ID

NO: 32373.

1666 gi4007097Homo SapiensdJ1118D24.2 (60S Ribosomal135 65
Protein

L 10 LIKE)

1667 gi212267 Gallus anuscartilage link protein 917 49

1667 12010 Sus scrofa link rotein recursor 913 51
(AA -15 to 339)

1667 g1459439 E uus caballuslink protein 910 51

1668 110443237Mus musculuss licing factor 3a, subunit276 36
2

1668 g1396743 Podocoryne Pod-EPPT 276 30

carnea

1668 g1294131 Plasmodium circumsporozoite protein266 22

falcipanxm

1669 AAM49641 Homo sapiensBOEH Human tumour-associated132 65

antigen B345 rotein SEQ
ID NO 4.

1669 AAU12252 Homo SapiensGETH Human PRO5773 polypeptide132 65

se uence.

1669 AAY91592 Homo SapiensHUMA- Human secreted 132 65
protein

sequence encoded by gene
6 SEQ ID

N0:265.

1670 g14835383Homo sa alias DLC1 226 47
iens

1670 g14704343Homo Sapiensalias DLC1; candidate 226 47
tumor

suppressor ene

1670 g1155627 Acanthamoebamyosin I heavy chain 118 42

castellanii

1671 ABB 12490Homo SapiensHYSE- Human bone marrow 237 88
expressed

protein SEQ ID NO: 329.

1671 g16002932Streptomycesglycosyltransferase 67 35

fradiae

1671 gi~9634613~rHuman Ll 65 39

ef~NP_0381papillomavirus

50.1 ~ type 69

1672 g113938013Homo SapiensSimilar to RIKEN cDNA 333 66
2610509612

ene

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
179
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1672 gi2388970Schizosaccharomtat-binding homolog 235 41
7, AAA ATPase

yces pombe family roteiii

1672 gi6850321Arabidopsis Contains similarity 214 40
to YTA7 ATPase

thaliana gene from Saccharomyces
cerevisiae

gb~X81072, and contains
Bromodomain

PF~00439, AAA PF~00004,,
and Sigma-

54 PF~00158 transcription
factor

domains.

1673 gil 1066113Drosophila Misexpression suppressor71 29
of ras 4

melano aster

1673 gi~20829387~Mus musculusRIKEN cDNA 4930455F23 77 27

rel]XP-1295

40.1

1673 gi~17647635~Drosophila Misexpression suppressor71 29
of ras 4

ref~NP,5237melanogaster

75.1

1674 gi~20535935~Homo sapienssimilar to splicing 75 37
coactivator subunit

ref~XP-1157 SRm300; RNA binding
protein; AT-

87.1 rich element bindin
factor

1674 gi~17544226~CaenorhabditisY76B12C.4.p 72 34

re~NP_5001elegans

51.1

1674 gi~17559826)CaenorhabditissepB domain 70 26

ref~NP_5057elegans

99.1

1675 gi5708067Oryctolagus hyperpolarization activated99 27
cation

cuniculus channel

1675 gi402558 Canis familiarismucin 98 27

1675 110636484Homo Sapienspolyglutamine-containin96 26
protein

1676 AAM95365 Homo SapiensHUMA- Human reproductive73 26
system

related antigen SEQ
ID NO: 4023.

1676 AAB56709 Homo SapiensROSEI Human prostate 72 34
cancer antigen

protein sequence SEQ
ID NO:1287.

1676 g11881288Bacillus FUNCTION UNKNOWN, SIMILAR71 30
subtilis

PRODUCT IN E.COLI, H.

INFLUENZAE AND NEISSERIA

MENINGITIDIS.

1677 gi~15892512~EC:2.7.7.41]phosphatidate cytidylyltransferase65 34

ref~NP_3602[Rickettsia

26.1 conorii

1679 g114231 SaccharomycesNADH dehydrogenase (ubiquinone)75 31

cerevisiae

1679 g1805022 SaccharomycesNdilp 73 31

cerevisiae

1679 g11353352Chlamydomonasalanine aminotransferase70 27

reinhardtii

1680 g11805421Bacillus surfactin production 77 36
subtilis

1680 g1396482 Bacillus srfA2 77 36
subtilis

1680 g1516360 Bacillus surfactin synthetase 77 36
subtilis

1681 AAG64494 Homo SapiensSHAN- Human natriuretic156 80
peptide

rece for 18.

1681 AAE16275 Homo SapiensINCY- Human kinase PKIN-21154 73

protein.

1681 AAM40599 ~ Homo Sapiens~ HYSE- Human polypeptide~ 154 ~ 73
SEQ ID I

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
180
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

NO 5530.

1682 g12323287multiple polyprotein 1646 75

sclerosis

associated

retrovirus

1682 gi~2351212~dFriend marinegag-pol polyproteiii 807 40
(precursor protein)

bj ~BAA2206leukemia
virus

4.1~

1682 gi~9626961~rMarine leukemiaPr180 802 40

ef~NP_0579virus

33.1

1683 AAM39205 Homo SapiensHYSE- Human polypeptide 457 53
SEQ ID

NO 2350.

1683 g13033415Gibbon ape gag polyprotein 353 38

leukemia
virus

1683 gi~6524623~gPhascolarctosgag protein 343 38

b~AAF15097cinereus

.1~

1684 g119110438Homo Sapienspolycystin-1L1 712 98

1684 g16361629Periplanetavitellogenin 81 25

americana

1684 13115393 Rana 1 iensguanylate cyclase inhibitory80 35
protein

1686 AAY91542 Homo SapiensHUMA- Human secreted 212 84
protein

sequence encoded by gene
92 SEQ ID

N0:215.

1686 11279841 Bos taurus glycine trans otter 72 36

1686 119879917Oryza sativaacid hosphatase 70 35

1687 g112056568Homo sa MSTP063 212 88
iens

1687 113539684Homo sa zinc forger rotein 291 212 88
iens

1687 gi~12056568~Homo SapiensMSTP063 212 88

gb~AAG479

45.1~AF119

814 1

1689 g15689766Homosa ienszinc finger 2.2 222 91

1689 AAU16267 Homo SapiensHUMA- Human novel secreted178 58
protein,

Seq ID 1220.

1689 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein177 60

cysteine methyltransferase
14.

1690 g13328880Chlamydia Protein Export 73 29

trachomatis

1690 g12832232Brucella flagellin; FIiC 67 29

melitensis
biovar

Aborius

1690 g117984285Brucella FLAGELL1N 67 29

melitensis

1692 g14927443Haemophilushemoglobin/hemoglobin-haptoglobin93 80

influenzae binding protein

1692 g14204775Haemophilushemoglobin and hemoglobin-93 80

influenzae ha toglobin bindin protein

1692 g13647226Haemophilusliemoglobin binding protein93 80

influenzae

1694 AAW95631 Homo SapiensGEMY Homo Sapiens secreted102 100
protein

gene clone hj968 2.

1694 g113162186Homo Sapiens~ calsyntenin-3 protein ~ 102 ~ 100

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
181
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1695 AA004205 Homo SapiensHYSE- Human polypeptide 81 37
SEQ ID

NO 18097.

1695 gi160180 Plasmodium circumsporozoite antigen81 29

cynomolgi

1695 gi495522 Plasmodium circumsporozoite protein80 30

simiovale

1696 AAM80223 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO

3869.

1696 AAM79239 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO

1901.

1696 gi3688394Homo sa triple LIM domain rotein252 66
iens

1697 gi19887715MethanopyrusPredicted membrane protein74 28

kandleri
AV 19

1698 AAM93184 Homo SapiensHELI- Human polypeptide,269 87
SEQ ID

NO: 2552.

1698 118044066Mus musculusRIKEN cDNA 5033406L14 226 76
gene

1698 AAB95302 Homo SapiensHELI- Human protein sequence194 78
SEQ

ID N0:17538.

1699 ABB17279 Homo SapiensHUMA- Human nervous system110 56
related

olypeptide SEQ ID NO
5936.

1699 AA013013 Homo SapiensHYSE- Human polypeptide 101 71
SEQ ID

NO 26905.

1699 gi~7650258~gHepatitis polyprotein 74 28
C virus

b~AAF65960

.1 ~AF20777

0 1

1700 g112697585Arabidopsis4-(cytidine 5'-phospho)-2-C-methyl-D-69 40

thaliana erithritol kinase

1701 g116740569Homo sa Similar to thymus expressed84 27
iens gene 3

1701 g117940760Mus musculuscask-interacting protein79 26
2

1701 g117940758Homo sapienscask-interacting protein77 26
1

1702 g117385401Homo SapiensTPIP alpha 1i id phosphatase234 62

1702 AAU75783 Homo sapiensINCY- Human protein phosphatase208 57
1

(PP1) protein sequence.

1702 AAG67638 Homo SapiensHELI- Amino acid sequence202 56
of a

human rotein.

1703 AAO07887 Homo SapiensHYSE- Human polypeptide 246 85
SEQ ID

NO 21779.

1703 AA008651 Homo SapiensHYSE- Human polypeptide 239 83
SEQ ID

NO 22543.

1703 AA008732 Homo SapiensHYSE- Human polypeptide 221 80
SEQ ID

NO 22624.

1704 AAB94588 Homo SapiensHELI- Human protein sequence82 52
SEQ

ID N0:15392.

1704 g13288914Mus musculusaortic carboxypeptidase-like82 24
protein

ACLP

1704 AAM93437 Homo SapiensHELI- Human polypeptide,81 32
SEQ ID

NO: 3074.

1706 AAM86104 Homo SapiensHUMA- Human 179 100

immune/haematopoietic
antigen SEQ

ID N0:13697.

1706 g110039425E uus caballusALR rotein 120 40

1706 120502826Eimeria cGMP-dependent rotein 115 35
maxima kinase

1707 AAM70251 Homo sapiensMOLE- Human bone marrow ~ 115 ~ 78

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
182
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

expressed probe encoded
protein SEQ

ID NO: 30557.

1707 AAM57834 Homo SapiensMOLE- Human brain expressed115 78
single

exon probe encoded protein
SEQ ID

NO: 29939.

1707 gi15450860Arabidopsisserine/threonine-protein71 56
kinase Mak

thaliana (male germ cell-associated
kiiiase)-like

protein

1708 11620403 Homo sa SF1-Bo isoform 82 41
iens

1708 119072991H ocrea class III chitinase precursor82 40
virens

1708 118765873Hypocrea class III chitinase 82 40
virens

1709 AAM52240 Homo sa 1NCY- Human MFAP4 SEQ 1384 100
iens ID NO 3.

1709 g1790817 Homo sa microfibril-associated 1384 100
iens glycoprotein 4

1709 AAM52239 Homo sapiensINCY- Human MAG4V SEQ 1374 100
ID NO 1.

1710 g116769882Drosophila SD07884p 67 27

melanogaster

1710 gi~17545505~Ralstonia CONSERVED HYPOTHETICAL 66 41

ret)NP_5189solanacearumPROTEIN

07.1

1711 AAU82954 Homo SapiensANAD- Human homologue 111 27
of MPT1

rotein target for antifungal
com ound.

1711 g12058326Homo Sapienssubunit of RNA polymerase111 27
II

transcri tion factor
TFIID

1711 g113559031Homo sapiensbA11M20.1 (TATA box binding108 26

protein (TBP)-associated
factor, RNA

polymerise II, C1, 130kD)

1712 AAB65626 Homo SapiensSUGE- Novel protein kinase,209 82
SEQ ID

NO: 152.

1712 AAM25283 Homo sapiensHYSE- Human protein sequence209 82
SEQ

ID N0:798.

1712 AAU17269 Homo SapiensHUMA- Novel signal transduction176 67

pathway protein, Se ID
834.

1713 g118256065Mus musculusSimilar to ATPase, class127 67
II, type 9A

1713 AAM76495 Homo SapiensMOLE- Human bone marrow 123 70

expressed probe encoded
protein SEQ

ID NO: 36801.

1713 AAM63681 Homo SapiensMOLE- Human brain expressed123 70
single

exon probe encoded protein
SEQ ID

NO: 35786.

1714 g18096269Nicotiana KED 149 28

tabacum

1714 g11752736Saccharomycesgene required for phosphoylation148 30
of

cerevisiae oligosaccharides/ has
high homology

with YJR061w

1714 g12292986Rattus cyclic nucleotide-gated 141 28
channel beta

norvegicus subunit

1715 AAM72995 Homo SapiensMOLE- Human bone marrow 158 47

expressed probe encoded
protein SEQ

ID NO: 33301.

1715 AAM60359 Homo SapiensMOLE- Human brain expressed158 47
single

exon probe encoded protein
SEQ ID

NO: 32464.

1715 gi~13539605~Paramecium cycloplulin-RNA interacting144 45
protein

emb~CAC35tetraurelia

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
183
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

733.1
~

1716 AAM71015 Homo SapiensMOLE- Human bone marrow251 64

expressed probe encoded
protein SEQ

ID NO: 31321.

1716 AAM58517 Homo sapiensMOLE- Human brain expressed251 64
single

exon probe encoded protein
SEQ ID

NO: 30622.

1716 AAU19766 Homo SapiensHUMA- Human novel extracellular161 44

matrix rotein, Seq ~D
No 416.

1718 g11420924Zea mays IN1 75 27

1718 gi~14521970~Pyrococcus O-sialoglycoprotein 73 35
endopeptidase

ref~NP_1274abyssi

47.1

1719 g120513851Hordeum BPM 74 35

vul are

1719 g121039126Cryptosporidium60 kDa glycoprotein 74 26

parvum

1719 g1207158 Ratios big tau 73 36

norvegicus

1720 g118181943Caenorhabditisheparan sulfate GIcNAc 67 34
transferase-I/II

elegans

1720 g12058699Caenorhabditismultiple exostoses homolog67 34
2

ele ans

1720 gi~17554740~CaenorhabditisMULTIPLE EXOSTOSES 67 34

reilNP-4993elegans HOMOLOG 2

68.1 ~

1721 AAM69150 Homo SapiensMOLE- Human bone marrow200 38

expressed probe encoded
protein SEQ

ID NO: 29456.

1721 AAM56769 Homo SapiensMOLE- Human brain expressed200 38
single

exon probe encoded protein
SEQ ID

NO: 28874.

1721 g14185947Human pol protein 196 38

endogenous

retrovirus
I~

1722 g12065210Mus musculusPro-Pol-dUTPase olyprotein615 60

1722 g118676710Homo SapiensFLJ00254 rotein 592 60

1722 gi~20469453~Homo Sapienssimilar to FLJ00254 283 50
protein

ref~XP_1140

40.1

1723 g113881755Mycobacteriumcation efflux system 74 30
protein

tuberculosis

CDC1551

1724 AAG78866 Homo sa iensSHAN- Human zinc fin 141 68
er protein 15.

1724 ABB 17928Homo sapiensHUMA- Human nervous 99 53
system related

polypeptide SEQ ID NO
6585.

1724 gi~21295712~Anopheles agCP1631 75 26

gb~EAA078gambiae str.

57.1 ~ PEST

1725 121104340Homo Sapiensobscurin 1586 83

1725 g17024535Gallus allusstructural muscle rotein207 24
titin

1725 g11513030Gallus gallusconnectin/titin 207 24

1727 AAE19162 Homo SapiensTHOR/ Human lcinase 1096 99
polypeptide

(PK1N-20).

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
184
Table 2
SEQ AccessionSpecies Description Score
~

ID No. Identity

NO:

1727 gi2736151Rattus mytonic dystrophy kinase-related902 78

norvegicus Cdc42-binding kinase

1727 gi1695873Homo Sapiensser-thr rotein kinase 896 77
PK428

1728 AAY99411 Homo SapiensGETH Human PR01487 (UNQ756)862 67

amino acid sequence SEQ
ID N0:260.

1728 115617453Homo sapienschondroitin synthase 862 67

1728 AAE15959 Homo SapiensEUMO- Human 4589624/92-303761 79

protein, member of Fringe
and Brainiac

family.

1729 gi~15804980~EscherichiaUncharacterized conserved71 33
coli protein

ref~NP_29090157:H7

60.1 EDL933 .

1731 114268490Musca domesticahunchback 82 33

1731 AAM93401 Homo SapiensHELI- Human polypeptide,76 27
SEQ ID

NO: 3002.

1731 12076606 Musca domesticahunchback zinc finger 73 30
rotein

1732 AAY91949 Homo SapiensINCY- Human cytoskeleton1047 57
associated

protein 4 (CYSKP-4).

1732 ABB90754 Homo SapiensUYJO Human Tumour Endothelial1043 57

Marker polypeptide SEQ
ID NO 240.

1732 g1619577 Gallus alluscardiac muscle tensin 1043 56

1733 g13090889Homo Sapienssynapsin IIIa 70 38

1733 g16572355Homo sa cE86D10.1 (syna sin III)70 38
iens

1733 gi~19924105~Homo Sapienssynapsin III, isoform 70 38
IIIa

ref~NP
0034

81.2

1734 AAB85144 Homo SapiensHUMA- Human NKCR polypeptide1506 93

(clone ID HMSOM53).

1734 g14973126Mus musculushigh affinity inununoglobulin490 39
gamma

castaneus Fc receptor I

1734 g14973124Mus musculushigh affinity immunoglobulin489 39
gamma

Fc receptor I

1735 gi~15597595~Pseudomonaspyoverdine synthetase 69 30
D

reflIVP-2510aeruginosa

89.1 ~

1736 114488302Oryza sativaPutative trans oson rotein81 24

1736 g13851516Phytophthoracyst germination specific72 33
acidic repeat

infestans rotein precursor

1736 gi~14488302~Oryza sativaPutative transposon protein81 24

gb~AAK638

83.1 ~AC074

105 12

1737 AAB85357 Homo Sapiens1NCY- Human phosphatase 1591 100
(PP) (clone

ID 3402521CD1).

1737 g121205864Homo SapiensT-cell activation protein1591 100
phosphatase

2C; TA-PP2C

1737 g121464366Drosophila RE06653p 758 52

melano aster

1738 g17271811Drosophila GTPase activating protein292 38

melanogaster

1738 AAM76430 Homo SapiensMOLE- Human bone marrow 246 100

expressed probe encoded
protein SEQ

ID NO: 36736.

1738 AAM63615 Homo SapiensMOLE- Human brain ex 246 100
ressed single

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
185
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

exon probe encoded protein
SEQ ID

NO: 35720.

1739 ABB50365 Homo SapiensHUMA- Human secreted 272 87
protein

encoded by gene 65 SEQ
ID N0:313.

1739 AAW88598 Homo SapiensHUMA- Secreted protein 272 87
encoded by

gene 65 clone HFVHY45.

1739 ABB50764 Homo SapiensHUMA- Human secreted 143 92
protein

encoded by ene 65 SEQ
ID N0:716.

1740 12065210 Mus musculusPro-Pol-dUTPase pol rotein1210 58

1740 gi~10834720~Homo SapiensPP565 274 80

gb~AAG237

90.1 ~AF258

587 1

1740 gi~385615~gbMus sp. fibulin gene homolog 248 75

~AAB26708.

1~

1741 ABB90748 Homo SapiensUYJO Human Tumour Endothelial2116 97

Marker polype tide SEQ
ID NO 228.

1741 115987493Homo Sapienstumor endothelial marker2116 97
6

1741 ABB90754 Homo SapiensUYJO Human Tumour Endothelial530 37

Marker of eptide SEQ
1D NO 240.

1742 ABB 11753Homo SapiensHYSE- Human NOV/plexin-A1291 90

homolo ue, SEQ ID N0:2123.

1742 g11665757Mus musculusplexin 1 291 90

1742 16010217 Homo sa NOV/ lexin-A1 rotein 291 90
iens

1743 AAM79514 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO

3160.

1743 AAM78530 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO

1192.

1743 g11244510Homo Sapiensp311 rotein 149 90

1744 AAG93324 Homo SapiensNISC- Human protein HP 83 41
10370.

1744 g121064771Drosophila RH61467p 83 46

melano aster

1744 g118676554Homo sa FLJ00174 protein 77 41
iens

1745 14128039 Homo SapiensTL132 rotein 81 29

1745 g117983118Brucella METAL DEPENDENT HYDROLASE74 23

melitensis

1745 AAU75578 Homo SapiensUYNA- Human ubiquitin 71 31
specific

rotease 10 (USP 10).

1746 g115074154SinorhizobiumPUTATIVE FATTY 76 25

meliloti ACID/PHOSPHOLIPID SYNTHESIS

PROTEIN

1746 g11869833human myristylated tegument 75 27
protein

he esvirus
2

1746 g120516045ThermoanaerobaChemotaxis response regulator69 20
CheB,

cter consists of CheY-like
receiver domain

tengcongensisand a methylesterase
(demethylase)

domain

1747 g118025496cercopithicineEBNA-1 124 37

he esvirus
15

1747 g15821153Homo SapiensRNA binding protein 123 29

1747 g16649242Homo Sapienssplicing coactivator 123 29
subunit SRm300

1748 gi~4321764~gMus musculusMAP kinase kinase 7 alpha65 30
2

b~AAD
1581

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
186
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

9.1~

1748 gi~20859704~Mus musculusmitogen activated protein65 30
kinase kinase

ref~XP'1339 7

86.1

1748 gi~4321768~gMus musculusMAP kinase kinase 7 beta65 30
2

b~AAD
1582

1.1~

1749 AAB50964 Homo sapiensGETH Human PR01313 protein.439 89

1749 AAB47290 Homo sa GETH PR01313 0l a tide. 439 89
iens

1749 AAB24431 Homo SapiensGETH Human PR01313 protein439 89

se uence SEQ ID N0:216.

1750 AAU00502 Homo sa MILL- Human TANGO 437 115 91
iens protein.

1750 g120384654Homo Sapienstwo- ore calcium channel115 91
rotein 2

1750 AAM91059 Homo SapiensHUMA- Human 93 64

immune/haematopoietic
antigen SEQ

ID N0:18652.

1751 g110440494Homo SapiensFLJ00092 rotein 252 97

1751 AAM40956 Homo SapiensHYSE- Human polypeptide 80 30
SEQ ID

NO 5887.

1751 gi~10440494~Homo SapiensFLJ00092 protein 252 97

dbj ~BAB
157

80.1

1752 g115980036Yersinia 2-dehydro-3-deoxyphosphooctonate77 46
pesos

aldolase

1752 g111322261Diceros al ha adrenergic rece 74 26
bicornis for 2B

1752 g120516240Thermoanaerobamethylaspartate mutase 73 25

cter

ten congensis

1753 g119684014Homo Sapienssimilar to brain-specific1387 99
angiogenesis

inhibitor 3 (H. sa iens)

1753 AAB88367 Homo SapiensHELI- Human membrane 1380 99
or secretory

protein clone PSECO101.

1753 11469936 Mus musculusFGF-binding protein 158 29

1754 AAB01397 Homo SapiensINCY- Neuron-associated 435 92
rotein.

1754 g121218140Homo Sapiensrab effector MYRIP 435 92

1754 g121320161Mus musculusexophilin 8 378 77

1755 AAM74815 Homo SapiensMOLE- Human bone marrow 253 75

expressed probe encoded
protein SEQ

ID NO: 35121.

1755 AAM62013 Homo SapiensMOLE- Human brain expressed253 75
single

exon probe encoded protein
SEQ ID

NO: 34118.

1755 AAM70390 Homo sapiensMOLE- Human bone marrow 228 62

expressed probe encoded
protein SEQ

ID NO: 30696.

1756 g16460201Deinococcusphenylacetic acid degradation85 27
protein

radioduransPaaA

1756 g13309543Talcifugu MLL 79 34

rubri es

1756 AAT10059_Homo SapiensUSSH erbB-3 cDNA clone 74 31
E3-16.

aal

1757 118676406Homo sa FLJ00021 protein 70 36
iens

1758 g113423395CaulobacterNADH dehydrogenase I, 78 37
M subunit

crescentus
CB 15

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
187
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1758 gi~17506337~CaenorhabditisD1007.15.p 82 24

ref~NP-4913elegans

90.1 ~

1758 gi~16126181~CaulobacterNADH dehydrogenase I, 78 37
M subunit

ref~NP_4207crescentus
CB 15

45.1

1759 gi19881193chimpanzee transcriptional transactivator83 29
TRS1

cytome alovirus

1759 gi19881161chimpanzee transcriptional transactivator83 29
IRS1

cytomegalovirus

1759 1556297 Mus musculusal ha-1 type IV collagen81 33

1760 118033185Danio rerioUNC45-related rotein 702 79

1760 AAG77802 Homo SapiensHUMA- Human HOGEN50 603 65

serine/threonine phosphatase
protein

se uence.

1760 AAM40290 Homo SapiensHYSE- Human polypeptide 603 65
SEQ ID

NO 3435.

1761 g16634123Drosophila SoxNeuro 70 24

melano aster

1762 gi~14245700~Giardia kinesin-like protein 69 26
4

dbj~BAB561intestinalis

42.1

1762 gi~165011~gbOryctolaguseucaryotic release factor69 24
(eRF)

~AAA31246.cuniculus

1~ ,

1762 gi~15559188~Homo SapiensdJ45P21.3 (butyrophilin,69 26
subfamily 3,

emb~CAC03 member A1)

424.2

1763 AAM93661 Homo SapiensHELI- Human polypeptide,186 80
SEQ ID

NO: 3536.

1763 AAM64398 Homo SapiensMOLE- Human brain expressed154 76
single

exon probe encoded protein
SEQ ID

NO: 36503.

1763 gi~20556958~Homo Sapienssimilar to PAM COOH-terminal73 43

ref~XP_0615 interactor protein 1

62.5

1764 AAU17223 Homo SapiensHUMA- Novel signal transduction211 87

pathwa rotein, Se ID
788.

1765 g11334546Podospora Dod COI 113 grp IB protein71 37

anserina

1765 15679307 Mus musculusROR aroma t 70 27

1765 g14186077Mus musculusROR aroma T rotein 70 27

1766 g117864081Mus musculusPPAR aroma coactivator-lbeta74 26
protein

1766 g144795 Methanococcuspolyferredoxin 71 28

voltae

1766 g114279670Lycopersiconverticillium wilt disease71 31
resistance

esculentum protein

1768 AAE06588 Homo SapiensSAGA Human protein having165 100

hydrophobic domain, HP
10778.

1768 AAM40979 Homo SapiensHYSE- Human polypeptide 165 100
SEQ ID

NO 5910.

1768 AAB24542 Homo SapiensHUMA- Human secreted 73 30
protein

sequence encoded by gene
27 SEQ ID

N0:168.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
188
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1769 gi6174840Achromobacterlow-specificity D-tlueonine78 33
aldolase

xylosoxidans

subsp.

xylosoxidans

1769 gi16769806Drosophila SD02660p 75 23

melano aster

1769 gi1098473Rattus insulin-like growth 73 31
factor binding

norvegicus rotein

1770 AAP94684 Homo SapiensCHIL Amino acid sequence79 56
encoded

by part of human xnamiose
binding

protein(hMBP) genomic
DNA.

1770 gij15790548jHalobacteriumcobyric acid synthase; 69 36
CbiP

ref~NP Sp. NRC-1
2803

72.1 ~

1770 gij11467609jGuillardia Clp protease ATP binding69 27
theta subunit

ref~NP_0506

61.1j

1772 gi5532460Shi eila ShiF 66 32
flexneri

1773 gi 11544663Arabidopsis PTPKIS 1 75 42

thaliana

1773 gi11595504Arabidopsis PTPKIS1 protein 75 42

thaliana

1773 gi18389331Mus musculus2',5'-oli oadenylate 73 42
synthetase-like 10

1774 AAM06519 Homo SapiensHYSE- Human foetal protein,414 90
SEQ ID

NO: 250.

1774 gij18552248jHomo Sapienssimilar to latent transforming69 37
growth

refjXP_0925 factor beta binding
protein 1; latent

10.1 TGF beta binding protein

1775 gi4884924Rangiferine glycoprotein C 67 60

he esvirus
1

1775 AAB94152 Homo sapiensHELI- Human protein 65 34
sequence SEQ

ID N0:14435.

1775 AAB93253 Homo SapiensHELI- Human protein 65 34
sequence SEQ

ID N0:12271.

1776 gi13424176Caulobacter N-carbamyl-L-amino acid89 24

crescentus amidohydrolase
CB 15

1776 gi514267 Homo Sapiensproto-oncogene tyrosine-protein86 29
kinase

1776 128237 Homo Sapiens150 protein (AA 1-1130)84 28

1777 g163370 Gallus anus d strophin (AA 1 - 3660)68 31

1777 gij3046783jeScyliorhinusdystrophin 67 29

mb~CAA680canicula

33.1j

1777 gi~2342682jgArabidopsis Contains similarity 67 31
to Rattus AMP-

bjAAB7040thaliana activated protein kinase
(gbjX95577).

6.1j

1778 AAE16176 Homo SapiensINCY- Human G-protein 1419 100
coupled

receptor 7 (GCREC-7)
rotein.

1778 AAE18021 Homo SapiensCUBA- Human G-protein 1419 100
coupled

receptor-8a (GPCR-8a)
rotein.

1778 AAG72411 Homo SapiensVEDA Human OR-like polypeptide1419 100

query se uence, SEQ
ID NO: 2092.

1779 AAM76040 Homo SapiensMOLE- Human bone marrow93 48

expressed probe encoded
protein SEQ

117 NO: 36346.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
189
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1779 AAM63227 Homo SapiensMOLE- Human brain expressed93 48
single

exon probe encoded protein
SEQ ID

NO: 35332.

1779 gi12620576BradyrllizobiumID342 87 24

' a onicum

1780 gi2459833Rattus Maxpl 81 31

norvegicus

1780 AAB65650 Homo SapiensSUGE- Novel protein kinase,- 80 35
SEQ ID

NO: 177.

1780 AAM39805 Homo sapiensHYSE- Human polypeptide 80 36
SEQ ID

NO 2950.

1781 14877963 Mus musculusNF-ka aB inducin kinase 69 39

1781 115077865Mus musculusbullous emphi oid antigen67 35
1-b

1781 g115077863Mus musculusbullous emphi oid anti 67 35
en 1-a

1782 g14138265Nicotiana Avr9 elicitor response 76 27
protein

tabacum

1782 g112725153LactococcusSOS ribosomal protein 75 32
L3

lactis subsp.

lactis

1782 AAB21008 Homo SapiensINCY- Human nucleic acid-binding73 32

protein, NuABP-12.

1783 g13947714Streptococcusinitiation factor IF2 86 20

agalactiae

1783 g19558387Streptococcusinitiation factor 2 86 20

a alactiae

1783 g19558369Streptococcusinitiation Factor 2 86 20

a alactiae

1786 g1435855 Mus s . CREB-binding protein; 75 22
CBP

1786 g12911464Leishmania sodium stibogluconate 75 34
resistance

tarentolae rotein

1786 g119547887Mus musculusCREB-binding rotein 75 22

1787 13747099 Mus musculusC1 -related factor 616 61

1787 114278927Mus musculusgliacolin ' 615 64

1787 g110566471Mus musculusGliacolin 615 64

1788 gi~21291197~Anopheles agCP7579 71 20

gb~EAA033gambiae
str.

42.1 ~ PEST

1788 gi~20803964~MesorhizobiumHYPOTHETICAL PROTEIN 69 43

emb~CAD31loti

541.1

1789 AAM41125 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID

NO 6056.

1789 AAM39339 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID

NO 2484.

1789 AAM79857 Homo SapiensHYSE- Human protein SEQ 320 80
ID NO

3503.

1790 g11143585Paracentrotus2 alpha fibrillar collagen69 23

lividus

1791 g19837427Lytechinus embryonic blastocoelar 116 34
extracellular

varie atus matrix rotein recursor

1791 g114089698Mycoplasma OLIGOPEPTIDE ABC 71 23

pulinonis TRANSPORTER PERMEASE

PROTEIN

1791 g16572111Bartonella riboflavin synthase alpha69 29
chain

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
190
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

uintana

1792 gi~4506023~rHomo Sapiensprotein phosphatase 68 39
2, regulatory

ef~NP_0027 subunit B (B56), gamma
isoform

10.1

1793 AAM71170 Homo SapiensMOLE- Human bone marrow180 82

expressed probe encoded
protein SEQ

ID NO: 31476.

1793 AAM58664 Homo SapiensMOLE- Human brain expressed180 82
single

exon probe encoded protein
SEQ ID

NO: 30769.

1793 AAM65679 Homo SapiensMOLE- Human brain expressed168 71
single

exon probe encoded protein
SEQ ID

NO: 37784.

1794 AAG00072 Homo SapiensGEST Human secreted 125 80
protein, SEQ ID

NO: 4153.

1794 AAW34618 Homo SapiensIMUT- Human C3 protein 125 80
mutant DV-

7N.

1794 AAW34617 Homo sapiensIMUT- Human C3 protein 125 80
mutant DV-

6.

1795 AAY05069 Homo SapiensSMIK Human PIGR-2 protein1055 85

sequence.

1795 gi396170 Homo sa iensCMRF-35 anti en 406 45

1795 gi18490143Homo SapiensCMRF35 leukocyte immunoglobulin-406 45

like receptor

1796 gi~6723273~dBaboon gag-pol precursor polyprotein421 41

bj~BAA8965endogenous

9.1~ virus strain
M7

1796 gi~13940448~Murine leukemiapol precursor protein 421 41

gb~AAK503virus

81.1 ~U43202

2

1796 gi~331995~gbAKV murine gag-pol polyprotein 421 41
(tag amber codon

~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)

1

1797 121411325Homo SapiensSimilar to LOC205103 260 73

1797 gi~4835878~gHomo Sapiensendocytic receptor Endo18077 31

b~AAD3028

O.1~AF1348

38 1

1797 gi~16076075~Leishmania trypanothione reductase70 30

emb~CAC94donovani

295.1 donovani

1798 g1927721 SaccharomycesSiplp: SNF1 proteiiikinase72 34
substrate;

cerevisiae YDR422C; CAI: 0.13

1798 g1172604 Saccharomycesprotein kinase 72 34

cerevisiae

1798 gi~6320630~rSaccharomycesSNF1 proteinkinase substrate;72 34
Siplp

eflNP_0107cerevisiae

10.1

1799 gi~20839768~Mus musculussimilar to GDP-fucose 71 29
transporter 1

ref~XP_1303

11.1

1801 gi~17461642~Homo Sapienssimilar to Ig kappa 78 23
chain

reflXP
0662

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
191
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

49.1 ~

1801 gi~6325342~rSaccharomycesProtein required for 76 22
cell viability;

0154 cerevisiae Ypr085cp
ef~NP

_
10.1

1801 gi~9635081~rGallid UL47 74 26

ef~NP_0578herpesvirus
2

09.1 ~

1802 AAB94148 Homo SapiensHELI- Human protein sequence250 56
SEQ

ID N0:14427.

1802 AAG64564 Homo SapiensSHAN- Human zinc-finger 250 56
protein 60.

1802 AAM79356 Homo SapiensHYSE- Human protein SEQ 250 56
ID NO

3002.

1803 AAW81754 Homo SapiensBOEF Human Fanconi anaemia-631 85

associated ene II protein.

1803 g12407911Homo Sapiensdifferentially expressed555 74
in Fanconi

anemia

1803 16013073 Mus musculusHemT-3 protein 89 24

1805 g114189735Homo sapiensATP-binding cassette 1508 90
transporter

family A member 12

1805 11943947 Bos taurus ABC transporter 404 31

1805 AAZ94734_Homo SapiensFARB Human ATP binding 395 33
cassette

aal ABCAl (ABC1) cDNA.

1806 AAU12234 Homo SapiensGETH Human PR04350 polypeptide859 100

sequence.

1806 AAA96344_Homo SapiensGETH cDNA encoding a 498 48
novel

aal of epode designated PR04357.

1806 AAU12445 Homo SapiensGETH Human PRO4357 polypeptide498 48

sequence.

1807 1190396 Homo sa rofilaggrin 76 29
iens

1808 AAB88367 Homo SapiensHELI- Human membrane 74 30
or secretory

rotein clone PSECO101.

1808 g119684014Homo Sapienssimilar to brain-specific74 30
angiogenesis

inhibitor 3 (H. Sapiens)

1808 gi~18576362~Homo Sapienssimilar to fibroblast 74 30
growth factor

re~XP_0844 binding protein 1

81.1

1809 g1530876 Chlamydomonasamino acid feature: Rod 126 35
protein

reinhardtiidomain, as 266 .. 468;
amino acid

feature: globular protein
domain, as 32

.. 265

1809 g16578849Myxococcus FrgA 126 29

xanthus

1809 12429362 Santalum proline rich protein 122 27
album

1810 g117428288Ralstonia PROBABLE CATION- 75 28

solanacearumTRANSPORTING ATPASE

LIPOPROTEIN TRANSMEMBRANE

1810 g121483422Drosophila LD34142p 71 29

melano aster

1810 ABB90042 Homo SapiensHUMA- Human polypeptide 70 32
SEQ ID

NO 2418.

1811 gi~20915248~Mus musculussimilar to Collagen alpha148 74
1(VI) chain

ref~XP_1451 precursor

60.1

1812 g12104558Rattus ~ CCA3 ~ 1150 ~ 90

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
192
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity

NO:

norvegicus

1812 AAB64963 Homo SapiensROSE/ Human secreted 172 37
protein

sequence encoded by gene
24 SEQ ID

NO:141.

1812 gi12963869Mus musculusgene trap ankyrin repeat172 37
containing

rotein

1813 AAB65201 Homo SapiensGETH Human PR01009 (UNQ493)208 100

rotein se uence SEQ ID
N0:194.

1813 AAY66678 Homo SapiensGETH Membrane-bound protein208 100

PR01009.

1813 AAB24068 Homo SapiensGETH Human PR01009 protein208 100

se uence SEQ ID N0:36.

1815 AAG89314 Homo SapiensGEST Hurnan secreted 191 100
protein, SEQ ID

NO: 434.

1815 gi6460052Deinococcusdipeptidyl peptidase 66 60
IV-related protein

radiodurans

1816 gi1052594Drosophila trithorax protein trxI 75 26

melanogaster

1816 gi1052593Drosophila trithorax protein trxII 75 26

melanogaster

1816 gi158818 Drosophila zinc-binding protein 75 26

melanogaster

1817 AAB49765 Homo SapiensHELI- Human proliferation229 94

differentiation factor
amino acid

se uence.

1817 AAB88393 Homo SapiensHELI- Human membrane 229 94
or secretory

rotein clone PSEC0137.

1817 gi18446895Drosophila AT05866p 73 25

melanogaster

1818 gi6573212Giardia variant-specific surface73 32
protein H7-1

intestinalis

1818 gi159143 Giardia variant-specific surface73 32
protein H7

intestinalis

1818 gi15144254Micrurus neurotoxin homologue 72 32
8

corallinus

1819 gi161857 Tetrahymenasurface antigen 69 35

thermo hila

1821 gi913964 Carcinoscorpiusfactor C 80 26

rotundicauda

1821 gi217397 Tachypleus limulus factor C precursor80 26

tridentatus

1821 gi18542425Tachypleus factor C precursor 80 26

tridentatus

1822 19309473 Mus musculusDNMT1 associated protein-174 37

1822 g11666895Homo sa CHL1 protein 74 23
iens

1822 g116923930Mus musculusMAT1-mediated transcriptional74 37

repressor

1823 g19058659Canis familiarisskeletal muscle chloride73 34
channel C1C-1

1823 g1433182 Drosophila receptor protein tyrosine72 26
phosphatase

melanogaster

1823 g120429105Paracoccus decaprenyl diphosphate 72 27
synthase

zeaxanthinifacie

ns

1824 g113374178Mus musculusTAFII140 rotein 612 88

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
193
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1824 gi17861888Drosophila GM10839p 246 49

melano aster

1824 gi6634096Drosophila BIP2 protein 242 48

melano aster

1825 gi16605480Homo sa G6b-C protein 1159 100
iens

1825 116605484Homo sa G6b-E rotein 1009 90
iens

1825 gi5304877Homo sa immuno lobulin rece for 1003 83
iens

1826 AAB94636 Homo SapiensHELI- Human protein sequence105 37
SEQ

ID N0:15515.

1826 AAU15903 Homo SapiensHUMA- Human novel secreted105 37
protein,

Se ID 856.

1826 gi21430928Drosophila SD27341p 93 39

melanogaster

1827 AAR33270 Homo SapiensWIST- T cell receptor 329 92
alpha chain

clone alphal.3.

1827 gi1806100Homo SapiensT cell rece for alpha 329 92
chain

1827 gi2358032Homo SapiensTCRAV8S3 329 92

1828 gi20513851Hordeum BPM 73 45

vul are

1828 AA001897 Homo SapiensHYSE- Human polypeptide 70 35
SEQ ID

NO 15789.

1828 AAE16477 Homo SapiensOSTE- Human collagen 69 31
alphal (II)

rotein.

1829 AAG66837 Homo SapiensSHAN- Human ATP-dependent356 100
serine

proteinase 31.

1829 AAG66838 Homo SapiensSHAN- Human ATP-dependent89 100
serine

proteinase 31 N-terminal
peptide.

1829 gi5881591Gallus gallushomeodomain protein 77 38

1830 AAB94294 Homo SapiensHELI- Human protein sequence951 99
SEQ

ID N0:14745.

1830 gi10504968Drosophila rho guanine nucleotide 180 22
exchange factor

melano aster4

1830 gi16197921Drosophila LD03170p 180 22

melano aster

1831 ABB 12353Homo SapiensHYSE- Human bone marrow 199 30
expressed

protein SEQ ID NO: 107.

1831 120452161Canis familiarisretinitis i mentosa GTPase143 24
re lator

1831 gi2062609Xenopus middle molecular weight 140 24
laevis neurofilament

rotein NF-M(1)

1832 AAB29778 Homo SapiensRHOD- Human MSF-derived 148 18

tribonectin.

1832 gi142161 Anaplasma surface antigen Amf105 141 25

mar finale

1832 gi4808177Drosophila largest subunit of the 141 20
RNA polymerase

subobscura II com lex

1833 AAM66321 Homo SapiensMOLE- Human bone marrow 424 51

expressed probe encoded
protein SEQ

ID NO: 26627.

1833 AAM53933 Homo SapiensMOLE- Human brain expressed424 51
single

exon probe encoded protein
SEQ ID

NO: 26038.

1833 gi~6723273~dBaboon gag-pol precursor polyprotein357 47

bj~BAA8965endogenous

9.1 virus strain
M7

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
194
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1834 AAM88756 Homo SapiensHUMA- Human 208 100

immune/haematopoietic
antigen SEQ

ID N0:16349.

1834 gi20417 Persea americanacellulase 77 34

1834 gi153337 Streptomyceskanamycin-apramycin resistance69 26

tenebrariusmethylase

1837 AAY02893 Homo SapiensIiLTMA- Fragment of human76 41
secreted

protein encoded by ene
92.

1837 AAY99429 Homo SapiensGETH Human PR01563 (UNQ769)73 35

amino acid se uence SEQ
ID N0:317.

1837 gi6634084Drosophila malate dehydrogenase 73 39
(NADP-

melanogasterdependent oxaloacetate

decarboxylating), malic
enzyme

1838 gi2865602SaccharopolyspoSapI M2 methyltransferase77 37

ra Sp.

1838 gi3089358Rattus MARRLC2A 75 33

norvegicus

1838 gi~2865602~gSaccharopolyspoSapI M2 methyltransferase77 37

b~AAC9718ra Sp.

2.1~

1839 AAM69149 Homo SapiensMOLE- Human bone marrow 154 96

expressed probe encoded
protein SEQ

ID NO: 29455.

1839 AAM56768 Homo SapiensMOLE- Human brain expressed154 96
single

exon probe encoded protein
SEQ ID

NO: 28873.

1839 AAW96209 Homo SapiensSMIK Amyloid precursor 102 78
protein

(APP) C-terminal fragment.

1840 gi9946563Pseudomonasprobable type II secretion81 36
system

aeru inosa protein

1840 gi21108565Xanthomonaspseudouridylate synthase75 35

axonopodis
pv.

citri str.
306

1840 ABB04714 Homo sapiensSHAN- Human PP1744 protein74 31
SEQ

ID N0:23.

1841 gi1491949Molluscum MC006L 85 30

contagiosum

virus sub
a 1

1841 AAM42085 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID

NO 7016.

1841 AAM40299 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID

NO 3444.

1842 120381413Homo sapiensSimilar to LOC160680 216 44

1842 g113592175Leishmania ppg3 144 24

maj or

1842 g15420387Leishmania proteophosphoglycan 140 23

ma' or

1843 AAB87181 Homo SapiensMILL- Human secreted 278 42
protein

MANGO 349 E41D variant,
SEQ ID

N0:231.

1843 AAB87128 Homo sapiensMILL- Human secreted 278 42
protein

MANGO 349, SEQ ID N0:130.

1843 AAB87179 Homo SapiensMILL- Human secreted 276 41
protein

MANGO 349 I21K variant,
SEQ ID

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
195
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

N0:227.

1844 AAE14341 Homo sapiensINCY- Human protease 886 93
PRTS-6

protein.

1844 gi16768276Drosophila GH27809p 290 41

melano aster

1844 gi2655204Mus musculusubiquitin-specific protease258 35

1846 AAY88300 Homo SapiensMILL- Human TANGO 187-3 1334 90
protein.

1846 gi13097780Homo SapiensSimilar to RIKEN cDNA 1326 90
2810037014

gene

1846 AAY88296 Homo SapiensMILL- Human TANGO 187-2/31312 87

protein.

1847 AAG74984 Homo SapiensHUMA- Human colon cancer75 32
antigen

protein SEQ ID N0:5748.

1847 gi17352449Rattus ErbB3/Her3 precursor 74 38

norve icus

1847 gi~20860870~Mus musculussimilar to H4(D10S170) 75 32
protein

re~XP,1256

64.1 ~

1848 gi3123530Fowlpox I3L, ortholo ue of vaccinia75 27
virus I3L

1848 gi5902659Drosophila ring canal protein 70 27

melanogaster

1848 gi~18110218~Drosophila kel-P2 70 27

ref~NP-4765melanogaster

89.2

1849 gi2065210Mus musculusPro-Pol-dUTPase olyprotein614 78

1849 AAM65715 Homo SapiensMOLE- Human bone marrow 548 73

expressed probe encoded
protein SEQ

ID NO: 26021.

1849 AAM53338 Homo SapiensMOLE- Human brain expressed548 73
single

exon probe encoded protein
SEQ ID

NO: 25443.

1850 gi10999071LophognathusNADH dehydrogenase subunit74 23
2

longirostris

1850 gi18537243Human envelope glycoprotein 74 29

immunodeficienc

y virus
a 1

1850 gi~1099907,1~LophognathusNADH dehydrogenase subunit74 23
2

gb~AAG006longirostris

22.2~AF
128

462 2

1851 gi~17448210~Homo Sapienssimilar to 60 kDa heat 72 28
shock protein,

ref~XP_0685 mitochondrial precursor
(Hsp60) (60

03.1 kDa chaperonin) (CPN60)
(Heat shock

protein 60) (HSP-60)
(Mitochondrial

matrix protein Pl) (P60
lymphocyte

protein) (HuCHA60)

1852 gi1164937SaccharomycesYOR3160w 74 31

cerevisiae

1852 gi3176662ArabidopsisSimilar to mannosyl-oligosaccharide73 31

thaliana glucosidase gb~X87237
from Homo

sa iens.

1852 gi13398928Arabidopsisalpha-glucosidase 1 73 31

thaliana

1853 gi~20889364~Mus musculussimilar to hepatitis ~ 76 ~ 36
A virus cellular

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
196
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1384 receptor 1; T cell immunoglobin
ref~XP

_ domain and mucin doamin
29.1 ~ rotein 1

1853 gi~21288202~Anopheles agCP9342 71 32

gb~EAA005gambiae
str.

23.1 ~ PEST

1854 AAB88481 Homo SapiensHELI- Human membrane 776 99
or secretory

rotein clone PSEC0251.

1854 AAE03835 Homo SapiensHLTMA- Human gene 18 776 99
encoded

secreted protein HFKHW50,
SEQ ID

NO: 81.

1854 AAE03863 Homo SapiensHIJMA- Human gene 18 716 97
encoded

secreted protein HFKHW50,
SEQ ID

N0:109.

1855 gi1663748Chlamydomonasdynein heavy chain 7 82 29

reinhardtii

1855 gi1663744Chlamydomonasdynein heavy chain 5 80 28

reinhardtii

1855 gi1663738Chlamydomonasdynein heavy chain 2 80 27

reinhardtii

1856 gi18032120Gallus gallusshal-like voltage-gated 75 23
potassium

channel

1856 gi1408569Haemophilusadhesion and penetration71 28
protein

influenzae

1856 gig 18032120Gallus gallusshal-like voltage-gated 75 23
potassium

gb~AAL566 chaimel

33.1 ~AF075

160 1

1857 AAM67180 Homo SapiensMOLE- Human bone marrow 129 44

expressed probe encoded
protein SEQ

ID NO: 27486.

1857 AAM54795 Homo sapiensMOLE- Human brain expressed129 44
single

exon probe encoded protein
SEQ ID

NO: 26900.

1857 gi~21040255~Homo Sapienssplicing factor, arginine/serine-rich109 29
12

re~NP_6319

07.1 ~

1858 gi21392190Drosophila RE74758p 71 39

melanogaster

1858 gi9954108TrypanosomaRNA binding protein RGGm68 40

cruzi

1858 gi20302994Medicago nodule-specific glycine-rich66 32
protein 1C

tnmcatula

1859 gi~20536244~Homo Sapienssimilar to autoantigen 72 30
La

ref~XP_0605

05.4

1860 gi~17541362~CaenorhabditisK08E7.S.p 103 29

ref)NP-5024elegans

09.1

1860 gi~17446900~Homo Sapienssimilar to DNA-directed 100 34
RNA

re~XP_0658 polymerase (EC 2.7.7.6)
II largest

33.1 ~ chain - Mastigamoeba
invertens

(fra ment)

1860 gi~9628166~rAfrican CD2 homolog 98 30
swine

eflNP fever virus
0427

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
197
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

52.1

1861 AAY70691 Homo sa DAND Human membrane attractin-2.162 40
iens

1861 AAY70690 Homo SapiensDAND Human membrane attractin-1.162 40

1861 gi12275390Rattus membrane attractin 162 40

norvegicus

1862 gi10039425Equus caballusALR protein 81 28

1862 gi13529521Mus musculusSimilar to elastin microfibril80 32
interface

located protein

1862 AAM40414 Homo SapiensHYSE- Human polypeptide 79 39
SEQ ID

NO 3559.

1863 gi~16588389~Homo SapiensB lymphocyte activation-related247 52
protein

gb~AAL267 BC-1514

87.1 ~AF304

442 1

1863 gi~20479028~Homo Sapienssimilar to B lymphocyte 117 68
activation-

re~XP_1137 related protein BC-1514

29.1

1863 gi~21301715~Anopheles agCP8366 85 41

gb~EAA138gambiae
str.

60.1 ~ PEST

1864 AAU15851 Homo SapiensHUMA- Human novel secreted1275 78
protein,

Seq ID 804.

1864 AAU16312 Homo sapiensHUMA- Human novel secreted1123 76
protein,

Seq ID 1265.

1864 AAG02054 Homo SapiensGEST Human secreted protein,308 91
SEQ ID

NO: 6135.

1865 AAB94953 Homo SapiensHELI- Human protein sequence86 29
SEQ

ID N0:16485.

1865 13746787 Homo SapiensSYT interacting protein 86 29
SIP

1865 g115022507Homo sapienscoactivator activator 86 29

1866 g117133332Nostoc Sp. preprotein translocase 68 43
PCC Sect subunit

7120

1866 gi~13489110~Homo Sapiensgap junction protein, 66 40
alpha 3, 46kD

ref~NP-0687 (connexin 46)

73.1

1867 g1706930 Rattus cyclic GMP stimulated 191 95

norvegicus phosphodiesterase

1867 AAV54762-Homo SapiensUNIW Human cGS-PDE cDNA 137 100
DNA

aal seqeucne.

1867 AAV36157_,Homo SapiensUNIW Human cyclic-GMP-nucleotide137 100

aal phos hodiesterase cDNA.

1868 AAB95695 Homo SapiensHELI- Human protein sequence112 27
SEQ

ID N0:18516.

1868 AAY91447 Homo SapiensHUMA- Human secreted 112 27
protein

sequence encoded by gene
48 SEQ ID

N0:168.

1868 AAY91393 Homo SapiensHUMA- Human secreted 112 27
protein

sequence encoded by gene
48 SEQ ID

N0:114.

1870 AAU07886 Homo SapiensWHED Polypeptide sequence1454 94
for

human hspGlS.

1870 g113603891Homo sa MOV10-like 1 1454 94
iens

1870 113603857Mus musculusMOV10-like 1 954 77

1871 AAM96652 Homo SapiensHUMA- Human reproductive484 96
system

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
198
Table 2
SEQ AccessionSpecies Description Score

ID No, Identity

NO:

related antigen SEQ ID
NO: 5310.

1871 gi18676652Homo sa FLJ00225 rotein 433 95
iens

1871 gi21386760Berneuxia maturase R 70 32

thibeoca

1872 AAQ90304_Homo SapiensNISR Human thryoid peroxidase73 29
gene.

aal

1872 AAW48781 Homo sa RSRR- Thyroid eroxidase.73 29
iens

1872 AAR75689 Homo SapiensNISR Human thryoid eroxidase.73 29

1873 AAG03774 Homo SapiensGEST Human secreted protein,228 90
SEQ ID

NO: 7855.

1873 1338288 Homo Sapienspre rosomatostatin I 228 90

1873 g1342299 Macaca preprosomatostaon 228 90

fascicularis

1875 AAR30418 Homo sa DAND Nearly com lete 76 30
iens p107 rotein.

1875 g1347378 Homo Sapiens107 76 30

1875 g1157871 Drosophila P glycoprotein 76 24

melanogastex

1876 ABB 17955Homo SapiensHUMA- Human nervous system186 40
related

poi a tide SEQ ID NO
6612.

1876 AAS 17764_Homo SapiensGENA- Human Genomic DNA 167 39
for

aal CRYBB1.

1876 AA002331 Homo SapiensHYSE- Human polypepode 165 42
SEQ ID

NO 16223.

1877 gi~59977~emHuman tripartite fusion transcript224 76
PLA2L

b~CAA7866endogenous

2.1 retrovirus

1878 ABB84943 Homo SapiensGETH Human PR01556 protein1056 93

sequence SEQ ID N0:254.

1878 AAB31670 Homo SapiensPROT- Amino acid sequence1056 93
of a

human protein having
a hydrophobic

domain.

1878 AAB47295 Homo SapiensGETH PR01556 0l epode. 1056 93

1879 ABB15861 Homo SapiensHUMA- Human nervous system73 36
related

poi eptide SEQ ID NO
4518.

1880 AAU83117 Homo sapiensZYMO Novel secreted protein66 54

Z799543G2P.

1880 g112723186Lactococcusouter membrane lipoprotein66 26
precursor

lactis subsp.

lactis

1881 1609624 Vibrio choleraeE SC 73 29

1882 g112667456Ratios synaptotagnun VIId 86 32

norvegicus

1882 g112667454Rattus synaptotagmin VIII 85 33

norvegicus

1882 g1334072 PseudorabiesORF-3 protein 83 35

virus

1883 g11747 Oryctolagustrichohyalin 119 29

cuniculus

1883 g12072290Xenopuslae XL-INCENP 100 27
vis

1883 g112584554_ polyprotein 96 25
Human

coxsackievirus

B3

1884 gi~15601413~Vibrio choleraesucrose-6-phosphate dehydrogenase65 55

ref~NP
2330

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
199
Table 2
SEQ AccessionSpecies Description Score 1o

ID No. Identity

NO:

44.1 ~

1885 gi16878287Homo sa Similar to C-terminal 74 35
iens modulator protein

1885 gi15866714Homo sa C-terminal modulator 74 35
iens rotein

1885 AA006984 Homo SapiensHYSE- Human polypeptide 70 60
SEQ ID

NO 20876.

1887 AAW25939 Homo SapiensCNRS T-cell receptor 601 99
V-beta-5.1

pe tide fra ent.

1887. gi36973 Homo SapiensT-cell receptor beta-chain601 99

1887 gi1552498Homo sa V_se meat translation 600 100
iens product

1888 gi18874468Homo Sapienspartitioning-defective 198 73
3-like protein

splice variant c

1888 gi16903870Homo sapienspartitioning-defective 198 73
3-like protein

splice variant b

1888 gi16903868Homo Sapienspartitioning-defective 198 73
3-like protein

s lice variant a

1889 gi21489377Homo SapiensMAPA rotein 1620 99

1889 gi21489330Bos taurus MAPA protein 833 56

1889 gi21489379Mus musculusMAPA protein 630 48

1890 AAY10874 Homo SapiensHUMA- Amino acid sequence503 100
of a

human secreted rotein.

1890 gi17429674Ralstonia PROBABLE LIPOPROTEIN 73 44

solanacearum

1891 gi15723141Homo sa c349E10.1.1 (novel protein,180 46
iens isoform 1)

1891 AAB59006 Homo SapiensHUMA- Breast and ovarian174 47
cancer

associated antigen protein
sequence

SEQ ID 714.

1891 gi19353342Mus musculusRII~EN cDNA 9530058802 162 47
gene

1892 AAM86086 Homo SapiensHUMA- Human 95 53

immiule/haematopoietic
antigen SEQ

ID NO:13679.

1892 AA005973 Homo SapiensHYSE- Human polypeptide 94 82
SEQ ID

NO 19865.

1892 AA009418 Homo SapiensHYSE- Human polypeptide 91 70
SEQ ID

NO 23310.

1893 gi8778607ArabidopsisFSM15.23 71 25

thaliana

1894 AAM65951 Homo SapiensMOLE- Human bone marrow 69 38

expressed probe encoded
protein SEQ

ID NO: 26257.

1894 AAM53568 Homo sapiensMOLE- Human brain expressed69 38
single

exon probe encoded protein
SEQ ID

NO: 25673.

1894 gi~20832567~Mus musculussimilar to Heterogeneous163 76
nuclear

ref~XP_1335 ribonucleoprotein A3
(hnRNP A3)

24.1 ~ (D 10 S 102)

1895 AAM66299 Homo sapiensMOLE- Human bone marrow 440 83

expressed probe encoded
protein SEQ

ID NO: 26605.

1895 AAM53913 Homo SapiensMOLE- Human brain expressed440 83
single

exon probe encoded protein
SEQ ID

NO: 26018.

1895 gi~6723273~dBaboon gag-pol precursor polyprotein270 45

bj ~BAA8965endogenous

9.1~ virus strain
M7

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
200
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1896 gi4883988Bartonella cell division protein 68 28
FtsZ

clarridgeiae

1897 AA013209 Homo sapiensHYSE- Human polypeptide 142 54
SEQ ID

NO 27101.

1897 AAM66708 Homo sapiensMOLE- Human bone marrow 124 46

expressed probe encoded
protein SEQ

ID NO: 27014.

1897 AAM54310 Homo SapiensMOLE- Human brain expressed124 46
single

exon probe encoded protein
SEQ ID

NO: 26415.

1898 gi2565268Drosophila pore-forming protein 75 27
MIP family

virilis

1898 gi7453547Homo Sapiensglioma tumor suppressor 75 31
candidate

re ion rotein 1

1898 gi3218331Metarhiziumnitrogen response regulator74 26

aniso liae

1899 19656609 Vibrio choleraechemotaxis protein CheA 73 32

1899 gi~20908537~Mus musculusRIVEN cDNA 1700001L19 443 80

re~XP_1274

14.1

1899 gi~15642063~Vibrio choleraechemotaxis protein CheA 73 32

re~NP,2316

95.1

1900 gi~18586105~Homo Sapienssimilar to scal 203 84

reflXP
0914

00.1 ~

1900 gi~20888279~Mus musculussimilar to spinocerebellar199 82
ataxia type 1

refjXP_
1465

08.1

1901 g1338033 Homo sa serum rotein 90 32
iens

1901 g14808221Homo SapiensdJ1177I5.2 (serum constituent90 32
protein

MSE55)

1901 g14098993Mus musculuspolyhomeotic 2 88 30

1902 AAB 19933Homo SapiensINCY- Human oxidoreductase250 100
OXRD-

8.

1902 g119713043Fusobacteriumhon/zinc/copper-binding 73 22
protein

nucleatum
subsp.

nucleatum

ATCC 25586

1902 gi~20342079~Mus musculusltIKEN cDNA 1700003E16 77 25

ref~XP_1106

14.1

1903 g1342279 Macaca opiomelanocortin 231 49

nemestrina

1903 128342 Homo sa roo iomelanocortin 230 49
iens

1903 g1190183 Homo sapienso iomelanocortin 230 49

1904 gi~11037117~Homo SapiensNAG13 180 53

gb~AAG274

85.1 CAF
194

537_1

1905 g15360984Homo SapiensdJ228HI3.1 (similar to 152 72
Ribosomal

protein L21 e)

1905 AAB44126 Homo SapiensHUMA- Human cancer associated150 83

protein sequence SEQ
ID N0:1571.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
201
Table 2
SEQ AccessionSpecies Description Score

No, Identity

NO:

1905 gi550015 Homo sa ribosomal protein L21 150 83
iens

1906 gi2654610Pseudomonasarginine/ornitlline succinyltransferase79 25

aeru inosa AIsubunit

1906 gi17226812Botryotiniahistidine kinase 72 33

fuckeliana

1906 gi16904238Botryotiniatwo-component osmosensing72 33
histidine

fuckeliana kinase BOS1

1908 gi330359 Human nuclear antigen precursor91 37

herpesvirus
4

1908 gi1632793Human EBNA3C (EBNA 4B) latent 91 37
protein

herpesvirus
4

1908 11184677 Candida hyphal wall rotein 1 90 38
albicans

1909 g113177635Rattus phospholipase C beta-3 72 26

norve icus

1909 g11150880Mus musculusphos holi ase C beta3 71 26

1909 g117105044Simian 10.1 kDa 71 31

adenovirus
25

1910 g19857054Leishmania possible CG7055 protein 71 47

maj or

1910 g11617560Leishmania LCFACASS; L5701.2 67 33

ma'or

1910 gi~9857054~eLeishmania possible CG7055 protein 71 47

mb~CAC040major

11.1

1911 AAY87278 Homo SapiensINCY- Human signal peptide501 82

containing protein HSPP-55
SEQ ID

NO:55.

1911 AAB 18912Homo SapiensGETH A novel polypeptide501 82
designated

PR01889.

1911 AAU27659 Homo SapiensZYMO Human protein AFP513481.416 77

1912 12065210 Mus musculusPro-Pol-dUTPase olyprotein434 80

1912 gig 18676710Homo SapiensFLJ00254 protein 270 64

dbj~BAB850

07.1

1913 g15713196Caenorhabditisliprin-alpha homolog 479 38
SYD-2

elegans

1913 1930343 Homo SapiensLAR-interacting protein 467 39
1b

1913 g1930341 Homo SapiensLAR-interacting protein 467 39
la

1914 g16651021Mus musculussemaphorin cytoplasmic 274 63
domain-

associated rotein 3B

1914 g16651019Mus musculussemaphorin cytoplasmic 274 63
domain-

associated protein 3A

1914 AAM25720 Homo SapiensHYSE- Human protein sequence266 61
SEQ

ID N0:1235.

1915 g1902214 Zea mays RNA polymerase beta' 72 24
subuW t-2

1915 g112482 Zea mays RNA polymerase beta-2 72 24
subunit (AA

1-1527)

1915 gig 11467184Zea mays RNA polymerase beta' 72 24
subunit-2

reflNP-0430

17.1

1916 g11655432Mus musculuslexin 2 1135 58

1916 AAM93435 Homo SapiensHELI- Human polypeptide,1132 57
SEQ ID

NO: 3070.

1916 g1961515 Xenopus lexin 1126 54
laevis

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
202
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1917 g115559064Mus musculusSNAG1 86 38

1917 gi~20863586~Mus musculussimilar to dJ551D2.5 88 30
(novel protein)

ref~XP_1415

81.1

1917 gi~18644890~Mus musculussorting nexin associated86 38
golgi protein 1

re~NP_5706

14.1

1918 g119528383Drosophila RE04404p 67 32

melanogaster

1919 AAM77461Homo SapiensMOLE- Human bone marrow 189 79

expressed probe encoded
protein SEQ

ID NO: 37767.

1919 AAM64684Homo sapiensMOLE- Human brain expressed189 79
single

exon probe encoded protein
SEQ ID

NO: 36789.

1919 gig 17477135Homo Sapienssimilar to embryonal 263 75
stem cell specific

ref~XP'0634 gene 1

15.1

1920 g12623757Rarius neurabin 172 97

norvegicus

1920 12827450Gallus anus KS5 rotein 154 88

1920 113991829Xenopus laevisneurabin 145 83

1923 g15532302Heterocapsa PSII CP47 apoprotein 75 29

tri uetra

1923 g11881335Bacillus SIMILAR TO YQFU, YXKD, 68 38
subtilis YITB

OF B. SUBTILIS.

1923 gi~5532302~gHeterocapsa PSII CP47 apoprotein 75 29

b~AAD4470triquetra

1.1~

1924 g16855429Leishmania possible mucin 1 precursor77 33

maj or

1924 g15832816Caenorhabditiscontains similarity to 74 34
Pfam domain:

elegans PF01694 (Rhomboid family),

Score=61.7, E-value=5.1e-15,
N=1

1924 AAB51976Homo SapiensHUMA- Human secreted 72 38
protein

sequence encoded by gene
48 SEQ ID

N0:108.

1925 AAB51635Homo SapiensROSE/ Human secreted 205 31
protein

sequence encoded by gene
16 SEQ ID

N0:75.

1925 AAB47128Homo Sapiens1NCY- CDIFF-6, Incyte 199 34
ID No.

2009435CD 1.

1925 ABB55766Homo SapiensFECH/ Human polypeptide 197 38
SEQ ID

NO 138.

1926 AAG89279Homo SapiensGEST Human secreted protein,330 44
SEQ ID

NO: 399.

1926 AAB70690Homo SapiensSREN- Human hDPP protein319 44
sequence

SEQ ID N0:7.

1926 g113182757Homo sa iensHTPAP 319 44

1927 g113177290Ectocarpus EsV-1-8 69 36

siliculosus
virus

1928 g118700171Arabidopsis AT5g20480/F7C8 70 86 39

thaliana

1928 g1915207Sus scrofa gastric mucin 83 29

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
203
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1928 gi532113Caenorhabditishomeotic region most 79 27
like

elegans HMPB_DROME: homeotic

probosci edia rotein

1929 ABB 12295Homo SapiensHYSE- Human secreted 135 59
protein

homologue, SEQ ID N0:2665.

1929 AAG04080Homo SapiensGEST Human secreted 78 38
protein, SEQ ID

NO: 8161.

1929 gi9279807Drosophila cortactin 77 27

melanogaster

1930 AAV81204_Homo sapiensGEHO Human CD7 cDNA. 872 73

aal

1930 AAB36657Homo SapiensIMMV Human CD7 protein 872 73
sequence

SEQ ID N0:2.

1930 AAU02438Homo SapiensGEHO Human lymphocyte 872 73
cell surface

anti en CD7 olype tide.

1931 gi2636248Bacillus similar to transaldolase73 29
subtilis (pentose

hosphate)

1931 gi~21398633~Bacillus Transaldolase, Transaldolase74 29
[Bacillus

reflNP,6546anthracis
A2012

18.1

1931 gi~16080764~Bacillus similar to transaldolase73 29
subtilis (pentose

ref~NP_3915 phosphate)

92.1

1932 AAB43545Homo SapiensHUMA- Human cancer associated73 46

protein sequence SEQ
ID N0:990.

1932 AAM40234Homo SapiensHYSE- Human polypeptide71 26
SEQ ID

NO 3379.

1934 gi3129962Gallus gallusB locus Lectin like 82 30
Natural Killer cell

surface protein

1934 AAB93791Homo SapiensHELI- Human protein 77 38
sequence SEQ

ID N0:13545.

1934 gi2541864Drosophila DAD polypeptide 77 32

melanogaster

1935 gi~4959869~gMurine leukemiapolymerise 335 52

b~AAD3453virus

6.1~

1935 gi~6524624~gPhascolarctospol protein 331 52

b~AAF15098cinereus

.l~

1935 gi~9630313~rGibbon ape pol polyprotein 328 52

ef~NP_0567leukemia
virus

90.1

1936 gi6562332Arabidopsis diaminopimelate decarboxylase86 30

thaliana

1936 gi7573355Arabidopsis diaminopimelate decarboxylase-like86 30

thaliana rotein

1936 gi15146250Arabidopsis ATSg11880/F14F18 50 86 30

thaliana

1939 AAU07442Homo SapiensGETH Human Wntl Upregulated300 100

protein 2 (WUP2).

1939 AAU07441Homo SapiensGETH Human Wntl Upregulated300 100

protein 1 (WUP1).

1939 AAB56802Homo sapiensROSEI Human prostate 300 100
cancer antigen

protein se uence SEQ
ID N0:1380.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
204
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1940 15802814 Homo sa Gag-Pro-Pol-Env rotein 587 57
iens

1940 g14185939Human pol protein 586 57

endogenous

retrovirus
K

1940 15802821 Homo sa Gag-Pro-Pol rotein 586 57
iens

1941 AAU83088 Homo sapiensZYMO Novel secreted protein586 100

Z2812G3P.

1941 AAB20275 Homo sa SCHE Human interleukin 535 76
iens DNAX 80.

1941 AAB20277 Homo SapiensSCHE Human interleukin 529 76
DNAX 80

variant.

1942 AAM06866 Homo SapiensHYSE- Human foetal protein,994 100
SEQ ID

NO: 1074.

1942 g117426446Homo sa bA351K23.5 (novel rotein)933 54
iens

1942 115099951Mus musculusdiacylglycerol acyltransferase915 55
2

1943 AAM06596 Homo sapiensHYSE- Human foetal protein,406 98
SEQ ID

NO: 327.

1943 gi~15640499~Vibrio choleraeS-adenosylmethionine 67 51
synthase

ref~NP-2301

26.1 ~

1945 AAG75561 Homo SapiensHUMA- Human colon cancer327 100
antigen

protein SEQ ID N0:6325.

1945 g116416764Homo SapiensFI~SG16 327 100

1945 g113905212Mus musculusRIKEN cDNA 1200006F02 261 79
gene

1946 g1288174 Mus musculusOct2b 97 85

1946 g153490 Mus musculusOct2.5 transcription 97 85
factor

1946 g19937478Drosophila thyroid hormone receptor-associated72 39

melanogasterrotein TRAP 170

1947 AAM66980 Homo SapiensMOLE- Human bone marrow 170 69

expressed probe encoded
protein SEQ

ID NO: 27286.

1947 AAM54574 Homo SapiensMOLE- Hurnan brain expressed170 69
single

exon probe encoded protein
SEQ ID

NO: 26679.

1947 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86

expressed probe encoded
protein SEQ

ID NO: 35495.

1948 AAY10874 Homo SapiensHUMA- Amino acid sequence100 100
of a

human secreted rotein.

1949 AAA27155_Homo SapiensGENE- Human P2 DNA. 100 100

aal

1949 AAY94475 Homo SapiensGENE- Predicted translation100 100
product of

human P2 splice isoform,
P2-B.

1949 AAY94474 Homo SapiensGENE- Human P2 protein. 100 100

1950 19502082 Homo sapienstubby super-family protein80 40

1950 19502080 Mus musculustubby super-family protein77 41

1950 18118432 Oryza sativabeta-ex ansin 73 35

1951 g14808994walleye envelope polyprotein 69 46

epidermal

hyperplasia
virus

type 1

1951 gig 15642893Thermotoga ribonucleotide reductase,66 46
B 12-

ref~NP_2279maritime dependent

34.1

1952 AAB80264 Homo SapiensGETH Human PR0332 protein.~ 577 ~ 61

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
205
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1952 AAB33425 Homo SapiensGETH Human PR0332 protein577 61

UNQ293 SEQ ID N0:57.

1952 AAY13396 Homo SapiensGETH Amino acid sequence577 61
of protein

PR0332.

1953 gi16648392Drosoplula LD39243p 449 61

melanogaster

1953 AAG73684 Homo SapiensHUMA- Human colon cancer371 55
antigen

rotein SEQ ID N0:4448.

1953 AAY48312 Homo SapiensMETA- Human prostate 371 55
cancer-

associated rotein 9.

1954 AAU84348 Homo SapiensBARK/ Protein MMP2 differentially2068 94

ex ressed in breast cancer
tissue.

1954 ABB90738 Homo SapiensUYJO Human Tumour Endothelial2068 94

Marker poi eptide SEQ
ID NO 208.

1954 AAB84607 Homo SapiensPFIZ Amino acid sequence2068 94
of matrix

metallo roteinase elatinase
A.

1955 gi16769680Drosophila LD46678p 245 35

melano aster

1955 AAM66797 Homo SapiensMOLE- Human bone marrow 148 80

expressed probe encoded
protein SEQ

ID NO: 27103.

1955 AAM54396 Homo SapiensMOLE- Human brain expressed148 80
single

exon probe encoded protein
SEQ ID

NO: 26501.

1957 AAB80242 Homo SapiensGETH Human PR0236 rotein.648 97

_ AAM93378 Homo SapiensHELI- Human polypeptide,648 97
1957 SEQ ID

N0: 2955.

1957 AAB 12157Homo sapiensPROT- Hydrophobic domain648 97
protein

from clone HP03165 isolated
from KB

cells.

1958 AAM41696 Homo SapiensHYSE- Human polypeptide 234 47
SEQ ID

NO 6627.

1958 AAU17119 Homo SapiensHUMA- Novel signal transduction229 46

pathway protein, Seq
ID 684.

1958 gi16741621Homo SapiensSimilar to RAB37, member228 47
of RAS

oncogene family

1959 gi18025526cercopithicineLF3 140 30

he esvirus
15

1959 gi3153821Mus musculusplenty-of prolines-101; 137 25
POP101; SH3-

philo-protein

1959 gi39255 Actinomycessialidase 129 28

viscosus

1960 ABB 12366Homo SapiensHYSE- Human bone marrow 400 90
expressed

rotein SEQ ID NO: 120.

1960 AA012936 Homo SapiensHYSE- Human polypeptide 115 95
SEQ ID

NO 26828.

1960 AAM84898 Homo SapiensHUMA- Human 113 82

immune/haematopoietic
antigen SEQ

ID N0:12491.

1961 gi19110438Homo sa polycystin-1L1 190 94
iens

1961 gi3115393Rana pipiensguanylate cyclase inhibitory80 35
. protein

1961 gi3462887Ratios alpha-fodrin 68 31

norvegicus

1962 AAU83130 Homo Sapiens~ ZYMO Novel secreted ~ 1076~ 100
protein

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
206
Table 2
SEQ AccessionSpecies Description Score /a

ID No. Identity

NO:

Z835892G6P.

1962 11890354 Brassica L-ascorbate eroxidase 80 33
na us

1962 g17529611Leishmania hypoothetical protein 79 31
L787.06

ma' or

1963 AAG78679 Homo sa BODE- Human thrombotic 467 86
iens protein 46.

1963 AAY87347 Homo SapiensINCY- Human signal peptide467 86

containing protein HSPP-124
SEQ ID

N0:124.

1963 AAB01431 Homo sa MILL- Human TANGO 224 467 86
iens (form 2).

1964 g13413504Rattus Bassoon 81 26

norvegicus

1964 g1330452 human DNA polymerase 79 28

he esvirus
5

1964 AAV69717_Homo SapiensLUDW- Tumour rejection 73 33
antigen

aal precursor MAGE-C1 cDNA.

1965 gi~2323'287~gmultiple polyprotein 286 64

b~AAB6652sclerosis

8.1~ associated

retrovirus

1965 gi~2351212~dFriend marinegag-pol polyprotein (precursor179 47
protein)

bj~BAA2206leukemia
virus

4.1~

1965 gi~9629516~rRauscher Pol 179 47
marine

ef~NP_0447leukemia
virus

38.1

1966 gi~2323287~gmultiple polyprotein 476 65

b~AAB6652sclerosis

8.1~ associated

retrovirus

1966 gi~2281588~gsynthetic Pol 323 51

b~AAB6416construct

0.1~

1966 gi~9626961~rMarine leukemiaPr180 323 51

ef~NP_0579virus

33.1

1967 12065210 Mus musculusPro-Pol-dUTPase pol rotein518 73

1967 AAM65715 Homo SapiensMOLE- Human bone marrow 464 69

expressed probe encoded
protein SEQ

ID NO: 26021.

1967 AAM53338 Homo SapiensMOLE- Human brain expressed464 69
single

exon probe encoded protein
SEQ ID

NO: 25443.

1968 AAG78149 Homo SapiensBODE- Human polypeptide-388 82

cytochrome b5-13.

1968 g13150438Human pol-env 345 55

endogenous

retrovirus
K

1968 g11469243Human pol/env 345 55

endogenous

retrovirus
K

1969 g121113108XanthomonasTong-dependent receptor 78 31

campestris
pv.

campestris
str.

ATCC 33913

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
207
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1969 gi476274 Homo SapiensR kappa B 77 23

1969 gi4206769Acanthamoebamyosin I heavy chain 76 27
kinase

castellanii

1970 gi~13310191~multiple recombinant envelope 244 77
protein

gb~AAK181sclerosis

89.1~AF331associated

1 retrovirus
500

_ element

1970 gi~8272468~gHomo Sapiensenvelope protein 219 81

b~AAF74215

.1 ~AF15696

3 1

1970 gi~21103962~Homo Sapiensenverin-2 219 77

gb~AAM331

41.1

1971 AAU83621 Homo SapiensGETH Human PRO protein, 320 100
Seq ID No

60.

1971 AA005826 Homo SapiensHYSE- Human polypeptide 295 93
SEQ ID

NO 19718.

1971 AAM39560 Homo SapiensHYSE- Human polypeptide 194 56
SEQ ID

NO 2705.

1972 gi6456112Mus musculusF-box protein FBX15 128 44

1972 gi21428946Drosophila GH22104p 74 31

melanogaster

1972 gi~6456112~gMus musculusF-box protein FBX15 128 44

b~AAF09139

.1~

1973 1148270 Escherichialambda-integrase 550 94
coli

1973 g11790244Escherichiasite-specific recombinase,550 94
coli acts on cer

I~12 sequence of ColEl, effects

chromosome segregation
at cell

division

1973 g113364217Escherichiasite-specific recombinase544 92
coli XerC

0157:H7

1974 g11805552EscherichiaFORMATE HYDROGENLYASE 887 88
coli

TRANSCRIPTIONAL ACTIVATOR.

1974 11616960 EscherichiaHyfR 887 88
coli

1974 g17920396Salmonella formate hydrogenlyase 522 54
activator

typhimuriumprotein

1975 1409795 EscherichiaNo definition line found1175 99
coli

1975 g115074592SinorllizobiumHYPOTHETICAL 378 33

meliloti TR.ANSMEMBRANE PROTEIN

1975 g117740718AgrobacteriumNa+/Pi-cotransporter 372 34

tumefaciens
str.

C58 (U.

Washington)

1976 AAB82047 Homo SapiensIGAK- Human mast cell 163 23
surface

antigen.

1976 g112654783Homo SapiensSimilar to loss of heterozygosity,163 23
11,

chromosomal region 2,
gene A

1976 AAZ45690-Homo sapiensREGC cDNA sequence encoding108 25
the

aal human minor vault protein
193.

1977 ABB56523 Homo SapiensMERI Human NMDA receptor73 28
subunit

SEQ ID NO 44.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
208
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1977 AAW87504 Homo SapiensSIBI- Human N-methyl-D-aspartate73 28

receptor subunit encoded
by clone

NMDA24.

1978 AAG00471 Homo SapiensGEST Human secreted protein,285 93
SEQ ID

NO: 4552.

1978 gi298489 Papio hamadryasSP-10 133 34

1978 gi452582 Vulpes vulpesfox sperm acrosomal protein132 34
FSA-

Acr. l

1979 AAB87128 Homo SapiensMILL- Human secreted 490 86
protein

MANGO 349, SEQ ID N0:130.

1979 AAB87179 Homo SapiensMILL- Human secreted 488 85
protein

MANGO 349 I21K variant,
SEQ ID

N0:227.

1979 AAB87181 Homo SapiensMILL- Human secreted 487 85
protein

MANGO 349 E41D variant,
SEQ ID

N0:231.

1982 AAM75035 Homo SapiensMOLE- Human bone marrow 109 67

expressed probe encoded
protein SEQ

ID NO: 35341.

1982 AAM62231 Homo SapiensMOLE- Human brain expressed109 67
single

exon probe encoded protein
SEQ ID

NO: 34336.

1982 gi11967423Mus musculusvomeronasal receptor 105 76
V1RC5

1983 AAG89276 Homo sapiensGEST Human secreted protein,224 46
SEQ ID

NO: 396.

1983 AAB56565 Homo sapiensROSE/ Human prostate 99 40
cancer antigen

protein sequence SEQ
ID N0:1143.

1983 AAY44987 Homo sa 1NCY- Human epidermal 78 28
iens protein-4.

1984 AAB95089 Homo SapiensHELI- Human protein sequence498 97
SEQ

ID NO:17025.

1984 AAM06608 Homo SapiensHYSE- Human foetal protein,495 96
SEQ ID

NO: 339.

1984 gi497890 unidentifiedalpha subunit of dinitrogenase73 24

nitrogen-fixingreductase (Fe protein)

bacteria

1985 gi~17455728~Homo Sapienssimilar to Zinc-forger 71 37
protein ubi-d4

ref~XP_0635 (Requiem) (Apoptosis
response zinc

94.1 ~ finger protein)

1986 gi21428886Drosophila GH12469p 69 34

melano aster

1987 17767529 Bos taurus cyclophilin I 364 75

1987 18699209 Canis familiariscyclo hilin A 361 88

1987 111641132Sus scrofa cyclo hilin 361 88

1988 g115073168SinorhizobiumPROBABLE TRANSLATION 81 37

meliloti INITIATION FACTOR IF-2

PROTEIN

1988 g11181352Paramecium Pro-rich protein; PIPG 78 25
(8X)

bursaria

Chlorella
virus 1

1988 g1493242 Feline Feline herpesvirus type 77 20
1 immediate

herpesvirusearly protein
1

1989 AAM65707 Homo SapiensMOLE- Human bone marrow 134 66

expressed probe encoded
protein SEQ

ID NO: 26013.

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
209
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1989 AAM53330 Homo SapiensMOLE- Human brain expressed134 66
single

exon probe encoded protein
SEQ ID

NO: 25435.

1989 gi~20475216~Homo Sapienssimilar to synapsin 228 59
I

ref~XP-1148

02.1 ~

1990 AAM71181 Homo SapiensMOLE- Human bone marrow110 64

expressed probe encoded
protein SEQ

ID NO: 31487.

1990 AAM58674 Homo SapiensMOLE- Human brain expressed110 64
single

exon probe encoded protein
SEQ ID

NO: 30779.

1990 gi21323636CorynebacteriumSulfate permease and 75 26
related

glutamicum transporters (MFS superfamily)

ATCC 13032

1991 gi1932813Xeno us laevisdsRNA adenosine deaminase96 34

1991 AAE10203 Homo SapiensHYSE- Human bone marrow83 25
derived

conti rotein, SEQ ID
NO: 68.

1991 gi3242649Rana catesbeianaalpha 1 type I collagen80 30

1992 gi1181423Paramecium PBCV-1 chitinase 71 41

bursaria

Chlorella
virus 1

1992 gi~21300897~Anopheles agCP14405 72 37

gb~EAA130gambiae str.

42.1 ~ PEST

1992 gi~9631828~rParamecium PBCV-1 chitinase 71 41

ef~NP_0486bursaria

13.1 Chlorella
virus 1

1994 gi8248755Plasmodium protein phosphatase 72 25

falciparum
3D7

1994 gi4104348CampylobacterS-layer-RTX protein 70 38

rectus

1994 gi~8248755~ePlasmodium protein phosphatase 72 25

mb~CAB628falciparum
3D7

78.2

1995 gi21324402CorynebacteriumUncharacterized ATPase 73 38
related to the

glutamicum helicase subunit of
the Holliday

ATCC 13032 junction resolvase

1995 gi~19552845~CorynebacteriumCOG2256:Uncharacterized73 38
ATPase

ref~NP_6008glutamicum related to the helicase
subunit of the

47.1 Holliday 'unction resolvase

1995 gi~17533213~CaenorhabditisF14ES.S.p 73 30

reflNP elegans
4957

77.1 ~

1996 11871223 Rickettsia crystalline surface 92 30
hi layer rotein

1996 g16969926Rickettsia OmpB ~ 79 25

aeschlimannii

1996 g114670347Rickettsia OmpB 78 25
felis

1997 gi~20548733~Homo Sapienssimilar to gag protein 256 58

re~XP-0556

41.2

1997 gi~9739120~gBovine leukemiagag 186 34

b~AAF97916virus

.l

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
210
TahlP 7
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

1997 gi~9626226~rBovine leukemiaPr44 185 34

e~NP_0568virus

97.1

1998 AAM79834 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO

3480.

1998 AAM78850 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO

1512.

1998 AAM79204 Homo SapiensHYSE- Human protein SEQ 272 71
ID NO

1866.

1999 AAM73176 Homo SapiensMOLE- Human bone marrow 168 48

expressed probe encoded
protein SEQ

ID NO: 33482.

1999 AAM60521 Homo sapiensMOLE- Human brain expressed168 48
single

exon probe encoded protein
SEQ ID

NO: 32626.

1999 gi~13929148~Rattus cyclic nucleotide-gated 163 47
channel beta

ref~NP_1139norvegicus subunit 1

97.1 ~

2000 gi1869859human very large tegument protein73 30

he esvirus
2

2000 gi7380253Neisseria 2-keto-4-hydroxyglutarate70 37
aldolase

' meningitidis

22491

2000 gi7226633Neisseria 4-hydroxy-2-oxoglutarate70 37
aldolase/2-

meningitidisdeydro-3-deoxyphosphogluconate

MC58 aldolase

2001 gi17016969Mus musculusNUANCE 138 36

2001 gi6273778Homo Sapienstrabeculin-alpha 137 33

2001 gi1675222Mus musculusACF7 neural isoform 1 136 42

2002 AAM39256 Homo SapiensHYSE- Human polypeptide 81 29
SEQ ID

NO 2401.

2002 1840789 Homo sa bindin re ulato factor 81 29
iens

2002 g117028337Homo Sapiensregulatory factor X, 81 29
5 (influences HLA

class II expression)

2003 g12252814Mus musculusFOG 172 64

2003 AAR58815 Homo SapiensUSSH Human c-myc far 103 42
upstream

element (FUSE) binding
protein

(FBP)variant from HL60
clone 3-1.

2003 g13598974Rattus protein tyrosine phosphatase103 26
TD14

norve icus

2004 g111994696Arabidopsiscontains similarity to 77 28
DNA repair

thaliana protein ene id:K7M2.11

2004 17209527 Mus musculustestis-s ecific gene 73 24

2004 gi~17451912~Homo Sapienssimilar to DNA-binding 234 97
protein B

ref~XP_0710

83.1

2005 AAE12023 Homo sapiens1NCY- Human G-protein 173 100
coupled

receptor, GCREC-2.

2005 AAG65832 Homo SapiensFARB Human G protein-coupled173 100

receptor (GPCR).

2005 AAG68126 Homo SapiensFARB Human 7TM-GPCR protein105 78

sequence SEQ ID N0:6.

2006 g120068811Homo SapiensRab-couplin protein 130 43

2006 g115822596Homo sapiensnRi 11 104 45

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
211
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2006 gi13377897Homo SapiensRabl l interacting protein83 40
I2i l la

2007 gi~17539708)CaenorhabditisFO8B4.S.p 78 42

ref~NP-5014elegans

89.1

2008 AAE10350 Homo SapiensPFIZ Human ADAMTS-J1.4 504 97
variant

protein.

2008 AAE10349 Homo SapiensPFIZ Human ADAMTS-J1.3 504 97
variant

rotein.

2008 AAE10347 Homo sapiensPFIZ Human ADAMTS-J1.1 504 97
variant

protein.

2009 AAV31720_Homo SapiensMOUN Nucleotide sequence87 29
of the

aal PUR-al ha ene.

2009 AAT99264_Homo SapiensMOUN Human PUR-alpha 87 29
gene.

aal

2009 AAQ44800_Homo SapiensMOUN Encodes single-stranded87 29
DNA

aal binding (PUR) protein.

2010 gi170444 Lycopersiconextensin (class II) 123 27

esculentum

2010 gi4662641Arabidopsisexpressed protein 116 30

thaliana

2010 gi188864 Homo sa mucin 115 28
iens

2011 AAY93650 Homo SapiensHUMA- Amino acid sequence1677 100
of a

human prostacyclin-stimulating
factor-

2.

2011 AAS 15723_Homo SapiensCURA- DNA encoding insulin-like1673 99

aal growth factor family
related protein,

NOV3.

2011 AAE17599 Homo SapiensINCY- Human extracellular1673 99
messenger

(XMES)-1 rotein.

2012 gi10440434Homo sa FLJ00052 protein 336 69
iens

2012 gi20502870Mus musculusSDS3 333 68

2012 gi21430678Drosophila RE74901p 170 36

melano aster

2013 AAH77293_Homo SapiensMILL- Human ion channel 214 93
protein

aal IC32391 cDNA coding re
ion.

2013 AAE13278 Homo Sapiens1NCY- Human transporters214 93
and ion

channels (TRICH)-5.

2013 AAG77969 Homo SapiensMILL- Human ion channel 214 93
protein

IC32391.

2014 gi4894768Xeno us ephrin-B2 recursor 78 30
laevis

2015 AAU77498 Homo sapiens1NCY- Human lipid metabolism1291 100

enzyme, LMM-6.

2015 ABB08205 Homo SapiensINCY- Human lipid metabolism1122 100

enzyme-5 (LME-5).

2015 ABB07493 Homo SapiensINCY- Human lipid metabolism864 75

molecule (LMM) polypeptide
(ID:

2965233 CD 1 ).

2016 gi~14769015~Homo Sapiensfibrillin3 68 36

retlXP_0415

69.1 ~

2017 gi2313786Helicobacterchorismate synthase (aroC)78 33

ylori 26695

2017 gi4155160HelicobacterCHORISMATE SYNTHASE 72 32

pylori J99

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
212
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2017 gi~15645287~Helicobacterchorismate synthase (aroC)78 33

reilNP-2074pylori 26695

57.1

2018 gi15485622Homo sa Q9H4T4 like 1068 100
iens

2018 ABB 14744Homo SapiensHUMA- Human nervous system694 98
related

pol epode SEQ ID NO 3401.

2018 AAB95100 Homo SapiensHELI- Human protein sequence101 24
SEQ

ID N0:17064.

2019 18050556 Gorilla carboxyl-ester lipase 223 42
gorilla

2019 AAU09894 Homo SapiensMONS Bile Salt Stimulated217 39
Lipase

(BSSL).

2019 ABB04676 Homo SapiensMONS Human milk bile 217 39
salt-

stimulated lipase (BSSL)
protein SEQ

ID N0:2.

2020 12065210 Mus musculusPro-Pol-dUTPase polyprotein515 74

2020 gi~385615~gbMus Sp. fibulin gene homolog 300 75

~AAB26708.

1~

2020 gi~13194728~Gallus galluspol-like protein ENS-3 170 33

gb~AAK155

26.1 ~AF329

451 1

2021 AAM66980 Homo SapiensMOLE- Human bone marrow 170 75

expressed probe encoded
protein SEQ

ID NO: 27286.

2021 AAM54574 Homo sapiensMOLE- Human brain expressed170 75
single

exon probe encoded protein
SEQ ID

NO: 26679.

2021 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86

expressed probe encoded
protein SEQ

ID NO: 35495.

2022 AAD29146_Homo sapiensZYMO Human Zcyto2l consensus649 83

aal cDNA.

2022 AAU83208 Homo SapiensZYMO Novel secreted protein649 83

Z908463G2P.

2022 AAE18311 Homo SapiensZYMO Human Zcyto2l consensus649 83

protein.

2024 g114336750Homo SapiensCe protein similar to 84 34
Dm Cys3His

forger rotein

2024 AAB50363 Homo sa UYSL- Human SRCAP. 83 34
iens

2024 AAB95541 Homo SapiensHELI- Human protein sequence83 34
SEQ

ID N0:18149.

2025 g118676682Homo SapiensFLJ00240 protein 470 45

2025 g114701866Dictyosteliumcarmil 221 29

discoideum

2025 g11881738Acanthamoebamyosin-I binding protein219 29
Acan125

castellanii

2026 ABB12490 Homo SapiensHYSE- Human bone marrow 212 78
expressed

protein SEQ ID NO: 329.

2027 AAU83147 Homo SapiensZYMO Novel secreted protein1153 100

Z846363G2P.

2027 gi~21287755~Anopheles ebiP4780 205 51

gb~EAA000gambiae
str.

76.1 ~ PEST

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
213
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2027 gi~17552028~CaenorhabditisCOSD11.8.p 91 38

ref~NP-4984elegans

07.1 ~

2028 gi1510143Homo Sapienssimilar to C.elegans 323 57
protein encoded in

cosmid T20D3 (Z68220).

2028 gi3879942CaenorhabditisT20D3.11 124 27

elegans

2028 gi5869818Globodera NADH-ubiquinone oxidoreductase82 27

allida subunit 6

2029 AAE13288 Homo SapiensINCY- Human transporters75 31
and ion

channels (TRICH)-15.

2029 gi3252893Thermotoga ABC transporter 74 37

neapolitana

2029 gi~18403965~Arabidopsisexpressed protein 70 29

re~NP_5658thaliana

26.1

2030 AAB97908 Homo SapiensSHAN- Hurnan GTP-binding79 27
protein

17 SEQ ID N0:2.

2030 AAM42129 Homo SapiensHYSE- Human polypeptide 79 27
SEQ ID

NO 7060.

2030 gi9971156Mus musculusGTP-binding like protein79 27
2

2031 gi~20864803~Mus musculusRIKEN cDNA 4930503K02 89 25

ref)XP'1308

00.1 ~

2031 gi~21262152~Oryza sativaSMC4 protein 77 28

emb~CAD32

690.1

2031 gi~1507705~gBorrelia outer surface protein 74 33

b~AAB0656burgdorferi

8.1~

2032 AAG65898 Homo SapiensSMIK Amino acid sequence481 100
of GSK

ene Id 18525.

2032 AAU83670 Homo sapiensGETH Human PRO protein, 471 97
Seq ID No

158.

2032 ABB84896 Homo SapiensGETH Human PR01309 protein471 97

se uence SEQ ID N0:160.

2034 gi6723273Baboon gag-pol precursor polyprotein687 43

endogenous

virus sham
M7

2034 gi18448744Moloney Pr180 gag-pro-pol polyprotein685 42
marine

leukemia
virus

2034 gi2801471Moloney Pr180 682 42
m'urine

leukemia
virus

2035 gi~17554696~CaenorhabditisR148.7.p 68 32

ref~NP elegans
4976

70.1

2035 gi~16127996fEscherichiaaspartokinase I, homoserine68 43
coli

re~NP K12 dehydrogenase I
4145
~

43.1

2035 gi~19548975~Escherichiaaspartokinase I-homoserine.68 43
coli

gb~AAL908 dehydrogenase I

85.1~AF487

900 1

2036 gi13424459Caulobactermethyl-accepting chemotaxis~ 72 ~ 32
protein

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
214
TahlP 9
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

crescentus Mc I
CB15

2036 gi~16877133~Homo sapienscarboxypeptidase, vitellogenic-like69 30

gb~AAH168

38.1 ~AAH16

838

2037 AAB67055 Homo SapiensINCY- Human immune response532 75

molecule (IMUN) protein
SEQ ID NO:

9.

2037 AA001862 Homo SapiensHYSE- Human polypeptide403 67
SEQ ID

NO 15754.

2037 gi~6753924~rMus musculusFriend virus susceptibility240 39
1

eflNP
0343
_

74.1

2039 AAB38447 Homo SapiensHUMA- Fragment of human80 27
secreted

protein encoded by gene
20 clone

HLTFBY 15.

2039 111527799Mus musculusGTP-bindin rotein like 73 30
1

2039 g1695237 Equine tegument protein 73 33 a

he esvirus
2

2040 gi~20544038~Homo Sapienssimilar to PER-HEXAMER 68 41
REPEAT

ref~XP PROTEIN 5
0896

12.4

2042 AAM77922 Homo SapiensMOLE- Human bone marrow642 85

expressed probe encoded
protein SEQ

ID NO: 38228.

2042 AAM65219 Homo SapiensMOLE- Human brain expressed642 85
single

exon probe encoded protein
SEQ ID

NO: 37324.

2042 gi~6723273~dBaboon gag-pol precursor polyprotein139 26

bj~BAA8965endogenous

9.1 virus strain
M7

2043 g148507 Wolinella formate dehydrogenase 80 27

succinogenes

2043 112381857Danio rerio c-Maf 78 42

2043 gi~18594822~Homo Sapienszinc finger protein 306 100
21 (KOX 14)

reflXP_0929

95.1

2044 13132272 Sus scrofa WT1 homologue 99 47

2044 AAG78446 Homo sapiensMASI Predicted WT1 Wilin's96 45
tumour

pol eptide of humans.

2044 AAG62154 Homo SapiensCORI- Human WT1/PSA 96 45
fusion

rotein SEQ ID NO: 357.

2046 g121483222Drosophila AT16994p 86 33

melanogaster

2046 g121111736Xanthomonas cell division protein 79 30

campestris
pv.

campestris
str.

ATCC 33913

2046 112653493Homo SapiensSimilar to brain acid-soluble79 36
protein 1

2047 ABB 12490Homo SapiensHYSE- Human bone marrow200 83
expressed

rotein SEQ ID NO: 329.

2047 gi~20837783~Mus musculussimilar to 40S ribosomal73 35
protein S11

ret~XP_1459

21.1

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
215
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2047 gi~6002932~gStreptomycesglycosyl transferase 71 35

b~AAF00209fradiae

.1 CAF '
16496

0 5

2048 AAB59012 Homo SapiensHUMA- Breast and ovarian103 32
cancer

associated antigen protein
sequence

SEQ ID 720.

2048 gi2429362Santalum proline rich rotein 99 31
album

2048 gi17945382Drosophila RE17165p 98 25

melanogaster

2051 gi15625542Hepatitis S antigen 71 31
B virus

2051 gi~4884886~gHepatitis surface antigen 68 30
B virus

b~AAD3185

7.1 CAF
1341

40 1

2052 AAB28764 Homo SapiensHUMA- Sequence homologous693 78
to

protein fragment encoded
by gene 21.

2052 gi2065210Mus musculusPro-Pol-dUTPase olyprotein693 78

2052 AAB73606 Homo SapiensSHAN- Human dUTP pyrophosphatase668 77

26.

2053 gi9945983Pseudomonastranscriptional regulator83 34
PcaQ

aeru inosa

2053 gi13874427Homo sa cerebral protein-5 76 35
iens

2053 gi12803205Homo sa CAAX box 1 76 35
iens

2054 gi21307831Aplysia CREB-binding protein 76 26

californica

2054 gi16755887Drosophila guanine nucleotide exchange76 26
factor

melano aster

2054 gi~21307831~Aplysia CREB-binding protein 76 26

gb~AAL548californica

59.1)

2055 gi16588389Homo SapiensB lymphocyte activation-related437 71
protein

BC-1514

2055 AAB92981 Homo SapiensHELI- Human protein sequence407 68
SEQ

ID N0:11698.

2055 AAM48325 Homo SapiensSHAN- Human urine receptor398 74
21.23.

2056 gi~2072969~gHomo Sapiensp40 134 47

b~AACS
127

4.1~

2056 gi~7959889~gHomo SapiensPR02221 123 43

b~AAF71115

.1 CAF
11672

1 95

2056 gi~2072974~gHomo Sapiensp40 122 44

b~AACS
127

7.1

2057 gi19171178Homo Sapiensmetalloprotease disintegrin518 98
16 with

thrombospondin type I
motif

2057 gi19171150Homo sa ADAMTS18 rotein 168 35
iens

2057 AAM39212 Homo SapiensHYSE- Human polypeptide 128 76
SEQ ID

NO 2357.

2058 gi~4959869~gMurine leukemiapolymerase 336 50

b~AAD3453virus

6.1

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
216
Tahlc:
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2058 gi~9630313~rGibbon ape pol polyprotein 331 46

ef~NP_0567leukemia
virus

90.1

2058 gi~6723273~dBaboon gag-pol precursor polyprotein329 49

bj~BAA8965endogenous

9.1 ~ virus strain
M7

2059 gi~20546404~Homo Sapienssimilar to nuclear receptor179 91
coactivator

ref~XP_1164 4; RET-activating gene
ELE1

66.1

2060 gi~6731237~gHomo Sapiensmyoferlin 112 79

b~AAF27177

.1 CAF
18231

7 1

2060 gi~798799~gbMus musculusimmunoglobulin heavy 72 55
chain

~AAC37713.

1~

2060 gi~20819487~Mus musculussimilar to LYRIC 72 27

ref~XP_1453

57.1

2061 gi415738 Euglena PSII D1- olype tide 75 27
gracilis

2061 gi11491 Euglena 32 kd rotein 75 27
gracilis

2061 gi11488 Euglena 32-Kda thylakoid membrane75 27
acilis protein

2062 gi21360549ArabidopsisAT3g01480/F4P13 3 79 29

thaliana

2062 gi3337366Arabidopsisnodulin-like protein 68 36

thaliana

2063 17959778 Homo sa PR01546 121 42
iens

2063 AAG02639 Homo SapiensGEST Human secreted protein,119 53
SEQ ID

NO: 6720.

2063 AAG02753 Homo SapiensGEST Human secreted protein,110 45
SEQ ID

NO: 6834.

2064 g115077406Antheraea fibroin 109 30

yamamai

2064 AAB82806 Homo SapiensBOST- Human low density 92 24
lipoprotein

binding roteiii 2 (LBP-2).

2064 AA001059 Homo SapiensHYSE- Human polypeptide 90 30
SEQ ID

NO 14951.

2065 g1200964 Mus musculusserine 2 ultra hi h sulfur80 30
rotein

2065 1200962 Mus musculusserine 1 ultra high sulfur80 30
protein

2065 AAM99918 Homo SapiensHIJMA- Hurnan polypeptide75 28
SEQ ID

NO 34.

2066 g1544724 Cavia cholecystokinin A receptor;69 29
CCK-A

receptor

2066 g12541920Rattus cholecystokinintype-A 69 29
receptor

norvegicus

2066 12114152 Mus musculuscholecystokinin type-A 69 29
receptor

2067 g12828586Pongo pygmaeusBRCA1 73 22

2068 AAM40813 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID

NO 5744.

2068 AAM39027 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID

NO 2172.

2068 AAY25768 Homo SapiensHUMA- Human secreted 75 29
protein

encoded from gene 58.

2070 11334150 Mus musculusunidentified reading 169 28
frame (first ATG

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
217
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

at os. 210)

2070 gi557822 Saccharomycesmal5, stay len: 1367, 133 20
CAI: 0.3,

cerevisiae AMYH_YEAST P08640

GLUCOAMYLASE S1 (EC 3.2.1.3)

2070 gi1304387Saccharomycesglucoamylase 133 20

cerevisiae
var.

diastaticus

2071 gi17983056Brucella BETA-HEXOSAMINIDASE A 88 29

melitensis

2071 gi1573917Haemophilus multidrug resistance 81 33
' protein A (emrA)

influenzae
Rd

2071 gi17982813Brucella NITROGEN REGULATION 80 26

melitensis PROTEIN NTRB

2073 gi~17532255~Caenorhabditisankyrin and proline rich67 29
domains

ref~NP elegans
4964

31.1

2074 gi19919730Homo SapiensBTEBS 704 97

2074 gi13195441Homo sapiensBTE-binding protein 4 478 64

2074 114549656Mus musculusdo amine receptor regulating452 76
factor

2076 AAE17482 Homo SapiensZYMO Human leucine-rich 1326 100
repeat-7

(ZLRR7) rotein.

2076 AAU83190 Homo SapiensZYMO Novel secreted protein1326 100

Z887300G2P.

2076 ABB 11242Homo SapiensHYSE- Human SLIT-2 homologue,568 99

SEQ ID N0:1612.

2077 g118893729Pyrococcus proteaseiv 74 34

furiosus
DSM

3638

2077 AAB94745 Homo SapiensHELI- Human protein sequence71 34
SEQ

ID N0:15792.

2077 g116413096Listeria 11n0656 68 35
innocua

2078 g160675 Beet ringspotpolyprotein 75 37

virus

2078 gi~14743288~Homo Sapienssimilar to Alu subfamily92 58
J sequence

reflXP contamination warning
0471 entry

91.1

2078 gi~20260801~Beetringspotpolyprotein 75 37

ref~NP_6201virus

13.1

2079 g13834629Mus musculusdiaphanous-related formin;208 67
p134

mDia2

2079 AAG74400 Homo SapiensHUMA- Human colon cancer71 36
antigen

rotein SEQ ID N0:5164.

2079 13171906 Homo SapiensDIA-156 roteiii 71 36

2080 g117298315Homo sa ienscandidate tumor suppressor125 100
rotein

2080 g17861733Homo Sapienslow density lipoprotein 125 100
receptor related

protein-deleted in tumor

2080 g18926243Mus musculuslow density lipoprotein 90 63
receptor related

protein LRP1B/LRP-DIT

2081 g14574224Fundulus multidrug resistance 343 55
transporter

heteroclitushomolog

2081 g116304396Pseudopleuronecmultidrug resistance 340 52
transporter-like

tes americanusprotein

2081 g13355757Gallus gallus~ ABC transporter protein~ 328 ~ 53

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
218
Table 2
SEQ AccessionSpecies Description Score

ID No. Identity

NO:

2082 gi7532975bacteriophageP10 67 27

phi-8

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
219
Table 3
SEQ ID DatabaseDescription *Results

NO: entr
ID

1059 BL00349CTF/NF-I roteins. BL00349H 15.70 9.710e-09
8-45

1061 DM00215PROLINE-RICH PROTEIN DM00215 19.43 6.143e-10
3. 29-61

DM00215 19.43 8.322e-09
40-72

1062 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 6.092e-12
II 80-99

ORF2.

1063 PR00944COPPER ION BINDING PROTEINPR00944E 9.18 7.132e-09
33-46

SIGNATURE

1076 PD00078REPEAT PROTEIN ANK PD00078B 13.14 9.217e-09
23-35

NUCLEAR ANKYR.

1089 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 8.754e-10
16-25

SIGNATURE

1089 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.658e-09
16-30

SIGNATURE

1089 PR00341PRION PROTEIN SIGNATUREPR00341E 3.32 9.898e-09
24-43

1099 PR00886HIGH MOBILITY GROUP PR00886C 11.84 1.141e-12
28-46

(HMGl/HMG2) PROTEIN

SIGNATURE

1107 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.077e-09
PI 51-65

SIGNATURE

1118 BL00472Small cytokines BL00472A 7.45 5.655e-09
1-12

(intercrine/chemokine)
C-C

subfamily signatur.

1118 PR00655AUXIN BINDING PROTEIN PR00655E 8.06 9.000e-09
88-103

SIGNATURE

1119 BL00970Nuclear transition proteinBL00970C 14.80 8.183e-12
2 proteins. 99-136

1119 BL00826MARCKS family roteins. BL00826B 12.51 4.279e-09
92-143

1119 BL00348p53 tumor antigen proteins.BL00348F 23.19 5.881e-10
93-135

BL00348F 23.19 6.857e-09
91-133

1119 PD01457RIBOSOMAL PROTEIN 40S PD01457A 16.51 8.216e-09
ZINC- 73-117

FINGER METAL.

1119 BL00752XPA protein. BL00752B 19.17 7.866e-09
100-143

BL00752B 19.17 8.979e-09
63-106

1119 DM01269303 kw ACTIVATING RAN DM01269A 23.35 9.446e-09
109-136

GTPASE ISOZYME.

1124 DM01813EGG-LAYING HORMONE. DM01813A 15.31 5.215e-09
15-42

1127 BL00452Guanylate cyclases proteins.BL00452A 17.52 1.170e-09
6-27

1131 BL00113Adenylate kinase roteins.BL00113B 20.49 9.897e-09
157-200

1162 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.000e-35
ZINC- 24-62

FINGER METAL-BINDING
NU.

1163 BL00407Connexins proteins. BL00407B 14.23 9.775e-30
21-51

BL00407C 14.61 2.500e-24
52-79

1163 PR00206CONNEXIN SIGNATURE PR00206B 13.75 1.957e-24
33-55

PR00206A 11.35 6.559e-23
2-26

PR00206C 15.16 7.469e-20
58-78

1171 PD01066PROTEIN ZINC FINGER PD01066 19.43 8.500e-28
ZINC- 35-73

FINGER METAL-BINDING
NU.

1177 DM018031 HERPESVIRUS DM01803C 7.00 7.240e-09
46-55

GLYCOPROTEIN H.

1190 PR00774GUANYLIN PRECURSOR PR00774A 6.49 8.579e-10
69-81

SIGNATURE

1195 PD02059CORE POLYPROTEIN PROTEINPD02059C 21.58 8.031
e-09 100-140

GAG CONTAINS: P.

1197 BL00472Small cytokines BL00472A 7.45 8.000e-14
1-12

(intercrine/chemokine)
C-C

subfamily signatur.

1213 PR00437SMALL CXC CYTOKINE ~ PR00437C 14.85 1.310e-16
33-51

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
220
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

FAMILY SIGNATURE

1213 BL00471Small cytokines BL00471 23.92 7.960e-10
6-53

(intercrine/chemokine)
C-x-C

subfamily signat.

1216 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 5.208e-09
183-192

SIGNATURE

1222 PF00852Fucosyl transferase. PF00852F 15.97 1.409e-15
195-231

1224 BL00299Ubi uitin domain roteins.BL00299 28.84 6.301e-11
47-98

1230 PR00540MUSCARINIC M3 RECEPTOR PR00540A 10.24 7.174e-09
134-153

SIGNATURE

1240 BL00290Immunoglobulins and BL00290A 20.89 7.480e-10
major 160-182

histocompatibility complexBL00290B 13.17 2.875e-09
roteins. 226-243

1258 PR00792PEPSIN (Al) ASPARTIC PR00792A 11.54 5.500e-18
80-100

PROTEASE FAMILY SIGNATURE

1258 BL00141Eukaryotic and viral BL00141A 12.10 4.789e-15
aspartyl 87-102

proteases roteins. BL00141B 12.14 2.929e-10
228-239

1300 BL00616Histidine acid phosphatasesBL00616A 11.86 1.000e-09
136-143

phos hohistidine proteins.

1301 DM014176 kw INDUCING XPMC2 DM01417C 12.93 9.325e-12
361-372

MUSHROOM SPAC22G7.04. DM01417D 11.08 9.820e-12
400-415

1302 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 6.067e-11
324-338

SIGNATURE

1311 BL00926Lysyl oxidase copper-bindingBL00926B 13.84 7.453e-09
region 84-121

roteins.

1320 PR00830ENDOPEPTIDASE LA (LON) PR00830A 8.41 3.712e-09
29-48

SER1NE PROTEASE (S16)

SIGNATURE

1325 BL00048Protamine P1 proteins. BL00048 6.39 4.671e-10
58-84

BL00048 6.39 4.908e-10
60-86

BL00048 6.39 2.913e-09
59-85

BL00048 6.39 5.950e-09
57-83

1345 PF00424REV protein (anti-repressionPF00424A 14.34 2.436e-09
184-215

transactivator protein).

1345 BL00048Protamine P1 proteins. BL00048 6.39 4.553e-10
178-204

BL00048 6.39 6.513e-09
179-205

1353 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 2.857e-15
II 82-101

ORF2.

1363 PF00850Histone deacetylase PF00850B 10.13 5.154e-14
family. 95-109

PF00850C 14.55 9.063e-11
132-148

1389 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.423e-09
PI 50-64

SIGNATURE

1389 PD00306PROTEIN GLYCOPROTE1N PD00306B 5.57 7.000e-09
59-69

PRECURSOR RE.

1396 BL00427Disinte ins roteins. BL00427 13.93 7.698e-17
260-314

1396 PR00289DISINTEGR1N SIGNATURE PR00289A 13.62 5.667e-14
274-293

1416 BL00419Photosystem I psaA and BL00419B 22.23 9.489e-09
psaB 18-51

roteins.

1434 PF00075RNase H. PF00075I 16.21 7.375e-11
167-173

1440 BL00598Chromo domain proteins.BL00598 14.45 1.500e-15
112-133

1440 PR00504CHROMODOMA1N SIGNATURE PR00504B 9.12 5.200e-13
106-120

PR00504C 11.19 6.510e-09
121-133

1450 PF00622Domain in SPla and the PF00622B 21.00 2.227e-09
RYanodine 93-114

Rece tor.

1451 PD02935FATTY ACID PD02935C 16.62 4.375e-16
59-86

OXIDOREDUCTASE BIOSYNT.

1467 BL00479Phorbol esters / diacylglycerolBL00479A 19.86 3.000e-11
130-152

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
221
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

binding domain proteins.BL00479B 12.57 3.340e-10
156-171

1468 PF00992Tro onin. PF00992A 16.67 5.563e-10
139-173

1468 BL00795Involucrin proteins. BL00795C 17.06 3.600e-09
193-237

1468 PR00042FOS TRANSFORMING PROTEINPR00042D 8.97 7.554e-09
141-162

SIGNATURE

1474 BL00107Protein kinases ATP-bindingBL00107A 18.39 9.308e-12
region 62-92

proteins.

1474 PR00109TYROSINE KINASE CATALYTICPR00109B 12.27 1.563e-09
62-80

DOMAIN SIGNATURE

1474 BL00239Receptor tyrosine kinaseBL00239C 18.75 4.205e-09
class II 49-71

proteins.

1475 BL00456Sodiuxnaolute symporterBL00456C 24.55 4.886e-28
family 15-69

proteins.

1480 BL00983L -6 / u-PAR domain BL00983C 12.69 1.346e-09
roteins. 36-51

1482 BL00979G-protein coupled receptorsBL00979A 19.66 9.633e-12
family 3 74-121

roteins.

1502 PD02561DETHIOBIOTIN SYNTHETASEPD02561B 12.71 9.308e-09
176-182

SYNTHASE.

1506 BL00297Heat shock hsp70 proteinsBL00297H 15.46 9.625e-23
family 302-355

proteins. BL00297D 11.95 6.063e-21
166-205

BL00297E 18.56 6.077e-21
226-269

BL00297C 9.51 9.667e-15
105-156

1506 PR0030170 KD HEAT SHOCK PROTEINPR00301I 12.76 3.208e-11
320-336

SIGNATURE

1513 PR00130DNASE I SIGNATURE PR00130E 14.66 5.046e-09
237-266

1515 DM012423 THREONINE--TRNA LIGASE.DM01242A 20.32 5.286e-20
163-206

1517 BL00983Ly-6 l u-PAR domain BL00983B 8.19 5.935e-10
roteins. 40-49

1520 BL00415S a sins proteins. BL00415P 2.37 3.914e-10
138-173

1520 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 3.746e-09
124-138

SIGNATURE PR00049D 0.00 1.000e-08
123-137

1530 PF00075RNase H. PF00075F 12.87 5.500e-10
127-137

1537 PR00463E-CLASS P450 GROUP I PR00463F 17.63 5.219e-13
288-306

SIGNATURE PR00463A 11.40 8.714e-12
52-71

PR00463B 17.50 5.041e-10
76-97

1537 PR00385P450 SUPERFAMILY PR00385C 16.94 6.318e-09
289-300

SIGNATURE

1538 PR00709AVIDIN SIGNATURE PR00709A 4.60 5.585e-09
19-37

1553 DM01354kw TRANSCRIPTASE REVERSEDM01354Y 10.69 6.423e-16
II 113-152

ORF2.

1558 PD01066PROTEIN ZINC FINGER PD01066 19.43 6.400e-25
ZINC- 70-108

FINGER METAL-BINDING
NU.

1564 PF00589Phage integrase family.PF00589B 16.17 1.621e-11
158-171

PF00589C 14.62 9.609e-10
183-194

1566 BL00908Mandelate racemase / BL00908B 37.71 6.455e-13
muconate 191-245

lactonizing enzyme family
signa.

1567 PR00702ACRIFLAVIN RESISTANCE PR00702A 14.92 2.421e-25
8-32

PROTEIN FAMILY SIGNATUREPR00702B 12.77 9.690e-18
36-54

1570 BL01047Heavy-metal-associated BL01047A 13.50 5.125e-17
domain 75-97

proteins.

1575 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 9.429e-15
II 80-99

ORF2.

1606 PF00642Zinc finger C-x8-C-x5-C-x3-HPF00642 11.59 2.575e-11
type 197-207

(and similar).

1610 DM01354kw TRANSCRIPTASE REVERSEDM01354I 15.55 7.702e-34
II 348-388

ORF2. DM01354G 11.57, 3.625e-32
277-307

DM01354H 18.00 2.528e-23
308-347

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
222
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

DM01354F 14.56 4.088e-11
241-276

1616 PD02929 ADHESION GLYCOPROTE1N PD02929A 28.27 2.263e-25
32-85

PRECURSORI.

1627 PR00121 SODIiJM/POTASSITJM- PR00121A 6.71 1.000e-08
15-29

TRANSPORTING ATPASE

SIGNATURE

1630 PR00824 HEPATIC LIPASE SIGNATUREPR00824A 7.81 7.214e-22
6-24

1640 BL00359 Ribosomal protein L11 BL00359C 22.18 1.155e-11
proteins. 93-126

1641 PR00080 ALCOHOL DEHYDROGENASE PR00080A 9.32 8.839e-10
134-145

SUPERFAMILY SIGNATURE

1641 PR00081 GLUCOSE/RIBITOL PR00081A 10.53 2.000e-12
45-62

DEHYDROGENASE FAMILY PR00081E 17.54 1.783e-10
238-255

SIGNATURE PR00081B 10.38 2.227e-09
134-145

1641 BL00061 Short-chain BL00061A 9.41 9.053e-10
134-144

dehydrogenases/reductasesBL00061B 25.79 6.860e-09
family 197-234

roteins.

1666 BL01257 Ribosomal protein LlOeBL01257D 18.80 2.973e-15
proteins. 59-98

1667 BL01241 Link domain proteins. BL01241 35.81 8.579e-37
180-232

BL01241 35.81 7.835e-14
289-341

1667 BL00086 Cytochrome P450 cysteineBL00086 20.87 3.377e-09
heme- 283-314

iron 1i and roteins.

1668 PR00671 INHIBIN BETA B CHAIN PR00671A 8.36 8.088e-09
4-22

SIGNATURE

1672 BL00674 AAA-protein family BL00674E 15.24 5.680e-15
proteins. 31-50

1682 PF00075 RNase H. PF00075A 14.44 4.400e-13
73-89

PF00075C 11.58 8.442e-09
152-163

1689 PD01066 PROTEIN ZINC FINGER PD01066 19.43 6.471 e-27
ZINC- 268-306

FINGER METAL-BINDING
NU.

1689 PR00788 NITROPHOR1N SIGNATURE PR00788A 9.79 6.108e-09
3-15

1692 BL00299 Ubiquitin domain proteins.BL00299 28.84 4.759e-10
32-83

1697 PR00423 CELL DIVISION PROTEIN PR00423E 7.36 4.038e-09
FTSZ 20-41

SIGNATURE

1706 BL00795 Involucrin proteins. BL00795C 17.06 5.395e-10
185-229

1709 BL00514 Fibrinogen beta and BL00514C 17.41 3.618e-25
gamma chains 68-104

C-terminal domain proteins.BL00514H 14.95 6.745e-16
230-254

BL00514G 15.98 6.566e-14
198-227

BL00514E 14.28 8.286e-14
128-144

BL00514D 15.35 2.915e-12
109-121

1714 PF00878 Cation-independent PF00878T 17.51 3.818e-09
mannose-6- 41-67

hos hate receptor re
eat roteins.

1715 PF01140 Matrix rotein (MA), PF01140D 15.54 4.872e-09
15. 123-157

1715 PF00992 Troponin. PF00992A 16.67 6.451e-10
109-143

PF00992A 16.67 3.724e-09
98-132

PF00992A 16.67 6.684e-09
96-130

1718 PD02474 SYNTHASE SMALL SUBUNITPD02474B 21.08 7.940e-10
92-130

ACETOLACT.

1725 BL00412 Neuromodulin (GAP-43) BL00412B 10.60 1.000e-10
proteins. 46-82

1725 PR00215 NEUROMODULIN SIGNATUREPR00215C 13.98 6.116e-10
54-74

1725 DM01688 2 POLY-IG RECEPTOR. DM01688G 16.45 3.160e-09
119-150

DM01688I 14.97 6.885e-09
107-154

1725 PD02870 RECEPTOR INTERLEUKIN-1PD02870B 18.83 8.564e-09
303-335

PRECURSOR.

1727 BL00107 Protein kinases ATP-bindingBL00107A 18.39 7.750e-21
region 185-215

proteins.

1727 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 7.176e-12
185-203

DOMAIN SIGNATURE

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
223
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

1727 BL00239 Receptor tyrosine kinaseBL00239B 25.15 4.387e-09
class II 119-166

roteins.

1728 BL00415 Synapsins proteins. BL00415Q 2.23 8.115e-09
52-87

1734 PD01270 RECEPTOR FC PD01270B 22.18 5.567e-18
75-111

IMMUNOGLOBULIN AFFIN. PD01270C 19.54 1.167e-17
118-146

PD01270A 17.22 4.960e-14
21-60

PD01270D 24.66 4.284e-09
152-187

1736 PD02346 PHOTOSYSTEM II PROTEINPD02346A 9.24 8.851e-09
6-17

PRECURSOR PHOTOSYNTHESIS.

1741 BL00415 Syna sins proteins. BL00415Q 2.23 6.777e-09
317-352

1744 BL00479 Phorbol esters / diacylglycerolBL00479B 12.57 1.000e-08
33-48

binding domain proteins.

1750 PR00763 COAGULIN SIGNATURE PR00763B 8.39 6.457e-09
41-60

1754 PR00276 INSULIN A CHAIN SIGNATUREPR00276A 11.84 7.840e-09
46-55

1755 PR00042 FOS TRANSFORMING PROTEINPR00042D 8.97 2.565e-09
164-185

SIGNATURE

1755 PF00922 Vesiculovirus hospho PF00922A 19.17 5.759e-09
rotein. 99-132

1778 PR00245 OLFACTORY RECEPTOR PR00245A 18.03 9.836e-14
59-80

SIGNATURE PR00245C 7.84 1.540e-13
237-252

PR00245B 10.38 2.125e-13
176-190

1778 BL00237 G-protein coupled receptorsBL00237A 27.68 1.474e-12
proteins. 90-129

1778 PR00534 MELANOCORTIN RECEPTOR PR00534A 11.49 4.729e-09
51-63

FAMILY SIGNATURE

1778 PR00237 RHODOPSIN-LIFE GPCR PR00237A 11.48 3.613e-09
26-50

SUPERFAMILY SIGNATURE PR00237C 15.69 7.525e-09
104-126

1787 PR00007 COMPLEMENT C1Q DOMAIN PR00007B 14.16 5.114e-15
146-165

SIGNATURE PR00007A 19.33 7.052e-10
119-145

1787 PR00524 CHOLECYSTOKININ TYPE PR00524F 5.36 4.351e-09
A 70-83

RECEPTOR SIGNATURE

1787 DM00250 kw ANNEXIN ANTIGEN DM00250B 13.84 6.595e-09
82-105

PROLINE TUMOR.

1787 BL00415 Syna sins roteins. BL00415N 4.29 7.372e-09
62-105

1787 BL01113 Clq domain proteins. BL01113B 18.26 3.786e-23
125-160

BL01113A 17.99 7.968e-15
73-99

BL01113A 17.99 5.091e-14
70-96

BL01113A 17.99 5.295e-11
64-90

BL01113A 17.99 8.568e-11
79-105

BL01113A 17.99 8.977e-11
67-93

BL01113A 17.99 4.635e-09
82-108

BL01113A 17.99 6.192e-09
76-102

BL01113A 17.99 7.750e-09
61-87

1787 BL00420 Speract receptor repeatBL00420A 20.42 8.691
proteins e-11 73-101

domain proteins. BL00420A 20.42 9.673e-11
70-98

BL00420A 20.42 2.180e-10
55-83

BL00420A 20.42 8.062e-09
52-80

1789 DM01930 2 kw FINGER SMCX SMCY DM01930E 15.41 2.964e-33
45-89

YDR096W.

1795 DM01688 2 POLY-IG RECEPTOR. DM01688I 14.97 7.480e-10
107-154

DM01688J 14.69 4.455e-09
60-96

1796 PFO0075 RNase H. PF00075J 15.78 4.115e-13
115-132

1802 PD00066 PROTEIN ZINC-FINGER PD00066 13.92 4.130e-11
METAL- 86-98

BINDI.

1802 BL00028 Zinc finger, C2H2 type,BL00028 16.07 1.600e-10
domain 110-126

proteins. BL00028 16.07 6.100e-10
70-86

1802 PR00048 C2H2-TYPE ZINC FINGER PR00048B 6.02 9.438e-10
83-92

SIGNATURE

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
224
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

1812 PD00078REPEAT PROTEIN ANK PD00078B 13.14 4.130e-09
157-169

NUCLEAR ANI~YR.

1824 PF00628PHD-finger. PF00628 15.84 5.500e-13
78-92

1833 PF00075RNase H. PF00075B 12.56 4.732e-10
156-166

1833 PR00939C2HC-TYPE ZINC-FINGER PR00939A 8.95 3.045e-09
137-146

SIGNATURE

1842 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.192e-09
PI 244-258

SIGNATURE

1844 BL00972Ubiquitin carboxyl-terminalBL00972D 22.55 3.348e-11
168-192

hydrolases family 2
proteins.

1857 PF00424REV protein (anti-repressionPF00424A 14.34 8.085e-09
71-102

transactivator rotein).

1860 PR00221CAULIMOVIRUS COAT PROTEINPR00221H 12.82 2.410e-09
184-197

SIGNATURE

1864 BL01282BIR re eat proteins. BL01282B 30.49 1.136e-10
214-252

1866 BL00155Cutinase, serine proteins.BL00155D 26.87 5.337e-09
19-67

1895 PF00075RNase H. PF00075F 12.87 7.353e-10
93-103

1911 BL00983Ly-6 J u-PAR domain BL00983C 12.69 6.365e-09
proteins. 101-116

1911 BL00272Snake toxins roteins. BL00272C 8.27 1.000e-08
105-116

1925 PR00308TYPE I ANTIFREEZE PROTEINPR00308A 5.90 6.795e-11
64-78

SIGNATURE PR00308C 3.83 2.385e-10
67-76

1925 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.438e-10
57-71

SIGNATURE

1925 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.654e-09
PI 59-73

SIGNATURE

1930 DM00179w KINASE ALPHA ADHESIONDM00179 13.97 5.263e-10
T- 107-116

CELL.

1935 PF00075RNase H. PF00075J 15.78 2.309e-12
81-98

1940 PF00075RNase H. PF00075F 12.87 3.864e-09
74-84

1952 PR00019LEUCINE-RICH REPEAT PR00019B 11.36 3.250e-10
184-197

SIGNATURE PR00019A 11.19 5.667e-09
187-200

1954 BL00546Matrixins cysteine switch.BL00546A 19.62 8.105e-30
77-106

_ BL00023Type II fibronectin BL00023 24.31 4.682e-35
1954 collagen-binding 340-376

domain proteins. BL00023 24.31 2.969e-28
282-318

BL00023 24.31 9.526e-24
224-260

1954 PR00138MATRIXIN SIGNATURE PR00138B 15.82 5.500e-18
144-159

PR00138A 15.14 8.773e-16
97-110

1954 BL00024Hemopexin domain proteins.BL00024B 21.53 9.591e-33
118-151

BL00024A 11.49 2.800e-13
97-107

BL00024C 22.98 7.796e-11
164-212

1954 PR00013FIBRONECTIN TYPE II PR00013C 12.29 1.000e-20
REPEAT 372-387

SIGNATURE PR00013C 12.29 3.571e-15
314-329

PR00013C 12.29 7.800e-14
256-271

PR00013A 12.26 5.500e-13
344-353

PR00013B 14.75 1.237e-11
355-367

PR00013B 14.75 4.000e-09
297-309

PR00013A 12.26 5.333e-09
286-295

PR00013A 12.26 7.833e-09
228-237

1957 BL01182Glycosyl hydrolases BL01182A 21.39 3.357e-34
family 35 77-119

proteins.

1957 PR00742GLYCOSYL HYDROLASE PR00742B 15.52 2.653e-14
78-96

FAMILY 35 SIGNATURE PR00742A 13.75 6.914e-10
57-74

1958 PR00449TRANSFORMING PROTEIN PR00449A 13.20 8.200e-15
P21 214-235

RAS SIGNATURE

1964 PR00727BACTERIAL LEADER PR00727A 12.93 7.000e-09
9-25

PEPTIDASE 1 (S26) FAMILY

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
225
Table 3
SEQ DatabaseDescription *Results
ID

NO: entr
ID

SIGNATURE

1965 PF00075RNase H. PF00075D 10.71 7.188e-09
71-81

1966 PF00075RNase H. PF00075C 11.58 9.786e-11
110-121

PF00075B 12.56 1.878e-10
78-88

1968 DM008923 RETROVIRAL PROTE1NASE.DM00892C 23.55 4.082e-11
314-347

1970 PF00075RNase H. PF00075J 15.78 8.571e-10
335-352

1973 PF00589Pha a integrase family.PF00589B 16.17 1.450e-14
101-114

1974 BL00675Sigma-54 interaction BL00675B 24.07 1.000e-24
domain 118-172

proteins ATP-binding BL00675C 13.51 6.400e-24
region A 183-210

roteins. BL00675D 12.03 1.750e-09
245-254

1987 PR00153CYCLOPHIL1N PEPTIDYL- PR00153B 11.57 1.500e-17
52-64

PROLYL CIS-TRANS PR00153A 12.98 4.255e-10
23-38

ISOMERASE SIGNATURE

1987 BL00170Cyclophilin-type peptidyl-prolylBL00170B 20.97 6.250e-33
cis- 47-86

trans isomerase signatur.BL00170A 17.08 2.309e-09
17-43

1998 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.750e-37
ZINC- 27-65

FINGER METAL-BINDING PD01066 19.43 8.863e-11
NU. 68-106

1999 PF00992Tro onin. PF00992A 16.67 3.487e-09
108-142

1999 BL00224Clathrin light chain BL00224B 16.94 7.055e-09
proteins. 96-148

1999 BL00422Granins proteins. BL00422C 16.18 8.059e-09
117-144

2001 BL00019Actinin-type actin-bindingBL00019B 13.34 7.158e-14
domain 261-283

roteins.

2001 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 3.500e-13
II 345-364

ORF2.

2008 PD01719PRECURSOR GLYCOPROTEIN PD01719A 12.89 3.483e-16
63-90

SIGNAL RE.

2011 BL00282Kazal serine protease BL00282 16.88 6.577e-10
inhibitors 127-149

family proteins.

2011 BL00222Insulin-like growth BL00222B 11.09 6.940e-10
factor binding 74-89

proteins.

2011 BL00621Tissue factor proteins.BL00621A 8.69 6.473e-09
5-22

2012 PD02563PROTEIN NONSTRUCTURAL PD02563C 13.51 9.634e-10
C 74-128

VP18.

2013 PR00124ATP SYNTHASE C SUBUNIT PR00124A 8.81 5.655e-09
58-77

SIGNATURE

2013 PR00783MAJOR INTRINSIC PROTEINPR00783C 13.54 8.981e-09
48-67

FAMILY SIGNATURE

2034 PF00075RNase H. PF00075F 12.87 6.523e-09
183-193

2037 BL00326Tropom osins proteins. BL00326D 8.76 9.327e-09
115-155

2048 PR00671INHIB1N BETA B CHAIN PR00671B 4.29 8.767e-10
138-157

SIGNATURE

2052 PD02455ELEMENT TRANSPOSABLE PD02455C 29.23 5.230e-09
225-27_6

INSERTION PROTEIN

TRANSPOSITION DNA.

2058 PF00075RNase H. PF00075J 15.78 9.000e-10
81-98

_ PD00066PROTEIN ZINC-FINGER PD00066 13.92 4.000e-13
2074 METAL- 62-74

BINDI.

2074 PR00048C2H2-TYPE ZINC FINGER PR00048B 6.02 4.462e-11
59-68

SIGNATURE PR00048B 6.02 1.000e-10
89-98

PR00048A 10.52 9.609e-10
101-114

2074 BL00028Zinc finger, C2H2 type,BL00028 16.07 9.100e-13
domain 104-120

proteins. BL00028 16.07 1.OOOe-O8
46-62

2076 PR00019LEUCINE-RICH REPEAT PR00019A 11.19 1.900e-11
106-119

SIGNATURE

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
226
Table 3
* Results include in order: Accession No., subtype, e-value, and amino acid
position of the signature in the
corresponding polypeptide

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
227
Table 4
SEQ Pfam Model Description E-value Score No: Position
of of
NO: Pfam the
DomainsDomain

1050 FAA_hydrolaseFumarylacetoacetate 0.64 -89.1 1 22-143
(FAA) hydrolase
fam

1066 rubredoxin Rubredoxin 7.2 -11.1 1 4-37

1076 ank Ankyrin re eat 0.01 22.5 1 25-57

1076 sodfe_C Iron/manganese superoxide3.9 -67.9 1 38-124
dismutases,
C-term

1076 DUF232 Putative transcriptional8.1 -29.1 1 134-254
regulator

1099 box HMG (high mobility grou8 -22.4 1 17-61
HMG ) box

1109 _ u-PAR/Ly-6 domain 0.21 -6.2 1 34-112
UPAR LY6

1110 ldl_recept Low-density lipoprotein8.8e-07 36.0 1 196-240
a receptor
d omain

1110 CUB CUB domain 0.38 -27.8 1 52-161

1118 rvt Reverse transcri tase 0.95 -46.1 1 38-207

1125 adenylatekinaseAdenylate kinase 0.00037 -77.6 1 13-103

1162 KRAB KR AB box 1.1 e-2392.1 1 22-62

1163 connexin Connexin 3.1e-23 90.6 1 1-130

1171 KR.AB KRAB box 6.6e-22 86.2 1 33-73

1193 MHC_I Class I Histocompatibility2e-06 1.1 1 29-205
antigen,
domains

1209 DOMON DOMON domain 1.9e-12 54.8 1 102-215

1213 IL8 Small cytokines (intecrine/chemokine),0.59 -7.8 1 18-
55
inter

1218 cys rich_FGFRCysteine rich repeat 4.4 -11.0 1 28-76

1222 Gl co transfGlycosyltransferase 6.6e-06 -54.1 1 1-322
family 10

1240 ig Immunoglobulin domain 1.6e-06 35.1 2 41-
124:156-
230

1258 as Eukaryotic aspartyl 8e-06 -110.81 19-241
protease

1280 DOMON DOMON domain 8.9 -16.6 1 35-117

1288 PDZ PDZ domain (Also known 1.1 0.4 1 7-73
as DHR or
GLGF)

1301 ExonucleaseExonuclease 3.4e-33 123.7 1 322-479

1311 Gemini_mov Geminivirus putative 5.7 -40.5 1 15-79
movement
protein

1341 fn3 Fibronectin type III 6.6e-36 132.7 2 109-
domain 200:212-
301

1345 Colla en Colla en tri 1e helix 7.3 -65.8 1 185-243
re eat (20 copies)

1365 Amidase Amidase 0.017 -178.91 68-276

1375 Galactosyl Galactosyltransferase 7.1e-44 159.2 1 113-309
T

1375 Glyco transfGlycosyltransferase 3 -77.1 1 146-293
25 family 25

1381 GRAM GRAM domain 6.6e-14 59.6 1 65-116

1396 Pep M12B-propReprolysin family propeptide1.4e-27 105.1 1 75-191
ep

1396 disintegrinDisinte in 2.6e-10 47.7 1 243-318

1398 SK_channel Calcium-activated SK 1.8e-06 34.9 1 1-57
potassium
channel

1413 i Immunoglobulin domain 5.4 9.1 1 29-88

1416 dUTPase dUTPase 0.00044 9.6 1 111-237

1420 Folate rec Folate receptor family 1.7 -111.21 14-175

1434 lectin c Lectin C-type domain 1.5e-05 28.0 1 233-319

1440 chromo 'chromo' (CHRromatin 4.6e-11 50.2 1 92-133
Organization
Modifier)

1449 PMSR Peptide methionine sulfoxide0.0089 -65.8 1 4-79
reductase

1450 SPRY SPRY domain ~ 9e-26 ~ 99.0~ 1 ~ 109-240

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
228
Table 4
SEQ Pfam Model Description E-value Score No: Position
ID of of
NO: Pfam the
DomainsDomain

1451 MaoC dehydrataMaoC like domain 2.1e-15 64.6 1 31-152
s

1463 NTP transf Nucleotidyltransferase 2.6e-12 54.3 1 121-234
2 domain

1467 DAG_PE-bindPhorbol esters/diacylglycerol8.7e-05 27.4 1 130-180
binding
dom

1467 DC1 DC1 domain 0.66 11.2 1 141-172

1470 'rri C jmjC domain 0.46 -18.2 1 166-262

1474 pkinase Protein kinase domain 0.0019 -85.7 1 2-187

1475 SSF Sodiumaolute sym orter 0.13 -177.11 1-311
family

1478 dUTPase dUTPase 7.6 -37.5 1 2-98

1479 fn3 Fibronectin type III 1.1e-19 78.9 1 14-100
domain

1485 rnaseH RNase H 0.36 -28.0 1 59-175

1488 NTR NTR/C345C module 0.044 -6.1 1 293-398

1506 HSP70 Hsp70 rotein 1.6e-13 38.3 1 61-424

1517 UPAR LY6 u-PAR/Ly-6 domain 0.33 -8.2 1 44-106

1530 rnaseH RNase H 0.011 -11.7 1 64-155

1537 p450 Cytochrome P450 2.1 -176.61 31-316

1537 DNA ligase NAD-dependent DNA ligase9.2 -42.9 1 200-256
OB OB-fold
d omain

1558 KRAB KRAB box 1.8e-18 74.8 1 68-108

1564 Phage integrasePha a irate rase family1.2e-09 45.5 1 39-204

1566 MR_MLE Mandelate racemase / 0.00079 -24.5 1 153-352
muconate
lactonizing en

1570 HMA Heavy-metal-associated 6.6e-13 56.3 1 71-131
domain

1580 i Immunoglobulin domain 0.99 15.2 1 23-131

1601 WD40 ' WD domain, G-beta repeat2e-08 41.5 3 39-
75:83-
118:126-
162

1606 zf CCCH Zinc finger C-x8-C-x5-C-x3-H0.094 19.3 3 105-
type 129:141-
173:183-
209

1612 zf CCHC Zinc knuckle 2.1e-05 31.4 2 167-
184:202-
219

1618 rnaseH RNase H 6.3e-14 59.7 1 24-144

1618 Zn Irate ase Zinc binding 3.8e-07 37.2 1 146-185
Irate ase domain

1618 _ Domain of unlaiown function9.3 -7.0 1 104-186
DUF224 (DUF224)

1641 adh short short chain dehydrogenase4.6e-32 119.9 1 42-309

1667 Xlink Extracellular link domain2.9e-83 290.0 2 162-
267:273-
364

1667 ig Immunoglobulin domain 0.0015 25.2 1 61-145

1682 rvt Reverse transcri tase 3.1e-31 117.2 1 56-238

1683 Ga 30 Gag P30 core shell protein2.9e-33 124.0 1 8-197

1689 KRAB KRAB box 4.9e-22 86.6 1 266-306

1692 ubiquitin Ubiquitin family 0.00061 26.5 1 17-91

1709 fibrinogen_CFibrinogen beta and 7.9e-85 295.2 1 37-255
gamma chains, C-
term

1713 HOK GEF Hok/gef family 2.4 -7.8 1 7-54

1716 Ga 30 Gag P30 core shell protein0.0036 -49.7 1 64-229

1721 rnaseH RNase H 0.011 -11.7 1 207-350

1722 dUTPase dUTPase 0.37 -22.9 ~ 1 ~ 93-217

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
229
Table 4
SEQ Pfam Model Description E-valueScore No: Position
ID of of
NO: Pfam the
DomainsDomain

1725 ig Irninunoglobulin domain 4.2e-1357.0 2 80-
141:259-
320

1725 IQ IQ calmodulin-bindin 4.3e-0530.4 1 49-69
motif

1727 pkinase Protein kinase domain 3e-21 84.0 1 71-267

1728 Fringe Frin e-like 5.9 -112.61 165-370

1734 ig Immuno lobulin domain 0.014 22.0 1 117-170

1737 PP2C Protein phos hatase 2C 0.0067 -50.5 1 37-273

1738 SH3 SH3 domain 1.7e-0531.7 1 102-159

1740 rnaseH RNase H 0.0042 -7.3 1 126-270

1744 DAG_PE-bindPhorbol esters/diacylglycerol2.9 -11.1 1 26-55
binding
door

1744 PHD PHD-fin er 3.3 -14.7 1 9-61

1760 GARS_N Phosphoribosylglycinamide8.2 -62.0 1 35-95
synthetase,
N

1760 Armadillo Armadillolbeta-catenin-like9.1 8.7 2 44-
seg repeat 84:131-
171

1778 7tm 1 7 transmembrane receptor1e-12 55.7 1 41-276
(rhodopsin .
family)

1778 YCF9 YCF9 3.1 -18.5 1 203-258

1787 Clq C1 domain 1e-05 13.2 1 111-230

1787 Collagen Collagen tri 1e helix 0.0043 -3.0 1 50-107
re eat (20 co ies)

1789 jm'C jmjC domain 0.0007812.0 1 52-241

1795 i Immunoglobulin domain 0.0037 23.9 1 64-141

1796 rve Inte ase core domain 2.6e-28107.5 1 20-174

1802 zf C2H2 Zinc finger, C2H2 type 6e-15 63.1 2 68-
90:108-
130

1806 Filamin Filamin/ABP280 re eat 0.0005418.6 1 26-131

1812 ank Ankyrin repeat 3.6e-2390.4 3 159-
191:205-
237:244-
276

1824 PHD PHD-forger 1.1e-1255.6 1 62-110

1826 PAP assoc PAP/25A associated domain1.5e-0635.2 1 101-155

1827 ig Immunoglobulin domain 1.6 13.4 1 29-102

1830 RhoGEF RhoGEF domain 3.3e-0624.0 1 110-280

1830 PH PH domain 2.8 6.7 1 356-451

1833 zf CCHC Zinc knuckle 2.1e-0634.7 1 137-154

1833 rvt Reverse transcriptase 7.7e-0625.9 1 84-277

1844 UCH-2 IJbiquitin carboxyl-terminal0.15 -8.5 1 165-238
hydrolase
family

1846 Armadillo Armadillo/beta-catenin-like0.28 17.7 2 50-
seg repeat 91:92-
132

1 zf CCHC Zinc knuckle 3.2e-0530.8 1 179-196
860

_ zf C3HC4 Zinc finger, C3HC4 type 0.0022 23.3 1 218-256
1864 (RING
fin er)

1887 ig Immunoglobulin domain 4e-08 40.4 1 35-112

1889 LRR Leucine Rich Repeat 0.051 20.1 1 62-85

1 rnaseH RNase H 3.4e-0625.8 1 47-177
895

_ Brevenin Brevenin/esculentin/gaegurin/rugosin7.5 -2.9 1 1-51
1899 family

1911 UPAR LY6 u-PAR/Ly-6 domain ~ 1.3e-06~ 35.4~ 1 ~ 44-117

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
230
Table 4
SEQ Pfam Model Description E-valueScore No: Position
of of
NO: Pfam the
DomainsDomain

1911 toxin Snaketoxin 3 -19.5 1 66-117

1911 Activin Activin es I and II receptor9.5 -14.0 1 30-118
rec domain

1912 Retroviral aspa 1 protease7 -26.3 1 42-142

1913 SAM SAM domain (Sterile alpha3.9e-1357.1 2 105-
motif) 170:183-
247

1916 Sema Sema domain 1.4e-1454.6 1 51-434

1926 PAP2 PAP2 su erfamily 2.9e-0737.6 1 48-142

1930 i Immunoglobulin domain 2.7e-0737.6 1 41-116

1935 rve Inte rase core domain 2.5e-1357.7 1 1-138

1940 rnaseH RNase H 1.1e-26102.0 1 24-153

1940 Integrase Integrase Zinc binding 4.7e-1253.5 1 155-194
Zn domain

1952 LRRNT Leucine rich repeat N-terminal0.0027 24.4 1 67-95
domain

1953 UQ con Ubiquitin-con'ugatin 2.8e-0840.9 1 78-219
enzyme

1954 Peptidase Matrixin 6.7e-86298.8 1 53-212
M10

1954 fn2 Fibronectin type II domain1e-79 278.2 3 231-
272:289-
330:347-
388

1958 ras Ras family 1.9 -132.01 215-284

1963 is 1 Thrombos ondin type 1 0.083 8.0 1 20-63
domain

1966 rvt Reverse transcriptase 1.5e-0521.9 1 2-196

1968 G-patch G- atch domain 0.3 6.0 1 307-352

1968 Retroviral aspartyl rotease1.4 -19.9 1 274-385

1970 rve Inte ase core domain 0.78 -16.8 1 265-395

1973 Pha a integrasePha a integrase family 5.7e-0839.9 1 1-153

1974 Si ma54 Sigma-54 interaction 3.1e-37137.2 1 63-253
activat domain

1975 Na Pi cotransNa+/Pi-cotransporter 0.0085 -99.2 1 1-146

_ signal His Kinase A (phosphoacceptor)7 -7.7 1 85-147
1975 domain

1978 UPAR LY6 u-PAR/Ly-6 domain 1.8 -16.0 1 21-96

1978 Zn_clus Fungal Zn(2)-Cys(6) binuclear5.1 -5.7 1 21-60
cluster
domain

1987 pro isomeraseCyclophilin type peptidyl-1.2e-1875.4 1 4-171
rolyl cis-tr

_ zf CCHC Zinc knuckle 1.9e-0531.5 2 181-
1997 198:204-
220

1997 TFIID-31 Transcription initiation7.9 -633 1 75-187
factor I1D,
3lkD su

1997 Ga 12 Gag polyprotein, inner 8.9 -9.5 1 155-229
coat protein 12

1998 KRAB KRAB box 2e-23 91.2 1 27-65

2001 CH Cal onin homology (CH) 0.019 10.8 1 230-330
domain

2001 SAM SAM domain (Sterile al 0.9 6.5 1 248-311
ha motif)

2008 is 1 Thrombospondin a 1 domain0.013 15.1 1 64-98

2011 i Immunoglobulin domain 1.7e-0531.7 1 186-255

2011 kazal Kazal-type serine protease0.0002827.6 1 121-168
inhibitor
domain

2011 IGFBP Insulin-like growth factor0.17 2.5 1 53-113
binding
protein

2011 zf UBR1 Putative zinc fm er in 8.3 -24.0 1 54-112
N-recognin

2015 PH PH domain 0.0002 28.1 1 174-281

2015 efhand EF hand 0.0003127.5 1 339-367

2018 RPEL RPEL re eat 1.3 11.8 1 25-50

2034 rnaseH RNase H 4e-27 103.6 1 122-267

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
231
rr"1~1 o n
SEQ Pfam Model Description E-valueScore No: Position
of of the
Pfam

ID
DomainsDomain

NO:
2038 anulin Granulin 7.7 -17.8 1 62-91
2052 rve Integrase core domain 2.6e-2494.2 1 160-314
2057 Pep Ml2B~ropReprolysin family propeptide0.44 -29.3 1 179-263

ep
2058 rve Integrase core domain 8.7e-1459.2 1 1-140
2074 zf C2H2 Zinc finger, C2H2 type S.Se-2286.5 3 42-
66:72-

96:102-

124

2074 zf BED BED zinc finger 0.94 1.8 1 91-129
2074 TP1 Nuclear transition rotein7.5 2.2 1 21-76
2076 LRR 1 3.2e-2080.6 5 57-
Leucine Rich Repeat 80:81-

104:105-

128:129-

152:153-

176

2076 LRRNT Leucine rich repeat N-terminal0.0001328.8 1 27-55
2076 LRRCT domain 0.047 18.0 1 186-234
Leucine rich repeat C-terminal
domain

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
232
z
° ~ ° ° o ° °O~r
Q, " ~. ~. ~-, o o ~ r0
p N 'zS n n ,n .Q b
a ~ a a a a
o ~ ~ o ~ o, o.
H
J N N ~t .p N
~1 O Cn O~ N
W ~O \O ~O ~ ~~ W
N CAD f~D C~D CAD N N ~ b
i ~ ~ i i n ~
O ,-. ~. ~. ,-. ~ O ~ r..
~1 N V~ W ~. ~. O~
O O O O p O
.p ~ ~. ON O Oo ~
~p c~~i
O .O O .O O O v, b
n
N ~ ~ N ,-'P. O
~O O ~O lp (~D I~
:-' m H
O
r~
d
He
~o ~~ ~~~ y~~c~
' yo r"°'oo 0 0~
~°x ~~~ c ~y ~y
o ~ ~ H ~ '~ H trJ n t=i n o
m No ~ n~ n
O
m ~~m ~n ~n a.
yH yH
a ~ m ~ ~ ~ tHI'J ~ tHrJ
ra err c~~~0 ~ ~~ x~~xx~~
ax~~o~ r c~ c~~~ ~~ ~dx~~d~d~CH~
y~~~~~ ~o~o~~'~ ~o od~ood~o
H°°o°~ ~~m c~~ °~ ~~~~a~~a
.-,~~x~mz v~r~r" t~~ Hy v~~~..~~v~~Hv~ d
maOmm~Om
~~N~ro z~~' ~b r~ ~'y
O~ O~~ r~Zn ~ ~p ~~ ~ ~ o
~,
x ~ ~ '~ ~~ ~~ ~~ o
xr
r~ m m
~ ~ 0 0
r r

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
233
0 0 ~0~~
~d
a a ~ ~ O
H
CNn ~ ~ ~ N
.p N ? s0
:p W .p °° coo b~ b
N ;D ;P i P~ m
'O p ~ O O ~ ,~~. ""
i
0 o O w
N W
~~h
O O O O n
i~
N J
d
~a ~~ ~~~ ~~ c~~ra
r ~~C x~~ ~.~ ~ b
xr
r.;d ZO~ro
x a~Z
x~ ~~ ~ r~ 0 0
~z ~z
x
a
d
o r~
a m ~ ~ ~.
r~~°b~~~°o°z~oo° ~~~~~~~o~~o
~ tzi ~ trl ''d ~ "-~ ~ H a H H k~ ~ ''d ° ~-3 m d
r~~~°~~O~~Z~o~'~~~p~~~'~o~H~'bH~' b
r° ~zo~~-~~- xr~~ x~o~ ~ Nox~ox d
~~~~xb~~~ ar~c~ x~b ~ z~ z~
°r°~~ro~r~~~d~ ~ara~~a~ o
c~~ya~~9~
x °r~r~r~~o~~,x~ ~~~d~~, ~ o
~V N'~~~ ~aN~'~~o ~
Nboo ~ o ~o ~ o W ~ z° z
N ~ ~, ~, ~ H x~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
234
° ° o~~'
~o ~o ~ ~ a\ a~ o,
-. .-. ~ ~. ~ ~ ro
w ~ ,fl ~ ~ ~ C
o ~~ w ~' ~ ~° o ~ td
a a a x
o~, o~, o~, o~, owo ~ ~ H
H
.a
0 0 0 o w o
0 0 0 0 ~ o ~ b~ ro
°m°o °w .o~ 0 0 0
a\ ~ ~ ~ ~ ~ rt
O O O O
O
.'~P J N
~M
0 0 0 0 0 ~ ro
tn ~-. i-. N
O 0o O v' N c~D
0o IJ
Owo N
O N
r ~
d
b
r ~~ ~~ ~~ r~ ~
o ro ro
~r o
H ~ ~ ~ x n
H ~ ~ o
Na ~
o ~ ~ n ~~ ~~
x
C7 y
a a~
~l C7 trJ ~ v~ H C7 ttJ ~] v~ v~ x~ C~ trJ f'l H C~ ''d H ''d a f~ H
HH~y~-Cr O~O7~~7~~W~O~
H ~ ~ ~ f~ H ~ ~ ~ 7~ c7 H m z ~-d z O O 'T' tn t'-1
z O n ~ z O O ~ z ~ ~ O O ~ ~-3 t" H ~ , tH=i 9 H ' ~'
~ ~p z ~ 7~ ~ O z ~ ~ ~ ~ O z ~-p3 tri ~ ''~ ~ '~ H h~ ~ ';' d
r3 ~ O H '-'3 O ~1 "'' ~ '-' ~ ~y' ~ a ~ ''~' b~
~ ,"dNzt~~'J~ C=1N~~'~ ~'x~'z''Hbz ~C
~t~-~~ ~~'m''~~ t-r~t-~~0~~ p~pOt-~O ~~~c~ a
~'~o ~~~0 HpH~o ~dm~~~ ~~r'o 0
a '~
o~ z°zo~~ °z~°~~ ~o~ ~~ z°~m
z ~ z~ ~z~ r 9~ ~~~r

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
235
0 0 ~ ~°o
N N
UNR ~O
a a a
~-. oo J v~
ov'o ~ ~ ~ ~ H
H
N N O~ ~ ~ ~rJ
01
p\ W W W O
,p N N N O
O N N P
p pp ~. w 00 n-
r-. i-' p O ~ C
.OP
i r ~. O O m FtJ
O ~ O ~ O
~ H
~ O ~
r
d
bH ~bH ~~~~x~d da~~~d
O~ aH~ ~ ~x~~a~o~~~a o~00
trJ ,~Z, t~r1 H ~ ~ ~~ ~ t~ ~ ~, ~ ~ ~ ~ CJ ro w c~
N ~ '~' ~ H ~ ~ G7 trJ H '~'' lzJ
U',~, ~dp'~.~'' _~d~''~'~CG~~r~O~;'34~~ ~'_'j~~ o
O~ O~ ~aHw(~Ox"H~~~O~-''G' ~C~%btrl "o
t~J ~-C O ~ o
w
,~~° ~o ~~~~o "a x
~t~d~
~~ozm . C
a d
aa~~H~aa
o ~' r
-~3 ~ 9 C7 ~ txrJ H H ~ O
~~1 . ~~ 'z7 n7 nH
~o b
bbx~~'~bb
H H3 H ~~"' ~ m ~ H
tii t~ ~ H ~ ~ tii trJ
d
n ~' n
~-3 d H

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
236
z
0 0 0 o O~p
N N N N
~b
UNG UQ UNQ USG
b~
a a a a
z
N N
W W O
H
o~', .tea o
-.
b~
b
..
~o o .-~ o
o ,o C
N ~. oo N n A
,.... W O ~ O ~-6
A
O . ~ ~ ro
o
0
o w ~ A rrJ
~' H
~ O ~°
'° r
d
(~ b H C~ 'b H f~ "b H (7 b H C~
xx~x x~x xx~x x~x x
aor~ aor~ om aor~ a
~-3 H ~ H H
,~z, m ,z~, tai tri ~ tai
yea y~ a~ ~~ a
~~x ~?~x ~~x !~~x ~' o
Or' O~ O~ O~
o.
r ~ r ~ r
,~ v~ ~. v~ ., v~
b~ '"' H H a a ~ ''" "~ ''~ a b~ '"' H ~l a b~ '"' H H td '""' H H
r~or~xzzr°~xz~~°~ zZ~°~xz~m°~x
H d tri ~-3 H '~ d tn ~-3 H H d ~ H H ''~ d ~ H H ''~ d
o ~ r N N ~ O ~ ~ N N r o ~ N N r O ~ N N O N d
xz ~r~r~xz ,~r~~~z ~mr~~z ~r~r~~~
~~bbH ~~bbH ~~broH ~~bbH ~
~o ~~ ~o r~x~ ~o~~~ ~o x~~ ~o
~r~oo ~r~oo ~rr~oo ~r~oo ~r~ o
x '-' H ~7 x "'' ~-3 ~ x '-' ~-3 ~ ~ x ~..~ ~ H ~1 ~y x H
H 0 ~ trJ tri ~ ~1 0 ~ trJ hi ~ ~l p ~ lTJ tii b ;~ p ~ tri tii b '~ p ~
~~~Z~ x~Z~ xx~~ZZ xr~~ZZ x~~~ o
d ~ d
CJ ,~ C7 ,~ d ,.~ H

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
237
O O O ~O t0 O O
O O ~ N N CD ~
a a a a
w w w N
oho ~ ~O oo N
H
N o O O N oho
W .p .p ~. W
coo cNO °co° c°~o can cNO ~ b
i i i i i i ~
O O ~ ~' O rt ""
O o0 0o N O ~D
O O O O ~ !-'
O ON O O ~ N p
M
.O ,O ,O O .O ,O r
r-. ~. O
l0 ~O J '-' N ~ fD
~ H
O ~
t~
(~ '"d H C7 'b H n 'b ~ C~ b H
xo~ yon o~ xo~
Z a~~ a~~ yZ~
0
r~; r~
r r
~~°~~~HHaa~~HHaa~~HHaa~~~~aa
~~~~m°~xzz~°mxzz~°~xzzm°mxzz
~-3 d trJ F-3 H '"3 C7 tai H H ''~ d trJ ~ H '~ d trJ H H
Oc~°~a~~ ~~a~~ ~~aZ~
OOOO~a~
~~x~~r'-~orr~r~r'-~ormr~r'-'ortnmr'-'ormr~ b
ooookzNx~~xzNxm~xzNx~mxzNx~~
~~~d~d,-~ ~
ZZZZ ~°~~x~ ~o,~~~ ~oH~x~
~rr~oo ~r~,oo ~rr~oo ~r~,oo
""' H H x ""' ~-3 ~-3 x '_'' H H x "" H H °.
H ~ ~ tii h7 ~ ~] ~ r~ trJ trJ ~ H ~ ~ t=i h7 ~ ~3 ~ ~ trJ t~J
9x~~Z~ ~x~~~Z x~x~~Z~ xx~ZZ
z
d
d H d H d H d H

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
238
O O ~O O ~O O O O
r, ~O l0 ~O ~D .P
b
o ~ w w
a a a a a a
z
J ~ .7~ N W
O\ W N O\ ~ O
H
.p W N N W .P
QO W J ~O 01 ~O v-'
00 O 00 O~ -P l0
O O O~ Oo 01 W .J~. W
eD
O O .P mP N co N G
O ~o ~ ~o ~u ~o ~u ~ "~d
.? d1 ~ N O O ~ ~ ..
O O O O O O 0 O
N G O ~ ~ O '"
Ov N O~ .p tm O~ N N "2~
O O p .O O :O o O n
W ,Wp '-' ,-. i--' i-. ~ ~-. O
V, r-~ 00 .p lp ~
~ H
~ O
r~
d
o~~~~o ~~"'..,''mb n~tr~iv ~tr~i i ~x.ItrrJ n~mp~t~ii
a ~ ~ ~ ~ ,..n.3 ~ H ~ ,.,n,3 ~i ~-3 H
d~~r9~ ~~ ~~Z~~~Z~~~~HZ°~r~
~~~do~ ~x ~~'~~r~~r~~~~~xx x
o~~~~~ ~ ~~~~~~~~~~~y~~zz ~ b
o~ ~~~C~~~~H~~H
r~ WH~Wa'.W~~WH~ r
'~W~
x ,
rbb rr
xxx
d
Hx oo d
~aa
o~~
0
aim
x

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
239
z
p O o O o 0
c'_'''" a- a. ~ ~ ~: ~ ~ d
,~N pp ,..
a a a a a
z
~' ~ 0 0 0 ~ H
H
W N N .p W N '_'
W ~O ~O O W ~ "
W N r-r. :p tp v0 O
.p oo Ov O~ C
N N N ON
O W
O
O i-J. O O O O O
O W O f7 fD
O W t~ .P ~ W
fD cC~h
O O O ~ O O
O 0o O
.p
.p
O
h
'° r
d
a~o ~~ o~~ ~~ ~~ a~~~~~
t7 ~~d "~dld tHnbt7 bd "mbt7 ~OHH
~-3 O
cZn b0 ~~ n'"O~' ~ ~~ u''fO,~rZ O~y~~ o
o~ ~~ ~~~ x~ ~~~C
r ~~ ~~ ,~~, a~ ar r ~ ~ °c
nH y~ °~ d7~ ~r,.b ~7~ ~'v'
O O O O r O r O
m m ~ y
w
~~~o~~,~~~~~,~a~~~d~~d~
o~ ~ r~ rn r~ ,goo
HH
~rHnb~~r~a~H~~.,~~H~~~H~~~ Ga
~~°~~'~~~d~~~'~~~db°db°~ zox d
cnzYbr~rrb~~r~rp~ro,~ro
ro
~m ~~b~ ~ c~ n .
arm ~a ~ ~ v ~9 ~ ~~a
o r~ ~'~rc~
~~~,~~~c~or~mc~~r~~~r~ o'~ .
r ~r~dg r~H r~H
~~dy~~~m~~~~~~~~o~~o ~xz
d ~~da~ ~~. ~~ . ~ . ~ rz~
N r

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
240
-. ~-. .., ~. "' ,r
"'' '-' ~ "' o 0
co 0 0 0 0
..-. vo ~ ~o ~ ~ ~ ~ b
U~G QG U~0 ~ 'C 'C
a~ w
a~ a a a a a a
0 0 0 ~ ~ ~ o
'-' '-' w o J N ~"
.P ~ ~ .P
~,, ~,, w .-, w P o, ;P
o, N oo :~ oo td
~o rn cn ~u cn ~ b
;, ,~ ,r ~-. ~-~ o
w o0 0 ~ o
0 0 0 ~ o C,
o .° '~-. N 'v, ~ v,
o N w N w J O
~M
~ o i i O ~ o ~ b
O O
-w.. \O ~ ~ 0 0 ~ f~D
F
lp
t
O f
rt
d
Hxc~H~Hx ~~r~~r~ r~~o ~o ~o
~~~~y~~ o ~o ~o~~ ~~ ~~ ~~ c~
r~Hr~HryHrrod rod bd b
v~ v~ " rn ~ ,~ ~ n H ~ n H ~ ~-.. H ~ H ~ H ~ H
C~~i-~r~ ~JC~aC~~y ~-3G~~~~HG~~-~~HC~,-,t-'OZ Oz OZ O
~tCrl~ ~x~~ ~-~~~~~~~~G~d~~ ~~ ~~ n
.-] "',d H ~--~ H 'T~ N ~ C,) N ~ O N ~ .n, vW'
~" a C a w ~ r., ~ r.., ~ r r, ~-C x '-C ~-C x
~~~r~ '°ax~ax~ax d
_ ~ ~ H ~ O H ~ O H
~r~ar~ar~ ~° y° aro y
C7 n Cn C~ w ~ ~ w ~ ~ w ~ ~ x 7~ 'sb 7~
H H H ao 0 0
~~'
HH d HH rt~x~~x~~x~
~ d r~
~r r_~~r~~r~
r~~ ~trl'Jc~n~ ,,.aj~~xH~g~~~g~H
~m do~~ ~~~ao~~o~ao b
rbb ~ ~ ~ d
z~~~ do~~°H~~~c~~~~~
H ~ H ~ H
~x ~mx '~~~ a~ a~, a o
~~~xo~xo~x
°~~
o~ Nom ~do~d~Zd~~
r' w
z ~ ~ r ~' r' ';'
' ~ hi ~ ~ 0 0 0

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
241
o~A
b
.,
a
N
~H
H
v,
p.
o, o, 00 0o t~TJ b
.° ~°
p OWO N ''t'
J
O ~ O
J
~M
o O ~ ~ ~"d
0o ~ ~ O
O
N r'
O "~J
'° r
d
x ~d~~c~~~~x~d~~~~~ax~~~~x~~~
zx x x ~xzxx x ~x x ~x x
~~~ay~~~a~~~~~a~~ ~~a~~~~a~a
Z _~ZZ~~~ ~~ Z~Z
.. , ~ ~~~,~a~~~~a
m~~~~~' ~ '~~~~~
xxxybxb~~ xxxabx~m~ bxb~~ ~x
~r~ r~ ~~"_'~r~ ~"_'~r ~'w o
~nro~~~ anro....~~. ~.~~. ,-.
dx"adx~~ rxa~x
b zaz~ar~~ z~H~a~~ x~'x
a
.-xo~a~ a '~xdx~ p Z
d x ~; '. H ~ ,. .
~C
~,~~w,~Hl-CH -CH~W~-3~-C~7 ''t~~'~~lH 7~
~W
ny 7~n< ~ ~~7~C~C '~ dOn''~ d0
a~~Zy~a~ a~HZy~a~, z~'a~, z~'
~a~dmmC~ ~~d~yt~C~ ,-~~-m~-~, ~ a
0
r~a~ ~ r~a~ ~ ~ c C
N ~ ~ ~ ~ C~ ~-~3 H '-'3 v~ ''~
O o "~ tii . o O o tr!
W ~ ro r ~, W ~'.''P'' b r N O (?J N
z~
x x

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
242
z
c c ~ ~ ~, b
d
a a~ ~ a ~ a
d
z
H
w ~ ~ i
w .r
i.~ o~ ov o0 0~ td
o ~ .G .p. i~ .~~n-. b
~O N N N w N
O O O O O r" C
w p i-. W N N n
O~ -P -P .P ~-' N ~ ~i
O O O O O m b
N
-~t ~ N
r~
d
C~ ~-3 ~ C~ ~-3 x ~ H x C~ H x C~ ~-3 ~ ~ ~C ~x-' H
C ~ ~C Cc~
d° Nor ~ ~~ z~ ~~~Ndx~~~
o ~x~m ~~ ~ ~ .. ~ how
~n
H ~ td ~ ~ td ~ ~ b7 ~ ~ td ~ ~ ~ r o
H
w~~ ~a~ ~a~ ~a~ ~a~ _~b a
'~ ~ t~rJ m ~ t~l,'J ~ ~-~3 ~J m ~~-l ~t~r,J ~ ~ ta'' ~ t~rJ o
~ °
w ~ ~ c'~r, ''r cHr, '-' c'T'r, ~ vx, ~ .p N y P.
a can ~ .P .p .p .~. ~ o
r
xbH ~r~~rH~~r r
y~~ .~~'~t~~,7~x~m~,~~t~~~~,tn
x ~ ~,m~o ~~o r~HO mho b
y~~' z~r z~r zxr zxr
r o ~ (~ x ~ C~ x ~ n x ~ C~ o~
~~z
aim acm a~~
rn
m

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
243
o b
~rn ~ ~ w .-.. Uv ~ td
a a a a a
H
~z
~1 N N
d
r, N P .p VO
O~ .p v0 ~O Oo W 01
O~'00~"~'
N ~-~ N v0 tn oo v0
O O O O O O
J tl~ ~~-. ~O J ~ n f~D
J ~p W Ch O ~ .~.
O .O O O O O ,O ~ b
n
'W --~ N N N N
~O O O O O ~O
~ H
~ O ~°
'° r ~
a
y~
z~ ~H x ~~~ ~ o do
a ~ o
x a n c
.. ~~ a~ ~9 a
d ~ r t~ t~
r ym
bab ~ ~~~~~~H~~x~~r~~~~~~a~~c~
x~x 00 oor ors o or~oo
p d O ~ ~ ~ H ~ H ~ H ~ ~ L~ ~ -~~- ~ ~ Hl ,~,x,, ~ ~ ~-Z3
a
rn~~~~~o~~H~~~p~~r~~~~~~~ '~
-!.a t~rJ ~ y-H, ~ m '"~ ~ ~~-7 ~ p ~ ~ ~ ~J a '_~' '"~'' ~ ~ "H_"H-'
~~~o~~~Za~a~~Zx~ '~~~r~~~ o
b ~r~o 000
~'v'~Np~'b~ h~7~~ o
n~~~~ z r ~o~~~~o~ d~~
r ° o ~ z~ ~ o off ,..., ~ c~
r ~ ~~nr~ z~ o

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
244
~_ ~_ ~ ~_
N N N N N
x' ~ C 'C
a a a
o'~', w .tea
H
~_
:~ ov td
N N N
lh W W O ~O '"
O O O O O ~ C
i--. tl~ N O i-~!
l~h ON1 O c.h W
O O O O O
W--P. O
O
~ O ~°
r
d
~~~o~~ ~H~ 9~b ~~b
r~
~c
~~~r~~ err z ~z
,..a.3 ~~ X00 H ~a-3 f~
azz
b b ~. ~c
xy aoo
bb
a
n
° zrz'~or~~ ~z°b°bro° bo°
~x~ ~°~~b~m~~~~~x~x-~~-ro~x'~~
°Htnb~wn v~~ ~ Hv~ btrJ b'J
~z~~
x~~~ ro o o~~'~~ ~ ~~r
m
O O
~y Crl f-3 H - " tiiw trJ
H N

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
245
~_
W N N
N N
r-. ~. W N 'r
-H. ~ N ('' ~. ~ tC
n
W ~O J W W W
H
N '-' ~ ~' ~ d
o ~, o ~, ~-.
o, o i.~ by o~ td
° '~~°, ° N ~ N ~ ro
o . . C
0 0 0 0 ~,
v, o J o ~-. 'v, ~ c~
~ ° ~r
°~
o . 0 0 0 0 ~ ~-d
°° ~ ~~" ° v, °
0 0 ~ "~J
o ~
'~ O
'° r
d
oza ~x ~~~~o°~o~
o ymzWV~ ~~, r~
mm~ ~ m~~~~z b~
v~ n ~ ~ ~ C7 ~~ v~ m m p y x m y
v~ ~, N . m ~, m
.. a~~ N~bm~ W m
p ~ Y ~ ~.:p ~ c
b m
~..~, C~~~rJd ~ ~~p°m''H.b~t~rJ ~ R.
x W~bN~ m~ r~
n
td~ ~Hd~~o >°c~~'>°c ~xx
m~om~'~ pW~p ~xx
O ~ ~ H ~'' ~ ~ x d x ~-~d d ~ ~ "b
o~;~~~ yd~y
xg ~oo~ ~oZ~, ~c
x~
m o~~ rim
m
a~Z d
p

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
246
"' '~ o~ o~ O ~ p
w w
ua ao
0 o d
a a
H
r
0
0 0
0 o trJ b
°o
~D ~O N O~ .-r
O O O
i n N ~'
0o W '-' O
M
fD eC
O O ~
O ~O O
O O W A
v~',
o ,p
o "~J
O e°
r~
d
~ ~ ~ H
~~~r ~a~r
n
al ~,
a ~~~ ~°~ o
~~a ~~a b
r o
~a~ ~a~ ~y~ Hy
°~ ~°
H ~ ~ ~ H
d
o~~~,~~o~~'"~ zo~~~~~~zo~~~~~~
~m~~rG~~~~~r rH~y~d~~HH~~Yd~~
cz°oxz~z°ox '~~' ~~~~x ~' ~~~~'x "°
z~~~z~N~~~z ~~~~H~~~~~~~~Z~ro
~~~HdHd
~r,r~ r~~~r~r, r~ ~ a~ ~ a~
° ~ ''~ Z ° a '"'~ z ~ ° v~ p ~ ~ ~ ~ ° ~ p n U' m
c
z~~~Z~~ z~~~Z~~
>C ~ ~C d ~~>C H~ ~~~C "-39
.~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
247
o ~ o
w w w
"' , ~ w w w w b
as ao ~ ~ ~ G
N a. a~ A- bd
a~ a~ a a a
w t~ ~ ~ ~ ~'' H
H
0 o w oN, v~, N°
d
0 0 ~o ~o w o
0 0 ow, i.~ o b~ ,b
0 o rn co co 0
0 0 ~ ~ a, ,_'
O~o N N N .P
,
p .O ~ ~ ~ C
.oP J ,-N.
~M
O O '"' ~ n
A
Ov ~ C/~
w p
~ N
r N
d
x rxbxxb~x~~b~,x~a~ a~~ ~x
y ~.,~Hy~-~d~~c~obG~Ccj° C~~ xy
t.~~ xt~Od~ O~~ O~ O
N ~ ~ O ~ ~ ~ rØ, ~ ~ ~ r C) r~-, ~ C7 ~'
° o
a N~,~CI~~ ~ br.. br.. r9 "O
°a
c~~tnp~c~~t~p~ p~ ~ a.
xH
a a a
x f~1 ~ ~ N ~ N ~ N r
Y ~ ~ td td td
ax~obOOxxOxOxxOxOxx~H~c~H~O~
~~~oooo~ro~o~ro~rr~r~ m'~o~"~o
~ H o H t~ ~ ~ ~ ~ p ~ O td ~ O ~ O ~ ~
~p~~~rrp~rnr~~rp~ndr v~r~v~rp
~zoZOZk~°xx°x~xx° ~'~~ ~x~~xz d
~a~a~,
~c bzbz ~<r~~ bc~~ o
a a~r~x H~~~ H~~~ ~'~n ~ rtd b
~' ~m_xm° ~x~~ ~x~,~ ~ ° ~ ~~ ~ o
x r Z ~ ~~" ,-C''., n "~~' ~-r, ~ n ~ rr-r O ''~b (z ~-°d t~t\'l h%
d tii trJ tn G~ O 47 0
Y ~~~ ~Y~~ ~Yro~ ~' r
t~ ~ r~nH ~~~-3 ~ ~ n
°
0 0

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
248

0
w w w r0
w

o ~' ~ a. ~
G

a a a a

H

H

N N N ~ ~j

J

pp r, 01
i
O~ .P

J

O O
O rn
O

M

A

.-. ~. O ~n
O O b

O O

.,o,~ v' H

o, ~ a
,p

~ ~
O
'

r ~
d

wax ax d~zab~~x ~o~~x ~ ~d
~ trJ CWa-7 9 ~ ~
H ~ ~ "rd H C~ H a
~' y
~

~ , ~~~~a
,, "'b~~go~~
.3
x oo~o
~

zo o ~N~'ra~ ~~~~
d~o

~~~o~~x ~'~~'~ '~ o

II zH ~ ~r ~rHa H~oc~ a ~ b
~

~td ~ d~ ~rra
~~N
w

,~N H ~ N O n ~ C~ ~ ~ . n r a.
(~ ~ ~ n ~ ,~Z,

nr ~Y 9 H rY d

a

o ~ Z e~

. .

d

~ ~ r~ t"''
~ ~o

r,d~- o p ~ ~
d o"~o ~ r~ d
r '~~~ word ~

~ y

~i ~ O O''~G~~ O
ro ~H a~~ ~td~VC ,~tiO

O n~ ~~ ~~~d

n

rr~ o oar ~r~

r~N~ N

a m

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
249
~ ~ ~

_ _ _
~O ~O ~O
W W W

A. tn~ ~ ~ y "~
b
d

a a a a a

z

H

H

N .P N W

O~

i i i i
b

O O O

O O

~d
O
f~D

~' H

0 0 0

O ~
r ~

d

z~c~~ araxax ar x x ~axax a~ax~x ~~
a v ~
x ~

x ~~r~ v~ y~~nHv~ aHv~Hv~ ~n
n ~m~ cn w ,~ w ,~
z a ~ ~m H ,..., H ,.~
n H H ,~ H H H
H ~

O ~ ~ ,~ H ~ O H O O rJ
~ ~, ~ C~ ~ O O ~ O
n ~ C~ ~ O ~ O
O O O

b r mom~m~ momnm~ mm~mn mm~mn
mr~a z~zoz~ z~zoz~ ~zozo ~zoz~ ~r,

b x~ ~c ~ --,~ ~ ~~ ~~ roo
'-~ '-' H a '-' ,~ ~ x ~ ~ x
O ,.d n a ~ ~ ~
v~ ~ ,.d

~ ~m~aya ~m~y~a ~ N~a a H~
~ ~ ~'~ N~a a ~aH H ~
H ~ ~a~ H

mr.. ~ v~ er o
x ~v~ ~
9 ~'' ~ ~
~

C~ ~ ~ ~
~ ~ N ~ ~ ~ ~ ~ ~ '
d ~ 0

W W l ~ '' 3t x
'y H ~c ~c xrHr ''H o~
~c ~c l ~r-
~9~c~.c ~a~c~~c

(-7 .p ~ .p N .P
~ a
N

~

n

H

x ro
r~vrH

d

td~o 0
o

y
ro

p
H

r

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
250
~ ~ ~ ~ ~z~v~
~ '-' "' , W W O G ~
w w '
_ ~ ~ ~
.° ° °
0
w N N
a a a a a x
w N V1
H
N
00 ~. ~ ~ ~
Q1
N ~ c~D ~ b
,..
O
J J
O O m
D\
O
~p e~~i
~d
0
0 o coo "~J
o\ ~ a1
o N °° LsJ
c~'n o 0
O
'° r
d
-,~~ax~xz~~~
xx
. r ~.. N~~r,~~~r
a~?~rroac~~rr~a ~td~zo~br~~
C) ~ O ~ ~ n ~ O ~ r' C7 C~ ~ ~ ~ H yn o
~ trJ ~ ~ '~ ~ m ~ ~"' ~ ~ V'' ,b O 'y H ~ ~ ~ x "C
H
x o r~ N ~m m~
d o~~a~~ o~~~~ ~~°~x~ ~~a~d
x m~x~~ ~~x~x ~mx~c ~~~m
H am ,~ r ~y
~ ~ N ~ ~ ~ ~ N ~ ~ 9 C~ ~ N
~~~bc,~~ ~bc,~c, ~r~~x~xz~~~x"~
o~m~o~o~m~o~o~rm~o~ roo ono
"'"~'"d H ~ ~ ~ oo ~ n-7 n ~ ~-~d oo ~ O ~ ~ ~ ~ Q ~ ~tiy ~ ~ O ~ p ~ O
r~ r°~ r r°~ r ~mor~ rrxr~r~
md~~m~mC~~m~m ~x~omoo amxmomx
xvo xxx~o xxx abrx~~'~~x~x x~ d
z~
0
o~~b~ o~~b~ ~~r~'~ rr~o ~~o
~w~x~ ~x ~ r~~
~Nmdro ~Nm~b rz~a~ xa a H°~a o
o x ~ H x ~ H ~z~N~c a~ ~ ~c
n ~~'1 n ~ N r l~'~~'~ "'~C
a~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
251
z
W ° ° ~ ,~
d
a a a
~ o
H
°o a~
a, ov b, b~ ro
v, J so
mo o ~'
0
0
~s
b
r~
0
~ H
t~J
A ~
'.''' ~'' o
~ O ~°
r
d
~~N~,dozx~z~~~~omx d~~x~x
d,~~c~ ~~ ~,~aa~~~~
0 0 00 0
,.pb ta..,~ ~ ~ ~ O ~ t~ n ~ ~ ~ x c~ p b 'b ~ ~
°a~~drroa~~ob°~~~ro ~°o°~~ o
n~9~H
~ ~~~a~~~~ ~~~~~ ~~ ~a
x ~da~~ ~~ b~a~H x° c~~ °'
~~c d~c ~xNz~c
H N
9 O ~~ 9 ~~ .~~. .
... m ~ .
aO~~'~ ~ mo ~~~~~'~~Ha
w H r-, ra.., ~ ~ ~ ~ ~ ~° O H ~-t~Jd ono y G~ O rv'n
C/~ O ~ ~-d r C~ '~ H ° trJ C~ ~..~
~ ''d
~ C'7 ~ ~ b r0 ~ ~ ~ trJ ~~ C7 r ~ ~ N
O 04~N ~'~ ~-a3t~~tm''~~Htn-~
r ~ r v,
daoo cr-.
"' G7 ~"~ '~ ~ H r '~
ooa ,~~" r~
~zm~

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
252
N c~~, v~,c~~,t_~n~~"

--. ~ ~ ~. ~. w ,"d

y td t~ ~ 9 ~ Y n

H

~-. ~ ~. r-. ~ N
~O J N ~ J ~ ~

G

00 ~D O~ ~ Ov W W
.p .p -P N N

~
J O N ~ W N P ro
~
"'

~1 Oo c~ O v0

O O O O O O rn
N i W ~ .? f~
-J. tD

W J pp tn 01

0 0 0 0 0 0 ~
b

vp .p. W J W O

Cn
~ H

~ ~_
A

O ~
'

r ~
d

ox ~~ b~ ro ~ d~c~z

~ro ~~ Ob o"G~dOro

a
r

y n
r

~,~. ~~ ~~ ~Z ~ oy

N
~ ~ i

C ~ b~ td ' ~ i~
~

O d t-'
~C ~C ~

~. C~ by C.'C~'~ ~ ~
' '~' 0 ~' ~''~ trJ
~ ~'

~
~r-~oO ~ ~~ ~~ ~~ ~
,.b~~~ x ~r
~~

~Ot7t7CJ b ~~ ~~ ~~ ~ ~ t7
O ~~
~ t~~%n~ ~
~

ror~~~r rr rr rr r~zr o

~~o~~ b"~r~a mb bb bb bb b Hb e~
9 ~ m

N ~ c O O O O O t=i r r
n ~ O w ~ b O O O O
7~ ~1 '~ O
~J

trJ CrJ ~-3 H H H Wn
H 7~ H H H H trJ
~-~ trJ '~ '~ h7 trJtrJ "
" ~ ti1 trJtrJtrJ tit

~~ ~ ~0~0~ ~~ ~~ ~~ ~ ~~ ~ a
~

no ~N
r

~r o ~ o

<
~

V

CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
253
_ _ _ ~' "' z
N N N N_ ~ ~ O ~ A
N N N N N N
ro
w p' ~ a.
a a a
z
H
H
o .-. o 0
o~ ~, ~~ 00
o, :~ N w ~ ~ ro
i-. N N ~ ~-. N v1 v,
O O -P N O
O O O O O O
w .Np 01 ~ '".
~~h
O O ~ O O O vW .b
J ~ O W '-~ W "Ot
fD
~ H
~ O ~
r
d
~~~o~,doo a~d9 ~~ ~dd~
m ~. x r
~~r~~~
wz~~~a ~~ a ~~'x ao
~'~~~~o d ox ° ~ ode' ° ~ c~
~~od~~~~ ~~d ~ ~o~ m
~a"O l7d ~~W"~; d °a
~tnxwt~~Ct~ r~~ c~ r~' ~ c~ a.
c~ ~W n~ "~ n
y d ~i ~ x m ~ x ran
r a r~ m
x Zo
ad
o °d~oo~orr~odorH aoor
' ~xx~x~x~x~x'~ x°dr xx
~'~~ooZO ~ZO~'~~o~~r~o~
o ~~dE~~or~~~o'~o~dd~ ro
~~x~o~°~'~~~~o~'~~~
y Ha ay~yza°~ y~~N°a~~
~~~m
o~ ~ r~~ b ~ z HoH~ o
d x yx ~ x
a ~ o ~o o , ~o
~ ~ ro
tai

DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

Representative Drawing

Sorry, the representative drawing for patent document number 2456955 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	Unavailable
(86) PCT Filing Date	2002-08-09
(87) PCT Publication Date	2003-10-02
(85) National Entry	2004-02-09
Dead Application	2008-08-11

Abandonment History

Abandonment Date	Reason	Reinstatement Date
2007-08-09	FAILURE TO REQUEST EXAMINATION
2008-08-11	FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Registration of a document - section 124			$100.00	2004-02-09
Application Fee			$400.00	2004-02-09
Maintenance Fee - Application - New Act	2	2004-08-09	$100.00	2004-06-17
Registration of a document - section 124			$100.00	2005-03-11
Registration of a document - section 124			$100.00	2005-03-11
Registration of a document - section 124			$100.00	2005-03-11
Maintenance Fee - Application - New Act	3	2005-08-09	$100.00	2005-06-15
Maintenance Fee - Application - New Act	4	2006-08-09	$100.00	2006-06-14
Maintenance Fee - Application - New Act	5	2007-08-09	$200.00	2007-06-19

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
NUVELO, INC.

Past Owners on Record
HYSEQ, INC.
MA, YUNQING
TANG, Y. TOM
WANG, ZHIWEI
WENG, GEZHI
YANG, YONGHONG

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Abstract	2004-02-09	1	51
Claims	2004-02-09	4	123
Description	2004-02-09	315	13,841
Description	2004-02-09	255	15,218
Cover Page	2004-05-17	1	26
Assignment	2004-02-09	2	93
PCT	2004-02-09	2	93
Prosecution-Amendment	2004-02-09	1	18
Correspondence	2004-05-19	1	24
Prosecution-Amendment	2004-04-15	1	40
Assignment	2005-03-11	13	582
Correspondence	2005-04-07	1	13

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

File Name	Received On	Size (bytes)
#50155-4.TXT	2004-04-15	5,957,163
#50155-4.PEP	2004-04-15	483,729
#50155-4.SEQ	2004-04-15	1,399,375

To view selected files, please enter reCAPTCHA code :

Language selection

Menus

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2456955 Summary

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.