Language selection

Search

Patent 2456955 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2456955
(54) English Title: NOVEL NUCLEIC ACIDS AND SECRETED POLYPEPTIDES
(54) French Title: NOUVEAUX ACIDES NUCLEIQUES ET POLYPEPTIDES SECRETES
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/12 (2006.01)
  • A61K 38/17 (2006.01)
  • C07K 14/435 (2006.01)
  • C07K 14/47 (2006.01)
  • C07K 16/18 (2006.01)
  • C12N 15/63 (2006.01)
  • C12P 21/02 (2006.01)
  • C12Q 1/68 (2006.01)
  • G01N 33/53 (2006.01)
(72) Inventors :
  • TANG, Y. TOM (United States of America)
  • YANG, YONGHONG (United States of America)
  • WANG, ZHIWEI (United States of America)
  • WENG, GEZHI (United States of America)
  • MA, YUNQING (United States of America)
(73) Owners :
  • NUVELO, INC. (United States of America)
(71) Applicants :
  • NUVELO, INC. (United States of America)
(74) Agent: SMART & BIGGAR
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2002-08-09
(87) Open to Public Inspection: 2003-10-02
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2002/025485
(87) International Publication Number: WO2003/080795
(85) National Entry: 2004-02-09

(30) Application Priority Data:
Application No. Country/Territory Date
60/311,261 United States of America 2001-08-09

Abstracts

English Abstract




The present invention provides novel nucleic acids, novel polypeptide
sequences encoded by these nucleic acids and uses thereof.


French Abstract

L'invention porte sur de nouveaux acides nucléiques, de nouvelles séquences de polypeptide codées par ces acides nucléiques et sur leurs utilisations correspondantes

Claims

Note: Claims are shown in the official language in which they were submitted.



567

WHAT IS CLAIMED IS:

1. An isolated polynucleotide comprising a nucleotide sequence selected from
the group
consisting of SEQ ID NO: 1-1041.

2. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein
said polynucleotide hybridizes to the polynucleotide of claim 1 under
stringent hybridization
conditions.

3. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein
said polynucleotide has greater than about 99% sequence identity with the
polynucleotide of
claim 1.

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA.

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises
the
complementary sequences.

6. A vector comprising the polynucleotide of claim 1.

7. An expression vector comprising the polynucleotide of claim 1.

8. A host cell genetically engineered to comprise the polynucleotide of claim
1.

9. A host cell genetically engineered to comprise the polynucleotide of claim
1
operatively associated with a regulatory sequence that modulates expression of
the
polynucleotide in the host cell.

10. An isolated polypeptide, wherein the polypeptide is selected from the
group consisting
of:
(a) a polypeptide encoded by any one of the polynucleotides of claim 1;
and
(b) a polypeptide encoded by a polynucleotide hybridizing under
stringent conditions with any one of SEQ ID NO: 1-1041.


568

11. A composition comprising the polypeptide of claim 10 and a carrier.

12. An antibody directed against the polypeptide of claim 10.

13. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:
a) contacting the sample with a compound that binds to and forms a
complex with the polynucleotide of claim 1 for a period sufficient to form the
complex; and
b) detecting the complex, so that if a complex is detected, the
polynucleotide of claim 1 is detected.

14. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:
a) contacting the sample under stringent hybridization conditions with
nucleic acid primers that anneal to the polynucleotide of claim 1 under such
conditions;
b) amplifying a product comprising at least a portion of the
polynucleotide of claim 1; and
c) detecting said product and thereby the polynucleotide of claim 1 in the
sample.

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and
the
method further comprises reverse transcribing an annealed RNA molecule into a
cDNA
polynucleotide.

16. A method for detecting the polypeptide of claim 10 in a sample,
comprising:
a) contacting the sample with a compound that binds to and forms a
complex with the polypeptide under conditions and for a period sufficient to
form the
complex; and
b) detecting formation of the complex, so that if a complex formation is
detected, the polypeptide of claim 10 is detected.

17. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:




569


a) contacting the compound with the polypeptide of claim 10 under
conditions sufficient to form a polypeptide/compound complex; and
b) detecting the complex, so that if the polypeptide/compound complex
is detected, a compound that binds to the polypeptide of claim 10 is
identified.

18. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:

a) contacting the compound with the polypeptide of claim 10, in a cell,
under conditions sufficient to form a polypeptide/compound complex, wherein
the complex
drives expression of a reporter gene sequence in the cell; and

b) detecting the complex by detecting reporter gene sequence expression,
so that if the polypeptide/compound complex is detected, a compound that binds
to the
polypeptide of claim 10 is identified.

19. A method of producing the polypeptide of claim 10, comprising,
a) culturing a host cell comprising a polynucleotide sequence selected
from the group consisting of any of the polynucleotides from SEQ ID NO: 1-
1041, under
conditions sufficient to express the polypeptide in said cell; and
b) isolating the polypeptide from the cell culture or cells of step (a).

20. An isolated polypeptide comprising an amino acid sequence selected from
the group
consisting of any one of the polypeptides SEQ ID NO: 1042-2082.

21. The polypeptide of claim 20 wherein the polypeptide is provided on a
polypeptide
array.

22. A collection of polynucleotides, wherein the collection comprising of at
least one of
SEQ ID NO: 1-1041.

23. The collection of claim 22, wherein the collection is provided on a
nucleic acid array.

24. The collection of claim 23, wherein the array detects full-matches to any
one of the
polynucleotides in the collection.




570


25. The collection of claim 23, wherein the array detects mismatches to any
one of the
polynucleotides in the collection.

26. The collection of claim 22, wherein the collection is provided in a
computer-readable
format.

Description

Note: Descriptions are shown in the official language in which they were submitted.





DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1
NOVEL NUCLEIC ACIDS AND SECRETED
POLYPEPTIDES
1. CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation-in-part application of U.S. Application
Serial No.
09/552,317 filed April 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries",
Attorney Docket No. 784CIP, which in turn is a continuation-in-part
application of U.S.
Application Serial No. 09/488,725 filed, January 21, 2000 entitled "Novel
Contigs Obtained
from Various Libraries", Attorney Docket No. 784; U.S. Application Serial No.
09/491,404
filed January 25, 2000 entitled "Novel Contigs Obtained from Various
Libraries'.', Attorney
Docket No. 785; U.S. Application Serial No. 09/560,875 filed April 27, 2000
entitled "Novel
Contigs Obtained from Various Libraries", Attorney Docket No. 787CIP, which in
turn is a
continuation-in-part application of U.S. Application Serial No. 09/496,914
filed February 03,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 787;
U.S. Application Serial No. 09/577,409 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 788CIP, which in turn
is,a
continuation-in-part application of U.S. Application Serial No. 09/515,126
filed February 28,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 788;
U.S. Application Serial No. 091574,454 filed May 19, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 789CIP which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/519,705
filed March 07,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 789;
U.S. Application Serial No. 091649,167 filed August 23, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 790CIP, which in turn is
a
continuation-in-part application of U.S. Application Serial No. 09/540,217
filed March 31,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 790;
U.S. Application Serial No. 09/770,160 filed January 26, 2001 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 791CIP, which is in turn
a
continuation-in-part application of U.S. Application Serial No. 091552,929
filed April 18,
2000 entitled "Novel Contigs Obtained from Various Libraries", Attorney Docket
No. 791;
and U.S. Application Serial No. 09/577,408 filed May 18, 2000 entitled "Novel
Contigs
Obtained from Various Libraries", Attorney Docket No. 792; all of which are
incorporated
herein by reference in their entirety.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
2
2. BACKGROUND OF THE INVENTION
2.1 TECHNICAL FIELD
The present invention provides novel polynucleotides and proteins encoded by
such
polynucleotides, along with uses for these polynucleotides and proteins, for
example in
therapeutic, diagnostic and research methods.
2.2 BACKGROUND
Technology aimed at the discovery of protein factors (including e.g.,
cytokines, such
as lymphokines, interferons, circulating soluble factors, chemokines, and
interleukins) has
matured rapidly over the past decade. The now routine hybridization cloning
and expression
cloning techniques clone novel polynucleotides "directly" in the sense that
they rely on
information directly related to the discovered protein (i.e., partial
DNA/amino acid sequence
of the protein in the case of hybridization cloning; activity of the protein
in the case of
expression cloning). More recent "indirect" cloning techniques such as signal
sequence
cloning, which isolates DNA sequences based on the presence of a now well-
recognized
secretory leader sequence motif, as well as various PCR-based or low
stringency
hybridization-based cloning techniques, have advanced the state of the art by
making
available large numbers of DNA/amino acid sequences for proteins that are
known to have
biological activity, for example, by virtue of their secreted nature in the
case of leader
sequence cloning, by virtue of their cell or tissue source in the case of PCR-
based
techniques, or by virtue of structural similarity to other genes of known
biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications
in,
for example, diagnostics, forensics, gene mapping; identification of mutations
responsible
for genetic disorders or other traits, to assess biodiversity, and to produce
many other types
of data and products dependent on DNA and amino acid sequences.
3. SUMMARY OF THE INVENTION
The compositions of the present invention include novel isolated polypeptides,
novel
isolated polymcleotides encoding such polypeptides, including recombinant DNA
molecules,
cloned genes or degenerate variants thereof, especially naturally occurring
variants such as
allelic variants, antisense polynucleotide molecules, and antibodies that
specifically recognize


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
one or more epitopes present on such polypeptides, as well as hybridomas
producing such
antibodies.
The compositions of the present invention additionally include vectors,
including
expression vectors, containing the polynucleotides of the invention, cells
genetically engineered
to contain such polynucleotides and cells genetically engineered to express
such
polynucleotides.
The present invention relates to a collection or library of at least one novel
nucleic acid
sequence assembled from expressed sequence tags (ESTs) isolated mainly by
sequencing by
hybridization (SBH), and in some cases, sequences obtained from one or more
public
databases. The invention relates also to the proteins encoded by such
polynucleotides, along
with therapeutic, diagnostic and research utilities for these polynucleotides
and proteins. These
nucleic acid sequences axe designated as SEQ ID NO: 1-1041, or 2083-2534 and
are provided
in the Sequence Listing. In the nucleic acids provided in the Sequence
Listing, A is adenine; C
is cytosine; G is guanine; T is thymine; and N is any of the four bases or
unknown. In the
amino acids provided in the Sequence Listing, * corresponds to the stop codon.
The nucleic acid sequences of the present invention also include, nucleic acid
sequences
that hybridize to the complement of SEQ ID NO: 1-1041, or 2083-2534 under
stringent
hybridization conditions; nucleic acid sequences which are allelic variants or
species
homologues of any of the nucleic acid sequences recited above, or nucleic acid
sequences that
encode a peptide comprising a specific domain or truncation of the peptides
encoded by SEQ
ID NO: 1-1041, or 2083-2534. A polynucleotide comprising a nucleotide sequence
having at
least 90% identity to an identifying sequence of SEQ m NO: 1-1041, or 2083-
2534 or a
degenerate variant or fragment thereof. The identifying sequence can be 100
base pairs in
length.
The nucleic acid sequences of the present invention also include the sequence
information from the nucleic acid sequences of SEQ ID NO: 1-1041, or 2083-
2534. The
sequence information can be a segment of any one of SEQ ID NO: 1-1041, or 2083-
2534 that
uniquely identifies or represents the sequence information of SEQ ID NO: 1-
1041, or 2083-
2534.
A collection as used in this application can be a collection of only one
polynucleotide.
The collection of sequence information or identifying information of each
sequence can be
provided on a nucleic acid array. In one embodiment, segments of sequence
information are
provided on a nucleic acid array to detect the polynucleotide that contains
the segment. The


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4
array can be designed to detect full-match or mismatch to the polynucleotide
that contains the
segment. The collection can also be provided in a computer-readable format.
This invention also includes the reverse or direct complement of any of the
nucleic acid
sequences recited above; cloning or expression vectors containing the nucleic
acid sequences;
and host cells or organisms transformed with these expression vectors. Nucleic
acid sequences
(or their reverse or direct complements) according to the invention have
numerous applications
in a variety of techniques known to those skilled in the art of molecular
biology, such as use as
hybridization probes, use as primers for PCR, use in an array, use in computer-
readable media,
use in sequencing full-length genes, use for chromosome and gene mapping, use
in the
recombinant production of protein, and use in the generation of anti-sense DNA
or RNA, their
chemical analogs and the like.
In a preferred embodiment, the nucleic acid sequences of SEQ m NO: 1-1041, or
2083-
2534 or novel segments or parts of the nucleic acids of the invention are used
as primers in
expression assays that are well knov~m in the art. In a particularly preferred
embodiment, the
nucleic acid sequences of SEQ m NO: 1-1041, or 2083-2534 or novel segments or
parts of the
nucleic acids provided herein are used in diagnostics for identifying
expressed genes or, as well
known in the art and exemplified by Vollrath et al., Science 258:52-59 (1992),
as expressed
sequence tags for physical mapping of the human genome.
The isolated polynucleotides of the invention include, but are not limited to,
a
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ
ID NO: 1-
1041, or 2083-2534; a polynucleotide comprising aaiy of the full length
protein coding
sequences of SEQ )D NO: 1-1041, or 2083-2534; and a polynucleotide comprising
any of the
nucleotide sequences of the mature protein coding sequences of SEQ ~ NO: 1-
1041, or 2083-
2534. The polynucleotides of the present invention also include, but are not
limited to, a
polynucleotide that hybridizes under stringent hybridization conditions to (a)
the complement of
any one of the nucleotide sequences set forth in SEQ m NO: 1-1041, or 2083-
2534; (b) a
nucleotide sequence encoding any one of the amino acid sequences set forth in
SEQ m NO: 1-
1041, or 2083-2534; (c) a pol5mucleotide which is an allelic variant of any
polynucleotides
recited above; (d) a polynucleotide which encodes a species homolog (e.g.
orthologs) of any of
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide
comprising a
specific domain or truncation of any of the polypeptides comprising an amino
acid sequence set
forth in SEQ m NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide
comprising any of the amino acid sequences set forth in the Sequence Listing;
or the
corresponding full length or mature protein. Polypeptides of the invention
also include
polypeptides with biological activity that are encoded by (a) any of the
polynucleotides having
a nucleotide sequence set forth in SEQ B7 NO: 1-1041, or 2083-2534; or (b)
polynucleotides
that hybridize to the complement of the polynucleotides of (a) under stringent
hybridization
conditions. Biologically active variants of any of the polypeptide sequences
in the Sequence
Listing, and "substantial equivalents" thereof (e.g., with at least about 65%,
70%, 75%, 80%,
85%, 90%, 95%, 98% or 99% amino acid sequence identity) that preferably retain
biological
activity are also contemplated. The polypeptides of the invention may be
wholly or partially
chemically synthesized but are preferably produced by recombiilant means using
the genetically
engineered cells (e.g. host cells) of the invention.
The invention also provides compositions comprising a polypeptide of the
invention.
Polypeptide compositions of the invention may further comprise an acceptable
carrier, such
as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a
polynucleotide of the invention.
The invention also relates to methods for producing a polypeptide of the
invention
comprising growing a culture of the host cells of the invention in a suitable
culture medium
under conditions permitting expression of the desired polypeptide, and
purifying the
polypeptide from the culture or from the host cells. Preferred embodiments
include those in
which the protein produced by such processes is a mature form of the protein.
Polynucleotides according to the invention have numerous applications in a
variety
of techniques known to those skilled in the art of molecular biology. These
techniques
include use as hybridization probes, use as oligomers, or primers, for PCR,
use for
chromosome and gene mapping, use in the recombinant production of protein, and
use in
generation of anti-sense DNA or RNA, their chemical analogs and the like. For
example,
when the expression of an mRNA is largely restricted to a particular cell or
tissue type,
polynucleotides of the invention can be used as hybridization probes to detect
the presence
of the particular cell or tissue mRNA in a sample using, e.g., ira situ
hybridization.
In other exemplary embodiments, the polynucleotides are used in diagnostics as
expressed sequence tags for identifying expressed genes or, as well known in
the art and


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
6
exemplified by Vollrath et al., Science 25:52-59 (1992), as expressed sequence
tags for
physical mapping of the human genome.
The polypeptides according to the invention can be used in a variety of
conventional
procedures and methods that are currently applied to other proteins. For
example, a
polypeptide of the invention can be used to generate an antibody that
specifically binds the
polypeptide. Such antibodies, particularly monoclonal antibodies, are useful
for detecting or
quantitating the polypeptide in tissue. The polypeptides of the invention can
also be used as
molecular weight markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical
condition which comprises the step of administering to a mammalian subject a
therapeutically effective amount of a composition comprising a polypeptide of
the present
invention and a pharmaceutically acceptable carrier.
In particular, the polypeptides and polynucleotides of the invention can be
utilized,
for example, in methods for the prevention and/or treatment of disorders
involving aberrant
protein expression or biological activity.
The present invention further relates to methods for detecting the presence of
the
polynucleotides or polypeptides of the invention in a sample. Such methods
can, for
example, be utilized as part of prognostic and diagnostic evaluation of
disorders as recited
herein and for the identification of subjects exhibiting a predisposition to
such conditions.
The invention provides a method for detecting the polynucleotides of the
invention in a
sample, comprising contacting the sample with a compound that binds to and
forms a
complex with the polynucleotide of interest for a period sufficient to form
the complex and
under conditions sufficient to form a complex and detecting the complex such
that if a
complex is detected, the polynucleotide of interest is detected. The invention
also provides a
method for detecting the polypeptides of the invention in a sample comprising
contacting the
sample with a compound that binds to and forms a complex with the polypeptide
under
conditions and for a period sufficient to form the complex and detecting the
formation of the
complex such that if a complex is formed, the polypeptide is detected.
The invention also provides kits comprising polynucleotide probes and/or
monoclonal antibodies, and optionally quantitative standards, for carrying out
methods of the
invention. Furthermore, the invention provides methods for evaluating the
efficacy of drugs,
and monitoring the progress of patients, involved in clinical trials for the
treatment of
disorders as recited above.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
7
The invention also provides methods for the identification of compounds that
modulate (i.e., increase or decrease) the expression or activity of the
polynucleotides and/or
polypeptides of the invention. Such methods can be utilized, for example, for
the
identification of compounds that can ameliorate symptoms of disorders as
recited herein.
Such methods can include, but are not limited to, assays for identifying
compounds and
other substances that interact with (e.g., bind to) the polypeptides of the
invention. The
invention provides a method for identifying a compound that binds to the
polypeptides of the
invention comprising contacting the compound with a polypeptide of the
invention in a cell
for a time sufficient to form a polypeptide/compound complex, wherein the
complex drives
expression of a reporter gene sequence in the cell; and detecting the complex
by detecting
the reporter gene sequence expression such that if expression of the reporter
gene is detected
the compound that binds to a polypeptide of the invention is identified.
The methods of the invention also provide methods for treatment which involve
the
administration of the polynucleotides or polypeptides of the invention to
individuals
exhibiting synptoms or tendencies. In addition, the invention encompasses
methods for
treating diseases or disorders as recited herein comprising administering
compounds and
other substances that modulate the overall activity of the target gene
products. Compounds
and other substances can affect such modulation either on the level of target
gene/protein
expression or target protein activity.
The polypeptides of the present invention and the polynucleotides encoding
them are
also useful for the same functions known to one of skill in the art as the
polypeptides and
polynucleotides to which they have homology (set forth in Table 2); for which
they have a
signature region (as set forth in Table 3); or for which they have homology to
a gene family
(as set forth in Table 4). If no homology is set forth for a sequence, then
the polypeptides
and polynucleotides of the present invention are useful for a variety of
applications, as
described herein, including use in arrays for detection.
4. DETAILED DESCRIPTION OF THE INVENTION
4.1 DEFINITIONS
It must be noted that as used herein and in the appended claims, the singular
forms
"a", "an" and "the" include plural references unless the context clearly
dictates otherwise.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "active" refers to those forms of the polypeptide which retain the
biologic
and/or immunologic activities of any naturally occurnng polypeptide. According
to the
invention, the terms "biologically active" or "biological activity" refer to a
protein or peptide
having structural, regulatory or biochemical functions of a naturally
occurring molecule.
Likewise "immunologically active" or "immunological activity" refers to the
capability of
the natural, recombinant or synthetic polypeptide to induce a specific immune
response in
appropriate animals or cells and to bind with specific antibodies.
The term "activated cells" as used in this application are those cells which
are
engaged in extracellular or intracellular membrane trafficking, including the
export of
secretory or enzymatic molecules as part of a normal or disease process.
The terms "complementary" or "complementarity" refer to the natural binding of
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to
the
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded
molecules may be "partial" such that only certain portions) of the nucleic
acids bind or it
may be "complete" such that total complementarity exists between the single
stranded
molecules. The degree of complementarity between the nucleic acid strands has
significant
effects on the efficiency and strength of the hybridization between the
nucleic acid strands.
The term "embryonic stem cells (ES)" refers to a cell that can give rise to
many
differentiated cell types in an embryo or an adult, including the germ cells.
The term "germ
line stem cells (GSCs)" refers to stem cells derived from primordial stem
cells that provide a
steady and continuous source of germ cells for the production of gametes. The
term
"primordial germ cells (PGCs)" refers to a small population of cells set aside
from other cell
lineages particularly from the yolk sac, mesenteries, or gonadal ridges during
embryogenesis
that have the potential to differentiate into germ cells and other cells. PGCs
are the source
from which GSCs and ES cells are derived. The PGCs, the GSCs and the ES cells
are .
capable of self renewal. Thus these cells not only populate the germ line and
give rise to a
plurality of terminally differentiated cells that comprise the adult
specialized organs, but are
able to regenerate themselves.
The term "expression modulating fragment," EMF, means a series of nucleotides
which modulates the expression of an operably linked ORF or another EMF.
As used herein, a sequence is said to "modulate the expression of an operably
linked
sequence" when the expression of the sequence is altered by the presence of
the EMF.
EMFs include, but are not limited to, promoters, and promoter modulating
sequences


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
9
(inducible elements). One class of EMFs are nucleic acid fragments which
induce the
expression of an operably linked ORF in response to a specific regulatory
factor or
physiological event.
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or
"oligonucleotide" are used interchangeably and refer to a heteropolymer of
nucleotides or
the sequence of these nucleotides. These phrases also refer to DNA or RNA of
genomic or
synthetic origin which may be single-stranded or double-stranded and may
represent the
sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-
like or RNA-like
material. In the sequences herein A is adenine, C is cytosine, T is thymine, G
is guanine and
N is A, C, G, or T (L~ or unknown. It is contemplated that where the
polynucleotide is
RNA, the T (thymine) in the sequences provided herein is substituted with U
(uracil).
Generally, nucleic acid segments provided by this invention may be assembled
from
fragments of the genome and short oligonucleotide linkers, or from a series of
oligonucleotides, or from individual nucleotides, to provide a synthetic
nucleic acid which is
capable of being expressed in a recombinant transcriptional unit comprising
regulatory
elements derived from a microbial or viral operon, or a eukaryotic gene.
The terms "oligonucleotide fragment" or a "polynucleotide fragment",
"portion," or
"segment" or "probe" or "primer" are used interchangeably and refer to a
sequence of
nucleotide residues which are at least about 5 nucleotides, more preferably at
least about 7
nucleotides, more preferably at least about 9 nucleotides, more preferably at
least about 11
nucleotides and most preferably at least about 17 nucleotides. The fragment is
preferably
less than about 500 nucleotides, preferably less than about 200 nucleotides,
more preferably
less than about 100 nucleotides, more preferably less than about 50
nucleotides and most
preferably less than 30 nucleotides. Preferably the probe is from about 6
nucleotides to
about 200 nucleotides, preferably from about 15 to about 50 nucleotides, more
preferably
from about 17 to 30 nucleotides and most preferably from about 20 to 25
nucleotides.
Preferably the fragments can be used in polymerase chain reaction (PCR),
various
hybridization procedures or microarray procedures to identify or amplify
identical or related
parts of mRNA or DNA molecules. A fragment or segment may uniquely identify
each
polynucleotide sequence of the present invention. Preferably the fragment
comprises a
sequence substantially similar to any one of SEQ ID NO: 1-1041, or 2083-2534.
Probes may, for example, be used to determine whether specific mRNA molecules
are present in a cell or tissue or to isolate similar nucleic acid sequences
from chromosomal


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
DNA as described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl
1:241-250).
They may be labeled by nick translation, Klenow fill-in reaction, PCR, or
other methods
well known in the art. Probes of the present invention, their preparation
andlor labeling are
elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory
Manual, Cold
5 Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, Current
Protocols in
Molecular Biology, John Wiley & Sons, New York NY, both of which are
incorporated
herein by reference in their entirety.
The nucleic acid sequences of the present invention also include the sequence
infornlation from the nucleic acid sequences of SEQ ff~ NO: 1-1041, or 2083-
2534. The
10 sequence information can be a segment of any one of SEQ m NO: 1-1041, or
2083-2534
that uniquely identifies or represents the sequence information of that
sequence of SEQ m
NO: 1-1041, or 2083-2534, or those segments identified in Tables 3, 5, 6, and
8. One such
segment can be a twenty-mer nucleic acid sequence because the probability that
a twenty-
mer is fully matched in the human genome is 1 in 300. In the human genome,
there are three
billion base pairs in one set of chromosomes. Because 42° possible
twenty-mers exist, there
are 300 times more twenty-mers than there are base pairs in a set of human
chromosomes.
Using the same analysis, the probability for a seventeen-mer to be fully
matched in the
human genome is approximately 1 in 5. When these segments are used in arrays
for
expression studies, fifteen-mer segments can be used. The probability that the
fifteen-mer is
fully matched in the expressed sequences is also approximately one in five
because
expressed sequences comprise less than approximately 5% of the entire genome
sequence.
Similarly, when using sequence information for detecting a single mismatch, a
segment
can be a twenty-five mer. The probability that the twenty-five mer would
appear in a human
genome with a single mismatch is calculated by multiplying the probability for
a full match
(1=4z5) times the increased probability for mismatch at each nucleotide
position (3 x 25). The
probability that an eighteen mer with a single mismatch can be detected in an
array for
expression studies is approximately one in five. The probability that a twenty-
mer with a single
mismatch can be detected in a human genome is approximately one in five.
The term "open reading frame," ORF, means a series of nucleotide triplets
coding for
amino acids without any termination codons and is a sequence translatable into
protein.
The terms "operably linked" or "operably associated" refer to functionally
related
nucleic acid sequences. For example, a promoter is operably associated or
operably linked
with a coding sequence if the promoter controls the transcription of the
coding sequence.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
11
While operably linked nucleic acid sequences can be contiguous and in the same
reading
frame, certain genetic elements e.g. repressor genes are not contiguously
linked to the coding
sequence but still control transcription/translation of the coding sequence.
The term "pluripotent" refers to the capability of a cell to differentiate
into a number
of differentiated cell types that are present in an adult organism. A
pluripotent cell is
restricted in its differentiation capability in comparison to a totipotent
cell.
The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and
to naturally
occurring or synthetic molecules. A polypeptide "fragment," "portion," or
"segment" is a
stretch of amino acid residues of at least about 5 amino acids, preferably at
least about 7
amino acids, more preferably at least about 9 amino acids and most preferably
at least about
17 or more amino acids. The peptide preferably is not greater than about 200
amino acids,
more preferably less than 150 amino acids and most preferably less than 100
amino acids.
Preferably the peptide is from about 5 to about 200 amino acids. To be active,
any
polypeptide must have sufficient length to display biological and/or
irmnunological activity.
The term "naturally occurring polypeptide" refers to polypeptides produced by
cells
that have not been genetically engineered and specifically contemplates
various polypeptides
arising from post-translational modifications of the polypeptide including,
but not limited to,
acetylation, carboxylation, glycosylation, phosphorylation, lipi'dation and
acylation.
The term "translated protein coding portion" means a sequence which encodes
for the
full-length protein which may include any leader sequence or any processing
sequence.
The term "mature protein coding sequence" means a sequence which encodes a
peptide or protein without a signal or leader sequence. The "mature protein
portion" means
that portion of the protein which does not include a signal or leader
sequence. The peptide
may have been produced by processing in the cell wluch removes any
leader/signal
sequence. The mature protein portion may or may not include the initial
methionine residue.
The methionine residue may be removed from the protein during processing in
the cell. The
peptide may be produced synthetically or the protein may have been produced
using a
polynucleotide only encoding for the mature protein coding sequence.
The term "derivative" refers to polypeptides chemically modified by such
techniques
as ubiquitination, labeling (e.g., with radionuclides or various enzymes),
covalent polymer
attachment such as pegylation (derivatization with polyethylene glycol) and
insertion or


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12
substitution by chemical synthesis of amino acids such as ornithine, which do
not normally
occur in human proteins.
The term "variant"(or "analog") refers to any polypeptide differing from
naturally
occurnng polypeptides by amino acid insertions, deletions, and substitutions,
created using,
a g., recombinant DNA techniques. Guidance in determining which amino acid
residues
may be replaced, added or deleted without abolishing activities of interest,
may be found. by
comparing the sequence of the particular polypeptide with that of homologous
peptides and
minimizing the number of amino acid sequence changes made in regions of high
homology
(conserved regions) or by replacing amino acids with consensus sequence.
Alternatively, recombinant variants encoding these same or similar
polypeptides may
be synthesized or selected by making use of the "redundancy" in the genetic
code. Various
codon substitutions, such as the silent changes which produce various
restriction sites, may
be introduced to optimize cloning into a plasmid or viral vector or expression
in a particular
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may
be
reflected in the polypeptide or domains of other peptides added to the
polypeptide to modify
the properties of any part of the polypeptide, to change characteristics such
as ligand-binding
affinities, interchain affinities, or degradation/turnover rate.
Preferably, amino acid "substitutions" are the result of replacing one amino
acid with
another amino acid having similar structural and/or chemical properties, i.
e., conservative
amino acid replacements. "Conservative" amino acid substitutions may be made
on the
basis of similarity in polarity, charge, solubility, hydrophobicity,
hydrophilicity, and/or the
amphipathic nature of the residues involved. For example, nonpolar
(hydrophobic) amino
acids include alanine, leucine, isoleucine, valine, proline, phenylalanine,
tryptophan, and
methionine; polar neutral amino acids include glycine, serine, threonine,
cysteine, tyrosine,
asparagine, and glutamine; positively charged (basic) amino acids include
arginine, lysine,
and histidine; and negatively charged (acidic) amino acids include aspartic
acid and glutamic
acid. "Insertions" or "deletions" are preferably in the range of about 1 to 20
amino acids,
more preferably 1 to 10 amino acids. The variation allowed may be
experimentally
determined by systematically making insertions, deletions, or substitutions of
amino acids in
a polypeptide molecule using recombinant DNA techniques and assaying the
resulting
recombinant variants for activity.
Alternatively, where alteration of function is desired, insertions, deletions
or
non-conservative alterations can be engineered to produce altered
polypeptides. Such


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13
alterations can, for example, alter one or more of the biological functions or
biochemical
characteristics of the polypeptides of the invention. For example, such
alterations may
change polypeptide characteristics such as ligand-binding affinities,
interchain affinities, or
degradation/turnover rate. Further, such alterations can be selected so as to
generate
polypeptides that are better suited for expression, scale up and the like in
the host cells
chosen for expression. For example, cysteine residues can be deleted or
substituted with
another amino acid residue in order to eliminate disulfide bridges.
The terms "purified" or "substantially purified" as used herein denotes that
the
indicated nucleic acid or polypeptide is present in the substantial absence of
other biological
macromolecules, e.g., polynucleotides, proteins, and the like. In one
embodiment, the
polynucleotide or polypeptide is purified such that it constitutes at least
95% by weight,
more preferably at least 99% by weight, of the indicated biological
macromolecules present
(but water, buffers, and other small molecules, especially molecules having a
molecular
weight of less than 1000 daltons, can be present).
The term "isolated" as used herein refers to a nucleic acid or polypeptide
separated
from at least one other component (e.g., nucleic acid or polypeptide) present
with the nucleic
acid or polypeptide in its natural source. In one embodiment, the nucleic acid
or polypeptide
is found in the presence of (if anything) only a solvent, buffer, ion, or
other component
normally present in a solution of the same. The terms "isolated" and
"purified" do not
encompass nucleic acids or polypeptides present in their natural source.
The term "recombinant," when used herein to refer to a polypeptide or protein,
means
that a polypeptide or protein is derived from recombinant (e.g., microbial,
insect, or
mammalian) expression systems. "Microbial" refers to recombinant polypeptides
or proteins
made in bacterial or fungal (e.g., yeast) expression systems. As a product,
"recombinant
microbial" defines a polypeptide or protein essentially free of native
endogenous substances
and unaccompanied by associated native glycosylation. Polypeptides or proteins
expressed
in most bacterial cultures, e.g., E. coli, will be free of glycosylation
modifications;
polypeptides or proteins expressed in yeast will have a glycosylation pattern
in general
different from those expressed in mammalian cells.
The term "recombinant expression vehicle or vector" refers to a plasmid or
phage or
virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An
expression
vehicle can comprise a transcriptional unit comprising an assembly of (1) a
genetic element
or elements having a regulatory role in gene expression, for example,
promoters or


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
14
enhancers, (2) a structural or coding sequence which is transcribed into mRNA
and
translated into protein, and (3) appropriate transcription iutiation and
termination sequences.
Structural units intended for use in yeast or eukaryotic expression systems
preferably include
a leader sequence enabling extracellular secretion of translated protein by a
host cell.
Alternatively, where recombinant protein is expressed without a leader or
transport
sequence, it may include an amino terminal methionine residue. This residue
may or may
not be subsequently cleaved from the expressed recombinant protein to provide
a final
product.
The term "recombinant expression system" means host cells which have stably
integrated a recombinant transcriptional unit into chromosomal DNA or carry
the
recombinant transcriptional unit extrachromosomally. Recombinant expression
systems as
defined herein will express heterologous polypeptides or proteins upon
induction of the
regulatory elements linked to the DNA segment or synthetic gene to be
expressed. This term
also means host cells which have stably integrated a recombinant genetic
element or
elements having a regulatory role in gene expression, for example, promoters
or enhancers.
Recombinant expression systems as defined herein will express polypeptides or
proteins
endogenous to the cell upon induction of the regulatory elements linked to the
endogenous
DNA segment or gene to be expressed. The cells can be prokaryotic or
eukaryotic.
The term "secreted" includes a protein that is transported across or through a
membrane, including transport as a result of signal sequences in its amino
acid sequence
when it is expressed in a suitable host cell. "Secreted" proteins include
without limitation
proteins secreted wholly (e.g., soluble proteins) or partially (e.g.,
receptors) from the cell in
which they are expressed. "Secreted" proteins also include without limitation
proteins that
are transported across the membrane of the endoplasmic reticulum. "Secreted"
proteins are
also intended to include proteins containing non-typical signal sequences
(e.g. Interleukin-1
Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 -143) and
factors
released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, see
Arend, W.P. et. al.
(1995) Annu. Rev. hnmunol. 16:27-55)
Where desired, an expression vector may be designed to contain a "signal or
leader
sequence" which will direct the polypeptide through the membrane of a cell.
Such a
sequence may be naturally present on the polypeptides of the present invention
or provided
from heterologous protein sources by recombinant DNA techniques.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The term "stringent" is used to refer to conditions that are commonly
understood in
the art as stringent. Stringent conditions can include highly stringent
conditions (i.e.,
hybridization to filter-bound DNA in 0.5 M NaHPO4, 7% sodium dodecyl sulfate
(SDS), 1
mM EDTA at 65°C, and washing in O.1X SSC/0.1% SDS at 68°C), and
moderately stringent
5 conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other
exemplary hybridization
conditions are described herein in the examples.
In instances of hybridization of deoxyoligonucleotides, additional exemplary
stringent hybridization conditions include washing in 6X SSC/0.05% sodium
pyrophosphate
at 37°C (for 14-base oligonucleotides), 48°C (for 17-base
oligonucleotides), 55°C (for 20-
10 base oligonucleotides), and 60°C (for 23-base oligonucleotides).
As used herein, "substantially equivalent" or "substantially similar" can
refer both to
nucleotide and amino acid sequences, for example a mutant sequence, that
varies from a
reference sequence by one or more substitutions, deletions, or additions, the
net effect of
which does not result in an adverse functional dissimilarity between the
reference and
15 subject sequences. Typically, such a substantially equivalent sequence
varies from one of
those listed herein by no more than about 35% (i.e., the number of individual
residue
substitutions, additions, and/or deletions in a substantially equivalent
sequence, as compared
to the corresponding reference sequence, divided by the total number of
residues in the
substantially equivalent sequence is about 0.35 or less). Such a sequence is
said to have
65% sequence identity to the listed sequence. In one embodiment, a
substantially
equivalent, e.g., mutant, sequence of the invention varies from a listed
sequence by no more
than 30% (70% sequence identity); in a variation of this embodiment, by no
more than 25%
(75% sequence identity); and in a further variation of this embodiment, by no
more than
20% (80% sequence identity) and in a further variation of this embodiment, by
no more than
10% (90% sequence identity) and in a further variation of this embodiment, by
no more that
5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid
sequences
according to the invention preferably have at least 80% sequence identity with
a listed amino
acid sequence, more preferably at least 85% sequence identity, more preferably
at least 90%
sequence identity, more preferably at least 95% sequence identity, more
preferably at least
98% sequence identity, and most preferably at least 99% sequence identity.
Substantially
equivalent nucleotide sequence of the invention can have louver percent
sequence identities,
taking into account, for example, the redundancy or degeneracy of the genetic
code.
Preferably, the nucleotide sequence has at least about 65% identity, more
preferably at least


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
16
about 75% identity, more preferably at least about 80% sequence identity, more
preferably at
least 85% sequence identity, more preferably at least 90% sequence identity,
more preferably
at least about 95% sequence identity, more preferably at least 98% sequence
identity, and
most preferably at least 99% sequence identity. For the purposes of the
present invention,
sequences having substantially equivalent biological activity and
substantially equivalent
expression characteristics are considered substantially equivalent. For the
purposes of
determining equivalence, truncation of the mature sequence (e.g., via a
mutation which
creates a new stop codon) should be disregarded. Sequence identity may be
determined,
e.g., using the Jotun Hein method (Hero, J. (1990) Methods Enzymol. 183:626-
645).
Identity between sequences can also be determined by other methods known in
the art, e.g.
by varying hybridization conditions.
The term "totipotent" refers to the capability of a cell to differentiate into
all of the
cell types of an adult organism.
The term "transformation" means introducing DNA into a suitable host cell so
that
the DNA is replicable, either as an extrachromosomal element, or by
chromosomal
integration. The term "transfection" refers to the taking up of an expression
vector by a
suitable host cell, whether or not any coding sequences are in fact expressed.
The term
"infection" refers to the introduction of nucleic acids into a suitable host
cell by use of a
virus or viral vector.
As used herein, an "uptake modulating fragment," UMF, means a series of
nucleotides which mediate the uptake of a linked DNA fragment into a cell.
UMFs can be
readily identified using known UMFs as a target sequence or target motif with
the
computer-based systems described below. The presence and activity of a UMF can
be
confirmed by attaching the suspected UMF to a marker sequence. The resulting
nucleic acid
molecule is then incubated with an appropriate host under appropriate
conditions and the
uptake of the marker sequence is determined. As described above, a UMF will
increase the
frequency of uptake of a linked marker sequence.
Each of the above terms is meant to encompass all that is described for each,
unless
the context dictates otherwise.
4.2 NUCLEIC ACIDS OF THE INVENTION
Nucleotide sequences of the invention are set forth in the Sequence Listing.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
17
The isolated polynucleotides of the invention include a polynucleotide
comprising
the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534; a polynucleotide
encoding
any one of the peptide sequences of SEQ m NO: 1-1041, or 2083-2534; and a
polynucleotide comprising the nucleotide sequence encoding the mature protein
coding
sequence of the polynucleotides of any one of SEQ m NO: 1-1041, or 2083-2534.
The
polynucleotides of the present invention also include, but are not limited to,
a polynucleotide
that hybridizes under stringent conditions to (a) the complement of any of the
nucleotides
sequences of SEQ m NO: 1-1041, or 2083-2534; (b) nucleotide sequences encoding
any one
of the amino acid sequences set forth in the Sequence Listing, or Table 8; (c)
a
polynucleotide which is an allelic variant of any polynucleotide recited
above; (d) a
polynucleotide which encodes a species homolog of any of the proteins recited
above; or (e)
a polynucleotide that encodes a polypeptide comprising a specific domain or
truncation of
the polypeptides of SEQ m NO: 1042-2082, or 2535-2986 (for example, as set
forth in
Tables 3, 5, 6, or 8). Domains of interest may depend on the nature of the
encoded
polypeptide; e.g., domains in receptor-like polypeptides include ligand-
binding,
extracellular, transmembrane, or cytoplasmic domains, or combinations thereof;
domains in
irmnunoglobulin-like proteins include the variable immunoglobulin-like
domains; domains
in enzyme-like polypeptides include catalytic and substrate binding domains;
and domains in
ligand polypeptides include receptor-binding domains.
The polynucleotides of the invention include naturally occurring or wholly or
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The
polynucleotides may include entire coding region of the cDNA or may represent
a portion of
the coding region of the cDNA.
The present invention also provides genes corresponding to the cDNA sequences
disclosed herein. The corresponding genes can be isolated in accordance with
known methods
using the sequence information disclosed herein. Such methods include the
preparation of
probes or primers from the disclosed sequence information for identification
and/or
amplification of genes in appropriate genomic libraries or other sources of
genomic materials.
Further 5' and 3' sequence can be obtained using methods known in the art. For
example, full
length cDNA or genomic DNA that corresponds to any of the polynucleotides of
SEQ m NO:
1-1041, or 2083-2534 can be obtained by screening appropriate cDNA or genomic
DNA
libraries under suitable hybridization conditions using any of the
polynucleotides of SEQ m
NO: 1-1041, or 2083-2534 or a portion thereof as a probe. Alternatively, the
polynucleotides of


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
18
SEQ ID NO: 1-1041, or 2083-2534 may be used as the basis for suitable primers)
that allow
identification and/or amplification of genes in appropriate genomic DNA or
cDNA libraries.
The nucleic acid sequences of the invention can be assembled from ESTs~and
sequences
(including cDNA and genomic sequences) obtained from one or more public
databases, such as
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence
information, representative fragment or segment information, or novel segment
information for
the full-length gene.
The polynucleotides of the invention also provide pol5mucleotides including
nucleotide sequences that are substantially equivalent to the polynucleotides
recited above.
Polynucleotides according to the invention can have, e.g., at least about 65%,
at least about
70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more
typically at least
about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 91%, 92%,
93%, 94%,
and even more typically at least about 95%, 96%, 97%, 98%, 99% sequence
identity to a
polynucleotide recited above.
Included within the scope of the nucleic acid sequences of the invention are
nucleic
acid sequence fragments that hybridize under stringent conditions to any of
the nucleotide
sequences of SEQ ID NO: 1-1041, or 2083-2534, or complements thereof, which
fragment is
greater than about 5 nucleotides, preferably 7 nucleotides, more preferably
greater than 9
nucleotides and most preferably greater than 17 nucleotides. Fragments of,
e.g. 15, 17, or 20
nucleotides or more that are selective for (i.e. specifically hybridize to)
any one of the
polynucleotides of the invention are contemplated. Probes capable of
specifically
hybridizing to a polynucleotide can differentiate polynucleotide sequences of
the invention
from other polynucleotide sequences in the same family of genes or can
differentiate human
genes from genes of other species, and are preferably based on unique
nucleotide sequences.
The sequences falling within the scope of the present invention are not
limited to these
specific sequences, but also include allelic and species variations thereof.
Allelic and species
variations can be routinely determined by comparing the sequence provided in
SEQ ID NO: 1-
1041, or 2083-2534, a representative fragment thereof, or a nucleotide
sequence at least 90%
identical, preferably 95% identical, to SEQ m NO: 1-1041, or 2083-2534 with a
sequence from
another isolate of the same species. Furthermore, to accommodate colon
variability, the
invention includes nucleic acid molecules coding for the same amino acid
sequences as do the
specific ORFs disclosed herein. In other words, in the coding region of an
ORF, substitution of
one colon for another colon that encodes the same amiilo acid is expressly
contemplated.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
19
The nearest neighbor or homology results for the nucleic acids of the present
invention,
including SEQ m NO: 1-1041, or 2083-2534 can be obtained by searching a
database using an
algorithm or a program. Preferably, a BLAST (Basic Local Aligmnent Search
Tool) program is
used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36
290-300 (1993) and
Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)). Alternatively a FASTA
version 3 search
against Genpept, using FASTXY algorithm may be performed.
Species homologs (or orthologs) of the disclosed polynucleotides and proteins
are
also provided by the present invention. Species homologs may be isolated and
identified by
making suitable probes or primers from the sequences provided herein and
screening a
suitable nucleic acid source from the desired species.
The invention also encompasses allelic variants of the disclosed
polynucleotides or
proteins; that is, naturally-occurring alternative forms of the isolated
polynucleotide which
also encode proteins which are identical, homologous or related to that
encoded by the
polynucleotides.
The nucleic acid sequences of the invention. are further directed to sequences
which
encode variants of the described nucleic acids. These amino acid sequence
variants may be
prepared by methods known in the art by introducing appropriate nucleotide
changes into a
native or variant polynucleotide. There are two variables in the construction
of amino acid
sequence variants: the location of the mutation and the nature of the
mutation. Nucleic
acids encoding the amino acid sequence variants are preferably constructed by
mutating the
polynucleotide to encode an amino acid sequence that does not occur in nature.
These
nucleic acid alterations can be made at sites that differ in the nucleic acids
from different
species (variable positions) or in highly conserved regions (constant
regions). Sites at such
locations will typically be modified in series, e.g., by substituting first
with conservative
choices (e.g., hydrophobic amino acid to a different hydrophobic amino acid)
and then with
more distant choices (e.g., hydrophobic amino acid to a charged amino acid),
and then
deletions or insertions may be made at the target site. Amino acid sequence
deletions
generally range from about 1 to 30 residues, preferably about 1 to 10
residues, and are
typically contiguous. Amino acid insertions include amino- and/or carboxyl-
terminal
fusions ranging in length from one to one hundred or more residues, as well as
intrasequence
insertions of single or multiple amino acid residues. Intrasequence insertions
may range
generally from about 1 to 10 amino residues, preferably from 1 to 5 residues.
Examples of
terminal insertions include the heterologous signal sequences necessary for
secretion or for


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
intracellular targeting in different host cells and sequences such as FLAG or
poly-histidine
sequences useful for purifying the expressed protein.
In a preferred method, polynucleotides encoding the novel amino acid sequences
are
changed via site-directed mutagenesis. This method uses oligonucleotide
sequences to alter
5 a polynucleotide to encode the desired amino acid variant, as well as
sufficient adjacent
nucleotides on both sides of the changed amino acid to form a stable duplex on
either side of
the site of being changed. In general, the techniques of site-directed
mutagenesis are well
known to those of skill in the art and this technique is exemplified by
publications such as,
Edelman et al., DNA 2:183 (1983). A versatile and efficient method for
producing
10 site-specific changes in a polynucleotide sequence was published by Zoller
and Smith,
Nucleic Acids Res. 10:6487-6500 (1982). PCR may also be used to create amino
acid
sequence variants of the novel nucleic acids. When small amounts of template
DNA are
used as starting material, primers) that differs slightly in sequence from the
corresponding
region in the template DNA can generate the desired amino acid variant. PCR
amplification
15 results in a population of product DNA fragments that differ from the
polynucleotide
template encoding the polypeptide at the position specified by the primer. The
product DNA
fragments replace the corresponding region in the plasmid and this gives a
polynucleotide
encoding the desired amino acid variant.
A further technique for generating amino acid variants is the cassette
mutagenesis
20 technique described in Wells et al., Gene 34:315 (1985); and other
mutagenesis techniques
well known in the art, such as, for example, the techniques in Sambrook et
al., supra, and
Cur~eht Protocols i~z MoleculaY Biology, Ausubel et al. Due to the inherent
degeneracy of
the genetic code, other DNA sequences which encode substantially the same or a
functionally equivalent amino acid sequence may be used in the practice of the
invention for
the cloning and expression of these novel nucleic acids. Such DNA sequences
include those
which are capable of hybridizing to the appropriate novel nucleic acid
sequence under
stringent conditions.
Polynucleotides encoding preferred polypeptide truncations of the invention
could be
used to generate polynucleotides encoding chimeric or fusion proteins
comprising one or
more domains of the invention and heterologous protein sequences.
The polynucleotides of the invention additionally include the complement of
any of
the polynucleotides recited above. The polynucleotide can be DNA (genomic,
cDNA,
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
21
polynucleotides are well known to those of skill in the art and can include,
for example,
methods for determining hybridization conditions that can routinely isolate
polynucleotides
of the desired sequence identities.
In accordance with the invention, polynucleotide sequences comprising the
mature
protein coding sequences corresponding to any one of SEQ m NO: 1-1041, or 2083-
2534,
or functional equivalents thereof, may be used to generate recombinant DNA
molecules that
direct the expression of that nucleic acid, or a functional equivalent
thereof, in appropriate
host cells. Also included are the cDNA inserts of any of the clones identified
herein.
A polynucleotide according to the invention can be joined to any of a variety
of other
nucleotide sequences by well-established recombinant DNA techniques (see
Sambrook J et
al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory, NY).
Useful nucleotide sequences for joining to polynucleotides include an
assortment of vectors,
e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and the like,
that are well
known in the art. Accordingly, the invention also provides a vector including
a
polynucleotide of the invention and a host cell containing the polynucleotide.
In general, the
vector contains an origin of replication functional in at least one organism,
convenient
restriction endonuclease sites, and a selectable marker for the host cell.
Vectors according to
the invention include expression vectors, replication vectors, probe
generation vectors, and
sequencing vectors. A host cell according to the invention can be a
prokaryotic or
eukaryotic cell and can be a unicellular organism or part of a multicellular
organism.
The present invention further provides recombinant constructs comprising a
nucleic
acid having any of the nucleotide sequences of SEQ m NO: 1-1041, or 2083-2534
or a
fragment thereof or any other pol5mucleotides of the invention. In one
embodiment, the
recombinant constructs of the present invention comprise a vector, such as a
plasmid or viral
vector, into which a nucleic acid having any of the nucleotide sequences of
SEQ m NO: 1-
1041, or 2083-2534 or a fragment thereof is inserted, in a forward or reverse
orientation. In
the case of a vector comprising one of the ORFs of the present invention, the
vector may
further comprise regulatory sequences, including for example, a promoter,
operably linked to
the ORF. Large numbers of suitable vectors and promoters are known to those of
skill in the
art and are commercially available for generating the recombinant constructs
of the present
invention. The following vectors are provided by way of example: Bacterial:
pBs,
phagescript, PsiX174, pBluescript SK, pBs KS, pNHBa, pNHl6a, pNHl8a, pNH46a
(Stratagene), pTrc99A, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia);
Eukaryotic:


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
22
pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL
(Pharmacia).
The isolated polynucleotide of the invention may be operably linked to an
expression
control sequence such as the pMT2 or pED expression vectors disclosed in
Kaufinan et al.,
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein
recombinantly.
Many suitable expression control sequences are known in the art. General
methods of
expressing recombinant proteins are also known and are exemplified in R.
Kaufinan,
Methods iu Enzymology 185, 537-566 (1990). As defined herein "operably linked"
means
that the isolated polynucleotide of the invention and an expression control
sequence are
situated within a vector or cell in such a way that the protein is expressed
by a host cell
which has been transformed (transfected) with the ligated
polynucleotide/expression control
sequence.
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol transferase) vectors or other vectors with selectable
markers. Two
appropriate vectors are pKK232-8 and pCM7. Particular named bacterial
promoters include
lacI, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV
immediate
early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and
mouse
metallothionein-I. Selection of the appropriate vector and promoter is well
within the level
of ordinary skill in the art. Generally, recombinant expression vectors will
include origins of
replication and selectable markers permitting transformation of the host cell,
e.g., the
ampicillin resistance gene of E. coli and S. cerevisiae TRP 1 gene, and a
promoter derived
from a highly expressed gene to direct transcription of a downstream
structural sequence.
Such promoters can be derived from operons encoding glycolytic enzymes such as
3-
phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat shock
proteins, among
others. The heterologous structural sequence is assembled in appropriate phase
with
translation initiation and termination sequences, and preferably, a leader
sequence capable of
directing secretion of translated protein into the periplasmic space or
extracellular medium.
Optionally, the heterologous sequence can encode a fusion protein including an
amino
terminal identification peptide imparting desired characteristics, e.g.,
stabilization or
simplified purification of expressed recombinant product. Useful expression
vectors for
bacterial use are constructed by inserting a structural DNA sequence encoding
a desired
protein together with suitable translation initiation and termination signals
in operable
reading phase with a functional promoter. The vector will comprise one or more
phenotypic


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
23
selectable markers and an origin of replication to ensure maintenance of the
vector and to, if
desirable, provide amplification within the host. Suitable prokaryotic hosts
for
transformation include E. coli, Bacillus subtilis, Salmonella typhimur iuna
and various species
within the genera Pseudomonas, Streptonayces, and Staphylococcus, although
others may
also be employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for
bacterial
use can comprise a selectable marker and bacterial origin of replication
derived from
commercially available plasmids comprising genetic elements of the well known
cloning
vector pBR322 (ATCC 37017). Such commercial vectors include, for example,
pI~K223-3
(Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech,
Madison, WI,
USA). These pBR322 "backbone" sections are combined with an appropriate
promoter and
the structural sequence to be expressed. Following transformation of a
suitable host strain
and growth of the host strain to an appropriate cell density, the selected
promoter is induced
or derepressed by appropriate means (e.g., temperature shift or chemical
induction) and cells
are cultured for an additional period. Cells axe typically harvested by
centrifugation,
disrupted by physical or chemical means, and the resulting crude extract
retained for further
purification.
Polynucleotides of the invention can also be used to induce immune responses.
For
example, as described in Fan et al., Nat. Biotech 17, 870-872 (1999),
incorporated herein by
reference, nucleic acid sequences encoding a polypeptide may be used to
generate antibodies
against the encoded polypeptide following topical administration of naked
plasmid DNA or
following injection, and preferably intra-muscular injection of the DNA. The
nucleic acid
sequences are preferably inserted in a recombinant expression vector and may
be in the form
of naked DNA.
4.3 ANTISENSE
Another aspect of the invention pertains to isolated antisense nucleic acid
molecules
that are hybridizable to or complementary to the nucleic acid molecule
comprising the
nucleotide sequence of SEQ ID NO: 1-1041, or 2083-2534, or fragments, analogs
or
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide
sequence that is
complementary to a "sense" nucleic acid encoding a protein, e.g.,
complementary to the
coding strand of a double-stranded cDNA molecule or complementary to an mRNA
sequence. In specific aspects, antisense nucleic acid molecules are provided
that comprise a


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
24
sequence complementary to at least about 10, 25, 50, 100, 250 or 500
nucleotides or an
entire coding strand, or to only a portion thereof. Nucleic acid molecules
encoding
fragments, homologs, derivatives and analogs of a protein of any of SEQ >D NO:
1-1041, or
2083-2534 or antisense nucleic acids complementary to a nucleic acid sequence
of SEQ m
NO: 1-1041, or 2083-2534 are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a
"coding
region" of the coding strand of a nucleotide sequence of the invention. The
term "coding
region" refers to the region of the nucleotide sequence comprising codons
which are
translated into amino acid residues. In another embodiment, the antisense
nucleic acid
molecule is antisense to a "noncoding region" of the coding strand of a
nucleotide sequence
of the invention. The term "noncoding region" refers to 5' and 3' sequences
that flank the ,
coding region that are not translated into amino acids (i.e., also referred to
as 5' and 3'
untranslated regions).
Given the coding strand sequences encoding a nucleic acid disclosed herein
(e.g.,
SEQ >D NO: 1-1041, or 2083-2534, antisense nucleic acids of the invention can
be designed
according to the rules of Watson and Crick or Hoogsteen base pairing. The
antisense nucleic
acid molecule can be complementary to the entire coding region of an mRNA, but
more
preferably is an oligonucleotide that is antisense to only a portion of the
coding or noncoding
region of an mRNA. For example, the antisense oligonucleotide can be
complementary to
the region surrounding the translation start site of an mRNA. An antisense
oligonucleotide
can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides
in length. An
antisense nucleic acid of the invention can be constructed using chemical
synthesis or
enzymatic ligation reactions using procedures known in the art. For example,
an antisense
nucleic acid (e.g., an antisense oligonucleotide) can be chemically
synthesized using
naturally occurring nucleotides or variously modified nucleotides designed to
increase the
biological stability of the molecules or to increase the physical stability of
the duplex formed
between the antisense and sense nucleic acids, e.g., phosphorothioate
derivatives and
acridine substituted nucleotides can be used.
Examples of modified nucleotides that can be used to generate the antisense
nucleic
acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine,
xanthine, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, 5-
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil,
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-
methylguanine,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-
methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-
methylaminomethyluracil, 5-methoxyamiuomethyl-2-thiouracil, beta-D-
mannosylqueosine,
5'-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-
isopentenyladenine,
5 uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-
thiocytosine, 5-methyl-
2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic
acid methylester,
uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-
carboxypropyl) uracil,
(acp3)w, and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can
be produced
.biologically using an expression vector into which a nucleic acid has been
subcloned in an
10 antisense orientation (i.e., RNA transcribed from the inserted nucleic acid
will be of an
antisense orientation to a target nucleic acid of interest, described further
in the following
subsection).
The antisense nucleic acid molecules of the invention are typically
administered to a
subject or generated in situ such that they hybridize with or bind to cellular
mRNA and/or
15 genomic DNA encoding a protein according to the invention to thereby
inhibit expression of
the protein, e.g., by inhibiting transcription and/or translation. The
hybridization can be by
conventional nucleotide complementarity to form a stable duplex, or, for
example, in the
case of an antisense nucleic acid molecule that binds to DNA duplexes, through
specific
interactions in the major groove of the double helix. An example of a route of
20 administration of antisense nucleic acid molecules of the invention
includes direct injection
at a tissue site. Alternatively, antisense nucleic acid molecules can be
modified to target
selected cells and then administered systemically. For example, for systemic
administration,
antisense molecules can be modified such that they specifically bind to
receptors or antigens
expressed on a selected cell surface, e.g., by linking the antisense nucleic
acid molecules to
25 peptides or antibodies that bind to cell surface receptors or antigens. The
antisense nucleic
acid molecules can also be delivered to cells using the vectors described
herein. To achieve
sufficient intracellular concentrations of antisense molecules, vector
constructs in which the
antisense nucleic acid molecule is placed under the control of a strong pol II
or pol III
promoter are preferred.
W yet another embodiment, the antisense nucleic acid molecule of the invention
is an
a,-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms
specific
double-stranded hybrids with complementary RNA in which, contrary to the usual
a,-units,
the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids
Res 15:


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
26
6625-6641). The antisense nucleic acid molecule can also comprise a
2'-o-methylribonucleotide (moue et al. (1987) Nucleic Acids Res 15: 6131-6148)
or a
chimeric RNA -DNA analogue (moue et al. (1987) FEBS Lett 215: 327-330).
4.4 RIBOZYMES AND PNA MOIETIES
In still another embodiment, an antisense nucleic acid of the invention is a
ribozyme.
Ribozymes are catalytic RNA molecules with ribonuclease activity that are
capable of
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described
in
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically
cleave
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having
specificity
for a nucleic acid of the invention can be designed based upon the nucleotide
sequence of a
DNA disclosed herein (i.e., SEQ ID NO: 1-1041, or 2083-2534). For example, a
derivative
of Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide
sequence of the
active site is complementary to the nucleotide sequence to be cleaved in a
mRNA. See, e.g.,
Cech et al. U.S. Pat. No. 4,987,071; and Cech et al. U.S. Pat. No. 5,116,742.
Alternatively,
mRNA of the invention can be used to select a catalytic RNA having a specific
ribonuclease
activity from a pool of RNA molecules. See, e.g., Bartel et al., (1993)
Seience
261:1411-1418.
Alternatively, gene expression can be inhibited by targeting nucleotide
sequences
complementary to the regulatory region (e.g., promoter and/or enhancers) to
form triple
helical structures that prevent transcription of the gene in target cells. See
generally, Helene.
(1991) Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ann. N Y. Acad.
Sci.
660:27-36; and Maher (1992) Bioassays 14: 807-15.
In various embodiments, the nucleic acids of the invention can be modified at
the
base moiety, sugar moiety or phosphate backbone to improve, e.g., the
stability,
hybridization, or solubility of the molecule. For example, the deoxyribose
phosphate
backbone of the nucleic acids can be modified to generate peptide nucleic
acids (see Hyrup
et al. (1996) Bioorg Med Chern 4: 5-23). As used herein, the terms "peptide
nucleic acids"
or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the
deoxyribose
phosphate backbone is replaced by a pseudopeptide backbone and only the four
natural
nucleobases are retained. The neutral backbone of PNAs has been shown to allow
for
specific hybridization to DNA and RNA under conditions of low ionic strength.
The


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
27
synthesis of PNA oligomers can be performed using standard solid phase peptide
synthesis
protocols as described in Hyrup et al. (1996) above; Perry-O'Keefe et al.
(1996) PNAS 93:
14670-675.
PNAs of the invention can be used in therapeutic and diagnostic applications.
For
example, PNAs can be used as antisense or antigene agents for sequence-
specific modulation
of gene expression by, e.g., inducing transcription or translation arrest or
inhibiting
replication. PNAs of the invention can also be used, e.g., in the analysis of
single base pair
mutations in a gene by, e.g., PNA directed PCR clamping; as artificial
restriction enzymes
when used in combination with other enzymes, e.g., S1 nucleases (Hyrup B.
(1996) above);
or as probes or primers for DNA sequence and hybridization (Hyrup et al.
(1996), above;
Perry-O'Keefe (1996), above).
In another embodiment, PNAs of the invention can be modified, e.g., to enhance
their stability or cellular uptake, by attaching lipophilic or other helper
groups to PNA, by
the formation of PNA-DNA chimeras, or by the use of liposomes or other
techniques of drug
delivery known in the art. For example, PNA-DNA chimeras can be generated that
may
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the
DNA
portion while the PNA portion would provide high binding affinity and
specificity.
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected
in terms of
base stacking, number of bonds between the nucleobases, and orientation (Hyrup
(1996)
above). The synthesis of PNA-DNA chimeras can be performed as described in
Hyrup
(1996) above and Finn et al. (1996) Nucl Acids Res 24: 3357-63. For example, a
DNA chain
can be synthesized on a solid support using standard phosphoramidite coupling
chemistry,
and modified nucleoside analogs, e.g., 5'-(4-methoxytrityl)amino-5'-deoxy-
thymidine
phosphoramidite, can be used between the PNA and the 5' end of DNA (Mag et al.
(1989)
Nucl Acid Res 17: 5973-88). PNA monomers are then coupled in a stepwise manner
to
produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment (Finn
et al.
(1996) above). Alternatively, chimeric molecules can be synthesized with a 5'
DNA
segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg Med Clzem
Lett 5:
1119-11124.
In other embodiments, the oligonucleotide may include other appended groups
such
as peptides (e.g., for targeting host cell receptors in vivo), or agents
facilitating transport
across the cell membrane (see, e.g., Letsinger et al., 1989, P~oc. Natl. Acad.
Sci. U.S.A.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
28
86:6553-6556; Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT
Publication
No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No.
W089/10134).
In addition, oligonucleotides can be modified with hybridization triggered
cleavage agents
(See, e.g., Krol et al., 1988, BioTechhiques 6:958-976) or intercalating
agents. (See, e.g.,
Zon, 1988, Pha~m. Res. 5: 539-549). To this end, the oligonucleotide may be
conjugated to
another molecule, e.g., a peptide, a hybridization triggered cross-linking
agent, a transport
agent, a hybridization-triggered cleavage agent, etc.
4.5 HOSTS
The present invention further provides host cells genetically engineered to
contain
the polynucleotides of the invention. For example, such host cells may contain
nucleic acids
of the invention introduced into the host cell using known transformation,
transfection or
infection methods. The present invention still fizrther provides host cells
genetically
engineered to express the polynucleotides of the invention, wherein such
polynucleotides are
in operative association with a regulatory sequence heterologous to the host
cell which
drives expression of the polynucleotides in the cell.
Knowledge of nucleic acid sequences allows for modification of cells to
permit, or
increase, expression of endogenous polypeptide. Cells can be modified (e.g.,
by
homologous recombination) to provide increased polypeptide expression by
replacing, in
whole or in part, the naturally occurring promoter with all or part of a
heterologous promoter
so that the cells express the polypeptide at higher levels. The heterologous
promoter is
inserted in such a manner that it is operatively linked to the encoding
sequences. See, for
example, PCT International Publication No. WO94/12650, PCT International
Publication
No. W092/20808, and PCT International Publication No. W091/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA
(e.g., ada, dhfr, and the multifunctional CAD gene which encodes carbamyl
phosphate
synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron DNA
may be
inserted along with the heterologous promoter DNA. If linked to the coding
sequence,
amplification of the marker DNA by standard selection methods results in co-
amplification
of the desired protein coding sequences in the cells.
The host cell can be a higher eukaryotic host cell, such as a mammalian cell,
a lower
eukaryotic host cell, such as a yeast cell, or the host cell can be a
prokaryotic cell, such as a
bacterial cell. Introduction of the recombinant construct into the host cell
can be effected by


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
29
calcium phosphate transfection, DEAE, dextran mediated transfection, or
electroporation
(Davis, L. et al., Basic Metlaods iri Molecular Biology (1986)). The host
cells containing one
of the polynucleotides of.the invention, can be used in conventional manners
to produce the
gene product encoded by the isolated fragment (in the case of an ORF) or can
be used to
produce a heterologous protein under the control of the EMF.
Any host/vector system can be used to express one or more of the ORFs of the
present invention. These include, but are not limited to, eukaryotic hosts
such as HeLa cells,
Cv-1 cell, COS cells, 293 cells, and S~ cells, as well as prokaryotic host
such as E. coli and
B. subtilis. The most preferred cells are those which do not normally express
the particular
polypeptide or protein or which expresses the polypeptide or protein at low
natural level.
Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other
cells under
the control of appropriate promoters. Cell-free translation systems can also
be employed to
produce such proteins using RNAs derived from the DNA constructs of the
present
invention. Appropriate cloning arid expression vectors for use with
prokaryotic and
eukaryotic hosts are described by Sambrook, et al., in Molecular Cloning: A
Laboratory
Manual, Second Edition, Cold Spring Harbor, New York (1989), the disclosure of
which is
hereby incorporated by reference.
Various mammalian cell culture systems can also be employed to express
recombinant protein. Examples of mammalian expression systems include the COS-
7 lines
of monkey kichley fibroblasts, described by Gluzman, Cell 23:175 (1981). Other
cell lines
capable of expressing a compatible vector are, for example, the C127, monkey
COS cells,
Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal
A431 cells,
human Co1o205 cells, 3T3 cells, CV-1 cells, other transformed primate cell
lines, normal
diploid cells, cell strains derived from ih vitro culture of primary tissue,
primary explants,
HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or Jurkat cells. Mammalian
expression
vectors will comprise an origin of replication, a suitable promoter and also
any necessary
ribosome binding sites, polyadenylation site, splice donor and acceptor sites,
transcriptional
termination sequences, and 5' flanking nontranscribed sequences. DNA sequences
derived
from the SV40 viral genome, for example, SV40 origin, early promoter,
enhancer, splice,
and polyadenylation sites may be used to provide the required nontranscribed
genetic
elements. Recombinant polypeptides and proteins produced in bacterial culture
are usually
isolated by initial extraction from cell pellets, followed by one or more
salting-out, aqueous
ion exchange or size exclusion chromatography steps. Protein refolding steps
can be used,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
as necessary, in completing configuration of the mature protein. Finally, high
performance
liquid chromatography (HPLC) can be employed for final purification steps.
Microbial cells
employed in expression of proteins can be disrupted by any convenient method,
including
freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing
agents.
Alternatively, it may be possible to produce the protein in lower eukaryotes
such as
yeast or insects or in prokaryotes such as bacteria. Potentially suitable
yeast strains include
SaccharonZyces cerevisiae, SclZizosacchaYOtnyces potrtbe, Kluyvet~omyces
strains, Candida,
or any yeast strain capable of expressing heterologous proteins. Potentially
suitable bacterial
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimuriut~t,
or any bacterial
10 strain capable of expressing heterologous proteins. If the protein is made
in yeast or
bacteria, it may be necessary to modify the protein produced therein, for
example by
phosphorylation or glycosylation of the appropriate sites, in order to obtain
the functional
protein. Such covalent attachments may be accomplished using known chemical or
enzymatic methods.
15 hl another embodiment of the present invention, cells and tissues may be
engineered
to express an endogenous gene comprising the polynucleotides of the invention
under the
control of inducible regulatory elements, in which case the regulatory
sequences of the
endogenous gene may be replaced by homologous recombination. As described
herein, gene
targeting can be used to replace a gene's existing regulatory region with a
regulatory
20 sequence. isolated from a different gene or a novel regulatory sequence
synthesized by
genetic engineering methods. Such regulatory sequences may be comprised of
promoters,
enhancers, scaffold-attachment regions, negative regulatory elements,
transcriptional
initiation sites, and regulatory protein binding sites or combinations of said
sequences.
Alternatively, sequences which affect the structure or stability of the RNA or
protein
25 produced may be replaced, removed, added, or otherwise modified by
targeting. These
sequence include polyadenylation signals, mRNA stability elements, splice
sites, leader
sequences for enhancing or modifying transport or secretion properties of the
protein, or
other sequences which alter or improve the function or stability of protein or
RNA
molecules.
30 The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
31
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different
cell-type specificity than the naturally occurnng elements. Here, the
naturally occurring
sequences are deleted and new sequences are added. In all cases, the
identification of the
targeting event may be facilitated by the use of one or more selectable marker
genes that are
contiguous with the targeting DNA, allowing for the selection of cells in
which the
exogenous DNA has integrated into the host cell genome. The identification of
the targeting
event may also be facilitated by the use of one or more marker genes
exhibiting the property
of negative selection, such that the negatively selectable marker is linked to
the exogenous
DNA, but configured such that the negatively selectable marker flanks the
targeting
sequence, and such that a correct homologous recombination event with
sequences in the
host cell genome does not result in the stable integration of the negatively
selectable marker.
Markers useful for this purpose include the Herpes Simplex Virus thymidine
kinase (TK)
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance
with this aspect of the invention are more particularly described in U.S.
Patent No. 5,272,071
to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; Tnternational
Application No.
PCT/US92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/US90/06436 (W091/06667) by Skoultchi et al., each of which is incorporated
by
reference herein in its entirety.
4.6 POLYPEPTIDES OF THE INVENTION
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide comprising: the amino acid sequences set forth as any one of SEQ
ID NO: 1042-
2082, or 2535-2986 or an amino acid sequence encoded by any one of the
nucleotide
sequences SEQ DJ NO: 1-1041, or 2083-2534 or the corresponding full length or
mature
protein. Polypeptides of the invention also include polypeptides preferably
with biological or
immunological activity that are encoded by: (a) a polynucleotide having any
one of the
nucleotide sequences set forth in SEQ ID NO: 1-1041, or 2083-2534 or (b)
polynucleotides
encoding any one of the amino acid sequences set forth as SEQ m NO: 1042-2082,
or 2535-
2986 or (c) polynucleotides that hybridize to the complement of the
polynucleotides of either
(a) or (b) under stringent hybridization conditions. The invention also
provides biologically
active or immunologically active variants of any of the amino acid sequences
set forth as


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
32
SEQ m NO: 1042-2082, or 2535-2986 or the corresponding full length or mature
protein;
and "substantial equivalents" thereof (e.g., with at least about 65%, at least
about 70%, at
least about 75%, at least about 80%, at least about 85%, 86%, 87%, 88%, 89%,
at least about
90%, 91%, 92%, 93%, 94%, typically at least about 95%, 96%, 97%, more
typically at least
about 98%, or most typically at least about 99% amino acid identity) that
retain biological
activity. Polypeptides encoded by allelic variants may have a similar,
increased, or
decreased activity compared to polypeptides comprising SEQ m NO: 1042-2082, or
2535-
2986.
Fragments of the proteins of the present invention which are capable of
exhibiting
biological activity are also encompassed by the present invention. Fragments
of the protein
may be in linear form or they may be cyclized using known methods, for
example, as
described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in
R. S.
McDowell, et al., J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are
incorporated herein by reference. Such fragments may be fused to Garner
molecules such as
immunoglobulins for many purposes, including increasing the valency of protein
binding
sites. Fragments are also identified in Tables 3, 5, 6, and 8.
The present invention also provides both full-length and mature forms (for
example,
without a signal sequence or precursor sequence) of the disclosed proteins.
The protein
coding sequence is identified in the sequence listing by translation of the
disclosed
nucleotide sequences. The predicted signal sequence is set forth in Table 6.
The mature
form of such protein may be obtained and confirmed by expression of a full-
length
polynucleotide in a suitable mammalian cell or other host cell and sequencing
of the cleaved
product. One of skill in the art will recognize that the actual cleavage site
may be different
than that predicted in Table 6. The sequence of the mature form of the protein
is also
determinable from the amino aci°d sequence of the full-length form.
Where proteins of the
present invention are membrane bound, soluble forms of the proteins are also
provided. In
such forms, part or all of the regions causing the proteins to be membrane
bound are deleted
so that the proteins are fully secreted from the cell in which they are
expressed.
Protein compositions of the present invention may further comprise an
acceptable
carrier, such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The present invention further provides isolated polypeptides encoded by the
nucleic
acid fragments of the present invention or by degenerate variants of the
nucleic acid
fragments of the present invention. By "degenerate variant" is intended
nucleotide


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
33
fragments which differ from a nucleic acid fragment of the present invention
(e.g., an ORF)
by nucleotide sequence but, due to the degeneracy of the genetic code, encode
an identical
polypeptide sequence. Preferred nucleic acid fragments of the present
invention are the
ORFs that encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one
of the
isolated polypeptides or proteins of the present invention. At the simplest
level, the amino
acid sequence can be synthesized using commercially available peptide
synthesizers. The
synthetically-constructed protein sequences, by virtue of sharing primary,
secondary or
tertiary structural and/or conformational characteristics with proteins may
possess biological
properties in common therewith, including protein activity. This technique is
particularly
useful in producing small peptides and fragments of larger polypeptides.
Fragments are
useful, for example, in generating antibodies against the native polypeptide.
Thus, they may
be employed as biologically active or immunological substitutes for natural,
purified
proteins in screening of therapeutic compounds and in immunological processes
for the
development of antibodies.
The polypeptides and proteins of the present invention can alternatively be
purified
from cells which have been altered to express the desired polypeptide or
protein. As used
herein, a Bell is said to be altered to express a desired polypeptide or
protein when the cell,
through genetic manipulation, is made to produce a polypeptide or protein
which it normally
does not produce or which the cell normally produces at a lower level. One
skilled in the art
can readily adapt procedures for introducing and expressing either recombinant
or synthetic
sequences into eukaryotic or prokaryotic cells in order to generate a cell
which produces one
of the polypeptides or proteins of the present invention.
The invention also relates to methods for producing a polypeptide comprising
growing a culture of host cells of the invention in a suitable culture medium,
and purifying
the protein from the cells or the culture in which the cells are grown. For
example, the
methods of the invention include a process for producing a polypeptide in
which a host cell
containing a suitable expression vector that includes a polynucleotide of the
invention is
cultured under conditions that allow expression of the encoded polypeptide.
The
polypeptide can be recovered from the culture, conveniently,from the culture
medium, or
from a lysate prepared from the host cells and further purified. Preferred
embodiments
include those in which the protein produced by such process is a full length
or mature form
of the protein.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
34
In an alternative method, the polypeptide or protein is purified from
bacterial cells
which naturally produce the polypeptide or protein. One skilled in the art can
readily follow
known methods for isolating polypeptides and proteins in order to obtain one
of the isolated
polypeptides or proteins of the present invention. These include, but are not
limited to,
S immunochromatography, HPLC, size-exclusion chromatography, ion-exchange
chromatography, and immuno-affinity chromatography. See, e.g., Scopes, Pf-
ateih
Pu~ificatiafa: Priheiples afad PYactice, Springer-Verlag (1994); Sambrook, et
al., in
Molecular Cloning: A Laboy~atoYy Manual; Ausubel et al., Cu~~efzt Protocols in
Molecular
Biology. Polypeptide fragments that retain biologicallimmunological activity
include
fragments comprising greater than about 100 amino acids, or greater than about
200 amino
acids, and fragments that encode specific protein domains.
The purified polypeptides can be used in in vitro binding assays which are
well
knov~m in the art to identify molecules which bind to the polypeptides. These
molecules
include but are not limited to, for e.g., small molecules, molecules from
combinatorial
1S libraries, antibodies or other proteins. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cellla~zimal death or prolonged survival of
the animal/cells.
In addition, the peptides of the invention or molecules capable of binding to
the
peptides may be complexed with toxins, e.g., ricin or cholera, or with other
compounds that
are toxic to cells. The toxin-binding molecule complex is then targeted to a
tumor. or other
cell by the specificity of the binding molecule for SEQ )D NO: 1042-2082, or
2S3S-2986.
The protein of the invention may also be expressed as a product of transgenic
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or
sheep which are
2S characterized by somatic or germ cells containing a nucleotide sequence
encoding the
protein.
The proteins provided herein also include proteins characterized by amino acid
sequences similar to those of purified proteins but into which modification
are naturally
provided or deliberately engineered. For example, modifications, in the
peptide or DNA
sequence, can be made by those skilled in the art using known techniques.
Modifications of
interest in the protein sequences may include the alteration, substitution,
replacement,
insertion or deletion of a selected amino acid residue in the coding sequence.
Fox example,
one or more of the cysteine residues may be deleted or replaced with another
amino acid to


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
alter the conformation of the molecule. Techniques for such alteration,
substitution,
replacement, insertion or deletion are well known to those skilled in the art
(see, e.g., U.S.
Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement,
insertion or
deletion retains the desired activity of the protein. Regions of the protein
that are important
5 for the protein function can be determined by various methods known in the
art including the
alanine-scanning method which involved systematic substitution of single or
strings of
amino acids with alanine, followed by testing the resulting alanine-containing
variant for
biological activity. This type of analysis determines the importance of the
substituted amino
acids) in biological activity. Regions of the protein that are important for
protein function
10 may be determined by the eMATRIX program.
Other fragments and derivatives of the sequences of proteins which would be
expected to retain protein activity in whole or in part and are useful for
screening or other
immunological methodologies may also be easily made by those skilled in the
art given the
disclosures herein. Such modifications are encompassed by the present
invention.
15 The protein may also be produced by operably linking the isolated
polynucleotide of
the invention to suitable control sequences in one or more insect expression
vectors, and
employing an insect expression system. Materials and methods for
baculovirus/insect cell
expression systems are commercially available in kit form from, e.g.,
Invitrogen, San Diego,
Calif., U.S.A. (the MaxBatTM kit), and such methods are well known in the art,
as described
20 in Summers and Smith, Texas Agricultural Experiment Station Bulletin No.
1555 (1987),
incorporated herein by reference. As used herein, an insect cell capable of
expressing a
polynucleotide of the present invention is "transformed."
The protein of the invention may be prepared by culturing transformed host
cells
under culture conditions suitable to express the recombinant protein. The
resulting
25 expressed protein may then be purified from such culture (i.e., from
culture medium or cell
extracts) using known purification processes, such as gel filtration and ion
exchange
chromatography. The purification of the protein may also include an affinity
column
containing agents which will bind to the protein; one or more column steps
over such affinity
resins as concanavalin A-agarose, heparin-toyopearlTM or Cibacrom blue 3GA
SepharoseTM;
30 one or more steps involving hydrophobic interaction chromatography using
such resins as
phenyl ether, butyl ether, or propyl ether; or immunoaffinity chromatography.
Alternatively, the protein of the invention may also be expressed in a form
which will
facilitate purification. For example, it may be expressed as a fusion protein,
such as those of


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
36
maltose binding protein (MBP), glutatluone-S-transferase (GST) or thioredoxin
(TRX), or as
a His tag. Kits for expression and purification of such fusion proteins are
commercially
available from New England BioLab (Beverly, Mass.), Pharmacia (Piscatav~iay,
N.J.) and
Invitrogen, respectively. The protein can also be tagged with an epitope and
subsequently
purified by using a specific antibody directed to such epitope. One such
epitope ("FLAG~")
is commercially available from Kodak (New Haven, Conn.).
Finally, one or more reverse-phase high performance liquid chromatography (RP-
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having
pendant
methyl or other aliphatic groups, can be employed to further purify the
protein. Some or all
of the foregoing purification steps, in various combinations, can also be
employed to provide
a substantially homogeneous isolated recombinant protein. The protein thus
purified is
substantially free of other mammalian proteins and is defined in accordance
with the present
invention as an "isolated protein."
The polypeptides of the invention include analogs (variants). This embraces
fragments, as well as peptides in which one or more amino acids has been
deleted, inserted,
or substituted. Also, analogs of the polypeptides of the invention embrace
fusions of the
polypeptides or modifications of the polypeptides of the invention, wherein
the polypeptide
or analog is fused to another moiety or moieties, e.g., targeting moiety or
another therapeutic
agent. Such analogs may exhibit improved properties such as activity and/or
stability.
Examples of moieties Which may be fused to the polypeptide or an analog
include, for
example, targeting moieties which provide for the delivery of polypeptide to
pancreatic cells,
e.g., antibodies to pancreatic cells, antibodies to immune cells such as T-
cells, monocytes,
dendritic cells, granulocytes, etc., as well as receptor and ligands expressed
on pancreatic or
immune cells. Other moieties which may be fused to the polypeptide include
therapeutic
agents which are used for treatment, for example, immunosuppressive drugs such
as
cyclosporin, SK506, azathioprine, CD3 antibodies and steroids. Also,
polypeptides may be
fused to immune modulators, and other cytokines such as alpha or beta
interferon.
4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE
IDENTITY AND SIMILARITY
Preferred identity and/or similarity are designed to give the largest match
between
the sequences tested. Methods to determine identity and similarity are
codified in computer
programs including, but are not limited to, the GCG program package, including
GAP


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
37
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics
Computer Group,
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA
(Altschul,
S.F. et al., J. Molec. Biol. 215:403-410 (1990), PST-BLAST (Altschul S.F. et
al., Nucleic
Acids Res. vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix
software (Wu
et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by
reference), eMotif
software (Nevill-Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein
incorporated by
reference), Pfam software (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1),
pp. 320-322
(1998), herein incorporated by reference) and the Kyte-Doolittle
hydrophobocity prediction
algorithm (J. Mo1 Biol, 157, pp. 105-31 (1982), incorporated herein by
reference).
polypeptide sequences were examined by a proprietary algorithm, SeqLoc that
separates the
proteins into three sets of locales: intracellular, membrane, or secreted.
This prediction is
based upon three characteristics of each polypeptide, including percentage of
cysteine
residues, Kyte-Doolittle scores for the f rst 20 amino acids of each protein,
and Kyte-
Doolittle scores to calculate the longest hydrophobic stretch of the said
protein. Values of
predicted proteins are compared against the values from a set of 592 proteins
of known
cellular localization from the Swissprot database
(http:llwww.expasy.ch/sprot). Predictions
are based upon the maximum likelihood estimation.
The BLAST programs are publicly available from the National Center for
Biotechnology Information (NCBI) and other sources (BLAST Manual, Altschul,
S., et al.
NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-
410
(1990).
4.7 CHIMERIC AND FUSION PROTEINS
The invention also provides chimeric or fusion proteins. As used herein, a
"chimeric
protein" or "fusion protein" comprises a polypeptide of the invention
operatively linked to
another polypeptide. Within a fusion protein the polypeptide according to the
invention can
correspond to all or a portion of a protein according to the invention. In one
embodiment, a
fusion protein comprises at least one biologically active portion of a protein
according to the
invention. In another embodiment, a fusion protein comprises at Least two
biologically
active portions of a protein according to the invention. Within the fusion
protein, the term
"operatively linked" is intended to indicate that the polypeptide according to
the invention
and the other polypeptide are fused in-frame to each other. The polypeptide
can be fused to
the N-terminus or C-terminus, or to the middle.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
38
For example, in one embodiment a fusion protein comprises a polypeptide
according
to the invention operably linked to the extracellular domain ~of a second
protein.
In another embodiment, the fusion protein is a GST-fusion protein in which the
polypeptide sequences of the invention are fused to the C-terminus of the GST
(i.e.,
glutathione S-transferase) sequences.
In another embodiment, the fusion protein is an immunoglobulin fusion protein
in
which the polypeptide sequences according to the invention comprise one or
more domains
fused to sequences derived from a member of the immunoglobulin protein family.
The
immunoglobulin fusion proteins of the invention can be incorporated into
pharmaceutical
compositions and administered to a subject to inhibit an interaction between a
ligand and a
protein of the invention on the surface of a cell, to thereby suppress signal
transduction ira
viv~. The immunoglobulin fusion proteins can be used to affect the
bioavailability of a
cognate ligand. Inhibition of the ligand/protein interaction may be useful
therapeutically for
both the treatment of proliferative and differentiative disorders, e.g.,
cancer as well as
modulating (e.g., promoting or inhibiting) cell survival. Moreover, the
immunoglobulin
fusion proteins of the invention can be used as immunogens to produce
antibodies in a
subject, to purify ligands, and in screening assays to identify molecules that
inhibit the
interaction of a polypeptide of the invention with a ligand.
A chimeric or fusion protein of the invention can be produced by standard
recombinant DNA techniques. For example, DNA fragments coding for the
different
polypeptide sequences are ligated together in-frame in accordance with
conventional
techniques, e.g., by employing blunt-ended or stagger-ended termini for
ligation, restriction
enzyme digestion to provide for appropriate termini, filling-in of cohesive
ends as
appropriate, alkaline phosphatase treatment to avoid undesirable joining, and
enzymatic
ligation. In another embodiment, the fusion gene can be synthesized by
conventional
techniques including automated DNA synthesizers. Alternatively, PCR
amplification of
gene fragments can be carned out using anchor primers that give rise to
complementary
overhangs between two consecutive gene fragments that can subsequently be
annealed and
reamplified to generate a chimeric gene sequence (see, for example, Ausubel et
al. (eds.)
CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 1992). Moreover,
many expression vectors are commercially available that already encode a
fusion moiety
(e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the
invention can be


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
39
cloned into such an expression vector such that the fusion moiety is linked in-
frame to the
protein of the invention.
4.8 GENE T~IERAPY
Mutations in the polynucleotides of the invention gene may result in loss of
normal
function of the encoded protein. The invention thus provides gene therapy to
restore normal
activity of the polypeptides of the invention; or to treat disease states
involving polypeptides
of the invention. Delivery of a functional gene encoding polypeptides of the
invention to
appropriate cells is effected ex vivo, ih situ, or is? vivo by use of vectors,
and more
particularly viral vectors (e.g., adenovirus, adeno-associated virus, or a
retrovirus), or ex vivo
by use of physical DNA transfer methods (e.g., liposomes or chemical
treatments). See, for
example, Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998).
For
additional reviews of gene therapy technology see Friedmann, Science, 244:
1275-1281
(1989); Verma, Scientific American: 68-84 (1990); and Miller, Nature, 357: 455-
460 (1992).
Introduction of amy one of the nucleotides of the present invention or a gene
encoding the
polypeptides of the present invention can also be accomplished with
extrachromosomal
substrates (transient expression) or artificial chromosomes (stable
expression). Cells may
also be cultured ex vivo in the presence of proteins of the present invention
in order to
proliferate or to produce a desired effect on or activity in such cells.
Treated cells can then
be introduced ifa vivo for therapeutic purposes. Alternatively, it is
contemplated that in other
human disease states, preventing the expression of or inhibiting the activity
of polypeptides
of the invention will be useful in treating the disease states. It is
contemplated that antisense
therapy or gene therapy could be applied to negatively regulate the expression
of
polypeptides of the invention.
Other methods inhibiting expression of a protein include the introduction of
antisense
molecules to the nucleic acids of the present invention, their complements, or
their translated
RNA sequences, by methods known in the art. Further, the polypeptides of the
present
invention can be inhibited by using targeted deletion methods, or the
insertion of a negative
regulatory element such as a silencer, which is tissue specific.
The present invention still further provides cells genetically engineered ih
vivo to
express the polynucleotides of the invention, wherein such polynucleotides are
in operative
association with a regulatory sequence heterologous to the host cell which
drives expression of


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
the polynucleotides in the cell. These methods can be used to increase or
decrease the
expression of the polynucleotides of the present invention.
Knowledge of DNA sequences provided by the invention allows for modification
of
cells to permit, increase, or decrease, expression of endogenous polypeptide.
Cells can be
5 modified (e.g., by homologous recombination) to provide increased
polypeptide expression by
replacing, in whole or in part, the naturally occurring promoter with all or
part of a heterologous
promoter so that the cells express the protein at lugher levels. The
heterologous promoter is
inserted in such a manner that it is operatively linked to the desired protein
encoding sequences.
See, for example, PCT International Publication No. WO 94/12650, PCT
International
10 Publication No. WO 92/20808, and PCT International Publication No. WO
91/09955. It is also
contemplated that, in addition to heterologous promoter DNA, amplifiable
marker DNA (e.g.,
ada, dhfr, and the multifunctional CAD gene which encodes carbamyl phosphate
synthase,
aspartate transcarbamylase, and dihydroorotase) and/or intron DNA may be
inserted along with
the heterologous promoter DNA. If linked to the desired protein coding
sequence,
15 amplification of the marker DNA by standard selection methods results in co-
amplification of
the desired protein coding sequences in the cells.
In another embodiment of the present invention, cells and tissues may be
engineered to
express an endogenous gene comprising the polynucleotides of the invention
under the control
of inducible regulatory elements, in which case the regulatory sequences of
the endogenous
20 gene may be replaced by homologous recombination. As described herein, gene
targeting can
be used to replace a gene's existing regulatory region with a regulatory
sequence isolated from
a different gene or a novel regulatory sequence synthesized by genetic
engineering methods.
Such regulatory sequences may be comprised of promoters, enhancers, scaffold-
attachment
regions, negative regulatory elements, transcriptional initiation sites,
regulatory protein binding
25 sites or combinations of said sequences. Alternatively, sequences which
affect the structure or
stability of the RNA or protein produced may be replaced, removed, added, or
otherwise
modified by targeting. These sequences include polyadenylation signals, mRNA
stability
elements, splice sites, leader sequences for enhancing or modifying transport
or secretion
properties of the protein, or other sequences which alter or improve the
function or stability of
30 protein or RNA molecules.
The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting'a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
41
deletion of a regulatory element, such as the deletion of a tissue-specific
negative regulatory
element. Alternatively, the targeting event may replace an existing element;
for example, a
tissue-specific enhancer can be replaced by an enhancer that has broader or
different cell-type
specificity than the naturally occurring elements. Here, the naturally
occurring sequences are
deleted and new sequences are added. In all cases, the identification of the
targeting event may
be facilitated by the use of one or more selectable marker genes that are
contiguous with the
targeting DNA, allowing for the selection of cells in which the exogenous DNA
has integrated
into the cell genome. The identification of the targeting event may also be
facilitated by the use
of one or more marker genes exhibiting the property of negative selection,
such that the
negatively selectable marker is linked to the exogenous DNA, but configured
such that the
negatively selectable marker flanks the targeting sequence, and such that a
correct homologous
recombination event with sequences in the host cell genome does not result in
the stable
integration of the negatively selectable marker. Markers useful for this
purpose include the
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xantlune-
guanine
phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance with
this aspect of the invention are more particularly described in U.S. Patent
No. 5,272,071 to
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; W ternational
Application No.
PCT/LTS92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/LTS90/06436 (W091/06667) by Skoultchi et al., each of which is
incorporated by
reference herein in its entirety.
4.9 TRANSGENIC ANIMALS
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)J. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
42
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. WO94/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of a promoter of the
polynucleotides of the invention is either activated or inactivated to alter
the level of
expression of the polypeptides of the invention. Inactivation can be carried
out using
homologous recombination methods described above. Activation can be achieved
by
supplementing or even replacing the homologous promoter to provide for
increased protein
expression. The homologous promoter can be supplemented by insertion of one or
more
heterologous enhancer elements known to confer promoter activation in a
particular tissue.
The polynucleotides of the present invention also make possible the
development,
through, e.g., homologous recombination or knock out strategies, of animals
that fail to
express polypeptides of the invention or that express a variant polypeptide.
Such animals are
useful as models for studying the i~ vivo activities of polypeptide as well as
for studying
modulators of the polypeptides of the invention.
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)x. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination
are referred to as "knockout" animals. Knockout animals, preferably non-human
mammals,
can be prepared as described in U.S. Patent No. 5,557,032, incorporated herein
by reference.
Transgenic animals are useful to determine the roles polypeptides of the
invention play in
biological processes, and preferably in disease states. Transgenic animals are
useful as model
systems to identify compounds that modulate lipid metabolism. Transgenic
animals,
preferably non-human mammals, are produced using methods as described in U.S.
Patent No
5,489,743 and PCT Publication No. W094/28122, incorporated herein by
reference.
Transgenic animals can be prepared wherein all or part of the polynucleotides
of the
invention promoter is either activated or inactivated to alter the level of
expression of the
polypeptides of the invention. Inactivation can be carried out using
homologous
recombination methods described above. Activation can be achieved by
supplementing or
even replacing the homologous promoter to provide for increased protein
expression. The


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
43
homologous promoter can be supplemented by insertion of one or more
heterologous
enhancer elements known to confer promoter activation in a particular tissue.
4.10 USES AND BIOLOGICAL ACTIVITY
The polynucleotides and proteins of the present invention are expected to
exhibit one
or more of the uses or biological activities (including those associated with
assays cited
herein) identified herein. Uses or activities described fox proteins of the
present invention
may be provided by administration or use of such proteins or of
polynucleotides encoding
such proteins (such as, for example, in gene therapies or vectors suitable for
introduction of
DNA). The mechanism underlying the particular condition or pathology will
dictate whether
the polypeptides of the invention, the polynucleotides of the invention or
modulators
(activators or inhibitors) thereof would be beneficial to the subject in need
of treatment.
Thus, "therapeutic compositions of the invention" include compositions
comprising isolated
polynucleotides (including recombinant DNA molecules, cloned genes and
degenerate
variants thereof) or polypeptides of the invention (including full length
protein, mature
protein and truncations or domains thereof), or compounds and other substances
that
modulate the overall activity of the target gene products, either at the level
of target
gene/protein expression or target protein activity. Such modulators include
polypeptides,
analogs, (variants), including fragments and fusion proteins, antibodies and
other binding
proteins; chemical compounds that directly or indirectly activate or inhibit
the polypeptides
of the invention (identified, e.g., via drug screening assays as described
herein); antisense
polynucleotides and polynucleotides suitable for triple helix formation; and
in particular
antibodies or other binding partners that specifically recognize one or more
epitopes of the
polypeptides of the invention.
The polypeptides of the present invention may likewise be involved in cellular
activation or in one of the other physiological pathways described herein.
4.10.1 RESEARCH USES AND UTILITIES
The polynucleotides provided by the present invention can be used by the
research
community for various purposes. The polynucleotides can be used to express
recombinant
protein for analysis, characterization or therapeutic use; as markers for
tissues in which the
corresponding protein is preferentially expressed (either constitutively or at
a particular stage
of tissue differentiation or development or in disease states); as molecular
weight markers on


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
44
gels; as chromosome markers or tags (when labeled) to identify chromosomes or
to map
related gene positions; to compare with endogenous DNA sequences in patients
to identify
potential genetic disorders; as probes to hybridize and thus discover novel,
related DNA
sequences; as a source of information to derive PCR primers for genetic
fingerprinting; as a
probe to "subtract-out" known sequences in the process of discovering other
novel
polynucleotides; for selecting and making oligomers for attachment to a "gene
chip" or other
support, including for examination of expression patterns; to raise anti-
protein antibodies
using DNA immunization techniques; and as an antigen to raise anti-DNA
antibodies or
elicit another immune response. Where the polynucleotide encodes a protein
which binds or
potentially binds to another protein (such as, for example, in a receptor-
ligand interaction),
the polynucleotide can also be used in interaction trap assays (such as, for
example, that
described in Gyuris et al., Cell 75:791-803 (1993)) to identify
polynucleotides encoding the
other protein with which binding occurs or to identify inhibitors of the
binding interaction.
The polypeptides provided by the present invention can similarly be used in
assays to
determine biological activity, including in a panel of multiple proteins for
high-throughput
screening; to raise antibodies or to elicit another immune response; as a
reagent (including
the labeled reagent) in assays designed to quantitatively determine levels of
the protein (or
its receptor) in biological fluids; as markers for tissues in which the
corresponding
polypeptide is preferentially expressed (either constitutively or at a
particular stage of tissue
differentiation or development or in a disease state); and, of course, to
isolate correlative
receptors or ligands. Proteins involved in these binding interactions can also
be used to
screen for peptide or small molecule inhibitors or agonists of the binding
interaction.
Any or all of these research utilities are capable of being developed into
reagent
grade or kit format for commercialization as research products.
Methods for performing the uses listed above are well known to those skilled
in the
art. References disclosing such methods include without limitation "Molecular
Cloning: A
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J.,
E. F.
Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to
Molecular
Cloning Techniques", Academic Press, Bergen S. L. and A. R. Kimmel eds., 1987.
4.10.2 NUTRITIONAL USES
Polynucleotides and polypeptides of the present invention can also be used as
nutritional sources or supplements. Such uses include without limitation use
as a protein or


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
aanino acid supplement, use as a carbon source, use as a nitrogen source and
use as a source of
carbohydrate. In such cases the polypeptide or polynucleotide of the invention
can be added to
the feed of a particular organism or can be administered as a separate solid
or liquid
preparation, such as in the form of powder, pills, solutions, suspensions or
capsules. In the case
of microorganisms, the polypeptide or polynucleotide of the invention can be
added to the
medium in or on which the microorganism is cultured.
4.10.3 CYTOHINE ANI) CELL PROLIFERATION/DIFFERENTIATION
ACTIVITY
10 A polypeptide of the present invention may exhibit activity relating to
cytokine, cell
proliferation (either inducing or inhibiting) or cell differentiation (either
inducing or
inhibiting) activity or may induce production of other cytokines in certain
cell populations.
A polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Many protein factors discovered to date, including all known cytokines, have
exhibited
15 activity in one or more factor-dependent cell proliferation assays, and
hence the assays serve
as a convenient confirmation of cytokine activity. The activity of therapeutic
compositions
of the present invention is evidenced by any one of a number of routine factor
dependent cell
proliferation assays for cell lines including, without limitation, 32D, DA2,
DAIG, T10, B9,
B9/11, BaF3, MC9/G, M+(preB M+), 2E8, RBS, DAl, 123, T1165, HT2, CTLL2, TF-1,
20 Mo7e, CMI~, HUVEC, and Caco. Therapeutic compositions of the invention can
be used in
the following:
Assays for T-cell or thymocyte proliferation include without limitation those
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M.
Kruisbeek, D. H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
25 Wiley-Interscience (Chapter 3, Ih Yitro assays for Mouse Lymphocyte
Function 3.1-3.19;
Chapter 7, linmunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Bertagnolli et al., J. Iminunol. 145:1706-1712, 1990; Bertagnolli et al.,
Cellular Irmnunology
133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992;
Bowman et al., I.
hnmunol. 152:1756-1761, 1994.
30 Assays for cytokine production and/or proliferation of spleen cells, lymph
node cells
or thymocytes include, without limitation, those described in: Polyclonal T
cell stimulation,
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in hnmunology. J. E.
e.a. Coligan
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and
Measurement of


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
46
mouse and human interleukin-y, Schreiber, R. D. In Current Protocols in
Immunology. J. E.
e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.
Assays for proliferation and differentiation of hematopoietic and
lymphopoietic cells
include, without limitation, those described in: Measurement of Human and
Murine
Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E.
In Current
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John
Wiley and
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau
et al.,
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A.
80:2931-2938,
1983; Measurement of mouse and human interleukin 6--Nordan, R. In Current
Protocols in
Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons,
Toronto. 1991;
Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; Measurement of
human
Interleukin 11--Bennett, F., Giannotti, J., Clark, S. C. and Turner, K. J. In
Current Protocols
in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John Wiley and Sons,
Toronto. 1991;
Measurement of mouse and human Interleukin 9--Ciarletta, A., Giannotti, J.,
Clark, S. C.
and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1
pp. 6.13.1,
John Wiley and Sons, Toronto. 1991.
Assays for T-cell clone responses to antigens (which will identify, among
others,
proteins that affect APC-T cell interactions as well as direct T-cell effects
by measuring
proliferation and cytokine production) include, without limitation, those
described in:
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H.
Margulies,
E. M. Shevach, W Strober, Pub. Greene Publishing Associates and Wiley-
Interscience
(Chapter 3, Ih T~itYO assays for Mouse Lymphocyte Function; Chapter 6,
Cytokines and their
cellular receptors; Chapter 7, Immunologic studies in Humans); Weinberger et
al., Proc.
Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., Eur. J. Immun.
11:405-41 l,
1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. hnmunol.
140:508-512,
1988.
4.10.4 STEM CELL GROWTH FACTOR ACTIVITY
A polypeptide of the present invention may exhibit stem cell growth factor
activity
and be involved in the proliferation, differentiation and survival of
pluripotent and totipotent
stem cells including primordial germ cells, embryonic stem cells,
hematopoietic stem cells
and/or germ line stem cells. Administration of the polypeptide of the
invention to stem cells
in vivo or ex vivo is expected to maintain and expand cell populations in a
totipotential or


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
47
pluripotential state wluch would be useful for re-engineering damaged or
diseased tissues,
transplantation, manufacture of bio-pharmaceuticals and the development of bio-
sensors.
The ability to produce large quantities of human cells has important working
applications for
the production of human proteins which currently must be obtained from non-
human sources
or donors, implantation of cells to treat diseases such as Parkinson's,
Alzheimer's and other
neurodegenerative diseases; tissues for grafting such as bone marrow, skin,
cartilage,
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea,
neural cells,
gastrointestinal cells and others; and organs for transplantation such as
kidney, liver,
pancreas (including islet cells), heart and lung.
It is contemplated that multiple different exogenous growth factors and/or
cytokines
may be administered in combination with the polypeptide of the invention to
achieve the
desired effect, including any of the growth factors listed herein, other stem
cell maintenance
factors, and specifically including stem cell factor (SCF), leukemia
inhibitory factor (LIF),
Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble IL-6
receptor fused to IL-
6, macrophage inflammatory protein 1-alpha (MIP-1-alpha), G-CSF, GM-CSF,
thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth factor
(PDGF),
neural growth factors and basic fibroblast growth factor (bFGF).
Since totipotent stem cells can give rise to virtually any mature cell type,
expansion
of these cells in culture will facilitate the production of large quantities
of mature cells.
Techniques for culturing stem cells are known in the art and administration of
polypeptides
of the invention, optionally with other growth factors and/or cytokines, is
expected to
enhance the survival and proliferation of the stem cell populations. This can
be
accomplished by direct administration of the polypeptide of the invention to
the culture
medium. Alternatively, stroma cells transfected with a polynucleotide that
encodes for the
polypeptide of the invention can be used as a feeder layer for the stem cell
populations in
culture or in vivo. Stromal support cells fort feeder layers may include
embryonic bone
marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or cultured
embryonic
fibroblasts (see U.S. Patent No. 5,690,926).
Stem cells themselves can be transfected with a polynucleotide of the
invention to
induce autocrine expression of the polypeptide of the invention. This will
allow for
generation of undifferentiated totipotential/pluripotential stem cell lines
that are useful as is
or that can then be differentiated into the desired mature cell types. These
stable cell lines
can also serve as a source of undifferentiated totipotential/pluripotential
mRNA to create


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
48
cDNA libraries and templates for polymerase chain reaction experiments. These
studies
would allow for the isolation and identification of differentially expressed
genes in stem cell
populations that regulate stem cell proliferation and/or maintenance.
Expansion and maintenance of totipotent stem cell populations will be useful
in the
treatment of many pathological conditions. For example, polypeptides of the
present
invention may be used to manipulate stem cells in culture to give rise to
neuroepithelial cells
that can be used to augment or replace cells damaged by illness, autoimmune
disease,
accidental damage or genetic disorders. The polypeptide of the invention may
be useful for
inducing the proliferation of neural cells and for the regeneration of nerve
and brain tissue,
i.e. for the treatment of central and peripheral nervous system diseases and
neuropathies, as
well as mechanical and traumatic disorders which involve degeneration, death
or trauma to
neural cells or nerve tissue. In addition, the expanded stem cell populations
can also be
genetically altered for gene therapy purposes and to decrease host rejection
of replacement
tissues after grafting or implantation.
Expression of the polypeptide of the invention and its effect on stem cells
can also be
manipulated to achieve controlled differentiation of the stem cells into more
differentiated
cell types. A broadly applicable method of obtaining pure populations of a
specific
differentiated cell type from undifferentiated stem cell populations involves
the use of a cell-
type specific promoter driving a selectable marker. The selectable marker
allows only cells
of the desired type to survive. For example, stem cells can be induced to
differentiate into
cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); Klug et
al., J. Clin.
Invest., 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. W. Tn:
Prifaciples of
Tissue Ehgiraeering eds. Lanza et al., Academic Press (1997)). Alternatively,
directed
differentiation of stem cells can be accomplished by culturing the stem cells
in the presence
of a differentiation factor such as retinoic acid and an antagonist of the
polypeptide of the
invention which would inhibit the effects of endogenous stem cell factor
activity and allow
differentiation to proceed. i
I~ vitro cultures of stem cells can be used to determine if the polypeptide of
the
invention exhibits stem cell growth factor activity. Stem cells are isolated
from any one of
various cell sources (including hematopoietic stem cells and embryonic stem
cells) and
cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad.
Sci, U.S.A.,
92: 7844-7848 (1995), in the presence of the polypeptide of the invention
alone or in
combination with other growth factors or cytokines. The ability of the
polypeptide of the


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
49
invention to induce stem cells proliferation is determined by colony formation
on semi-solid
support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991).
4.10.5 HEMATOPOIESIS REGULATING ACTIVITY
A polypeptide of the present invention may be involved in regulation of
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell
disorders.
Even marginal biological activity in support of colony forming cells or of
factor-dependent
cell lines indicates involvement in regulating hematopoiesis, e.g. in
supporting the growth
and proliferation of erythroid progenitor cells alone or in combination with
other cytokines,
thereby indicating utility, for example, in treating various anemias or for
use in conjunction
with irradiation/chemotherapy to stimulate the production of erythroid
precursors and/or
erythroid cells; in supporting the growth and proliferation of myeloid cells
such as
granulocytes and monocytes/macrophages (i.e., traditional CSF activity)
useful, for example,
in conjunction with chemotherapy to prevent or treat consequent myelo-
suppression; in
supporting the growth and proliferation of megakaryocytes and consequently of
platelets
thereby allowing prevention or treatment of various platelet disorders such as
thrombocytopenia, and generally for use in place of or complimentary to
platelet
transfusions; and/or in supporting the growth and proliferation of
hematopoietic stem cells
which are capable of maturing to any and all of the above-mentioned
hematopoietic cells and
therefore find therapeutic utility in various stem cell disorders (such as
those usually treated
with transplantation, including, without limitation, aplastic anemia and
paroxysmal nocturnal
hemoglobinuria), as well as in repopulating the stem cell compartment post
irradiation/chemotherapy, either i~-vivo or ex-vivo (i.e., in conjunction with
bone marrow
transplantation or with peripheral progenitor cell transplantation (homologous
or
heterologous)) as normal cells or genetically manipulated for gene therapy.
Therapeutic compositions of the invention can be used in the following:
Suitable assays for proliferation and differentiation of various hematopoietic
lines are
cited above.
Assays for embryonic stem cell differentiation (which will identify, among
others,
proteins that influence embryonic differentiation hematopoiesis) include,
without limitation,
those described in: Johansson et al. Cellular Biology 15:141-151, 1995;
I~eller et al.,
Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood
81:2903-2915,
1993.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Assays for stem cell survival and differentiation (which will identify, among
others,
proteins that regulate lympho-hematopoiesis) include, without limitation,
those described in:
Methylcellulose colony forming assays, Freshney, M. G. In Culture of
Hematopoietic Cells.
R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, W c., New York, N.Y.
1994;
5 Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-591 l, 1992; Primitive
hematopoietic
colony forming cells with high proliferative potential, McNiece, I. I~. and
Briddell, R. A. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 23-39,
Wiley-Liss, Inc.,
New York, N.Y. 1994; Neben et al., Experimental Hematology 22:353-359, 1994;
Cobblestone area forming cell assay, Ploemacher, R. E. In Culture of
Hematopoietic Cells.
10 R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, Inc., New York, N.Y.
1994; Long term
bone marrow cultures in the presence of stromal cells, Spooncer, E., Dexter,
M. and Allen,
T. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 163-
179, Wiley-Liss,
Inc., New York, N.Y. I994; Long term culture initiating cell assay,
Sutherland, H. J. In
Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 139-162,
Wiley-Liss, Inc.,
15 New York, N.Y. 1994.
4.10.6 TISSUE GROWTH ACTIVITY
A polypeptide of the present invention also may be involved in bone,
cartilage,
tendon, ligament and/or nerve tissue growth or regeneration, as well as in
wound healing and
20 tissue repair and replacement, and in healing of burns, incisions and
ulcers.
A polypeptide of the present invention which induces cartilage and/or bone
growth in
circumstances where bone is not normally fomned, has application in the
healing of bone
fractures and cartilage damage or defects in humans and other animals.
Compositions of a
polypeptide, antibody, binding partner, or other modulator of the invention
may have
25 prophylactic use in closed as well as open fracture reduction and also in
the improved
fixation of artificial joints. De novo bone formation induced by an osteogenic
agent
contributes to the repair of congenital, trauma induced, or oncologic
resection induced
craniofacial defects, and also is useful in cosmetic plastic surgery.
A polypeptide of this invention may also be involved in attracting bone-
forming
30 cells, stimulating growth of bone-forming cells, or inducing
differentiation of progenitors of
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone
degenerative disorders, or
periodontal disease, such as through stimulation of bone and/or cartilage
repair or by
blocking inflammation or processes of tissue destruction (collagenase
activity, osteoclast


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
51
activity, etc.) mediated by inflammatory processes may also be possible using
the
composition of the invention.
Another category of tissue regeneration activity that may involve the
polypeptide of
the present invention is tendoWligament formation. Induction of
tendon/ligament-like tissue
or other tissue formation in circumstances where such tissue is not normally
formed, has
application in the healing of tendon or ligament tears, deformities and other
tendon or
ligament defects in humans and other animals. Such a preparation employing a
tendon/ligament-like tissue inducing protein may have prophylactic use in
preventing
damage to tendon or ligament tissue, as well as use in the improved fixation
of tendon or
ligament to bone or other tissues, and in repairing defects to tendon or
ligament tissue. De
novo tendon/ligament-like tissue formation induced by a composition of the
present
invention contributes to the repair of congenital, trauma induced, or other
tendon or ligament
defects of other origin, and is also useful in cosmetic plastic surgery for
attachment or repair
of tendons or ligaments. The compositions of the present invention may provide
environment to attract tendon- or ligament-forming cells, stimulate growth of
tendon- or
ligament-forming cells, induce differentiation of progenitors of tendon- or
ligament-forming
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for
return ira vivo to
effect tissue repair. The compositions of the invention may also be useful in
the treatment of
tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The
compositions
may also include an appropriate matrix and/or sequestering agent as a carrier
as is well
known in the art.
The compositions of the present invention may also be useful for proliferation
of
neural cells and for regeneration of nerve and brain tissue, i.e. for the
treatment of central
and peripheral nervous system diseases and neuropathies, as well as mechanical
and
traumatic disorders, which involve degeneration, death or trauma to neural
cells or nerve
tissue. More specifically, a composition may be used in the treatment of
diseases of the
peripheral nervous system, such as peripheral nerve injuries, peripheral
neuropathy and
localized neuropathies, and central nervous system diseases, such as
Alzheimer's,
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and
Shy-Drager
syndrome. Further conditions which may be treated in accordance with the
present invention
include mechanical and traumatic disorders, such as spinal cord disorders,
head trauma and
cerebrovascular diseases such as stroke. Peripheral neuropathies resulting
from


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
52
chemotherapy or other medical therapies may also be treatable using a
composition of the
invention.
Compositions of the invention may also be useful to promote better or faster
closure
of non-healing wounds, including without limitation pressure ulcers, ulcers
associated with
vascular insufficiency, surgical and traumatic wounds, and the like.
Compositions of the present invention may also be involved in the generation
or
regeneration of other tissues, such as organs (including, for example,
pancreas, liver,
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac)
and vascular
(including vascular endothelium) tissue, or for promoting the growth of cells
comprising
such tissues. Part of the desired effects may be by inhibition or modulation
of fibrotic
scarring may allow normal tissue to regenerate. A polypeptide of the present
invention may
also exhibit angiogenic activity.
A composition of the present invention may also be useful for gut protection
or
regeneration and treatment of lung or liver fibrosis, reperfusion injury in
various tissues, and
conditions resulting from systemic cytokine damage.
A composition of the present invention may also be useful for promoting or
inhibiting differentiation of tissues described above from precursor tissues
or cells; or for
inhibiting the growth of tissues described above.
Therapeutic compositions of the invention can be used in the following:
Assays for tissue generation activity include, without limitation, those
described in:
International Patent Publication No. W095/16035 (bone, cartilage, tendon);
International
Patent Publication No. W095/05846 (nerve, neuronal); International Patent
Publication No.
W091/07491 (skin, endothelium).
Assays for wound healing activity include, without limitation, those described
in:
Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T.,
eds.),
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and
Mertz, J. Invest.
Dermatol 71:382-84 (1978).
4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY
A polypeptide of the present invention may also exhibit immune stimulating or
immune suppressing activity, including without limitation the activities for
which assays are
described herein. A polynucleotide of the invention can encode a polypeptide
exhibiting
such activities. A protein may be useful in the treatment of various immune
deficiencies and


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
53
disorders (including severe combined immunodeficiency (SCID)), e.g., in
regulating (up or
down) growth and proliferation of T andlor B lymphocytes, as well as effecting
the cytolytic
activity of NIA cells and other cell populations. These immune deficiencies
may be genetic or
be caused by viral (e.g., HIV) as well as bacterial or fungal infections, or
may result from
autoimmune disorders. More specifically, infectious diseases causes by viral,
bacterial,
fungal or other infection may be treatable using a protein of the present
invention, including
infections by HIV, hepatitis viruses, herpes viruses, mycobacteria, Leishmania
spp., malaria
spp. and various fungal infections such as candidiasis. Of course, in this
regard, proteins of
the present invention may also be useful where a boost to the immune system
generally may
be desirable, i.e.~ in the treatment of cancer.
Autoimmune disorders which may be treated using a protein of the present
invention
include, for example, connective tissue disease, multiple sclerosis, systemic
lupus
erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation,
Guillain-Barre
syndrome, autoirmnune thyroiditis, insulin dependent diabetes mellitis,
myasthenia gravis,
graft-versus-host disease and autoimmune inflammatory eye disease. Such a
protein (or
antagonists thereof, including antibodies) of the present invention may also
to be useful in
the treatment of allergic reactions and conditions (e.g., anaphylaxis, serum
sickness, drug
reactions, food allergies, insect venom allergies, mastocytosis, allergic
rhinitis,
hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic
dermatitis, allergic
contact dermatitis, erythema, multiforme, Stevens-Johnson syndrome, allergic
conjunctivitis,
atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant papillary
conjunctivitis and
contact allergies), such as asthma (particularly allergic asthma) or other
respiratory
problems. Other conditions, in which immune suppression is desired (including,
for
example, organ transplantation), may also be treatable using a protein (or
antagonists
thereof) of the present invention. The therapeutic effects of the polypeptides
or antagonists
thereof on allergic reactions can be evaluated by in vivo animals models such
as the
cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66,
I99~), skin
prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin
sensitization test
(Vohr et al., Arch. Toxocol. 73: 501-9), and marine local lymph node assay
(Kimber et al.,
J. Toxicol. Environ. Health 53: 563-79).
Using the proteins of the invention it may also be possible to modulate immune
responses, in a number of ways. Down regulation may be in the form of
inhibiting or
blocking an immune response already in progress or may involve preventing the
induction of


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
54
an immune response. The functions of activated T cells may be inhibited by
suppressing T
cell responses or by inducing specific tolerance in T cells, or both.
Immunosuppression of T
cell responses is generally an active, non-antigen-specific, process which
requires continuous
exposure of the T cells to the suppressive agent. Tolerance, which involves
inducing
non-responsiveness or energy in T cells, is distinguishable from
immunosuppression in that
it is generally antigen-specific and persists after exposure to the tolerizing
agent has ceased.
Operationally, tolerance can be demonstrated by the lack of a T cell response
upon
reexposure to specific antigen in the absence of the tolerizing agent.
Down regulating or preventing one or more antigen functions (including without
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g.,
preventing high
level lymphokine synthesis by activated T cells, will be useful in situations
of tissue, skin
and organ transplantation and in graft-versus-host disease (GVHD). For
example, blockage
of T cell function should result in reduced tissue destruction in tissue
transplantation.
Typically, in tissue transplants, rejection of the transplant is initiated
through its recognition
as foreign by T cells, followed by an immune reaction that destroys the
transplant. The
administration of a therapeutic composition of the invention may prevent
cytokine synthesis
by immune cells, such as T cells, and thus acts as an immunosuppressant.
Moreover, a lack
of costimulation may also be sufficient to energize the T cells, thereby
inducing tolerance in
a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking
reagents may
avoid the necessity of repeated administration of these blocking reagents. To
achieve
sufficient immunosuppression or tolerance in a subject, it may also be
necessary to block the
function of a combination of B lymphocyte antigens.
The efficacy of particular therapeutic compositions in preventing organ
transplant
rejection or GVHD can be assessed using animal models that are predictive of
efficacy in
humans. Examples of appropriate systems which can be used include allogeneic
cardiac
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of
which have been
used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in
vivo as
described in Lenschow et al., Science 257:789-792 (1992) and Turka et al.,
Proc. Natl. Aced.
Sci USA, 89:11102-11105 (1992). In addition, murine models of GVHD (see Paul
ed.,
Fundamental Irmnunology, Raven Press, New York, 1989, pp. 846-847) can be used
to
determine the effect of therapeutic compositions of the invention on the
development of that
disease.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Blocking antigen function may also be therapeutically useful for treating
autoimmune diseases. Many autoimmune disorders are the result of inappropriate
activation
of T cells that are reactive against self tissue and which promote the
production of cytokines
asld autoantibodies involved in the pathology of the diseases. Preventing the
activation of
5 autoreactive T cells may reduce or eliminate disease symptoms.
Administration of reagents
which block stimulation of T cells can be used to inhibit T cell activation
and prevent
production of autoantibodies or T cell-derived cytokines which may be involved
in the
disease process. Additionally, blocking reagents may induce antigen-specific
tolerance of
autoreactive T cells which could lead to long-teen relief from the disease.
The efficacy of
10 blocking reagents in preventing or alleviating autoimmune disorders can be
determined
using a number of well-characterized animal models of human autoimmune
diseases.
Examples include marine experimental autoixnmune encephalitis, systemic lupus
erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, marine autoimmune
collagen
arthritis, diabetes mellitus in NOD mice and BB rats, and marine experimental
myasthenia
15 gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989,
pp.
840-856).
Upregulation of an antigen function (e.g., a B lymphocyte antigen function),
as a
means of up regulating immune responses, may also be useful in therapy.
Upregulation of
immune responses may be in the form of enhancing an existing immune response
or eliciting
20 an initial immune response. For example, enhancing an immune response may
be useful in
cases of viral infection, including systemic viral diseases such as influenza,
the common
cold, and encephalitis.
Alternatively, anti-viral immune responses may be enhanced in an infected
patient by
removing T cells from the patient, costimulating the T cells in vitro with
viral antigen-pulsed
25 APCs either expressing a peptide of the present invention or together with
a stimulatory
form of a soluble peptide of the present invention and reintroducing the in
vitro activated T
cells into the patient. Another method of enhancing anti-viral immune
responses would be to
isolate infected cells from a patient, transfect them with a nucleic acid
encoding a protein of
the present invention as described herein such that the cells express all or a
portion of the
30 protein on their surface, and reintroduce the transfected cells into the
patient. The infected
cells would now be capable of delivering a costimulatory signal to, and
thereby activate, T
cells in vivo.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
56
A polypeptide of the present invention may provide the necessary stimulation
signal
to T cells to induce a T cell mediated immune response against the transfected
tumor cells.
W addition, tumor cells which lack MHC class I or MHC class II molecules, or
which fail to
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be
transfected
with nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain
truncated portion)
of an MHC class I alpha chain protein and (32 microglobulin protein or an MHC
class II
alpha chain protein and an MHC class II beta chain protein to thereby express
MHC class I
or MHC class II proteins on the cell surface. Expression of the appropriate
class I or class II
MHC in conjunction with a peptide having the activity of a B lymphocyte
antigen (e.g.,
B7-1, B7-2, B7-3) induces a T cell mediated immune response against the
transfected tumor
cell. Optionally, a gene encoding an antisense construct which blocks
expression of an MHC
class II associated protein, such as the invariant chain, can also be
cotransfected with a DNA
encoding a peptide having the activity of a B lymphocyte antigen to promote
presentation of
tumor associated antigens and induce tumor specific immunity. Thus, the
induction of a T
cell mediated immune response in a human subject may be sufficient to overcome
tumor-specific tolerance in the subject.
The activity of a protein of the invention may, among other means, be measured
by
the following methods:
Suitable assays for thymocyte or splenocyte cytotoxicity include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. I~ruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Hemnann et al., Proc. Natl. Acad.
Sci. USA
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et
al., J.
Tm_m__unol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500,
1986; Takai et al.,
J. Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998;
Bertagnolli et
al., Cellular Immunology 133:327-341, 1991; Brown et al., J. Tmmunol. 153:3079-
3092,
1994.
Assays for T-cell-dependent immunoglobulin responses and isotype switching
(which will identify, among others, proteins that modulate T-cell dependent
antibody
responses and that affect Thl/Th2 profiles) include, without limitation, those
described in:
Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell function:
In vitro


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
57
antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in
Immunology. J.
E. e.a. Coligan eds. Vol 1 pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto.
1994.
Mixed lymphocyte reaction (MLR) assays (which will identify, among others,
proteins that generate predominantly Thl and CTL responses) include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. Kruisbeek,
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates
and
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-
3500, 1986;
Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. Immunol.
149:3778-3783,
1992.
Dendritic cell-dependent assays (which will identify, among others, proteins
expressed by dendritic cells that activate naive T-cells) include, without
limitation, those
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et aL,
Journal of
Experimental Medicine 173:549-559, 1991; Macatonia et al., Journal of
Immunology
154:5071-5079, 1995; Porgador et al., Journal of Experimental Medicine 182:255-
260,
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al.,
Science
264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-
1264,
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and
Inaba et al.,
Journal of Experimental Medicine 172:631-640, 1990.
Assays for lymphocyte survival/apoptosis (which will identify, among others,
proteins that prevent apoptosis after superantigen induction and proteins that
regulate
lymphocyte homeostasis) include, without limitation, those described in:
Darzynkiewicz et
al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993;
Gorczyca et
al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991;
Zacharchuk,
Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897,
1993;
Gorczyca et al., International Journal of Oncology 1:639-648, 1992.
Assays for proteins that influence early steps of T-cell commitment and
development
include, without limitation, those described in: Antica et al., Blood 84:111-
117, 1994; Fine
et al., Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-
2778, 1995;
Toki et al., Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.
4.10.8 ACTIVIN/INHIBIN ACTIVITY


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
58
A polypeptide of the present invention may also exhibit activin- or inhibin-
related
activities. A polynucleotide of the invention may encode a polypeptide
exhibiting such
characteristics. Inhibins are characterized by their ability to inhibit the
release of follicle
stimulating hormone (FSH), while activins and are characterized by their
ability to stimulate
the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the
present
invention, alone or in heterodimers with a member of the inhibin family, may
be useful as a
contraceptive based on the ability of inlubins to decrease fertility in female
mammals and
decrease spermatogenesis in male marmnals. Administration of sufficient
amounts of other
inlubins can induce infertility in these mammals. Alternatively, the
polypeptide of the
invention, as a homodimer or as a heterodimer with other protein subunits of
the inhibin
group, may be useful as a fertility inducing therapeutic, based upon the
ability of activin
molecules in stimulating FSH release from cells of the anterior pituitary.
See, for example,
U.S. Pat. No. 4,798,885. A polypeptide of the invention may also be useful for
advancement
of the onset of fertility in sexually immature mammals, so as to increase the
lifetime
reproductive performance of domestic animals such as, but not limited to,
cows, sheep and
pigs.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods.
Assays for activiWinhibin activity include, without limitation, those
described in:
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782,
1986; Vale et
al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage
et al., Proc.
Natl. Acad. Sci. USA 83:3091-3095, 1986.
4.10.9 CHEMOTACTIC/CHEMOHINETIC ACTIVITY
A polypeptide of the present invention may be involved in chemotactic or
chemokinetic activity for mammalian cells, including, for example, monocytes,
fibroblasts,
neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial
cells. A
polynucleotide of the invention can encode a polypeptide exhibiting such
attributes.
Chemotactic and chemokinetic receptor activation can be used to mobilize or
attract a
desired cell population to a desired site of action. Chemotactic or
chemokinetic compositions
(e.g. proteins, antibodies, binding partners, or modulators of the invention)
provide particular
advantages in treatment of wounds and other trauma to tissues, as well as in
treatment of
localized infections. For example, attraction of lymphocytes, monocytes or
neutrophils to


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
59
tumors or sites of infection may result in improved immune responses against
the tumor or
infecting agent.
A protein or peptide has chemotactic activity for a particular cell population
if it can
stimulate, directly or indirectly, the directed orientation or movement of
such cell
population. Preferably, the protein or peptide has the ability to directly
stimulate directed
movement of cells. Whether a particular protein has chemotactic activity for a
population of
cells can be readily determined by employing such protein or peptide in any
known assay for
cell chemotaxis.
Therapeutic compositions of the invention can be used in the following:
Assays for chemotactic activity (which will identify proteins that induce or
prevent
chemotaxis) consist of assays that measure the ability of a protein to induce
the migration of
cells across a membrane as well as the ability of a protein to induce the
adhesion of one cell
population to another cell population. Suitable assays for movement and
adhesion include,
without limitation, those described in: Current Protocols in Immunology, Ed by
J. E.
Coligan, A. M. I~ruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub.
Greene
Publishing Associates and Wiley-Interscience (Chapter 6.12, Measurement of
alpha and beta
Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995;
Lind et al.
APMIS 103:140-146, 1995; Muller et al Eur. J. Imrnunol. 25:1744-1748; Gruber
et al. J. of
Immunol. 152:5860-5867, 1994; Johnston et al. J. of Immunol. 153:1762-1768,
1994.
4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY
A polypeptide of the invention may also be involved in hemostatis or
thrombolysis or
thrombosis. A polynucleotide of the invention can encode a polypeptide
exhibiting such
attributes. Compositions may be useful in treatment of various coagulation
disorders
(including hereditary disorders, such as hemophiliac) or to enhance
coagulation and other
hemostatic events in treating wounds resulting from trauma, surgery or other
causes. A
composition of the invention may also be useful for dissolving or inhibiting
formation of
thromboses and for treatment and prevention of conditions resulting therefrom
(such as, for
example, infarction of cardiac and central nervous system vessels (e.g.,
stroke).
Therapeutic compositions of the invention can be used in the following:
Assay for hemostatic and thrombolytic activity include, without limitation,
those
described in: Linet et al., J. Clin. Phannacol. 26:131-140, 1986; Burdick et
al., Thrombosis


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
Res. 45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub,
Prostaglandins 35:467-474, 1988.
4.14.11 CANCER DIAGNOSIS AND THERAPY
5 Polypeptides of the invention may be involved in cancer cell generation,
proliferation
or metastasis. Detection of the presence or amount of polynucleotides or
polypeptides of the
invention may be useful for the diagnosis and/or prognosis of one or more
types of cancer.
For example, the presence or increased expression of a
polynucleotide/polypeptide of the
invention may indicate a hereditary risk of cancer, a precancerous condition,
or an ongoing
10 malignancy. Conversely, a defect in the gene or absence of the polypeptide
may be
associated with a cancer condition. Identification of single nucleotide
polymorphisms
associated with cancer or a predisposition to cancer may also be useful for
diagnosis or
prognosis.
Cancer treatments promote tumor regression by inhibiting tumor cell
proliferation,
15 inhibiting angiogenesis (growth of new blood vessels that is necessary to
support tumor
growth) and/or prohibiting metastasis by reducing tumor cell motility or
invasiveness.
Therapeutic compositions of the invention may be effective in adult and
pediatric oncology
including in solid phase tumors/malignancies, locally advanced tumors, human
soft tissue
sarcomas, metastatic cancer, including l5nnphatic metastases, blood cell
malignancies
20 including multiple myeloma, acute and chronic leukemias, and lymphomas,
head and neck
cancers including mouth cancer, larynx cancer and thyroid cancer, lung cancers
including
small cell carcinoma and non-small cell cancers, breast cancers including
small cell .
carcinoma and ductal carcinoma, gastrointestinal cancers including esophageal
cancer,
stomach cancer, colon cancer, colorectal cancer and polyps associated with
colorectal
25 neoplasia, pancreatic cancers, liver cancer, urologic cancers including
bladder cancer and
prostate cancer, malignancies of the female genital tract including ovarian
carcinoma, uterine
(including endometrial) cancers, and solid tumor in the ovarian follicle,
kidney cancers
including renal cell carcinoma, brain cancers including intrinsic brain
tumors,
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell
invasion in the central
30 nervous system, bone cancers including osteomas, skin cancers including
malignant
melanoma, tumor progression of human skin keratinocytes, squamous cell
carcinoma, basal
cell carcinoma, hemangiopericytoma and Karposi's sarcoma.
Polypeptides, polynucleotides, or modulators of polypeptides of the invention


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
61
(including inhibitors and stimulators of the biological activity of the
polypeptide of the
invention) may be administered to treat cancer. Therapeutic compositions can
be
administered in therapeutically effective dosages alone or in combination with
adjuvant
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and
laser
therapy, and may provide a beneficial effect, e.g. reducing tumor size,
slowing rate of tumor
growth, inhibiting metastasis, or otherwise improving overall clinical
condition, without
necessarily eradicating the cancer.
The composition can also be administered in therapeutically effective amounts
as a
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of
the polypeptide or
modulator of the invention with one or more anti-cancer drugs in addition to a
pharmaceutically acceptable carrier for delivery. The use of anti-cancer
cocktails as a cancer
treatment is routine. Anti-cancer drugs that are well knovcm in the art and
can be used as a
treatment in combination with the polypeptide or modulator of the invention
include:
Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan,
Carboplatin,
Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, Cytarabine
HCl
(Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HCI,
Doxombicin HCl,
Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5-
Fluorouracil (5-Fu),
Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon Alpha-Za,
Interferon
Alpha-Zb, Leuprolide acetate (LHRH-releasing factor analog), Lomustine,
Mechlorethamine
HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, Methotrexate (MTX),
Mitomycin, Mitoxantrone HCI, Octreotide, Plicamycin, Procaxbazine HCI,
Streptozocin,
Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine
ulfate,
Amsacrine, Azacitidine, Hexamethyhnelamine, Interleukin-2, Mitoguazone,
Pentostatin,
Semustine, Teniposide, and Vindesine sulfate.
In addition, therapeutic compositions of the invention may be used for
prophylactic
treatment of cancer. There axe hereditary conditions and/or environmental
situations (e.g.
exposure to carcinogens) known in the art that predispose an individual to
developing
cancers. Under these circumstances, it may be beneficial to treat these
individuals with
therapeutically effective doses of the polypeptide of the invention to reduce
the risk of
developing cancers.
In vitfro models can be used to determine the effective doses of the
polypeptide of the
invention as a potential cancer treatment. These ivy. vitYO models include
proliferation assays
of cultured tumor cells, growth of cultured tumor cells in soft agar (see
Freshney, (1987)


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
62
Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY
Ch 18
and Ch 21), tumor systems in nude mice as described in Giovanella et al., J.
Natl. Can. Inst.,
52: 921-30 (1974), mobility and invasive potential of tumor cells in Boyden
Chamber assays
as described in Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and
angiogenesis
assays such as induction of vascularization of the chick chorioallantoic
membrane or
induction of vascular endothelial cell migration as described in Ribatta et
al., Intl. J. Dev.
Biol., 40: 1189-97 (1999) and Li et al., Clin. Exp. Metastasis, 17:423-9
(1899), respectively.
Suitable ttunor cells lines are available, e.g. from American Type Tissue
Culture Collection
catalogs.
4.10.12 RECEPTOR/LIGAND ACTIVITY
A polypeptide of the present invention may also demonstrate activity as
receptor,
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A
polynucleotide of
the invention can encode a polypeptide exhibiting such characteristics.
Examples of such
receptors and ligands include, without limitation, cytokine receptors and
their ligands,
receptor kinases and their ligands, receptor phosphatases and their ligands,
receptors
involved in cell-cell interactions and their ligands (including without
limitation, cellular
adhesion molecules (such as selectins, integrins and their ligands) and
receptorfligand pairs
involved in antigen presentation, antigen recognition and development of
cellular and
humoral immune responses. Receptors and ligands are also useful for screening
of potential
peptide or small molecule inhibitors of the relevant receptor/ligand
interaction. A protein of
the present invention (including, without limitation, fragments of receptors
and ligands) may
themselves be useful as inhibitors of receptor/ligand interactions.
The activity of a polypeptide of the invention may, among other means, be
measured
by the following methods:
Suitable assays for receptor-ligand activity include without limitation those
described
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H.
Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
Wiley-
Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static
conditions
7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987;
Bierer et al.,
J. Exp. Med. 168:1145-1156, 1988; Rosenstein et al., J. Exp. Med. 169:149-160
1989;
Stoltenborg et al., J. Iminunol. Methods 175:59-68, 1994; Stitt et al., Cell
80:661-670, 1995.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
63
By way of example, the polypeptides of the invention may be used as a receptor
for a
ligand(s) thereby transmitting the biological activity of that ligand(s).
Ligands may be
identified through binding assays, affinity chromatography, dihybrid screening
assays,
BIAcore assays, gel overlay assays, or other methods knOWn 1I1 the art.
Studies characterizing drugs or proteins as agonist or antagonist or partial
agonists or
a partial antagonist require the use of other proteins as competing ligands.
The polypeptides
of the present invention or ligand(s) thereof may be labeled by being coupled
to
radioisotopes, colorimetric molecules or a toxin molecules by conventional
methods.
("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in
Enzymology Vol. 182
(1990) Academic Press, Inc. San Diego). Examples of radioisotopes include, but
are not
limited to, tritium and carbon-14 . Examples of colorimetric molecules
include, but are not
limited to, fluorescent molecules such as fluorescamine, or rhodamine or other
colorimetric
molecules. Examples of toxins include, but are not limited, to ricin.
4.10.13 DRUG SCREENING
This invention is particularly useful for screening chemical compounds by
using the
novel polypeptides or binding fragments thereof in any of a variety of drug
screening
techniques. The polypeptides or fragments employed in such a test may either
be free in
solution, affixed to a solid support, borne on a cell surface or located
intracellularly. One
method of drug screening utilizes eukaryotic or prokaryotic host cells which
are stably
transformed with recombinant nucleic acids expressing the polypeptide or a
fragment
thereof. Drugs are screened against such transformed cells in competitive
binding assays.
Such cells, either in viable or fixed form, can be used for standard binding
assays. One may
measure, for example, the formation of complexes between polypeptides of the
invention or
fragments and the agent being tested or examine the diminution in complex
formation
between the novel polypeptides and an appropriate cell line, which are well
known in the art.
Sources for test compounds that may be screened for ability to bind to or
modulate
(i.e., increase or decrease) the activity of polypeptides of the invention
include (1) iilorganic
and organic chemical libraries, (2) natural product libraries, and (3)
combinatorial libraries
comprised of either random or mimetic peptides, oligonucleotides or organic
molecules.
Chemical libraries may be readily synthesized or purchased from a number of
commercial sources, and may include structural analogs of known compounds or
compounds
that are identified as "hits" or "leads" via natural product screening.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
64
The sources of natural product libraries are microorganisms (including
bacteria and
fungi), animals, plants or other vegetation, or marine organisms, and
libraries of mixtures for
screening may be created by: (1) fermentation and extraction of broths from
soil, plant or
marine microorganisms or (2) extraction of the organisms themselves. Natural
product
libraries include polyketides, non-ribosomal peptides, and (non-naturally
occurring) variants
thereof. For a review, see Science 282:63-68 (1998).
Combinatorial libraries are composed of large numbers of peptides,
oligonucleotides
or organic compounds and can be readily prepared by traditional automated
synthesis
methods, PCR, cloning or proprietary synthetic methods. Of particular interest
are peptide
and oligonucleotide combinatorial libraries. Still other libraries of interest
include peptide,
protein, peptidomimetic, multiparallel synthetic collection, recombinatorial,
and polypeptide
libraries. For a review of combinatorial chemistry and libraries created
therefrom, see
Myers, Curs. Opin. BioteclZnol. 8:701-707 (1997). For reviews and examples of
peptidomimetic libraries, see Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23
(1998); Hruby
et al., Curn Opin Clzem Biol, 1(1):114-19 (1997); Dorner et al., BioofgMed
Chem,
4(5):709-15 (1996) (alkylated dipeptides).
Identification of modulators through use of the various libraries described
herein
permits modification of the candidate "hit" (or "lead") to optimize the
capacity of the "hit"
to bind a polypeptide of the invention. The molecules identified in the
binding assay are then
tested for antagonist or agonist activity in in vivo tissue culture or animal
models that are
well known in the art. In brief, the molecules are titrated into a plurality
of cell cultures or
animals and then tested for either cell/animal death or prolonged survival of
the animal/cells.
The binding molecules thus identified may be complexed with toxins, e.g.,
ricin or
cholera, or with other compounds that are toxic to cells such as
radioisotopes. The
toxin-binding molecule complex is then targeted to a tumor or other cell by
the specificity of
the binding molecule for a polypeptide of the invention. Alternatively, the
binding
molecules may be complexed with imaging agents for targeting and imaging
purposes.
4.10.14 ASSAY FOR RECEPTOR ACTIVITY
The invention also provides methods to detect specific binding of a
polypeptide e.g. a
ligand or a receptor. The art provides numerous assays particularly useful for
identifying
previously unknown binding partners for receptor polypeptides of the
invention. For
example, expression cloning using mammalian or bacterial cells, or dihybrid
screening


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
assays can be used to identify polynucleotides encoding binding partners. As
another
example, affinity chromatography with the appropriate immobilized polypeptide
of the
invention can be used to isolate polypeptides that recognize and bind
polypeptides of the
invention. There are a number of different libraries used for the
identification of
5 compounds, and in particular small molecules, that modulate (i.e., increase
or decrease)
biological activity of a polypeptide of the invention. Ligands for receptor
polypeptides of the
invention can also be identified by adding exogenous ligands, or cocktails of
ligands to two
cells populations that are genetically identical except for the expression of
the receptor of the
invention: one cell population expresses the receptor of the invention whereas
the other does
10 not. The responses of the two cell populations to the addition of
ligands(s) are then
compared. Alternatively, an expression library can be co-expressed with the
polypeptide of
the invention in cells and assayed for an autocrine response to identify
potential ligand(s). As
still another example, BIAcore assays, gel overlay assays, or other methods
known in the art
can be used to identify binding partner polypeptides, including, (1) organic
and inorganic
15 chemical libraries, (2) natural product libraries, and (3) combinatorial
libraries comprised of
random peptides, oligonucleotides or organic molecules.
The role of downstream intracellular signaling molecules in the signaling
cascade of
the polypeptide of the invention can be determined. For example, a chimeric
protein in
which the cytoplasmic domain of the polypeptide of the invention is fused to
the
20 extracellular portion of a protein, whose ligand has been identified, is
produced in a host
cell. The cell is then incubated with the ligand specific for the
extracellular portion of the
chimeric protein, thereby activating the chimeric receptor. Known downstream
proteins
involved in intracellular signaling can then be assayed for expected
modifications i.e.
phosphorylation. Other methods known to those in the art can also be used to
identify
25 signaling molecules involved in receptor activity.
4.10.15 ANTI-INFLAMMATORY ACTIVITY
Compositions of the present invention may also exhibit anti-inflammatory
activity.
The anti-inflammatory activity may be achieved by providing a stimulus to
cells involved in
30 the inflammatory response, by inhibiting or promoting cell-cell
interactions (such as, for
example, cell adhesion), by inhibiting or promoting chemotaxis of cells
involved in the
inflammatory process, inhibiting or promoting cell extravasation, or by
stimulating or
suppressing production of'other factors which more directly inhibit or promote
an


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
66
inflammatory response. Compositions with such activities can be used to treat
inflammatory
conditions including chronic or acute conditions), including without
limitation intimation
associated with infection (such as septic shock, sepsis or systemic
inflammatory response
syndrome (SIRS)), ischemia-reperfusion injury, endotoxin lethality, arthritis,
complement-mediated hyperacute rejection, nephritis, cytokine or chemokine-
induced lung
injury, inflammatory bowel disease, Crohn's disease or resulting from over
production of
cytokines such as TNF or IL-1. Compositions of the invention may also be
useful to treat
anaphylaxis and hypersensitivity to an antigenic substance or material.
Compositions of this
invention may be utilized to prevent or treat conditions such as, but not
limited to, sepsis,
acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid
arthritis, chronic
inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1,
graft versus
host disease, inflammatory bowel disease, inflamation associated with
pulmonary disease,
other autoimmune disease or inflammatory disease, an antiproliferative agent
such as for
acute or chronic mylegenous leukemia or in the prevention of premature labor
secondary to
intrauterine infections.
4.10.16 LEUKEMIAS
Leukemias and related disorders may be treated or prevented by administration
of a
therapeutic that promotes or inhibits function of the polynucleotides and/or
polypeptides of
the invention. Such leukemias and related disorders include but are not
limited to acute
leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic,
promyelocytic, myelomonocytic, monocytic, erythroleukemia, chronic leukemia,
chronic
myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia (for a
review of such
disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co.,
Philadelphia).
4.10.17 NERVOUS SYSTEM DISORDERS
Nervous system disorders, involving cell types which can be tested for
efficacy of
intervention with compounds that modulate the activity of the polynucleotides
and/or
polypeptides of the invention, and which can be treated upon thus observing an
indication of
therapeutic utility, include but are not limited to nervous system injuries,
and diseases or
disorders which result in either a disconnection of axons, a diminution or
degeneration of
neurons, or demyelination. Nervous system lesions which may be treated in a
patient
(including human and non-human mammalian patients) according to the invention
include


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
67
but are not limited to the following lesions of either the central (including
spinal cord, brain)
or peripheral nervous systems:
(i) traumatic lesions, including lesions caused by physical injury or
associated
with sua-gery, for example, lesions which sever a portion of the nervous
system, or
compression injuries;
(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous
system
results in neuronal injury or death, including cerebral infarction or
ischemia, or spinal cord
infarction or ischemia;
(iii) infectious lesions, in which a portion of the nervous system is
destroyed or
injured as a result of infection, for example, by an abscess or associated
with infection by
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with
Lyme
disease, tuberculosis, syphilis;
(iv) degenerative lesions, in which a portion of the nervous system is
destroyed or
injured as a result of a degenerative process including but not limited to
degeneration
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea,
or
amyotrophic lateral sclerosis;
(v) lesions associated with nutritional diseases or disorders, in which a
portion of
the nervous system is destroyed or injured by a nutritional disorder or
disorder of
metabolism including but not limited to, vitamin B 12 deficiency, folic acid
deficiency,
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease
(primary
degeneration of the corpus callosum), and alcoholic cerebellar degeneration;
(vi) neurological lesions associated with systemic diseases including but not
limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus
erythematosus,
carcinoma, or sarcoidosis;
(vii) lesions caused by toxic substances including alcohol, lead, or
particular
neurotoxins; and
(viii) demyelinated lesions in which a portion of the nervous system is
destroyed or
injured by a demyelinating disease including but not limited to multiple
sclerosis, human
immunodeficiency virus-associated myelopathy, transverse myelopathy or various
etiologies, progressive multifocal leukoencephalopathy, and central pontine
myelinolysis.
Therapeutics which are useful according to the invention for treatment of a
nervous
system disorder may be selected by testing for biological activity in
promoting the survival


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
68
or differentiation of neurons. For example, and not by way of limitation,
therapeutics which
elicit any of the following effects may be useful according to the invention:
(i) increased survival time of neurons in culture;
(ii) increased sprouting of neurons in culture or in vivo;
(iii) increased production of a neuron-associated molecule in culture or in
vivo,
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor
neurons; or
(iv) decreased symptoms of neuron dysfunction in vivo.
Such effects may be measured by any method known in the art. In preferred,
non-limiting embodiments, increased survival of neurons may be measured by the
method
set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3S1S); increased
sprouting of neurons
may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol.
70:65-82) or
Brown et al. (1981, Ann. Rev. Neurosci. 4:17-42); increased production of
neuron-associated molecules may be measured by bioassay, enzymatic assay,
antibody
binding, Northern blot assay, etc., depending on the molecule to be measured;
and motor
1 S neuron dysfunction may be measured by assessing the physical manifestation
of motor
neuron disorder, e.g., weakness, motor neuron conduction velocity, or
functional disability.
In specific embodiments, motor neuron disorders that may be treated according
to the
invention include but are not limited to disorders such as infarction,
infection, exposure to
toxin, trauma, surgical damage, degenerative disease or malignancy that may
affect motor
neurons as well as other components of the nervous system, as well as
disorders that
selectively affect neurons such as amyotrophic lateral sclerosis, and
including but not limited
to progressive spinal muscular atrophy, progressive bulbar palsy, primary
lateral sclerosis,
infantile and juvenile muscular atrophy, progressive bulbar paralysis of
childhood (Fazio-
Londe syndrome), poliomyelitis and the post polio syndrome, and Hereditary
Motorsensory
2S Neuropathy (Charcot-Marie-Tooth Disease).
4.10.18 OTHER ACTIVITIES
A polypeptide of the invention may also exhibit one or more of the following
additional activities or effects: inhibiting the growth, infection or function
of, or killing,
infectious agents, including, without limitation, bacteria, viruses, fungi and
other parasites;
effecting (suppressing or enhancing) bodily characteristics, including,
without limitation,
height, weight, hair color, eye color, skin, fat to lean ratio or other tissue
pigmentation, or
organ or body part size or shape (such as, for example, breast augmentation or
diminution,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
69
change in bone form or shape); effecting biorhythms or circadian cycles or
rhythms;
effecting the fertility of male or female subjects; effecting the metabolism,
catabolism,
anabolism, processing, utilization, storage or elimination of dietary fat,
lipid, protein,
carbohydrate, vitamins, minerals, co-factors or other nutritional factors or
component(s);
effecting behavioral characteristics, including, without limitation, appetite,
libido, stress,
cognition (including cognitive disorders), depression (including depressive
disorders) and
violent behaviors; providing analgesic effects or other pain reducing effects;
promoting
differentiation and growth of embryonic stem cells in Iineages other than
hematopoietic
lineages; hormonal or endocrine activity; in the case of enzymes, correcting
deficiencies of
the enzyme and treating deficiency-related diseases; treatment of
hyperproliferative
disorders (such as, for example, psoriasis); immunoglobulin-like activity
(such as, for
example, the ability to bind antigens or complement); and the ability to act
as an antigen in a
vaccine composition to raise an immune response against such protein or
another material or
entity which is cross-reactive with such protein.
4.10.19 IDENTIFICATION OF POLYMORPHISMS
The demonstration of polymorphisms makes possible the identification of such
polymorphisms in human subjects and the pharmacogenetic use of this
information for
diagnosis and treatment. Such polymorphisms may be associated with, e.g.,
differential
predisposition or susceptibility to various disease states (such as disorders
involving
inflammation or immune response) or a differential response to drug
administration, and this
genetic information can be used to tailor preventive or therapeutic treatment
appropriately.
For example, the existence of a polymorphism associated with a predisposition
to
inflammation or autoimmune disease makes possible the diagnosis of this
condition in
humans by identifying the presence of the polymorphism.
Polymorphisms can be identified in a variety of ways known in the art which
all
generally involve obtaining a sample from a patient, analyzing DNA from the
sample,
optionally involving isolation or amplification of the DNA, and identifying
the presence of
the polymorphism in the DNA. For example, PCR may be used to amplify an
appropriate
fragment of genomic DNA which rnay then be sequenced. Alternatively, the DNA
may be
subjected to allele-specific oligonucleotide hybridization (in which
appropriate
oligonucleotides are hybridized to the DNA under conditions permitting
detection of a single
base mismatch) or to a single nucleotide extension assay (in which an
oligonucleotide that


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
hybridizes immediately adjacent to the position of the polymorphism is
extended with one or
more labeled nucleotides). In addition, traditional restriction fragment
length polymorphism
analysis (using restriction enzymes that provide differential digestion of the
genomic DNA
depending on the presence or absence of the polymorphism) may be performed.
Arrays with
5 nucleotide sequences of the present invention can be used to detect
polyrnorphisms. The
array can comprise modified nucleotide sequences of the present invention in
order to detect
the nucleotide sequences of the present invention. In the alternative, any one
of the
nucleotide sequences of the present invention can be placed on the array to
detect changes
from those sequences.
10 Alternatively a polymorphism resulting in a change in the amino acid
sequence could
also be detected by detecting a corresponding change in amino acid sequence of
the protein,
e.g., by an antibody specific to the variant sequence.
4.10.20 ARTHRITIS AND INFLAMMATION
15 The immunosuppressive effects of the compositions of the invention against
rheumatoid arthritis is determined in an experimental animal model system. The
experimental model system is adjuvant induced arthritis in rats, and the
protocol is described
by J. Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963,
Int. Arch.
Allergy Appl. Immunol., 23:129. W duction of the disease can be caused by a
single
20 injection, generally intradermally, of a suspension of killed Mycobacterium
tuberculosis in
complete Freund's adjuvant (CFA). The route of injection can vary, but rats
may be injected
at the base of the tail with an adjuvant mixture. The polypeptide is
administered in phosphate
buffered solution (PBS) at a dose of about 1-5 mg/kg. The control consists of
administering
PBS only.
25 The procedure for testing the effects of the test compound would consist of
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by
immediately
administering the test compound and subsequent treatment every other day until
day 24. At
14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an
overall arthritis
score may be obtained as described by J. Holoskitz above. An analysis of the
data would
30 reveal that the test compound would have a dramatic affect on the swelling
of the joints as
measured by a decrease of the arthritis score.
4.11 THERAPEUTIC METHODS


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
71
The compositions (including polypeptide fragments, analogs, variants and
antibodies
or other binding partners or modulators including antisense polynucleotides)
of the invention
have numerous applications in a variety of therapeutic methods. Examples of
therapeutic
applications include, but are not limited to, those exemplified herein.
4.11.1 EXAMPLE
One embodiment of the invention is the administration of an effective amount
of the
polypeptides or other composition of the invention to individuals affected by
a disease or
disorder that can be modulated by regulating the peptides of the invention.
While the mode
of administration is not particularly important, parenteral administration is
preferred. An
exemplary mode of administration is to deliver an intravenous bolus. The
dosage of the
polypeptides or other composition of the invention will normally be determined
by the
prescribing physician. It is to be expected that the dosage will vary
according to the age,
weight, condition and response of the individual patient. Typically, the
amount of
polypeptide administered per dose will be in the range of about 0.01 ~,g/kg to
100 mg/kg of
body weight, with the preferred dose being about 0.1 ~.g/kg to 10 mg/kg of
patient body
weight. For parenteral administration, polypeptides of the invention will be
formulated in an
injectable form combined with a pharmaceutically acceptable parenteral
vehicle. Such
vehicles are well known in the art and examples include water, saline,
Ringer's solution,
dextrose solution, and solutions consisting of small amounts of the human
serum albumin.
The vehicle may contain minor amounts of additives that maintain the
isotonicity and
stability of the polypeptide or other active ingredient. The preparation of
such solutions is
within the skill of the art.
4.12 PHARMACEUTICAL . FORMULATIONS AND ROUTES OF
ADMINISTRATION
A protein or other composition of the present invention (from whatever source
derived, including without limitation from recombinant and non-recombinant
sources and
including antibodies and other binding partners of the polypeptides of the
invention) may be
administered to a patient in need, by itself, or in pharmaceutical
compositions where it is
mixed with suitable carriers or excipient(s) at doses to treat or ameliorate a
variety of
disorders. Such a composition may optionally contain (in addition to protein
or other active
ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers,
solubilizers, and other


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
72
materials well known in the art. The term "pharmaceutically acceptable" means
a non-toxic
material that does not interfere with the effectiveness of the biological
activity of the active
ingredient(s). The characteristics of the carrier will depend on the route of
administration.
The pharmaceutical composition of the invention may also contain cytokines,
lymphokines,
or other hematopoietic factors such as M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3,
IL-4, IL-5,
IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, IL-13, IL-14, IL-15, IFN, TNFO,
TNF1, TNF2,
G-CSF, Meg-CSF, thrombopoietin, stem cell factor, and erythropoietin. In
further
compositions, proteins of the invention may be combined with other agents
beneficial to the
treatment of the disease or disorder in question. These agents include various
growth factors
such as epidermal growth factor (EGF), platelet-derived growth factor (PDGF),
transforming
growth factors (TGF-oc and TGF-[3), insulin-like growth factor (IGF), as well
as cytokines
described herein.
The pharmaceutical composition may further contain other agents which either
enhance the activity of the protein or other active ingredient or complement
its activity or
use in treatment. Such additional factors and/or agents may be included in the
pharmaceutical composition to produce a synergistic effect with protein or
other active
ingredient of the invention, or to minimize side effects. Conversely, protein
or other active
ingredient of the present invention may be included in formulations of the
particular clotting
factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or anti-
thrombotic
factor, or anti- inflammatory agent to minimize side effects of the clotting
factor, cytokine,
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic
factor, or
anti-inflammatory agent (such as IL-lRa, IL-1 Hyl, IL-I Hy2, anti-TNF,
corticosteroids,
immunosuppressive agents). A protein of the present invention may be active in
multimers
(e.g., heterodimers or homodimers) or complexes with itself or other proteins.
As a result,
pharmaceutical compositions of the invention may comprise a protein of the
invention in
such multimeric or complexed foam.
As an alternative to being included in a pharmaceutical composition of the
invention
including a first protein, a second protein or a therapeutic agent may be
concurrently
administered with the first protein (e.g., at the same time, or at differing
times provided that
therapeutic concentrations of the combination of agents is achieved at the
treatment site).
Techniques for formulation and administration of the compounds of the instant
application
may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co.,
Easton, PA,
latest edition. A therapeutically effective dose further refers to that amount
of the compound


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
73
sufficient to result in amelioration of symptoms, e.g., treatment, healing,
prevention or
amelioration of the relevant medical condition, or an increase in rate of
treatment, healing,
prevention or amelioration of such conditions. When applied to an individual
active
ingredient, administered alone, a therapeutically effective dose refers to
that ingredient
alone. When applied to a combination, a therapeutically effective dose refers
to combined
amounts of the active ingredients that result in the therapeutic effect,
whether administered
in combination, serially oresimultaneously.
In practicing the method of treatment or use of the present invention, a
therapeutically effective amount of protein or other active ingredient of the
present invention
is administered to a mammal having a condition to be treated. Protein or other
active
ingredient of the present invention may be administered in accordance with the
method of
the invention either alone or in combination with other therapies such as
treatments
employing cytokines, lyrnphokines or other hematopoietic factors. When co-
administered
with one or more cytokines, lymphokines or other hematopoietic factors,
protein or other
active ingredient of the present invention may be administered either
simultaneously with
the cytokine(s), lymphokine(s), other hematopoietic factor(s), thrombolytic or
anti-thrombotic factors, or sequentially. If administered sequentially, the
attending physician
will decide on the appropriate sequence of administering protein or other
active ingredient of
the present invention in combination with cytokine(s), lyrnphokine(s), other
hematopoietic
factor(s), thrombolytic or anti-thrombotic factors.
4.12.1 ROUTES OF ADMINISTRATION
Suitable routes of administration may, for example, include oral, rectal,
transmucosal, or intestinal administration; parenteral delivery, including
intramuscular,
subcutaneous, intramedullary injections, as well as intrathecal, direct
intraventricular,
intravenous, intraperitoneal, intranasal, or intraocular injections.
Administration of protein
or other active ingredient of the present invention used in the pharmaceutical
composition or
to practice the method of the present invention can be carned out in a variety
of conventional
ways, such as oral ingestion, inhalation, topical application or cutaneous,
subcutaneous,
intraperitoneal, parenteral or intravenous injection. Intravenous
administration to the patient
is preferred.
Alternately, one may administer the compound in a local rather than systemic
manner, for example, via injection of the compound directly into a arthritic
joints or in


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
74
fibrotic tissue, often in a depot or sustained release formulation. In order
to prevent the
scarnng process frequently occurring as complication of glaucoma surgery, the
compounds
may be administered topically, for example, as eye drops. Furthermore, one may
administer
the drug in a targeted drug delivery system, for example, in a liposome coated
with a specific
antibody, targeting, for example, arthritic or fibrotic tissue. The liposomes
will be targeted
to and taken up selectively by the afflicted tissue.
The polypeptides of the invention are administered by any route that delivers
an
effective dosage to the desired site of action. The determination of a
suitable route of
administration and an effective dosage for a particular indication is within
the level of skill
in the art. Preferably for wound treatment, one administers the therapeutic
compound
directly to the site. Suitable dosage ranges for the polypeptides of the
invention can be
extrapolated from these dosages or from similar studies in appropriate animal
models.
Dosages can then be adjusted as necessaxy by the clinician to provide maximal
therapeutic
benefit.
4.12.2 COMPOSITIONS/FORMULATIONS
Pharmaceutical compositions for use in accordance with the present invention
thus
may be formulated in a conventional manner using one or more physiologically
acceptable
carriers comprising excipients and auxiliaries which facilitate processing of
the active
compounds into preparations which can be used pharmaceutically. These
pharmaceutical
compositions may be manufactured in a manner that is itself known, e.g., by
means of
conventional mixing, dissolving, granulating, dragee-making, levigating,
emulsifying,
encapsulating, entrapping or lyophilizing processes. Proper formulation is
dependent upon
the route of administration chosen. When a therapeutically effective amount of
protein or
other active ingredient of the present invention is administered orally,
protein or other active
ingredient of the present invention will be in the form of a tablet, capsule,
powder, solution
or elixir. When administered in tablet form, the pharmaceutical composition of
the invention
may additionally contain a solid carrier such as a gelatin or an adjuvant. The
tablet, capsule,
and powder contain from about 5 to 95% protein or other active ingredient of
the present
invention, and preferably from about 25 to 90% protein or other active
ingredient of the
present invention. When administered in liquid form, a liquid carrier such as
water,
petroleum, oils of animal or plant origin such as peanut oil, mineral oil,
soybean oil, or
sesame oil, or synthetic oils may be added. The liquid form of the
pharmaceutical


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
composition may further contain physiological saline solution, dextrose or
other saccharide
solution, or glycols such as ethylene glycol, propylene glycol or polyethylene
glycol. When
administered in liquid form, the pharmaceutical composition c~antains from
about 0.5 to 90%
by weight of protein or other active ingredient of the present invention, and
preferably from
5 about 1 to 50% protein or other active ingredient of the present invention.
When a therapeutically effective amount of protein ox other active ingredient
of the
present invention is administered by intravenous, cutaneous or subcutaneous
injection,
protein or other active ingredient of the present invention will be in the
form of a
pyrogen-free, parenterally acceptable aqueous solution. The preparation of
such parenterally
10 acceptable protein or other active ingredient solutions, having due regard
to pH, isotonicity,
stability, and the like, is within the skill in the art. A preferred
pharmaceutical composition
for intravenous, cutaneous, or subcutaneous injection should contain, in
addition to protein
or other active ingredient of the present invention, an isotonic vehicle such
as Sodium
Chloride Injection, Ringer's Injection, Dextrose Injection, Dextrose and
Sodium Chloride
15 Injection, Lactated Ringer's Injection, or~other vehicle as known in the
art. The
pharmaceutical composition of the present invention may also contain
stabilizers,
preservatives, buffers, antioxidants, or other additives known to those of
skill in the art. For
injection, the agents of the invention may be formulated in aqueous solutions,
preferably in
physiologically compatible buffers such as Hanks's solution, Ringer's
solution, or
20 physiological saline buffer. For transmucosal administration, penetrants
appropriate to the
barrier to be permeated are used in the formulation. Such penetrants are
generally known in
the art.
For oral administration, the compounds can be formulated readily by combining
the
active compounds with pharmaceutically acceptable carriers well knov~m in the
art. Such
25 carriers enable the compounds of the invention to be formulated as tablets,
pills, dragees,
capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral
ingestion by a
patient to be treated. Pharmaceutical preparations for oral use can be
obtained from a solid
excipient, optionally grinding a resulting mixture, and processing the mixture
of granules,
after adding suitable auxiliaries, if desired, to obtain tablets or dragee
cores. Suitable
30 excipients are, in particular, fillers such as sugars, including lactose,
sucrose, mannitol, or
sorbitol; cellulose preparations such as, for example, maize starch, wheat
starch, rice starch,
potato starch, gelatin, gum tragacanth, methyl cellulbse, hydroxypropylmethyl-
cellulose,
sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
76
disintegrating agents may be added, such as the cross-linked polyvinyl
pyrrolidone, agar, or
alginic acid or a salt thereof such as sodium alginate. Dragee cores are
provided with
suitable coatings. For this purpose, concentrated sugar solutions may be used,
which may
optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel,
polyethylene glycol,
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or
solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for
identification or to
characterize different combinations of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules
made
of gelatin, as well as soft, sealed capsules made of gelatin and a
plasticizer, such as glycerol
or sorbitol. The push-fit capsules can contain the active ingredients in
admixture with filler
such as lactose, binders such as starches, and/or lubricants such as talc or
magnesium
stearate and, optionally, stabilizers. In soft capsules, the active compounds
may be dissolved
or suspended in suitable liquids, such as fatty oils, liquid paraffin, or
liquid polyethylene
glycols. In addition, stabilizers may be added. All formulations for oral
administration
should be in dosages suitable for such administration. For buccal
administration, the
compositions may take the form of tablets or lozenges formulated in
conventional manner.
For administration by inhalation, the compounds for use according to the
present
invention are conveniently delivered in the form of an aerosol spray
presentation from
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g.,
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane,
carbon dioxide
or other suitable gas. In the case of a pressurized aerosol the dosage unit
may be determined
by providing a valve to deliver a metered amount. Capsules and cartridges of,
e.g., gelatin
for use in an inhaler or insufflator may be formulated containing a powder mix
of the
compound and a suitable powder base such as lactose or starch. The compounds
may be
formulated for parenteral administration by injection, e.g., by bolus
injection or continuous
infusion. Formulations for injection may be presented in unit dosage form,
e.g., in ampules
or in mufti-dose containers, with an added preservative. The compositions may
take such
forms as suspensions, solutions or emulsions in oily or aqueous vehicles, and
may contain
formulatory agents such as suspending, stabilizing and/or dispersing agents.
Pharmaceutical formulations for parenteral administration include aqueous
solutions
of the active compounds in water-soluble form. Additionally, suspensions of
the active
compounds may be prepared as appropriate oily injection suspensions. Suitable
lipophilic
solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty
acid esters, such


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
77
as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions
may contain
substances which increase the viscosity of the suspension, such as sodium
carboxymethyl
cellulose, sorbitol, or dextran. Optionally, the suspension may also contain
suitable
stabilizers or agents which increase the solubility of the compounds to allow
for the
preparation of highly concentrated solutions. Alternatively, the active
ingredient may be in
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-
free water, before
use.
The compounds may also be formulated in rectal compositions such as
suppositories
or retention enemas, e.g., containing conventional suppository bases such as
cocoa butter or
other glycerides. In addition to the formulations described previously, the
compounds may
i
also be formulated as a depot preparation. Such long acting formulations may
be
administered by implantation (for example subcutaneously or intramuscularly)
or by
intramuscular injection. Thus, for example, the compounds may be formulated
with suitable
polymeric or hydrophobic materials (for example as an emulsion in an
acceptable oil) or ion
exchange resins, or as sparingly soluble derivatives, for example, as a
sparingly soluble salt.
A pharmaceutical carrier for the hydrophobic compounds of the invention is a
co-
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-
miscible organic
polymer, and an aqueous phase. The co-solvent system may be the VPD co-solvent
system.
VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant
polysorbate
80, and 65% w/v polyethylene glycol 300, made up to volume in absolute
ethanol. The VPD
co-solvent system (VPD:SW) consists of VPD diluted 1:1 with a 5% dextrose in
water
solution. This co-solvent system dissolves hydrophobic compounds well, and
itself produces
low toxicity upon systemic administration. Naturally, the proportions of a co-
solvent system
may be varied considerably without destroying its solubility and toxicity
characteristics.
Furthermore, the identity of the co-solvent components may be varied: for
example, other
low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the
fraction size of
polyethylene glycol may be varied; other biocompatible polyners may replace
polyethylene
glycol, e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may
substitute for
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical
compounds
may be employed. Liposomes and emulsions are well known examples of delivery
vehicles
or Garners for hydrophobic drugs. Certain organic solvents such as
dimethylsulfoxide also
may be employed, although usually at the cost of greater toxicity.
Additionally, the
compounds may be delivered using a sustained-release system, such as
semipermeable


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
78
matrices of solid hydrophobic polymers containing the therapeutic agent.
Various types of
sustained-release materials have been established and are well known by those
skilled in the
art. Sustained-release capsules may, depending on their chemical nature,
release the
compounds for a few weeks up to over 100 days. Depending on the chemical
nature and the
biological stability of the therapeutic reagent, additional strategies for
protein or other active
ingredient stabilization may be employed.
The pharmaceutical compositions also may comprise suitable solid or gel phase
Garners or excipients. Examples of such carriers or excipients include but are
not limited to
calcium carbonate, calcium phosphate, various sugars, starches, cellulose
derivatives,
gelatin, and polymers such as polyethylene glycols. Many of the active
ingredients of the
invention may be provided as salts with pharmaceutically compatible counter
ions. Such ,
pharmaceutically acceptable base addition salts are those salts which retain
the biological
effectiveness and properties of the free acids and which are obtained by
reaction with
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide,
ammonia,
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium
acetate,
potassium benzoate, triethanol amine and the like.
The pharmaceutical composition of the invention may be in the form of a
complex of
the proteins) or other active ingredients) of present invention along with
protein or peptide
antigens. The protein and/or peptide antigen will deliver a stimulatory signal
to both B and T
lymphocytes. B lymphocytes will respond to antigen through their surface
imm.unoglobulin
receptor. T lymphocytes will respond to antigen through the T cell receptor
(TCR)
following presentation of the antigen by MHC proteins. MHC and structurally
related
proteins including those encoded by class I and class II MHC genes on host
cells will serve
to present the peptide antigens) to T lymphocytes. The antigen components
could also be
supplied as purified MHC-peptide complexes alone or with co-stimulatory
molecules that
can directly signal T cells. Alternatively antibodies able to bind surface
immunoglobulin
and other molecules on B cells as well as antibodies able to bind the TCR and
other
molecules on T cells can be combined with the pharmaceutical composition of
the invention.
The pharmaceutical composition of the invention may be in the form of a
liposome in
which protein of the present invention is combined, in addition to other
pharmaceutically
acceptable carriers, with amphipathic agents such as lipids which exist in
aggregated form as
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous
solution.
Suitable lipids for liposomal formulation include, without limitation,
monoglycerides,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
79
diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids,
and the like.
Preparation of such liposomal formulations is within the level of skill in the
art, as disclosed,
for example, in U.S. Patent Nos. 4,235,871; 4,501,728; 4,837,028; and
4,737,323, all of
which are incorporated herein by reference.
The amount of protein or other active ingredient of the present invention in
the
pharmaceutical composition of the present invention will depend upon the
nature and
severity of the condition being treated, and on the nature of prior treatments
which the
patient has undergone. Ultimately, the attending physician will decide the
amount of protein
or other active ingredient of the present invention with which to treat each
individual patient.
Initially, the attending physician will administer low doses of protein or
other active
ingredient of the present invention and observe the patient's response. Larger
doses of
protein or other active ingredient of the present invention may be
administered until the
optimal therapeutic effect is obtained for the patient, and at that point the
dosage is not
increased further. It is contemplated that the various pharmaceutical
compositions used to
practice the method of the present invention should contain about 0.01 ~g to
about 100 mg
(preferably about 0.1 ~,g to about 10 mg, more preferably about 0.1 ~,g to
about 1 mg) of
protein or other active ingredient of the present invention per kg body
weight. For
compositions of the present invention which are useful for bone, cartilage,
tendon or
ligament regeneration, the therapeutic method includes administering the
composition
topically, systematically, or locally as an implant or device. When
administered, the
therapeutic composition for use in this invention is, of course, in a pyrogen-
free,
physiologically acceptable form. Further, the composition may desirably be
encapsulated or
injected in a viscous form for delivery to the site of bone, cartilage or
tissue damage.
Topical administration may be suitable for wound healing and tissue repair.
Therapeutically
useful agents other than a protein or other active ingredient of the invention
which may also
optionally be included in the composition as described above, may
alternatively or
additionally, be administered simultaneously or sequentially with the
composition in the
methods of the invention. Preferably for bone and/or cartilage formation, the
composition
would include a matrix capable of delivering the protein-containing or other
active
ingredient-containing composition to the site of bone and/or cartilage damage,
providing a
structure for the developing bone and cartilage and optimally capable of being
resorbed into
the body. Such matrices may be formed of materials presently in use for other
implanted
medical applications.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
The choice of matrix material is based on biocompatibility, biodegradability,
mechanical properties, cosmetic appearance and interface properties. The
particular
application of the compositions will define the appropriate formulation.
Potential matrices
for the compositions may be biodegradable arid chemically defined calcium
sulfate,
5 tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and
polyanhydrides.
Other potential materials are biodegradable and biologically well-defined,
such as bone or
dermal collagen. Further matrices are comprised of pure proteins or
extracellular matrix
components. Other potential matrices are nonbiodegradable and chemically
defined, such as
sintered hydroxyapatite, bioglass, aluminates, or other ceramics. Matrices may
be comprised
10 of combinations of any of the above-mentioned types of material, such as
polylactic acid and
hydroxyapatite or collagen and tricalcium phosphate. The bioceramics may be
altered in
composition, such as in calcium-aluminate-phosphate and processing to alter
pore size,
particle size, particle shape, and biodegradability. Presently preferred is a
50:50 (mole
weight) copolymer of lactic acid and glycolic acid in the form of porous
particles having
15 diameters ranging from 150 to 800 microns. In some applications, it will be
useful to utilize
a sequestering agent, such as carboxymethyl cellulose or autologous blood
clot, to prevent
the protein compositions from disassociating from the matrix.
A preferred family of sequestering agents is cellulosic materials such as
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose,
20 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose,
hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred
being
cationic salts of carboxyrnethylcellulose (CMC). Other preferred sequestering
agents
include hyaluronic acid, sodium alginate, polyethylene glycol),
polyoxyethylene oxide,
carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent
useful
25 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation
weight, which
represents the amount necessary to prevent desorption of the protein from the
polymer
matrix and to provide appropriate handling of the composition, yet not so much
that the
progenitor cells are prevented from infiltrating the matrix, thereby providing
the protein the
opportunity to assist the osteogenic activity of the progenitor cells. In
further compositions,
30 proteins or other active ingredients of the invention may be combined with
other agents
beneficial to the treatment of the bone and/or cartilage defect, wound, or
tissue in question.
These agents include various growth factors such as epidermal growth factor
(EGF), platelet


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
81
derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-(3),
and
insulin-like growth factor (IGF).
The therapeutic compositions are also presently valuable for veterinary
applications.
Particularly domestic animals and thoroughbred horses, in addition to humans,
are desired
patients for such treatment with proteins or other active ingredients of the
present invention.
The dosage regimen of a protein-containing pharmaceutical composition to be
used in tissue
regeneration will be determined by the attending physician considering various
factors which
modify the action of the proteins, e.g., amount of tissue weight desired to be
formed, the site
of damage, the condition of the damaged tissue, the size of a wound, type of
damaged tissue
(e.g., bone), the patient's age, sex, and diet, the severity of any infection,
time of
administration and other clinical factors. The dosage may vary with the type
of matrix used
in the reconstitution and with inclusion of other proteins in the
pharmaceutical composition.
For example, the addition of other known growth factors, such as IGF I
(insulin like growth
factor I), to the final composition, may also effect the dosage. Progress can
be monitored by
periodic assessment of tissue/bone growth and/or repair, for example, X-rays,
histomorphometric determinations and tetracycline labeling.
Polynucleotides of the present invention can also be used for gene therapy.
Such
polynucleotides can be introduced either in vivo or ex vivo into cells for
expression in a
mammalian subject. Polynucleotides of the invention may also be administered
by other
known methods for introduction of nucleic acid into a cell or organism
(including, without
limitation, in the form of viral vectors or naked DNA). Cells may also be
cultured ex vivo in
the presence of proteins of the present invention in order to proliferate or
to produce a
desired effect on or activity in such cells. Treated cells can then be
introduced in vivo for
therapeutic purposes.
4.12.3 EFFECTIVE DOSAGE
Pharmaceutical compositions suitable for use in the present invention include
compositions wherein the active ingredients are contained in an effective
amount to achieve
its intended purpose. More specifically, a therapeutically effective amount
means an amount
effective to prevent development of or to alleviate the existing symptoms of
the subject
being treated. Determination of the effective amount is well within the
capability of those
skilled in the art, especially in light of the detailed disclosure provided
herein. For any
compound used in the method of the invention, the therapeutically effective
dose can be


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
82
estimated initially from appropriate in vitro assays. For example, a dose can
be formulated in
animal models to achieve a circulating concentration range that can be used to
more
accurately determine useful doses in humans. For example, a dose can be
formulated in
animal models to achieve a circulating concentration range that includes the
ICso as
determined in cell culture (i. e., the concentration of the test compound
which achieves a
half maximal inhibition of the protein's biological activity). Such
information can be used
to more accurately determine useful doses in humans.
A therapeutically effective dose refers to that amount of the compound that
results in
amelioration of symptoms or a prolongation of survival in a patient. Toxicity
and therapeutic
efficacy of such compounds can be determined by standard pharmaceutical
procedures in
cell cultures or experimental animals, e.g., for determining the LDso (the
dose lethal to 50%
of the population) and the EDSO (the dose therapeutically effective in 50% of
the population).
The dose ratio between toxic and therapeutic effects is the therapeutic index
and it can be
expressed as the ratio between LDso and EDso. Compounds which exhibit high
therapeutic
indices are preferred. The data obtained from these cell culture assays and
animal studies
can be used in formulating a range of dosage for use in human. The dosage of
such
compounds lies preferably within a range of circulating concentrations that
include the EDSo
with little or no toxicity. The dosage may vary within this range depending
upon the dosage
form employed and the route of administration utilized. The exact formulation,
route of
administration and dosage can be chosen by the individual physician in view of
the patient's
condition. See, e.g., Fingl et al., 1975, in "The Pharmacological Basis of
Therapeutics", Ch.
1 p.1. Dosage amount and interval may be adjusted individually to provide
plasma levels of
the active moiety which are sufficient to maintain the desired effects, or
minimal effective
concentration (MEC). The MEC will vary for each compound but can be estimated
from ire
vitro data. Dosages necessary to achieve the MEC will depend on individual
characteristics
and route of administration. However, HPLC assays or bioassays can be used to
determine
plasma concentrations.
Dosage intervals can also be determined using MEC value. Compounds should be
administered using a regimen which maintains plasma levels above the MEC for
10-90% of
the time, preferably between 30-90% and most preferably between 50-90%. In
cases of local
administration or selective uptake, the effective local concentration of the
drug may not be
related to plasma concentration.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
83
An exemplary dosage regimen for polypeptides or other compositions of the
invention will be in the range of about 0.01 ~g/kg to 100 mg/kg of body weight
daily, with
the preferred dose being about 0.1 q.g/kg to 25 mg/kg of patient body weight
daily, varying
in adults and children. Dosing may be once daily, or equivalent doses may be
delivered at
longer or shorter intervals.
The amount of composition administered will, of course, be dependent on the
subject
being treated, on the subject's age and weight, the severity of the
affliction, the manner of
administration and the judgment of the prescribing physician.
4.12.4 PACKAGING
The compositions may, if desired, be presented in a pack or dispenser device
which
may contain one or more unit dosage forms containing the active ingredient.
The pack may,
for example, comprise metal or plastic foil, such as a blister pack. The pack
or dispenser
device may be accompanied by instructions for administration. Compositions
comprising a
compound of the invention formulated in a compatible pharmaceutical carrier
may also be
prepared, placed in an appropriate container, and labeled for treatment of an
indicated
condition.
4.13 ANTIBODIES
Also included in the invention are antibodies to proteins, or fragments of
proteins of
the invention. The term "antibody" as used herein refers to immunoglobulin
molecules and
immunologically active portions of immunoglobulin (Ig) molecules, i.e.,
molecules that
contain an antigen-binding site that specifically binds (inununoreacts with)
an antigen. Such
antibodies include, but are not limited to, polyclonal, monoclonal, chimeric,
single chain,
Fab, Fab' and F~ab~>2 fragments, and an Fib expression library. In general, an
antibody molecule
obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD,
which differ
from one another by the nature of the heavy chain present in the molecule.
Certain classes
have subclasses as well, such as IgGI, IgG2, and others. Furthermore, in
humans, the light
chain may be a kappa chain or a lambda chain. Reference herein to antibodies
includes a
reference to all such classes, subclasses and types of human antibody species.
An isolated related protein of the invention may be intended to serve as an
antigen, or
a portion or fragment thereof, and additionally can be used as an immunogen to
generate
antibodies that immunospecifically bind the antigen, using standard techniques
for


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
84
polyclonal and monoclonal antibody preparation. The full-length protein can be
used or,
alternatively, the invention provides antigenic peptide fragments of the
antigen for use as
immunogens. An antigenic peptide fragment comprises at least 6 amino acid
residues of the
amino acid sequence of the full length protein, such as an amino acid sequence
shown in
SEQ ID NO: 1042-2082, or 2535-2986, or Tables 3, 5, 6, or 8, and encompasses
an epitope
thereof such that an antibody raised against the peptide forms a specific
immune complex
with the full length protein or with any fragment that contains the epitope.
Preferably, the
antigenic peptide comprises at least 10 amino acid residues, or at least 15
amino acid
residues, or at least 20 amino acid residues, or at least 30 amino acid
residues. Preferred
epitopes encompassed by the antigenic peptide are regions of the protein that
are located on
its surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by
the
antigenic peptide is a surface region of the protein, e.g., a hydrophilic
region. A
hydrophobicity analysis of the human related protein sequence will indicate
which regions of
a related protein are particularly hydrophilic and, therefore, are likely to
encode surface
residues useful for targeting antibody production. As a means for targeting
antibody
production, hydropathy plots showing regions of hydrophilicity and
hydrophobicity may be
generated by any method well known in the art, including, for example, the
I~yte Doolittle or
the Hopp Woods methods, either with or without Fourier transformation. See,
e.g., Hopp and
Woods, 1981, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982,
J. Mol.
Biol. 157: 105-142, each of which is incorporated herein by reference in its
entirety.
Antibodies that axe specific for one or more domains within an antigenic
protein, or
derivatives, fragments, analogs or homologs thereof, are also provided herein.
A protein of the invention, or a derivative, fragment, analog, homolog or
ortholog
thereof, may be utilized as an immunogen in the generation of antibodies that
immunospecifically bind these protein components.
The term "specific for" indicates that the variable regions of the antibodies
of the
invention recognize and bind polypeptides of the invention exclusively (i.e.,
able to
distinguish the polypeptide of the invention from other similar polypeptides
despite sequence
identity, homology, or similarity found in the family of polypeptides), but
may also interact
with other proteins (for example, S. aureus protein A or other antibodies in
ELISA
techniques) through interactions with sequences outside the variable region of
the antibodies,
and in particular, in the constant region of the molecule. Screening assays to
determine


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
binding specificity of an antibody of the invention are well known and
routinely practiced in
the art. For a comprehensive discussion of such assays, see Harlow et al.
(Eds), Antibodies
A Laboratory Manual; Cold Spring Harbor Laboratory; Cold Spring Harbor, NY
(1988),
Chapter 6. Antibodies that recognize and bind fragments of the polypeptides of
the
5 invention axe also contemplated, provided that the antibodies are first and
foremost specific
for, as defined above, full-length polypeptides of the invention. As with
antibodies that are
specific for full length polypeptides of the invention, antibodies of the
invention that
recognize fragments are those which can distinguish polypeptides from the same
family of
polypeptides despite inherent sequence identity, homology, or similarity found
in the family
10 of proteins.
Antibodies of the invention are useful for, for example, therapeutic purposes
(by
modulating activity of a polypeptide of the invention), diagnostic purposes to
detect or
quantitate a polypeptide of the invention, as well as purification of a
polypeptide of the
invention. Kits comprising an antibody of the invention for any of the
purposes described
15 herein are also comprehended. In general, a kit of the invention also
includes a control
antigen for which the antibody is immunospecific. The invention further
provides a
hybridoma that produces an antibody according to the invention. Antibodies of
the
invention are useful for detection and/or purification of the polypeptides of
the invention.
Monoclonal antibodies binding to the protein of the invention may be useful
20 diagnostic agents for the immunodetection of the protein. Neutralizing
monoclonal
antibodies binding to the protein may also be useful therapeutics for both
conditions
associated with the protein and also in the treatment of some forms of cancer
where
abnormal expression of the protein is involved. In the case of cancerous cells
or leukemic
cells, neutralizing monoclonal antibodies against the protein may be useful in
detecting and
25 preventing the metastatic spread of the cancerous cells, which may be
mediated by the
protein.
The labeled antibodies of the present invention can be used for i~r
vita°o, iya vivo, and
in situ assays to identify cells or tissues in which a fragment of the
polypeptide of interest is
expressed. The antibodies may also be used directly in therapies or other
diagnostics. The
30 present invention further provides the above-described antibodies
immobilized on a solid
support. Examples of such solid supports include plastics such as
polycarbonate, complex
carbohydrates such as agarose and Sepharose~, acrylic resins and such as
polyacrylamide
and latex beads. Techniques for coupling antibodies to such solid supports are
well known


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
86
in the art (Weir, D.M. et al., "Handbook of Experimental Immunology" 4th Ed.,
Blackwell
Scientific Publications, Oxford, England, Chapter 10 (1986); Jacoby, W.D. et
al., Meth.
Enzym. 34 Academic Press, N.Y. (1974)). The immobilized antibodies of the
present
invention can be used for in vitro, ifa vivo, and i~a situ assays as well as
for immuno-affinity
purification of the proteins of the present invention.
Various procedures known within the art may be used for the production of
polyclonal or monoclonal antibodies directed against a protein of the
invention, or against
derivatives, fragments, analogs homologs or orthologs thereof (see, for
example, Antibodies:
A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory
Press,
Cold Spring Harbor, NY, incorporated herein by reference). Some of these
antibodies are
discussed below.
4.13.1 POLYCLONAL ANTIBODIES
For the production of polyclonal antibodies, various suitable host animals
(e.g.,
rabbit, goat, mouse or other mammal) may be immunized by one or more
injections with the
native protein, a synthetic variant thereof, or a derivative of the foregoing.
An appropriate
immunogenic preparation can contain, for example, the naturally occurring
immunogenic
protein, a chemically synthesized polypeptide representing the immunogenic
protein, or a
recombinantly expressed inununogenic protein. Furthermore, the protein may be
conjugated
to a second protein known to be immunogenic in the mammal being immunized.
Examples
of such immunogenic proteins include but are not limitedrto keyhole limpet
hemocyanin,
serum albumin, bovine thyroglobulin, and soybean trypsin inhibitor. The
preparation can
further include an adjuvant. Various adjuvants used to increase the
immunological response
include, but are not limited to, Freund's (complete and incomplete), mineral
gels (e.g.,
aluminum hydroxide), surface-active substances (e.g., lysolecithin, pluronic
polyols,
polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants usable in
humans such as
Bacille Calmette-Guerin and Corynebacterium parvum, or similar
immunostimulatory
agents. Additional examples of adjuvants that can be employed include MPL-TDM
adjuvant
(monophosphoryl Lipid A, synthetic trehalose dicorynomycolate).
The polyclonal antibody molecules directed against the immunogenic protein can
be
isolated from the mammal (e.g., from the blood) and further purified by well
known
techniques, such as affinity chromatography using protein A or protein G,
which provide
primarily the IgG fraction of immune serum. Subsequently, or alternatively,
the specific


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
87
~i
antigen which is the target of z~ i~~~~~~wsr~~iin sought, or an epitope
thereof, may be
imrri~bilized on a column to purify the immune specific antibody by
immunoaffinity
chromatography. Purification of immunoglobulins is discussed, for example, by
D.
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA,
Vol. 14, No. 8
(April 17, 2000), pp. 25-28).
4.13.2 MONOCLONAL ANTIBODIES
The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as
used herein, refers to a population of antibody molecules that contain only
one molecular
species of antibody molecule consisting of a unique light chain gene product
and a unique
heavy chain gene product. In pauticular, the complementarity determining
regions (CDRs)
of the monoclonal antibody are identical in all the molecules of the
population. MAbs thus
contain an antigen-binding site capable of immunoreacting with a particular
epitope of the
antigen characterized by a unique binding affinity for it.
Monoclonal antibodies c'an be prepared using hybridoma methods, such as those
described by Kohler and Milstein, Nature, 256, 495 (1975). In a hybridoma
method, a
mouse, hamster, or other appropriate host animal, is typically immunized with
an
immunizing agent to elicit lymphocytes that produce or are capable of
producing antibodies
that will specifically bind to the immunizing agent. Alternatively, the
lymphocytes can be
immunized in vitro.
The innnunizing agent will typically include the protein antigen, a fragment
thereof
or a fusion protein thereof. Generally, either peripheral blood lymphocytes
are used if cells
of human origin are desired, or spleen cells or lymph node cells are used if
non-human
mammalian sources are desired. The lymphocytes are then fused with an
immortalized cell
line using a suitable fusing agent, such as polyethylene glycol, to form a
hybridoma cell
(Goding, Monoclonal Antibodies: Principles and Practice, Academic Press,
(1986) pp. 59-
103). Immortalized cell lines are usually transformed mammalian cells,
particularly
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse
myeloma cell
lines are employed. The hybridoma cells can be cultured in a suitable culture
medium that
preferably contains one or more substances that inhibit the growth or survival
of the unfused,
immortalized cells. For example, if the parental cells lack the enzyme
hypoxanthine guanine
phosphoribosyl transferase (HGPRT or HPRT), the culture medium for the
hybridomas


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
88
typically will include hypoxanthine, aminopterin, and thymidine ("HAT
medium"), which
substances prevent the growth of HGPRT-deficient cells.
Preferred immortalized cell lines are those that fuse efficiently, support
stable high
level expression of antibody by the selected antibody-producing cells, and are
sensitive to a
medium such as HAT medium. More preferred immortalized cell lines are marine
myeloma
lines, which can be obtained, for instance, from the Salk Institute Cell
Distribution Center,
San Diego, California and the American Type Culture Collection, Manassas,
Virginia.
Human myeloma and mouse-human heteromyeloma cell lines also have been
described for
the production of human monoclonal antibodies (Kozbor, J. linmunol., 133:3001
(1984);
Brodeur et al., Monoclonal Antibody Production Techniques and Applications,
Marcel
Dekker, Inc., New York, (1987) pp. 51-63).
The culture medium in which the hybridoma cells are cultured can then be
assayed
for the presence of monoclonal antibodies directed against the antigen.
Preferably, the
binding specificity of monoclonal antibodies produced by the hybridoma cells
is determined
by immunoprecipitation or by an in vitro binding assay, such as
radioimmunoassay (RIA) or
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are
known in
the art. The binding affinity of the monoclonal antibody can, for example, be
determined by
the Scatchard analysis of Munson and Pollard, Anal. Biochem., 107, 220 (1980).
Preferably,
antibodies having a high degree of specificity and a high binding affinity for
the target
antigen are isolated.
After the desired hybridoma cells are identified, the clones can be subcloned
by
limiting dilution procedures and grown by standard methods. Suitable culture
media for this
purpose include, far example, Dulbecco's Modifed Eagle's Medimn and RPMI-1640
medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in
a mammal.
The monoclonal antibodies secreted by the subclones can be isolated or
purified from
the culture medium or ascites fluid by conventional immunoglobulin
purification procedures
such as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel
electrophoresis, dialysis, or affinity chromatography.
The monoclonal antibodies can also be made by recombinant DNA methods, such as
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal
antibodies of
the invention can be readily isolated and sequenced using conventional
procedures (e.g., by
using oligonucleotide probes that are capable of binding specifically to genes
encoding the
heavy and light chains of marine antibodies). The hybridoma cells of the
invention serve as


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
89
a preferred source of such DNA. Once isolated, the DNA can be placed into
expression
vectors, which are then transfected into host cells such as simian COS cells,
Chinese hamster
ovary (CHO) cells, or myeloma cells that do not otherwise produce
immunoglobulin protein,
to obtain the synthesis of monoclonal antibodies in the recombinant host
cells. The DNA
also can be modified, for example, by substituting the coding sequence for
human heavy and
light chain constant domains in place of the homologous rnurine sequences
(LJ.S. Patent No.
4,816,567; Morrison, Nature 368, 812-13 (1994)) or by covalently joining to
the
immunoglobulin coding sequence all or part of the coding sequence for a non-
immunoglobulin polypeptide. Such a non-immunoglobulin polypeptide can be
substituted
for the constant domains of an antibody of the invention, or can be
substituted for the
variable domains of one antigen-combining site of an antibody of the invention
to create a
chimeric bivalent antibody.
4.13.3 HUMANIZED ANTIBODIES
The antibodies directed against the protein antigens of the invention can
further
comprise humanized antibodies or human antibodies. These antibodies are
suitable for
administration to humans without engendering an irninune response by the human
against
the administered immunoglobulin. Humanized forms of antibodies are chimeric
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab,
Fab',
F(ab')Z or other antigen-binding subsequences of antibodies) that are
principally comprised
of the sequence of a human immunoglobulin, and contain minimal sequence
derived from a
non-human immunoglobulin. Humanization can be performed following the method
of
Winter and co-workers (Jones et al., Nature, 321, 522-525 (1986); Riechmann et
al., Nature,
332, 323-327 (1988); Verhoeyen et al., Science, 239, 1534-1536 (1988)), by
substituting
rodent CDRs or CDR sequences for the corresponding sequences of a human
antibody. (See
also U.S. Patent No. 5,225,539). In some instances, Fv framework residues of
the human
immunoglobulin are replaced by corresponding non-human residues. Humanized
antibodies
can also comprise residues that are found neither in the recipient antibody
nor in the
imported CDR or framework sequences. W general, the humanized antibody will
comprise
substantially all of at least one, and typically two, variable domains, in
which all or
substantially all of the CDR regions correspond to those of a non-human
immunoglobulin
and all or substantially all of the framework regions are those of a human
immunoglobulin
consensus sequence. The humanized antibody optimally also will comprise at
least a portion


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
of an immtmoglobulin constant region (Fc), typically that of a human
immunoglobulin
(Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct.
Biol., 2, 593-596
(1992)).
5 4.13.4 HUMAN ANTIBODIES
' Fully human antibodies relate to antibody molecules in which essentially the
entire
sequences of both the light chain and the heavy chain, including the CDRs,
arise from
human genes. Such antibodies are termed "human antibodies", or "fully human
antibodies"
herein. Human monoclonal antibodies can be prepared by the trioma technique;
the human
10 B-cell hybridoma technique (see Kozbor, et al., 1983 linmunol Today 4: 72)
and the EBV
hybridoma technique to produce human monoclonal antibodies (see Cole, et al.,
1985 In:
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96).
Human
monoclonal antibodies may be utilized in the practice of the present invention
and may be
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci
USA 80,
15 2026-2030) or by transforming human B-cells with Epstein Barr Virus in
vitro (see Cole, et
aL, 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp.
77-96).
In addition, human antibodies can also be produced using additional
techniques,
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227,
381 (1991);
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can
be made by
20 introducing human immunoglobulin loci into transgenic animals, e.g., mice
in which the
endogenous immunoglobulin genes have been partially or completely inactivated.
Upon
challenge, human antibody production is observed, which closely resembles that
seen in
humans in all respects, including gene rearrangement, assembly, and antibody
repertoire.
This approach is described, for example, in U.S. Patent Nos. 5,545,807;
5,545,806;
25 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al.
(Bio/Technology 10, 779-
783 (1992)); Lonberg et al. (Nature 368, 856-859 (1994)); Morrison (Nature
368, 812-13
(1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 (1996)); Neuberger
(Nature
Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol.
13, 65-93
(1995)).
30 Human antibodies may additionally be produced using transgenic nonhuman
animals
that are modified so as to produce fully human antibodies rather than the
animal's
endogenous antibodies in response to challenge by an antigen. (See PCT
publication
W094/02602). The endogenous genes encoding the heavy and light immunoglobulin
chains


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
91
in the nonhuman host have been incapacitated, and active loci encoding human
heavy and
light chain immunoglobulins are inserted into the host's genome. The human
genes are
incorporated, for example, using yeast artificial chromosomes containing the
requisite
human DNA segments. An animal which provides all the desired modifications is
then
obtained as progeny by crossbreeding intermediate transgenic animals
containing fewer than
the full complement of the modifications. The preferred embodiment of such a
nonhuman
animal is a mouse, and is termed the XenomouseTM as disclosed in PCT
publications WO
96/33735 and WO 96/34096. This animal produces B cells that secrete fully
human
immunoglobulins. The antibodies can be obtained directly from the animal after
immunization with an immunogen of interest, as, for example, a preparation of
a polyclonal
antibody, or alternatively from immortalized B cells derived from the animal,
such as
hybridomas producing monoclonal antibodies. Additionally, the genes encoding
the
immunoglobulins with human variable regions can be recovered and expressed to
obtain the
antibodies directly, or can be further modified to obtain analogs of
antibodies such as, for
example, single chain Fv molecules.
An example of a method of producing a nonhuman host, exemplified as a mouse,
lacking expression of an endogenous immunoglobulin heavy chain is disclosed in
U.S.
Patent No. 5,939,598. It can be obtained by a method including deleting the J
segment genes
from at least one endogenous heavy chain locus in an embryonic stem cell to
prevent
rearrangement of the locus and to prevent formation of a transcript of a
rearranged
immunoglobulin heavy chain locus, the deletion being effected by a targeting
vector
containing a gene encoding a selectable marker; and producing from the
embryonic stem cell
a transgenic mouse whose somatic and germ cells contain the gene encoding the
selectable
marker.
A method for producing an antibody of interest, such as a human antibody, is
disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression
vector that
contains a nucleotide sequence encoding a heavy chain into one mammalian host
cell in
culture, introducing an expression vector containing a nucleotide sequence
encoding a light
chain into another mammalian host cell, and fusing the two cells to form a
hybrid cell. The
hybrid cell expresses an antibody containing the heavy chain and the light
chain.
In a further improvement on this procedure, a method for identifying a
clinically .
relevant epitope on an immunogen, and a correlative method for selecting an
antibody that


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
92
binds immunospecifically to the relevant epitope with high affinity, are
disclosed in PCT
publication WO 99/53049.
4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES
According to the invention, techniques can be adapted for the production of
single-chain antibodies specific to an antigenic protein of the invention (see
e.g., LJ.S. Patent
No. 4,946,778). In addition, methods can be adapted for the construction of
Fab expression
libraries (see e.g., Huse, et al., 1989 Science 246, 1275-1281) to allow rapid
and effective
identification of monoclonal Fab fragments with the desired specificity for a
protein or
derivatives, fragments, analogs or homologs thereof. Antibody fragments that
contain the
idiotypes to a protein antigen may be produced by techniques known in the art
including, but
not limited to: (i) an F(ab')z fragment produced by pepsin digestion of an
antibody molecule;
(ii) an Fab fragment generated by reducing the disulfide bridges of an F~~b~~2
fragment; (iii) an
Fab fragment generated by the treatment of the antibody molecule with papain
and a reducing
agent and (iv) F~ fragments.
4.13.6 BISPECIFIC ANTIBODIES
Bispecific antibodies are monoclonal, preferably human or humanized,
antibodies
that have binding specificities for at least two different antigens. In the
present case, one of
the binding specificities is for an antigenic protein of the invention. The
second binding
target is any other antigen, and advantageously is a cell-surface protein or
receptor or
receptor subunit.
Methods for making bispecific antibodies are known in the art. Traditionally,
the
recombinant production of bispecific antibodies is based on the co-expression
of two
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have
different
specificities (Milstein and Cuello, Nature, 305, 537-539 (1983)). Because of
the random
assortment of immunoglobulin heavy and light chains, these hybridomas
(quadromas)
produce a potential mixture of ten different antibody molecules, of which only
one has the
correct bispecific structure. The purification of the correct molecule is
usually accomplished
by affinity chromatography steps. Similar procedures are disclosed in WO
93/08829,
published 13 May 1993, and in Traunecker et al., 1991 EMBO J., 10, 3655-3659.
Antibody variable domains with the desired binding specificities (antibody-
antigen
combining sites) can be fused to immunoglobulin constant domain sequences. The
fusion


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
93
preferably is with an immunoglobulin heavy-chain constant domain, comprising
at least part
of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-
chain constant
region (CH1) containing the site necessary for light-chain binding present in
at least one of
the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if
desired, the
immunoglobulin light chain, are inserted into separate expression vectors, and
are co-
transfected into a suitable host organism. For further details of generating
bispecific
antibodies see, for example, Suresh et al., Methods in Enzymology, 121, 210
(1986).
According to another approach described in WO 96/27011, the interface between
a
pair of antibody molecules can be engineered to maximize the percentage of
heterodimers
that are recovered from recombinant cell culture. The preferred interface
comprises at least
a part of the CH3 region of an antibody constant domain. In this method, one
or more small
amino acid side chains from the interface of the first antibody molecule are
replaced with
larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of
identical or
similar size to the large side chains) are created on the interface of the
second antibody
molecule by replacing large amino acid side chains with smaller ones (e.g.
alanine or
threonine). This provides a mechanism for increasing the yield of the
heterodimer over other
unwanted end-products such as homodimers.
Bispecific antibodies can be prepared as full-length antibodies or antibody
fragments
(e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific
antibodies from
antibody fragments have been described in the Literature. For example,
bispecific antibodies
can be prepared using chemical linkage. Brennan et al., Science 229, 81 (1985)
describe a
procedure wherein intact antibodies are proteolytically cleaved to generate
F(ab')2
fragments. These fragments are reduced in the presence of the dithiol
complexing agent
sodium arsenite to stabilize vicinal dithiols and prevent intermolecular
disulfide formation.
The Fab' fragments generated are then converted to thionitrobenzoate (TNB)
derivatives.
One of the Fab'-TNB derivatives is then reconverted to the Fab'-thiol by
reduction with
mercaptoethylamine and is mixed with an equimolar amount of the other Fab'-TNB
derivative to form the bispecific antibody. The bispecific antibodies produced
can be used
as agents for the selective immobilization of enzymes.
Additionally, Fab' fragments can be directly recovered from E. coli and
chemically
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med_ 175, 217-
225 (1992)
describe the production of a fully humanized bispecific antibody F(ab')2
molecule. Each
Fab' fragment was separately secreted from E. coli and subjected to directed
chemical


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
94
coupling in vitro to form the bispecific antibody. The bispecific antibody
thus formed was
able to bind to cells overexpressing the ErbB2 receptor and normal human T
cells, as well as
trigger the lytic activity of human cytotoxic lymphocytes against human breast
tumor targets.
Various techniques for making and isolating bispecific antibody fragments
directly
from recombinant cell culture have also been described. For example,
bispecific antibodies
have been produced using leucine zippers. I~ostelny et al., J. Immunol.
148(5), 1547-1553
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked
to the Fab'
portions of two different antibodies by gene fusion. The antibody homodimers
were reduced
at the hinge region to form monomers and then re-oxidized to form the antibody
heterodimers. This method can also be utilized for the production of antibody
homodimers.
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci.
USA 90,
6444-6448 (1993) has provided an alternative mechanism for making bispecific
antibody
fragments. The fragments comprise a heavy-chain variable domain (VH) coimected
to a
light-chain variable domain (VL) by a linker which is too short to allow
pairing between the
two domains on the same chain. Accordingly, the VH and VL domains of one
fragment are
forced to pair with the complementary VL and VH domains of another fragment,
thereby
forming two antigen-binding sites. Another strategy for making bispecific
antibody
fragments by the use of single-chain Fv (sFv) dimers has also been reported.
See, Gruber et
al., J. Immunol. 152, 5368 (1994).
Antibodies with more than two valencies are contemplated. For example,
trispecific
antibodies can be prepared. Tutt et al., J. Immunol. 147, 60 (1991).
Exemplary bispecific antibodies can bind to two different epitopes, at least
one of
which originates in the protein antigen of the invention. Alternatively, an
anti-antigenic arm
of an irnmunoglobulin molecule can be combined with an arm which binds to a
triggering
molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3,
CD28, or B7),
or Fc receptors for IgG (Fc~yR), such as Fc~yRI (CD64), Fc~yRII (CD32) and
Fc°yRIII (CD16)
so as to focus cellular defense mechanisms to the cell expressing the
particular antigen.
Bispecific antibodies can also be used to direct cytotoxic agents to cells
which express a
particular antigen. These antibodies possess an antigen-binding arm and an arm
which binds
a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, DOTA, or
TETA.
Another bispecific antibody of interest binds the protein antigen described
herein and further
binds tissue factor (TF).


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
4.13.7 HETEROCONJUGATE ANTIBODIES
Heteroconjugate antibodies are also within the scope of the present invention.
Heteroconjugate antibodies are composed of two covalently joined antibodies.
Such
antibodies have, for example, been proposed to target immune system cells to
unwanted cells
5 (IJ.S. Patent No. 4,676,980), and for treatment of HIV infection (WO
91/00360; WO
921200373; EP 03089). It is contemplated that the antibodies can be prepared
in vitro using
known methods in synthetic protein chemistry, including those involving
crosslinking
agents. For example, immunotoxins can be constructed using a disulfide
exchange reaction
or by forming a thioether bond. Examples of suitable reagents for this purpose
include
10 iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, for
example, in U.S.
Patent No. 4,676,980.
4.13.8 EFFECTOR FUNCTION ENGINEERING
It can be desirable to modify the antibody of the invention with respect to
effector
15 function, so as to enhance, e.g., the effectiveness of the antibody in
treating cancer. For
example, cysteine residues) can be introduced into the Fc region, thereby
allowing
interchain disulfide bond formation in this region. The homodimeric antibody
thus
generated can have improved internalization capability andlor increased
complement-
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See
Caron et
20 al., J. Exp Med., 176, 1191-1195 (1992) and Shopes, J. Tmmunol., 148, 2918-
2922 (1992).
Homodimeric antibodies with enhanced anti-tumor activity can also be prepared
using
heterobifunctional cross-linkers as described in Wolff et al. Cancer Research,
53, 2560-
2565 (1993). Alternatively, an antibody can be engineered that has dual Fc
regions and can
thereby have enhanced complement lysis and ADCC capabilities. See Stevenson et
al.,
25 Anti-Cancer Drug Design, 3, 219-230 (1989).
4.13.9 IMMUNOCONJUGATES
The invention also pertains to immunoconjugates comprising an antibody
conjugated
to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an
enzymatically active
30 toxin of bacterial, fungal, plant, or animal origin, or fragments thereof),
or a radioactive
isotope (i.e., a radioconjugate).
Chemotherapeutic agents useful in the generation of such immunoconjugates have
been described above. Enzymatically active toxins and fragments thereof that
can be used


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
96
include diphtheria A chain, nonbinding active fragments of diphtheria toxin,
exotoxin A
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A
chain,
alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca
americana proteins
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin,
sapaonaria
officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin,
enomycin, and the
tricothecenes. A variety of radionuclides are available for the production of
radioconjugated
antibodies. Examples include ZiaBiy3ih i3lln, 9oY, and ls6Re.
Conjugates of the antibody and cytotoxic agent axe made using a variety of
bifunctional protein-coupling agents such as N-succinimidyl-3-(2-
pyridyldithiol) propionate
(SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as
dimethyl
adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes
(such as
glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl)
hexanediamine), bis-
diazonium derivatives (such as bis-(p-diazoniumbenzoyl)-ethylenediasnine),
diisocyanates
(such as tolyene 2,6-diisocyanate), and bis-active fluorine compounds (such as
1,5-difluoro-
2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as
described in
Vitetta et al., Science, 238: 1098 (1987). Carbon-14-labeled 1-
isothiocyanatobenzyl-3-
methyldiethylene triaminepentaacetic acid (MX-DTPA) is an exemplary chelating
agent for
conaugation of radionucleotide to the antibody. See W094/11026.
In another embodiment, the antibody can be conjugated to a "receptor" (such
streptavidin) for utilization in tumor pretargeting wherein the antibody-
receptor conjugate is
administered to the patient, followed by removal of unbound conjugate from the
circulation
using a clearing agent and then administration of a "ligand" (e.g., avidin)
that is in turn
conjugated to a cytotoxic agent.
4.14 COMPUTER READAELE SEQUENCES
In one application of this embodiment, a nucleotide sequence of the present
invention
can be recorded on computer readable media. As used herein, "computer readable
media"
refers to any medium which can be read and accessed directly by a computer.
Such media
include, but are not limited to: magnetic storage media, such as floppy discs,
hard disc
storage medium, and magnetic tape; optical storage media such as CD-ROM;
electrical
storage media such as RAM and ROM; and hybrids of these categories such as
magnetic/optical storage media. A skilled artisan can readily appreciate how
any of the


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
97
presently known computer readable mediums can be used to create a manufacture
comprising computer readable medium having recorded thereon a nucleotide
sequence of the
present invention. As used herein, "recorded" refers to a process for storing
information on
computer readable medium. A skilled artisan can readily adopt any of the
presently known
methods for recording information on computer readable medium to generate
manufactures
comprising the nucleotide sequence information of the present invention.
A variety of data storage structures are available to a skilled artisan for
creating a
computer readable medium having recorded thereon a nucleotide sequence of the
present
invention. The choice of the data storage structure will generally be based on
the means
chosen to access the stored information. In addition, a variety of data
processor programs
and formats can be used to store the nucleotide sequence information of the
present
invention on computer readable medium. The sequence information can be
represented in a
word processing text file, formatted in commercially-available software such
as WordPerfect
and Microsoft Word, or represented in the form of an ASCII file, stored in a
database
application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can
readily adapt any
number of data processor structuring formats (e.g. text file or database) in
order to obtain
computer readable medium having recorded thereon the nucleotide sequence
information of
the present invention.
By providing any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534
or
a representative fragment thereof; or a nucleotide sequence at least 95%
identical to any of
the nucleotide sequences of SEQ ID NO: 1-1041, or 2083-2534 in computer
readable form, a
skilled artisan can routinely access the sequence information for a variety of
purposes.
Computer software is publicly available which allows a skilled artisan to
access sequence
information provided in a computer readable medium. The examples which follow
demonstrate how software which implements the BLAST (Altschul et al., J. Mol.
Biol.
215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993))
search
algorithms on a Sybase system is used to identify open reading frames (ORFs)
within a
nucleic acid sequence. Such ORFs may be protein-encoding fragments and may be
useful in
producing commercially important proteins such as enzymes used in fermentation
reactions
and in the production of commercially useful metabolites.
As used herein, "a computer-based system" refers to the hardware means,
software
means, and data storage means used to analyze the nucleotide sequence
information of the
present invention. The minimum hardware means of the computer-based systems of
the


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
98
present invention comprises a central processing unit (CPL, input means,
output means, and
data storage means. A skilled artisan can readily appreciate that any one of
the currently
available computer-based systems are suitable for use in the present
invention. As stated
above, the computer-based systems of the present invention comprise a data
storage means
having stored therein a nucleotide sequence of the present invention and the
necessary
hardware means and software means for supporting and implementing a search
means. As
used herein, "data storage means" refers to memory which can store nucleotide
sequence
information of the present invention, or a memory access means which can
access
manufactures having recorded thereon the nucleotide sequence infornlation of
the present
invention.
As used herein, "search means" refers to one or more programs which are
implemented on the computer-based system to compare a target sequence or
target structural
motif with the sequence information stored within the data storage means.
Search means are
used to identify fragments or regions of a known sequence which match a
particular target
sequence or target motif. A variety of known algorithms are disclosed publicly
and a variety
of commercially available software for conducting search means are and can be
used in the
computer-based systems of the present invention. Examples of such software
includes, but
is not limited to, Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA
(NPOLYPEPTIDEIA). A skilled artisan can readily recognize that any one of the
available
algorithms or implementing software packages for conducting homology searches
can be
adapted for use in the present computer-based systems. As used herein, a
"target sequence"
can be any nucleic acid or amino acid sequence of six or more nucleotides or
two or more
amino acids. A skilled artisan can readily recognize that the longer a target
sequence is, the
less likely a target sequence will be present as a random occurrence in the
database. The
most preferred sequence length of a target sequence is from about 10 to 300
amino acids,
more preferably from about 30 to 100 nucleotide residues. However, it is well
recognized
that searches for commercially important fragments, such as sequence fragments
involved in
gene expression and protein processing, may be of shorter length.
As used herein, "a target structural motif," or "target motif," refers to any
rationally
selected sequence or combination of sequences in which the sequences) are
chosen based on
a three-dimensional configuration which is formed upon the folding of the
target motif.
There are a variety of target motifs known in the art. Protein target motifs
include, but are
not limited to, enzyme active sites and signal sequences. Nucleic acid target
motifs include,


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
99
but are not limited to, promoter sequences, hairpin structures and inducible
expression
elements (protein binding sequences).
4.15 TRIPLE HELIX FORMATION
In addition, the fragments of the present invention, as broadly described, can
be used
to control gene expression through triple helix formation or antisense DNA or
RNA, both of
which methods are based on the binding of a polynucleotide sequence to DNA or
RNA.
Polynucleotides suitable for use in these methods are preferably 20 to 40
bases in length and
are designed to be complementary to a region of the gene involved in
transcription (triple
helix-see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science
15241, 456
(1988); and Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself
(antisense-
Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense
Inhibitors of
Gene Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally
results in a shut-off of RNA transcription from DNA, while antisense RNA
hybridization
blocks translation of an mRNA molecule into polypeptide. Both techniques have
been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide.
4.16 DIAGNOSTIC ASSAYS AND HITS
The present invention further provides methods to identify the presence or
expression
of one of the ORFs of the present invention, or homolog thereof, in a test
sample, using a
nucleic acid probe or antibodies of the present invention, optionally
conjugated or otherwise
associated With a suitable label.
In general, methods for detecting a polynucleotide of the invention can
comprise
contacting a sample with a compound that binds to and forms a complex with the
polynucleotide for a period sufficient to form the complex, and detecting the
complex, so
that if a complex is detected, a polynucleotide of the invention is detected
in the sample.
Such methods can also comprise contacting a sample under stringent
hybridization
conditions with nucleic acid primers that anneal to a polynucleotide of the
invention under
such conditions, and amplifying annealed polynucleotides, so that if a
polynucleotide is
amplified, a polynucleotide of the invention is detected in the sample.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
100
In general, methods for detecting a polypeptide of the invention can comprise
contacting a sample with a compound that binds to and forms a complex with the
polypeptide for a period sufficient to form the complex, and detecting the
complex, so that if
a complex is detected, a polypeptide of the invention is detected in the
sample.
In detail, such methods comprise incubating a test sample with one or more of
the
antibodies or one or more of the nucleic acid probes of the present invention
and assaying
for binding of the nucleic acid probes or antibodies to components within the
test sample.
Conditions for incubating a nucleic acid probe or antibody with a test sample
vary.
Incubation conditions depend on the format employed in the assay, the
detection methods
employed, and the type and nature of the nucleic acid probe or antibody used
in the assay.
One skilled in the art will recognize that any one of the commonly available
hybridization,
amplification or immunological assay formats can readily be adapted to employ
the nucleic
acid probes or antibodies of the present invention. Examples of such assays
can be found in
Chard, T., An Introduction to Radioimmunoassay and Related Techniques,
Elsevier Science
Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al.,
Techniques in
hnmunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983),
Vol. 3
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory
Techniques in
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam,
The
Netherlands (1985). The test samples of the present invention include cells,
protein or
membrane extracts of cells, or biological fluids such as sputum, blood, serum,
plasma, or
urine. The test sample used in the above-described method will vary based on
the assay
format, nature of the detection method and the tissues, cells or extracts used
as the sample to
be assayed. Methods for preparing protein extracts or membrane extracts of
cells are well
known in the art and can be readily be adapted in order to obtain a sample
which is
compatible with the system utilized.
In another embodiment of the present invention, kits are provided which
contain the
necessary reagents to carry out the assays of the present invention.
Specifically, the
invention provides a compartment kit to receive, in close confinement, one or
more
containers which comprises: (a) a first container comprising one of the probes
or antibodies
of the present invention; and (b) one or more other containers comprising one
or more of the
following: wash reagents, reagents capable of detecting presence of a bound
probe or
antibody.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
101
In detail, a compartment kit includes any kit in which reagents are contained
in
separate containers. Such containers include small glass containers, plastic
containers or
strips of plastic or paper. Such containers allows one to efficiently transfer
reagents from
one compartment to another compartment such that the samples and reagents are
not
cross-contaminated, and the agents or solutions of each container can be added
in a
quantitative fashion from one compartment to another. Such containers will
include a
container which will accept the test sample, a container which contains the
antibodies used
in the assay, containers which contain wash reagents (such as phosphate
buffered saline,
Tris-buffers, etc.), and containers which contain the reagents used to detect
the bound
antibody or probe. Types of detection reagents include labeled nucleic acid
probes, labeled
secondary antibodies; or in the alternative, if the primary antibody is
labeled, the enzymatic,
or antibody binding reagents which are capable of reacting with the labeled
antibody. One
skilled in the art will readily recognize that the disclosed probes and
antibodies of the present
invention can be readily incorporated into one of the established kit formats
which are well
known in the art.
4.17 MEDICAL IMAGING
The novel polypeptides and binding partners of the invention are useful in
medical
imaging of sites expressing the molecules of the invention (e.g., where the
polypeptide of the
invention is involved in the immune response, for imaging sites of
inflammation or
infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods
involve
chemical attachment of a labeling or imaging agent, administration of the
labeled
polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging
the labeled
polypeptide ih vivo at the target site.
4.18 SCREENING ASSAYS
Using the isolated proteins and polynucleotides of the invention, the present
invention further provides methods of obtaining and identifying agents which
bind to a
polypeptide encoded by an ORF corresponding to any of the nucleotide sequences
set forth
in SEQ m NO: 1-1041, or 2083-2534, or bind to a specific domain of the
polypeptide
encoded by the nucleic acid. In.detail, said method comprises the steps of
(a) contacting an agent with an isolated protein encoded by an ORF of the
present invention, or nucleic acid of the invention; and


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
102
(b) determining whether the agent binds to said protein or said nucleic acid.
In general, therefore, such methods for identifying compounds that bind to a
polynucleotide of the invention can comprise contacting a compound with a
polynucleotide
of the invention for a time sufficient to form a polynucleotide/compound
complex, and
detecting the complex, so that if a polynucleotide/compound complex is
detected, a
compound that binds to a polynucleotide of the invention is identified.
Likewise, in general, therefore, such methods for identifying compounds that
bind to
a polypeptide of the invention can comprise contacting a compound with a
polypeptide of
the invention for a time sufficient to form a polypeptide/compound complex,
and detecting
the complex, so that if a polypeptide/compound complex is detected, a compound
that binds
to a polynucleotide of the invention is identified.
Methods for identifying compounds that bind to a polypeptide of the invention
can
also comprise contacting a compound with a polypeptide of the invention in a
cell for a time
sufficient to form a polypeptide/compound complex, wherein the complex drives
expression
of a receptor gene sequence in the cell, and detecting the complex by
detecting reporter gene
sequence expression, so that if a polypeptide/compound complex is detected, a
compound
that binds a polypeptide of the invention is identified.
Compounds identified via such methods can include compounds which modulate the
activity of a polypeptide of the invention (that is, increase or decrease its
activity, relative to
activity observed in the absence of the compound). Alternatively, compounds
identified via
such methods can include compounds which modulate the expression of a
polynucleotide of
the invention (that is, increase or decrease expression relative to expression
levels observed
in the absence of the compound). Compounds, such as compounds identified via
the
methods of the invention, can be tested using standard assays well known to
those of skill in
the art for their ability to modulate activity/expression.
The agents screened in the above assay can be, but are not limited to,
peptides,
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents
can be
selected and screened at random or rationally selected or designed using
protein modeling
techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical
agents
and the like are selected at random and are assayed for their ability to bind
to the protein
encoded by the ORF of the present invention. Alternatively, agents may be
rationally
selected or designed. As used herein, an agent is said to be "rationally
selected or designed"


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
103
when the agent is chosen based on the configuration of the particular protein.
For example,
one skilled in the art can readily adapt currently available procedures to
generate peptides,
pharmaceutical agents and the like, capable of binding to a specific peptide
sequence, in
order to generate rationally designed antipeptide peptides, for example see
Hurby et al.,
Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides,
A User's
Guide, W.H. Freeman, NY (1992), pp. 289-307, and I~aspczak et al.,
Biochemistry
28:9230-8 (1989), or pharmaceutical agents, or the like.
In addition to the foregoing, one class of agents of the present invention, as
broadly
described, can be used to control gene expression through binding to one of
the ORFs or
EMFs of the present invention. As described above, such agents can be randomly
screened
or rationally designed/selected. Targeting the ORF or EMF allows a skilled
artisan to design
sequence specific or element specific agents, modulating the expression of
either a single
ORF or multiple ORFs which rely on the same EMF for expression control. One
class of
DNA binding agents are agents which contain base residues which hybridize or
form a triple
helix formation by binding to DNA or RNA. Such agents can be based on the
classic
phosphodiester, ribonucleic acid backbone, or can be a variety of sulfhydryl
or polymeric
derivatives which have base attachment capacity.
Agents suitable for use in these methods preferably contain 20 to 40 bases and
are
designed to be complementary to a region of the gene involved in transcription
(triple helix -
see Lee et al., Nucl. Acids Res. 6, 3073 (1979); Cooney et al., Science 241,
456 (1988); and
Dervan et al., Science 251, 1360 (1991)) or to the mRNA itself (antisense-
Okano, J.
Neurochem. 56, 560 (1991); Oligodeoxynucleotides as Antisense W hibitors of
Gene
Expression, CRC Press, Boca Raton, FL (1988)). Triple helix-formation
optimally results in
a shut-off of RNA transcription from DNA, while antisense RNA hybridization
blocks
translation of an mRNA molecule into polypeptide. Both techniques have been
demonstrated to be effective in model systems. Information contained in the
sequences of
the present invention is necessary for the design of an antisense or triple
helix
oligonucleotide and other DNA binding agents.
Agents which bind to a protein encoded by one of the ORFs of the present
invention
can be used as a diagnostic agent. Agents which bind to a protein encoded by
one of the
ORFs of the present invention can be formulated using known techniques to
generate a
pharmaceutical composition.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
104
4.19 USE OF NUCLEIC ACIDS AS PROBES
Another aspect of the subject invention is to provide for polypeptide-specific
nucleic
acid hybridization probes capable of hybridizing with naturally occurnng
nucleotide
sequences. The hybridization probes of the subject invention may be derived
from any of
the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534. Because the
corresponding
gene is only expressed in a limited number of tissues, a hybridization probe
derived from
any of the nucleotide sequences SEQ ID NO: 1-1041, or 2083-2534 can be used as
an
indicator of the presence of RNA of cell type of such a tissue in a sample.
Any suitable hybridization technique can be employed, such as, for example, in
situ
hybridization. PCR as described.in US Patents Nos. 4,683,195 and 4,965,188
provides
additional uses for oligonucleotides based upon the nucleotide sequences. Such
probes used
in PCR may be of recombinant origin, may be chemically synthesized, or a
mixture of both.
The probe will comprise a discrete nucleotide sequence for the detection of
identical
sequences or a degenerate pool of possible sequences for identification of
closely related
genomic sequences.
Other means for producing specific hybridization probes for nucleic acids
include the
cloning of nucleic acid sequences into vectors for the production of mRNA
probes. Such
vectors are known in the art and are commercially available and may be used to
synthesize
RNA probes ifa vitro by means of the addition of the appropriate RNA
polyrnerase as T7 or
SP6 RNA polymerase and the appropriate radioactively labeled nucleotides. The
nucleotide
sequences may be used to construct hybridization probes for mapping their
respective
genomic sequences. The nucleotide sequence provided herein may be mapped to a
.
chromosome or specific regions of a chromosome using well-known genetic and/or
chromosomal mapping techniques. These techniques include in situ
hybridization, linkage
analysis against known chromosomal markers, hybridization screening with
libraries or
flow-sorted chromosomal preparations specific to known chromosomes, and the
like. The
technique of fluorescent in situ hybridization of chromosome spreads has been
described,
among other places, in Verma et al (1988) Human Chromosomes: A Manual of Basic
Techniques, Pergamon Press, New York NY.
Fluorescent ifz situ hybridization of chromosomal preparations and other
physical
chromosome mapping techniques may be correlated with additional genetic map
data.
Examples of genetic map data can be found in the 1994 Genome Issue of Science
(265:1981f). Correlation between the location of a nucleic acid on a physical
chromosomal


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
105
map and a specific disease (or predisposition to a specific disease) may help
delimit the
region of DNA associated with that genetic disease. The nucleotide sequences
of the subject
invention may be used to detect differences in gene sequences between normal,
carrier or
affected individuals.
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared
by, for
example, directly synthesizing the oligonucleotide by chemical means, as is
commonly
practiced using an automated oligonucleotide synthesizer.
Support bound oligonucleotides may be prepared by any of the methods known to
those
of skill in the art using any suitable support such as glass, polystyrene or
Teflon. One strategy
is to precisely spot oligonucleotides synthesized by standard synthesizers.
Immobilization can
be achieved using passive adsorption (hlouye & Hondo, (1990) J. Clin.
Microbiol. 28(6), 1469-
72); using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey &
Collins, (1989) Mol.
Cell Probes 3(2) 189-207) or by covalent binding of base modified DNA (I~eller
et al., 1988;
1989); all references being specifically incorporated herein.
Another strategy that may be employed is the use of the strong biotin-
streptavidin
interaction as a linker. For example, Broude et al. (1994) Froc. Natl. Acad.
Sci. USA 91(8),
3072-6, describe the use of biotinylated probes, although these are duplex
probes, that are
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads
may be
purchased from Dynal, Oslo. Of course, this same linking chemistry is
applicable to coating
any surface with streptavidin. Biotinylated probes may be purchased from
various sources,
such as, e.g., Operon Technologies (Alameda, CA).
Nunc Laboratories (Naperville, IL) is also selling suitable material that
could be used.
Nunc Laboratories have developed a method by which DNA can be covalently bound
to the
microwell surface termed Covalink NH. CovaLink NH is a polystyrene surface
grafted with
secondary amino groups (>NH) that serve as bridgeheads for further covalent
coupling.
CovaLink Modules may be purchased from Nunc Laboratories. DNA molecules may be
bound
to CovaLink exclusively at the 5'-end by a phosphoramidate bond, allowing
immobilization of
more than 1 pmol of DNA (Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-
42).
The use of CovaLink NH~ strips for covalent binding of DNA molecules at the 5'-
end
has been described (Rasmussen et al., (1991). In this technology, a
phosphoramidate bond is
employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is
beneficial as
immobilization using only a single covalent bond is preferred. The
phosphoramidate bond joins


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
106
the DNA to the CovaLink NH secondary amino groups that are positioned at the
end of spacer
arms covalently grafted onto the polystyrene surface through a 2 nm long
spacer arm. To link
an oligonucleotide to CovaLink NH via an phosphoramidate bond, the
oligonucleotide terminus
must have a 5'-end phosphate group. It is, perhaps, even possible for biotin
to be covalently
bound to CovaLink and then streptavidin used to bind the probes.
More specifically, the linkage method iilcludes dissolving DNA in water (7.5
ng/~,1) and
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold
0.1 M 1-
methylimidazole, pH 7.0 (1-MeIm~), is then added to a final concentration of
10 mM 1-Melm~.
A ss DNA solution is then dispensed into CovaLink NH strips (75 p,l/well)
standing on ice.
Carbodiimide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC),
dissolved in 10 mM 1-Melm~, is made fresh and 25 ~,1 added per well. The
strips are incubated
for 5 hours at 50°C. After incubation the strips are washed using,
e.g., Nunc-Immuno Wash;
first the wells are washed 3 times, then they are soaked with washing solution
for 5 min., and
finally they are washed 3 times (where in the washing solution is 0.4 N NaOH,
0.25% SDS
heated to 50°C).
It is contemplated that a further suitable method for use with the present
invention is
that described in PCT Patent Application WO 90/03382 (Southern & Maskos),
incorporated
herein by reference. This method of preparing an oligonucleotide bound to a
support involves
attaching a nucleoside 3'-reagent through the phosphate group by a covalent
phosphodiester link
to aliphatic hydroxyl groups carried by the support. The oligonucleotide is
then synthesized on
the supported nucleoside and protecting groups removed from the synthetic
.oligonucleotide
chain under standard conditions that do not cleave the oligonucleotide from
the support.
Suitable reagents include nucleoside phosphoramidite and nucleoside hydrogen
phosphorate.
An on-chip strategy for the preparation of DNA probe for the preparation of
DNA probe
arrays may be employed. For example, addressable laser-activated
photodeprotection may be
employed in the chemical synthesis of oligonucleotides directly on a glass
surface, as described
by Fodor et al. (1991) Science 251(4995), 767-73, incorporated herein by
reference. Probes
may also be immobilized on nylon supports as described by Van Ness et al.
(1991) Nucleic
Acids Res., 19(12) 3345-50; or linked to Teflon using the method of Duncan &
Cavalier (1988)
Anal. Biochem. 169(1), 104-8; all references being specifically incorporated
herein.
To link an oligonucleotide to a nylon support, as described by Van Ness et al.
(1991),
requires activation of the nylon surface via alkylation and selective
activation of the 5'-amine of
oligonucleotides with cyanuric chloride.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
107
One particular way to prepare support bound oligonucleotides is to utilize the
light-generated synthesis described by Pease et al., (1994) Proc. Nafl. Acad.
Sci., USA 91(11),
5022-6, incorporated herein by reference). These authors used current
photolithographic
techniques to generate arrays of immobilized oligonucleotide probes (DNA
chips). These
methods, in which light is used to direct the synthesis of oligonucleotide
probes in high-density,
miniaturized arrays, utilize photolabile 5'-protected N acyl-deoxynucleoside
phosphoramidites,
surface linker chemistry and versatile combinatorial synthesis strategies. A
matrix of 256
spatially defined oligonucleotide probes may be generated in this manner.
4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS
The nucleic acids may be obtained from any appropriate source, such as cDNAs,
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC
inserts, and RNA, including mRNA without any amplification steps. For example,
Sambrook
et al. (1989) describes three protocols for the isolation of high molecular
weight DNA from
mammalian cells (p. 9.14-9.23).
DNA fragments may be prepared as clones in M13, plasmid or lambda vectors
and/or
prepared directly from genomic DNA or cDNA by PCR or other amplification
methods.
Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of
DNA
samples may be prepared in 2-500 ml of final volume.
The nucleic acids would then be fragmented by any of the methods known to
those of
skill in the art including, for example, using restriction enzymes as
described at 9.24-9.28 of
Sambrook et al. (1989), shearing by ultrasound and NaOH treatment.
Low pressure sheariilg is also appropriate, as described by Schriefer et al.
(1990)
Nucleic Acids Res. 18(24), 7455-6, incorporated herein by reference). In this
method, DNA
samples are passed through a small French pressure cell at a variety of low to
intermediate
pressures. A lever device allows controlled application of low to intermediate
pressures to the
cell. The results of these studies indicate that low-pressure shearing is a
useful alternative to
sonic.and enzymatic DNA fragmentation methods.
One particularly suitable way for fragmenting DNA is contemplated to be that
using the
two base recognition endonuclease, C'viJI, described by Fitzgerald et al.
(1992) Nucleic Acids
Res. 20(14) 3753-62. These authors described an approach for the rapid
fragmentation and
fractionation of DNA into particular sizes that they contemplated to be
suitable for shotgun
cloning and sequencing.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
108
The restriction endonuclease CviJI normally cleaves the recognition sequence
PuGCPy
between the G and C to leave blunt ends. Atypical reaction conditions, which
alter the
specificity of this enzyme (CviJI**), yield a quasi-random distribution of DNA
fragments form
the small molecule pUCl9 (2688 base pairs). Fitzgerald et al. (1992)
quantitatively evaluated
the randomness of this fragmentation strategy, using a CviJI** digest of pUCl9
that was size
fractionated by a rapid gel filtration method and directly ligated, without
end repair, to a lac Z
minus M13 cloning vector. Sequence analysis of 76 clones showed that CviJI**
restricts
pyGCPy and PuGCPu, in addition to PuGCPy sites, and that new sequence data is
accumulated
at a rate consistent with random fragmentation.
As reported in the literature, advantages of this approach compared to
sonicaion and
,:
agarose gel fractionation include: smaller amounts of DNA are required (0.2-
0.5 ~Cg instead of
2-5 ~,g); and fewer steps are involved (no preligation, end repair, chemical
extraction, or
agarose gel electrophoresis and elution are needed).
Irrespective of the manner in which the nucleic acid fragments are obtained or
prepared,
it is important to denature the DNA to give single stranded pieces available
for hybridization.
This is aclueved by incubating the DNA solution for 2-5 minutes at 80-
90°C. The solution is
then cooled quickly to 2°C to prevent renaturation of the DNA fragments
before they are
contacted with the chip. Phosphate groups must also be removed from genomic
DNA by
methods known in the art.
4.22 PREPARATION OF DNA ARRAYS
Arrays may be prepared by spotting DNA samples on a support such as a nylon
membrane. Spotting may be performed by using arrays of metal pins the
positions of which
correspond to an array of wells in a microtiter plate) to repeated by transfer
of about X20 n1 of a
DNA solution to a nylon membrane. By offset printing, a density of dots higher
than the density
of the wells is achieved. One to 25 dots may be accommodated in 1 mm2,
depending on the
type of label used. By avoiding spotting in some preselected number of rows
and columns,
separate subsets (subarrays) may be formed. Samples in one subarray may be the
same genomic
segment of DNA (or the same gene) from different individuals, or may be
different, overlapped
genomic clones. Each of the subarrays may represent replica spotting of the
same samples. In
one example, a selected gene segment may be amplified from 64 patients. For
each patient, the
amplified gene segment may be in one 96-well plate (all 96 wells containing
the same sample).
A plate for each of the 64 patients is prepared. By using a 96-pin device, all
samples may be
spotted on one 8 x 12 cm membrane. Subarrays may contain 64 samples, one from
each patient.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
109
Where the 96 subarrays are identical, the dot span may be 1 mmz and there may
be a 1 mm
space between subarrays.
Another approach is to use membranes or plates (available from NL1NC,
Naperville,
Illinois) which may be partitioned by physical spacers e.g. a plastic grid
molded over the
membrane, the grid being similar to the sort of membrane applied to the bottom
of multiwell
plates, or hydrophobic strips. A fixed physical spacer is not preferred for
imaging by exposure
to flat phosphor-storage screens or x-ray films.
The present invention is illustrated in the following examples. Upon
consideration of
the present disclosure, one of skill in the art will appreciate that many
other embodiments and
variations may be made in the scope of the present invention. Accordingly, it
is intended that
the broader aspects of the present invention not be limited to the disclosure
of the following
examples. The present invention is not to be limited in scope by the
exemplified embodiments
which are intended as illustrations of single aspects of the invention, and
compositions and
methods which are functionally equivalent are within the scope of the
invention. Indeed,
numerous modifications and variations in the practice of the invention are
expected to occur to
those skilled in the art upon consideration of the present preferred
embodiments. Consequently,
the only limitations which should be placed upon the scope of the invention
are those which
appear in the appended claims.
All references cited within the body of the instant specification are hereby
incorporated
by reference in their entirety.
5.0 EXAMPLES
5.1 EXAMPLE 1
Novel Nucleic Acid Seguences Obtained From Various Libraries
A plurality of novel nucleic acids were obtained from cDNA libraries prepared
from
various human tissues and in some cases isolated from a genomic library
derived from human
chromosome using standard PCR, SBH sequence signature analysis and Sanger
sequencing
techniques. The inserts of the library were amplified with PCR using primers
specific for the
vector sequences which flank the inserts. Clones from cDNA libraries were
spotted on nylon
membrane filters and screened with oligonucleotide probes (e.g., 7-mers) to
obtain signature
sequences. The clones were clustered into groups of similar or identical
sequences.
Representative clones were selected for sequencing.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
110
In some cases, the 5' sequence of the amplified inserts was then deduced using
a typical
Sanger sequencing protocol. PCR products were purified and subjected to
fluorescent dye
terminator cycle sequencing. Single pass gel sequencing was done using a 377
Applied
Biosystems (ABA sequencer to obtain the novel nucleic acid sequences.
5.2 EXAMPLE 2
Assemblage of Novel Conti~s
The contigs of the present invention, designated as SEQ m NO: 2083-2534 were
assembled using an EST sequence as a seed. Then a recursive algorithm was used
to extend the
seed EST into an extended assemblage, by pulling additional sequences from
different
databases (i.e., Hyseq's database containing EST sequences, dbEST, gb pri, and
UniGene, and
exons from public domain genomic sequences predicated by GenScan) that belong
to this
assemblage. The algorithm terminated when there were no additional sequences
from the
above databases that would extend the assemblage. Further, inclusion of
component sequences
into the assemblage was based on a BLASTN hit to the extending assemblage with
BLAST
score greater than 300 and percent identity greater than 95%.
Table 8 sets forth the novel predicted polypeptides (including proteins)
encoded by the
novel pohynucleotides (SEQ )D NO: 2083-2534) of the present invention, and
their
corresponding translation start and stop nucleotide locations to each of SEQ
ID NO: 2083-2534.
Table 8 also indicates the method by which the polypeptide was predicted.
Method A refers to
a polypeptide obtained by using a software program called FASTY (available
from
http://fasta.bioch.virginia.edu) which selects a polypeptide based on a
comparison of the
translated novel polynucleotide to known polynucleotides (W.R. Pearson,
Methods in
Enzymology, 183:63-98 (1990), herein incorporated by reference). Method B
refers to a
polypeptide obtained by using a software program called GenScan for
human/vertebrate
sequences (available from Stanford University, Office of Technology Licensing)
that predicts
the polypeptide based on a probabilistic model of gene structure/compositional
properties (C.
Burge and S. Karlin, J. Mol. Biol., 268:78-94 (1997), incorporated herein by
reference).
Method C refers to a polypeptide obtained by using a Hyseq proprietary
software program that
translates the novel polynucheotide and its complementary strand into six
possible amino acid
sequences (forward and reverse frames) and chooses the polypeptide with the
longest open
reading frame.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
111
5.3 EXAMPLE 3
Novel Nucleic Acids
The novel nucleic acids of the present invention SEQ ID NO: 1-1041 were
assembled
from Hyseq's proprietary EST sequences as described in Example 1 and human
genome
sequences that are available from the public databases
(htt~://www.ncbi.nlm.nih.~ovn.
Exons were predicted from human genome sequences using GenScan
(http:l/genes.mit.edu/GENSCANinfo.html); HMMgene
(http~l/www cbs dtu.dl~/services/HMM~enemmmgenel l.html); and GenMark.hmm
(httpyenemark.biology.~atech.edu/GeneMark/whmm info.html). The Hyseq
proprietary
EST sequences and the predicted exons were assembled based on a BLASTN hit to
the
extending assemblage with BLAST score greater than 300 and percent identity
greater than
95%. Then, the predicted genes were analyzed using Neural Network SignalP V1.1
program
(from Center for Biological Sequence Analysis, The Technical University of
Denmark) for
presence of a signal peptide. These sequences ware further analyzed for
absence of a
transmembrane region using the TMpred program
(http://www.ch.embnet.or~/software/TMPRED form.html).
Table 1 shows the various tissue sources of SEQ ID NO: 1-1041.
The homologs for polypeptides SEQ m NO: 1042-2082, that correspond to
nucleotide sequences SEQ ID NO: 1-1041 were obtained by a BLASTP version 2.0a1
19MP-
WashU searches against Genpept release 124 using BLAST algorithm. The results
showing
homologues for SEQ ID NO: 1042-2082 from Genpept 124 are shown in Table 2.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J.
Comp. Biol., Vol. 6, 219-235 (1999), http:l/motif.stanford.edu/ematrix-search/
herein
incorporated by reference), all the polypeptide sequences were examined to
determine
whether they had identifiable signature regions. Scoring matrices of the
eMatrix software
package are derived from the BLOCKS, PRINTS, PFAM, PRODOM, and DOMO
databases. Table 3 shows the accession number of the homologous eMatrix
signature found
in the indicated polypeptide sequence, its description, and the results
obtained which include
accession number subtype; raw score; p-value; and the position of signature in
amino acid
sequence.
Using the Pfam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1) pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences
were examined for domains with homology to certain peptide domains. Table 4
shows the


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
112
name of the Pfam model found, the description, the e-value and the Pfam score
for the
identified model within the sequence. Further description of the Pfam models
can be found
at http://pfam.wustl.edu/.
The GeneAtlasT"' software package (Molecular Simulations Inc. (MSI), San
Diego,
CA) was used to predict the three-dimensional structure models for the
polypeptides
encoded by SEQ ID NO 1-1041 (i.e. SEQ ID NO: 1042-2082). Models were generated
by
(1) PSI-BLAST which is a multiple alignment sequence profile-based searching
developed
by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) High
Throughput Modeling
(HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) which is an automated
sequence
and structure searching procedure (http://www.msi.com/), and (3) SeqFoldTM
which is a fold
recognition method described by Fischer and Eisenberg (J. Mol. Biol. 209, 779-
791 (1998)).
This analysis was carried out, in part, by comparing the polypeptides of the
invention with
the known NMR (nuclear magnetic resonance) and x-ray crystal three-dimensional
structures
as templates. Table 5 shows: "PDB ID", the Protein DataBase (PDB) identifier
given to
template structure; "Chain ID", identifier of the subcomponent of the PDB
template
structure; "Compound Information", information of the PDB template structure
and/or its
subcomponents; "PDB Function Amlotation" gives function of the PDB template as
annotated by the PDB files (http:/www.rcsb.or DB/); start and end amino acid
position of
the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold
score, and the
Potentials) of Mean Force (PMF). The verify score is produced by GeneAtlasT"'
software
(MST), is based on Dr. Eisenberg's Profile-3D threading program developed in
Dr. David
Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and
Eisenberg, Nature,
356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc.. Natl.
Acad. Sci. USA,
95:13597-12502. The verify score produced by GeneAtlas normalizes the verify
score for
proteins with different lengths so that a unified cutoff can be used to select
good models as
follows:
Verify score (normalized) _ (raw score -1/2 high score)/(1/2 high score)
The PFM score, produced by GeneAtlasT"' software (MSI), is a composite scoring
function that depends in part on the compactness of the model, sequence
identity in the
alignment used to build the model, pairwise and surface mean force potentials
(MFP). As
given in table 5, a verify score between 0 to 1.0, with 1 being the best,
represents a good


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
113
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best,
represents a good
model. A SeqFoldTM score of more than 50 is considered significant. A good
model may
also be determined by one of skill in the art based all the information in
Table 5 taken in
totality.
Table 6 shows the position of the signal peptide in each of the polypeptides
and the
maximum score and mean score associated with that signal peptide using Neural
Network
SignalP V1.1 program (from Center for Biological Sequence Analysis, The
Technical
University of Denmark). The process for identifying prokaryotic and eukaryotic
signal
peptides and their cleavage sites are also disclosed by Henrik Nielson, Jacob
Engelbrecht,
Soren Brunak, and Gunnar von Heijne in the publication " Identification of
prokaryotic and
eukaryotic signal peptides and prediction of their cleavage sites" Protein
Engineering, Vol.
10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score
and a mean
S score, as described in the Nielson et al reference, was obtained for the
polypeptide
sequences.
Table 7 correlates each of SEQ ID NO: 1-1041 to a specific chromosomal
location.
Table 9 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, their corresponding polypeptide sequences SEQ ID NO: 1042-2082, their
corresponding priority contig nucleotide sequences SEQ ID NO: 2083-2534, their
corresponding priority contig polypeptide sequences SEQ ID NO: 2535-2986, and
the US
serial number of the priority application in which the contig sequence was
filed.
Table 10 is a correlation table of the novel polynucleotide sequences SEQ ID
NO: 1-
1041, the novel polypeptide sequences SEQ ID NO: 1042-2082, and the
corresponding SEQ
ID NO in which the sequence was filed in priority US application 60/311,261.


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
114
Table 1
'Tissue Ori in 1Z1VA/Tissue Librar Name SEQ ID NO:
Source


adrenal gland Clontech ADR002 13 23 34 45 77 111
115 122 187


194 210-211 249-250
255 290


320 357-358 362 420
443 451


492 499 551 577 630
698 702


713 718 805 808 819
841-843


845 861 896 899 909
924 937


949 985 1037


adult bladder Invitrogen BLD001 9 87 189 320-321
358 563 768


840 970


adult brain Clontech ABR001 ~ 184-186 277 282 352
558 849


871 898 958


adult brain Clontech ABR006 30 45 170 199 210
226 260 292-


294 340 357 413 443-444
478


499 551-552 579 582
584-588


632-637 646 654-655
676 683


731-732 755-756 777
813-827


861 872 874 880 883
1002 1012


adult brain Clontech ABR008 15 45 54 61 67 81
87 101 106


108 122-123 143-144
170 181-


183 195-209 215 222
245-248


261-270 283-289 292-293
296


306 308-310 327 340
358 370


394-407 409 421 428
440 442


459 477-478 496 531-547
551-


552 556 565-566 578-579
606


618 620-621 629-630
651 653-


655 664 667-668 707
713-714


729 745 750 753 756
772 779


788 790 793-794 799-800
802


808 812 823 826-827
849-850


859 862 872 883 885
898 917


919 921 930 935-936
947 974


985-986 992 1002
1006 1012


1028 1030 1036 1039


adult brain Clontech ABRO 11 1012


adult brain GIBCO AB3001 23 57-58 67 85 296
492 499 579


853 898-899 950 1012


adult brain GIBCO ABD003 45 59-62 67 72 82
85-88 156


179-180 182 296 299
355-356


440 458 474 483 499
563 823


840 852 860 885 898
992 999


1012


adult brain Invitrogen ABR014 45 115 238 470 599
653 974-976


adult brain Invitrogen ABR015 45 600 885 1012


adult brain Invitro en ABR016 599 1012


adult brain Invitrogen ABT004 ' 34 45 54 74 84 118
138-143 170-


171 180-181 208 255
277 359


379 428 438 499 501
536 715


731 783 793 799 805
809 824


862 898 912 977 998
1012


adult cervix BioChain CVX001 23 26 48 54 57 67
77 118 121


177 183 238 255 271-272
296


303 311-319 325 352
361-362


411-412 419-420 424
428 440


447 478 541 567 569
599-600


622 699 793 805 813
831 836-


837 839 844-845 848
863 872




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
115
Table 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


913 928-929 944 958
965 970


973 1001 1004


adult colon Invitrogen CLN001 250 322-325 429 630
788 970


985


adult heart GIBCO AHR001 28-30 45 61 67 90-94
118 122


150-151 183 193 250-251
279


349-351 369-370 410
419 474


483 485 490 493 552
563 719


773 835-836 853 861
961 976


1030


adult kidney GIBCO AKD001 24 31-34 44-46 48
55 62 67 81


121 144 151 162 176-178
183


251 255 258 277 352
358 369-


370 386 408 420 429
483 490


536 546 579 599-600
602 645


698 793 805 874 898
913


adult kidney Invitrogen AKT002 32 53-54 67 85 177
251 260 341


386 408 419-420 431-436
478


490 493 507 561 582
596-599


698 728 788 805 819
837 844-


848 885 898 969 989
1013


adult liver Clontech ALV003 101 121 193 579 638-639
729


890-893 919 1007 1017


adult liver Invitrogen ALV002 75 157 173 183 212-214
236 240


263 292 323 335 386
408 415


495-499 552 577 589
599 727


782 858 869 898-900
924 968


adult lung GIBCO ALG001 67 77 152 369 386
419 443 483


583 732 849 907


adult ovary Invitrogen AOV001 5 26 34 43 45 48 55
61-62 64-67


77 87 101-102 105
115 118 122-


129 143 151 155-163
170 174-


175 177 181-183 193
251-252


286 292 338 347 353-354
369


381 410 415 420 424
451 458


483 489 497 499 515
536 541


546 552 577 579 595
599-600


604 647 658 661 665
699 744


782-783 800 805-806
814 831


835 839-840 844 853
874 895


898-899 913 924 929
941-942


949 973 977 994 1004
1007 1012


1016 1031 1037


adult lacenta Clontech APL001 67 419 688 728 848
930


adult spleen Clontech SPLc01 82 101 187 255 260
358 370 447


483 489 579 586 648
768 835


845 848 853-857 863
885 913


917 962 986


adult spleen GIBCO ASP001 87 105 108 122 158
172 215 299


380 492 499 552 599
622 785


830 840 850 889


adult testis GIBCO ATS001 68-69 106 183 251
301 360 386


520 541 570 753 788
832 840


890 916


bone marrow Clontech BMD001 10-12 16-19 24-26
35 46 48 58


77 85 95-96 98-99
122 156 164




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
116
TahlP 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


172 187 222 251 385
424 429


458 478 483 489 519
568-569


599 622-623 630-631
696 700


758 765 794 844 914
919 924


944 971 985 992 1001
1017


bone marrow GF BMD002 23 45 81-82 104-105
115 136


144 156 170 172-173
181 183


247 287 292 306 319-320
327


362 370 418 478-483
489 492


536 548-552 565 569-570
572


579 596 599 614-622
630 640-


641 643 653 668 691
699 708


715-718 726 743 756
758 772


789 841 889 917 920
947 958


994 1006 1010 1037
1039


cultured preadipocytesStratagene ADP001 121 255 400 490-494
511 629


689 758 793 835 861
913 944


949 984


endothelial cellsStratagene EDT001 34 45 54 58 67 120-122
144 151-


154 183 193 299 385
440 451


458 483 490 499 515
552 563


569 577 579 599 622-623
752


793 800 844-845 898-899
942


944 949


fetal brain Clontech FBR001 139 168 356 599 702
712 831


845 850 872-873 898
921 1037


fetal brain Clontech FBR004 138 168 250 363 873-875
882


fetal brain Clontech FBR006 14 29 45 51 81 87
101 104 118


131 143-144 157 171
177 206


208-209 215 229 238
251 261


273 279 283 291-293
326-332


358 362 370-371 397
400 402


413 419 428 461 472
485 551-


560 568-569 579 618
620 629-


630 653-657 659-661
663-673


675 700 714 739-742
744-746


766 779 793 809 815
819 822


840 850 859 862 872
875-885


930 958 972 995 1002
1006 1028


1030-1031 1038


fetal brain GIBCO HFB001 13-15 54-57 62 67
70-72 84 121


174 177 180 183 410
417 424


485 518 520 542 552
578-579


599 785 793 805 831-832
840


858 871 883 898-899
977 1012


fetal brain Invitrogen FBT002 7 45 49 144-149 157
180 255 263


356 493 501 600 630
707 748


832 845 858 913 1012


fetal heart Invitrogen FHR001 24 45 81-82 104 114-115
118


121 144 152 181 239
247 288


292 327 362 370 381
419 428


444 453 458 478 486
493 503


569 571 576 582.596
618 640


' 668 674-688 719-722
731 744


753 762 772 784 794
819 823


836 850 885 914 944
949 957-


958 1017




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
117
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


fetal kidney Clontech FI~D001 82 107 208 458 483
485 536 758


760 819 836 894 1017


fetal kidney Clontech FKD002 61 101 105 183 189
238 247 263


292 327 340 370 405
416 419


517 569 586 620 648
668 689-


691 731 746-752 763
771-772


787-788 819 840 842
854 861


872 944 958 961 969


fetal kidney Invitro en FKD007 116


fetal liver Clontech FLV002 410 429 454 692-695
704 781


805 894-895 1017


fetal liver Clontech FLV004 67 107 115 118 151
187 241 255


287 370 466 478 492
518 548


552 569 582 589 630
653 668


696-699 752-757 784
789 805


885 908 985


fetal liver ~ Invitrogen FLV001 45 101 130-137 157
222 240 337


386 428-429 492 552
589 693


727 840


fetal liver-spleenColumbia FLS001 1-9 18 20-23 27 34
36-38 45 55


University 67 70 83 89 94 118
122 158 164


172-173 177 183 219
238 240


246 251 292 299 323
335 338


358 369 376 385-386
397 408


416 419 421-422 429
451 456-


460 466 472 478 483
489-490


493 516 536 543 546
551 569-


573 579 586 588-589
593-595


599-603 619 622 668
676 691


699 702 724 731 734
743 787


789 794 800 805 834-835
840


848 853 874 880 885
890-891


899 908 910 923 926-927
930


939-940 944 949 958
973 980


992 999 1004 1007
1009 1013


fetal liver-spleenColumbia FLS002 3 8 17 22 36-37 46
55 61 63 70


University 72 85 89-90 94 106
122 148 156


158 165 172 177 181
194 213


215 219 246 251 292
299 304-


307 323-324 338 346
355 366


371 374 380-381 386
392 397


410 417 421 440 455
462-464


466-468 489-490 492-493
507-


521 536 552 565-566
569 571-


576 592 596 599 619
630 650


655 661 688 698-699
712 718


723-729 731 735-737
753 767


783 824 831 834 840
845 871


885 891 894 899 902
906-909


913 923-930 940 943
949 958


. 973 980 992 999 1003
1007 1017


1032 1040-1041


fetal liver-spleenColumbia FLS003 23 67 106 150 158
193 338 374


University 376 411 443 478 493
546 565


569-570 582 589 609-613
630


661 699 724 727-734
767 809


812 834-835 845 880
890 910




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
118
Table l
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


929-930 958 973 980
985 1013


fetal lung Clontech FLG001 728 824 1008


fetal lun Clontech FLG004 115 668


fetal lung Invitrogen FLG003 120 183 322 333-336
476 516


691 831 835 850 1012


fetal muscle Invitrogen FMS001 45 338-339 365 369
386 429 431


496-497 789 793 856
970 1008


1019 1033 1035


fetal muscle Invitrogen FMS002 45 115 171 247 327
365 370 405


536 642-652 668 710-711
719


726 758-761 765 836
899 901


907 913 948 965 1037


fetal skin Invitrogen FSK001 29 57 67 74 81 118
152 177 180


193 294 340-342 345
375 397


419 437-443 445-451
454 475


532 541 546 565 598
604 630


650 668 728 742 772
789 793


804-805 823 828-830
837 840


849 899 901 922 958
970 1007


1022 1033


fetal skin Invitrogen FSK002 34 45 77 81 85 115
173 200 279


292-293 360 370 381
419 428-


429 451 466 490 551
569-570


579 600 604 630 647
668 698


700-706 729 731 746
750 758


762-766 768-773 780
794 840


850 859 861 885 901
911 913


957 961 965 973 1038


fibroblast Stratagene LFB001 55 72 143 255 490
502-505 587


599 627 861 863 885
984 1037


induced neuron-cellsStratagene NTD001 30 82 111 124 181
206 356 392


410 417 484-488 578
831-834


898 977 1036 1039


infant brain Columbia IB2002 18 21 45 66 73-75
100-103 118


University 152 168-171,177 180
241-242


252 292-295 340 345
366-367


413 438 454 499 501
542 561-


562 578-580 599 668
702 728-


729 745 765 768 772
793 796-


799 823-824 863 874
887 899


948-949 967 975 977
981 983


992 995 1012


infant brain Columbia IB2003 81 101 113 118 177
180 241 252


Uiliversity 293 340 345 367 371
379 381


400 417 499-501 536
562 578


580-581 629-630 702
713 745


796-805 824 831 837
840 845


874 885 967 977 981
985 1012


1030


infant brain Columbia IBM002 168 358 413-414 913


University


infant brain Columbia IBS001 415 417 533 581 886-888
977


University


leukocyte Clontech LUG003 77 619889 949


leukocyte GIBCO LUC001 34 36 38-42 50-52
55 67 77 81-


83 85 121 137 144
158 172 183




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
119
Tahl a 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


223 226 251 254 258
291 324


368-374 378 424 429
443 483


492 536 552 564 600
602 732


760 768 782 785 805
838 844-


845 848 850 889 898
905 908


946 973 992


leg 55 72 143
255 490


502-505 587
599


627 861 863
885


984 1037


lung tumor Invitrogen LGT002 55 61 65 77-79 82
102 105 115


156-157 165-167 170
182-183


197 243-244 251 253
296-297


325 370 386 418-419
421-425


478 483 492 499 520
531 533


541 569 577 582 600
788 844-


845 848 874 899 911
913 916-


918 939 944 949 956
970 976


lymph node Clontech ALN001 47 63 104-105 183
483 492 691


894 1017


lymphocytes ATCC LPC001 45 53 77 158 193 251
392 421


455 469-474 483 507
536 546


579 581 618 621 640
765 780-


787 793 838 845 875
924 968


978 999


macrophage Invitrogen HMP001 122 147 157 183 251
255 493


738 898-899 903-905


mammary gland Invitrogen MMG001 45 64 67 83-84 101
113 143 148


152 158 164 177 181-183
189


216-218 253 255 258
263 274


299 336 419 421 423
426-430


440 466 478 490 520
533 536


564 569 579 582 630
646 753


768 782 789 800 835
840 848


850 883 912-913 944
950 958


melanoma from-cell-line-Clontech MEL004 62 158 181 298 362
364 402 419


ATCC-#CRL-1424 515 536 896-897 958
973 1004


1008


*Mixture of 16 Various VendorsCGd010 353 358 823 942 982
tissues - 1020


mRNA


*Mixture of 16 Various VendorsCGd011 569 630 944 955 999
tissues -


mRNA


*Mixture of 16 Various VendorsCGd012 9 38 59 63 80 85 122-123
tissues - 152


~A 154 177 195 217 232
246 250


296 300 306 323-324
381 427


434 438-439 478 489
499 507


517 538 558 565 571
575 630


657 681 701 736 762
792 800


802 823-824 861 871-872
899


929 941 955 968 974
985-1003


1006 1011-1012 1033


*Mixture of 16 Various VendorsCGd013 232 434 748 956-958
tissues - 992


mRNA


*Mixture of 16 Various VendorsCGd015 18 69 115 324 335
tissues - 548 551 569


~A 582 600 622 731 819
899 911


944 957-958 1012 1017-1018




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
120
Tahle 1
'Tissue Ori in RNA/Tissue Librar Name SEQ ID NO:
Source


*Mixture of 16 Various VendorsCGd016 46 172 183 323 371
tissues - 481 493 565


~A 569 571 596 599 630
654 698


745 762 786 849 907
944 1004-


1013 1037 1039


neuronal cells Stratagene NTU001 7 33 45 107 113 121
150 183 286


385 440 478 483 485
487 489


536 569 582 756 768
772 819


836 944 958 966 1001


pituitary gland Clontech PIT004 158 222 255 345 356
370 379


569 579 819 831 861-862
885


898 922 1017


placenta Clontech PLA003 7 36 61 279 419 478
489 582 586


599 641 647 668 681
707-711


774-779 1001


placenta Invitrogen APL002 57 173 536 728 793
800


prostate Clontech PRT001 26 219-222 229 412
599 665 762


835 837 860 878 951
1031


rectum Invitrogen REC001 9 292 343-346 431
546 714 800


863 918


retinoic acid-induced-Shatagene NTR001 112 400 478 569 582
629 756


neuronal-cells 758 800 819 831 835-836
850


906 944 958


salivary gland Clontech SAL001 58 61 77 118 150 158
294 347-


348 483 492-493 546
752 830


915


skeletal muscle Clontech SI~M001 80 118 247 365 483
719 805 812


823


small intestine Clontech SIN001 34 37 45 52 60 93
106 119 121


138 144 177 180 208
223-225


238 247 294 323 335-336
343


362 370 380 386 397
409-411


416 420 440 451 455
478 489


493 536 571 577 579
590 602


604-608 614 622 624-628
655


668 688 700 714 805-812
831


841 872 894 899 914
924 926


929 958 961 965 973
991 998


1017


spinal cord Clontech SPC001 51 164 182-183 190
226-228


255-257 275-277 286
296 299


451 454 542 552 579
591 728


753 770 786 790 831
835 849-


852 898 907 958 1000
1012


stomach Clontech STO001 72 222 232 247 258
366 645


thalamus Clontech THA002 45 49 113 155 164
180 183 191-


192 208 229-232 238
345 417


443 512 551 558 592
630 728


800 823 840 858-860
885 898


976 1012


thymus Clontech THM001 45 141 160 183 258
360 378-379


418 451 460 569 602
619 731


788-790 819 835 845
958 965


1004


thymus Clontech THMc02 47 108 115 121 144
157 173 247


259-260 300 327 340
358 362


375-393 409 453 455
461 478-




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
121
Table 1
Tissue Ori in RNAITissue Librar Name SEQ ID NO:
Source


479 489 551 565 569-570
579


582 615 630 640 653
668 708


744 752 758 766 790-795
810


819 823 835-836 845
850 853


861 885 911 919 938
958 962


994 1001 1027


thyroid gland Clontech THR001 46 58 67 80 82 144
160 177 183


193-194 233-235 251
255 263


268 278-280 286 299
301-303


324 358 370 386 397
408 410


420 440 474 483 493
506 519-


520 533 594 599-600
602 658


661 719 758 772 785
788 793


830 851 853 864-867
898 904


909 924 929 961 973
991 998


1001 1009


trachea Clontech TRC001 45 154 236 238 281
323 416 571


602 868-869 913


umbilical cord BioChain FUC001 34 45 54 58 67 70
85 152 154


177 180 188 208 251
299 370


409 415 419 434 451-455
483


596 599 647 661 733
742 793


808 839-840 845 849-850
861


888 911 913 992


uterus Clontech UTR001 177 237-239 255 258
417 493


520 567 599 604 646
844 870


874 898 973


young liver GIBCO ALV001 45 419 440 443 490
653 732 753


805 845 898 904


*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult
brain mRNA (Invitrogen), 2)
Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA
(Invitrogen), 4) Normal adult liver
mRNA (Invitrogen), 5) Normal fetal kidney mRNA (Invitrogen), 6) Norn~al fetal
liver mRNA (Invitrogen), 7)
normal fetal skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech),
9) Human bone marrow
mRNA (Clontech), 10) Human leukemia lymphoblastic mRNA (Clontech), 11) Human
thymus mRNA
(Clontech), 12) human lymph node mRNA (Clontech), 13) human so\spinal cord
mRNA (Clontech), 14)
human thyroid mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human
conceptional
umbilical cord mRNA (BioChain).


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
122
Tahle 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1044 AAB32400 Homo SapiensHUMA- Human secreted 339 100
protein


sequence encoded by
gene 30 SEQ ID


N0:86.


1044 AAM74711 Homo SapiensMOLE- Human bone marrow335 100


expressed probe encoded
protein SEQ


ID NO: 35017.


1044 AAM61909 Homo SapiensMOLE- Human brain expressed335 100
single


exon probe encoded protein
SEQ ID


NO: 34014.


1045 gi3859599Arabidopsis similar to class I chitinases74 27
(Pfam:


thaliana PF00182, E=1.2e-142,
N=1)


1045 gi15292107Drosophila LD38671p 74 33


melanogaster


1045 gi2258324Fusarium yellowing-associated 73 32
protein


oxysporum
f. Sp.


ciceris


1046 gi17428204Ralstonia CONSERVED HYPOTHETICAL 74 32


solanacearumPROTEIN


1046 gi4314432Homo Sapienssimilar to phosphatidylinositol71 30


(4,5)bisphosphate 5-phosphatase;


match to PID:g1399105


1046 gi~17545909~Ralstonia CONSERVED HYPOTHETICAL 74 32


ref~NP_5193solanacearumPROTEIN


11.1


1047 gi9756017Actinoplanesalpha-amylase 69 38
Sp.


50/110


1047 gi~6572499~gHomo SapiensLHX3 protein 67 26


b~AAF17291


.1~


1047 gi~18572988~Homo SapiensLIM homeobox protein 67 26
3


re~XP_0291


70.2


1048 AAY28474-Homo SapiensUYJO Human Capon protein.721 99


1048 gi2895555Homo sapienscarboxyl-terminal PDZ 721 99
ligand of


neuronal nitric oxide
synthase


1048 gi2895557Rattus carboxyl-terminal PDZ 654 92
ligand of


norve icus neuronal nitric oxide
synthase


1049 gi19713721FusobacteriumGTP-binding protein 66 28
era


nucleatum
subsp.


nucleatum


ATCC 25586


1050 131291 Homo sa iensfumarylacetoacetase 175 70
(AA 1-349)


1050 g1182393 Homo sa iensfumarylacetoacetate 175 70
hydrolase


1050 g112803409Homo Sapiensfiunar lacetoacetate 175 70


1052 g14680089Human envelope glycoprotein 79 26


immunodeficienc


y virus a
1


1052 g13868997Ephydatia EFPDE2 74 20


fluviatilis


1052 g14679590Human envelope glycoprotein 74 25


immunodeficienc


y virus type
1


1054 g13844648Mycoplasma glycerol kinase (glpK) 71 28


genitalium




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
123
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1054 gi18448155Ipomoea AC3 70 27
leaf


curl virus


1054 gi~12044888~Mycoplasma glycerol kinase (glpK) 71 28


ref~IVP_0726genitalium


98.1


1056 AAM56747 Homo SapiensMOLE- Human brain expressed229 72
single


exon probe encoded protein
SEQ ID


NO: 28852.


1056 AAM67067 Homo SapiensMOLE- Human bone marrow 224 69


expressed probe encoded
protein SEQ


ID NO: 27373.


1056 AAM54664 Homo SapiensMOLE- Human brain expressed224 69
, single


exon probe encoded protein
SEQ ID


NO: 26769.


1058 gi~13310191~multiple recombinant envelope 228 79
protein


gb~AAK181sclerosis


89.1~AF331associated


500_1 retrovirus


element


1058 gi~21103962~Homo sapiensenverin-2 209 77


gb~AAM331


41.1


1058 gi~8272468~gHomo Sapiensenvelope protein 198 75


b~AAF74215


.1 ~AF15696


3 1


1059 120380199Homo sa Similar to LOC168246 251 100
iens


1059 gi~8388692~eLeishmania probable DNA-binding 67 46
protein


mb~CAB940major


42.1 ~


1060 gi~21292780~Anopheles agCP4203 70 39
'


gb~EAA049gambiae
str.


25.1 J PEST


1061 g1330862 Equine membrane glycoprotein 179 30


herpesvirus
1


1061 g117221106Equine glycoprotein gp2 178 34


herpesvirus
1


1061 AAE03643 Homo SapiensINCY- Human extracellular175 29
matrix and


cell adhesion molecule-7
(XMAD-7).


1062 gi~11037117~Homo SapiensNAG13 334 66


gb~AAG274


85.1 CAF
194


537 1


1062 gi~1335205~eHomo SapiensORFII 332 66


mb~CAA364


80.1 ,


1063 g121323402CorynebacteriumABC-type transporter, 70 36
periplasmic


glutamicum component


ATCC 13032


1063 gi~19551869~CorynebacteriumCOG1464:ABC-type uncharacterized70 36


reflNP-5998glutamicum transport systems, periplasmic


71.1 ~ component


1063 gi~17551878~CaenorhabditisTPRDomain 67 37


re NP elegans
4990




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
124
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


90.1


1064 gi2308977Aspergilluschitin synthase 66 29


nidulans


1065 gi18076958Yarrowia Optl protein 74 30


lipolytica


1065 gi786145 Walleye envelope polyprotein 73 28
dermal


sarcoma
virus


1065 gi2801522Walleye gPr env 73 28
dermal


sarcoma
virus


1066 gi9294279ArabidopsisTal l-like non-LTR retroelement67 32


thaliana protein-like; CHP-rich
zinc finger


rotein-like


1066 gi~20848817~Mus musculussimilar to HEAT SHOCK 83 69
COGNATE


ref~XP_1380 PROTEIN 80


10.1


1069 AAM77637 Homo SapiensMOLE- Human bone marrow 96 65


expressed probe encoded
protein SEQ


ID NO: 37943.


1069 AAM64901 Homo SapiensMOLE- Human brain expressed96 65
single


exon probe encoded protein
SEQ ID


NO: 37006.


1069 gig 17473741Homo Sapienssimilar to Meningioma-expressed112 56
~


ref~~ antigen 6/11 (MEA6) (MEAL
0623 l)


80.1


1070 gi296288 Homo Sapienshistone H1 77 44


1070 15923857 Artemisia s ualene synthase 75 35
annua


1070 AAO08837 Homo SapiensHYSE- Human polypeptide 73 39
SEQ ID


NO 22729.


1071 g121483554Drosophila SD02058p 72 29


melano aster


1071 g18515845Homo Sapienshepatocellular carcinoma71 38
associated


rotein TD26


1071 gi~21483554~Drosophila SD02058p 72 29


gb~AAM527melanogaster


52.1 ~


1072 g15902896Streptomycestype I polyketide synthase74 50
AVES 4


avermitilis


1072 gi~21301752~Anopheles agCP8235 70 34


gb~EAA138gambiae
str.


97.1 PEST


1073 AAV30916 Homo SapiensGEMY Human secreted protein9.9 66


_ AR415 4 cDNA.
aal


1073 ABB89113 Homo SapiensHUMA- Human polypeptide 99 66
SEQ ID


NO 1489.


1073 AAB90679 Homo SapiensGEMY Human AR415 4 protein99 66


sequence SEQ ID 35.


1074 AAG99338 Homo SapiensTAKE Human atypical tachykinin380 92
~


rotein fragment SEQ ID
NO: 20.


1074 AAG99336 Homo SapiensTAKE Human atypical tachykinin329 91


rotein fragment SEQ ID
NO: 13.


1074 AAG99333 Homo SapiensTAKE Human atypical tachykinin324 91


protein fra ment SEQ
ID NO: 3.


1075 g117945760Drosophila RE33302p 305 29


melanogaster




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
125
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1075 gi1039447SaccharomycesLpblp 91 25


cerevisiae


1075 AAB64777 Homo SapiensHUMA- Human secreted 78 77
protein


sequence encoded by gene
5 SEQ ID


N0:63.


1076 AAB50261 Homo SapiensCORI- Human breast cancer308 39
associated


B726P-20 rotein.


1076 AAB50244 Homo SapiensLORI- Human breast cancer308 39
associated


B726P-79 rotein.


1076 AAB84702 Homo SapiensCORR Amino acid sequence308 39
of a


human cancer associated
antigen.


1077 12529735 Gorilla 1 co horin BlE recursor 71 31
orilla


1077 AAB74724 Homo SapiensINCY- Human membrane 70 31
associated


protein MEMAP-30.


1077 g14164424Scluzosaccharomsimilar to yeast cytoskeleton70 24
control


yces ombe protein Bnilp


1078 g118145107Clostridiumprobable transcriptional71 28
regulator


perfringens


1078 gi~9581801~ePlasmodium guanylyl cyclase 69 24


mb~CAC005falciparum


46.1


1078 gi~16805032~Plasmodium Ser/Thr protein kinase 69 26


ref~NP_4730falciparum


61.1


1079 gi~20886321~Mus musculussimilar to olfactory 72 34
receptor, family 5,


ref~XP subfamily V, member 1;
1406 olfactory


_ receptor, family 5, subfamily
14.1 V


member 1


1081 g19650824Petroselinumcommon plant regulatory 76 28
factor 5


Iris um


1081 g1559695 Hydrolagus This CDS feature is included74 31
to show


colliei the translation of the
corresponding


C_region. Presently translation


qualifiers on C region
features are


illega1


1081 g1476622 Hydrolagus immunoglobulin light 74 31
chain


colliei


1082 AAM39205 Homo SapiensHYSE- Human polypeptide 363 71
SEQ ID


NO 2350.


1082 AA007159 Homo SapiensHYSE- Human polypeptide 357 76
SEQ ID


NO 21051.


1082 AAM40991 Homo SapiensHYSE- Human polypeptide 343 79
SEQ ID


NO 5922.


1083 gi~17229222~Nostoc Sp. similar to HetF protein 72 30
PCC


reflNP-48577120


70.1


1084 g117221628Felis catusT-lym hocyte surface 76 38
CD2 antigen


1084 g118565073Crimean-Congoenvelope glycoprotein 74 29
precursor


hemorrhagic


fevervirus


1084 gi~17221628~Felis catusT-lymphocyte surface 76 38
CD2 antigen


dbj~BAB784


75.1


1085 117430213Ralstonia PUTATIVE HEMAGGLUTININ- 74 26




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
126
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


solanacearumRELATED PROTEIN


1087 gi2323287multiple polyprotein 618 79


sclerosis


associated


retrovirus


1087 gi~4996596~dHuman polyprotein 317 74


bj~BAA7854endogenous


9.1 ~ retrovirus
W


1087 gi~9630708~rFeline leukemiagag-pol precursor polyprotein293 38
gPr80


e~NP_0472virus


55.1


1088 gi15075953SinorhizobiumPUTATIVE MOLYBDENUM 70 56


meliloti TRANSPORT SYSTEM PERMEASE


ABC TRANSPORTER PROTEIN


1088 gi2288880Arthrobactertransmembrane protein 67 56


nicotinovorans


1088 gi17298547BradyrllizobiumModB 67 56


japonicum


1089 AAY95660Homo sa iensZYMO Human Zntr2 protein.231 61


1089 AAU83682Homo SapiensGETH Human PRO protein, 210 59
Seq ID No


182.


1089 AAY99386Homo SapiensGETH Human PR01305 (UNQ671)210 59


amino acid sequence SEQ
ID N0:153.


1090 gi7688355Solanum Dof zinc finger protein 70 31


tuberosum


1090 gi4389445Drosophila transcription factor 67 32


melanogaster


1090 gi~7688355~eSolanum Dof zinc finger protein 70 31


mb~CAB898tuberosum


31.1


1092 AAG78884Homo SapiensBIOW- Human ribosomal 90 44
protein s5-


17.


1092 AAM91239Homo SapiensHUMA- Human 72 53


immune/haematopoietic
antigen SEQ


ID NO:18832.


1092 AAM95026Homo sapiensHUMA- Human reproductive72 48
system


related antigen SEQ ID
NO: 3684.


1094 gi18676450Homo sa iensFLJ00122 protein 69 38


1094 gi18073428Homo sa iensstabilin-2 69 38


1094 gi~20806091~Homo Sapiensstabilin-2; CD44-like 69 38
precursor FELL


ref~NP_0600


34.8


1095 gi20906397Methanosarcinaconserved protein 76 44


mazei Goel


1095 gi~21299784~Anopheles agCP6531 75 30


gb~EAA119gambiae str.


29.1 PEST
~


1095 gi~17549046~Ralstonia CONSERVED HYPOTHETICAL 73 32


reflNP-5223solanacearumPROTEIN


86.1


1096 AAB58317Homo SapiensROSE/ Lung cancer associated678 100


of eptide sequence SEQ
ID 655.


1096 gi862600Drosophila male-specific lethal-1 176 25
protein


melanogaster




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
127
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1096 gi601930 Oryctolagus neurofilament-H 115 24


cuniculus


1097 AAU83109 Homo SapiensZYMO Novel secreted 76 85
protein


Z701935G4P.


1097 gi~20348496~Mus musculussimilar to RII~EN cDNA 72 57
9030605E16


ref~XP_1117


12.1


1098 gi18031887Mus musculusFanconi anemia complementation77 29


gr ou G


1098 112002137Mus musculusFanconi anemia grou 77 29
G rotein


1098 AAB72381 Homo sapiensLEEM/ Human hairy and 75 28
enhancer of


S lit homolo a amino
acid se uence.


1099 g18217648Homo SapiensdJ579F20.1 (high-mobility159 70
group


(nonhistone chromosomal)
protein 1-


like 1)


1099 g15815432Gallus gallushi h mobility group 154 70
protein HMGl


1099 14140289 Gallus allushigh mobility group 154 70
1 rotein


1100 ABB 11527Homo SapiensHYSE- Human apolipoprotein84 26
B


rece for homolo ue,
SEQ ID N0:1897.


1100 1487347 Homo sa iensbrea oint cluster region81 32
rotein


1100 g1144050 Bordetella filamentous hemagglutinin78 30


periussis


1102 AAM68946 Homo SapiensMOLE- Human bone marrow327 81


expressed probe encoded
protein SEQ


ID NO: 29252.


1102 AAM79768 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO


3414.


1102 AAM78784 Homo SapiensHYSE- Human protein 324 80
SEQ ID NO


1446.


1103 AAZ11186 Homo SapiensSAGA Gene encoding transmembrane143 68


_ domain containing protein
aal clone


HP02239.


1103 AAD31079_Homo SapiensINCY- Human cornichon 143 68
protein


aal (CORN) cDNA.


1103 AAA88439_Homo SapiensGETH Antitumour PR0181 143 68
cDNA


aal clone DNA23330-1390.


1104 ABB07527 Homo sapiensINCY- Human drug metabolizing562 100


enzyme (DME) (ID: 5643401CD1).


1104 ABB07515 Homo SapiensINCY- Human drug metabolizing562 100


enzyme (DME) ID: 8097779CD1).


1104 113161409Mus musculusfamily 4 cytochrome 431 76
P450


1107 g113542874Mus musculusSimilar to CGI-67 protein677 64


1107 AAU81978 Homo sa iens1NCY- Human secreted 665 65
protein SECP4.


1107 AAU77137 Homo SapiensMILL- Human alpha/beta 665 65
hydrolase


38618 polypeptide.


1108 113620885Homo Sapiensmitochondrial ribosomal323 100
protein S6


1108 113620887Mus musculusmitochondrial ribosomal284 82
protein S6


1108 g119713140FusobacteriumFusobacterium outer 79 28
membrane protein


nucleatum family
subsp.


nucleatum


ATCC 25586


1109 g118378673Homo SapiensPATE 607 89


1109 g1530'5193Rattus sperm protein 10 108 30


norvegicus




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
12,8
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1109 gi969103 Mus musculusmSP-10 107 27


1110 12462979 Bos taurus Tenascin-X 119 34


1110 g13413958Homo SapiensLDL rece for related 110 27
rotein 105


1110 g113938519Homo Sapienslow density lipoprotein110 27
receptor-related


protein 3


1111 g117981053Mus musculustranscri tion factor 82 32
NFATS


1111 g115425825Mus musculustonicity-responsive 82, 32
enhancer binding


rotein


1111 g16911148Mus musculustranscription factor 82 32
NFATS isoform b


1112 g16634473Metarhizium adenylate cyclase, ACY 73 . 30


anisopliae
var.


anisopliae


1113 AAU19759 Homo SapiensHUMA- Human novel extracellular900 70


matrix rotein, Seq ID
No 409.


1113 g13171934Mus musculusneuronal-STOP rotein 886 52


1113 g12769587Mus musculusSTOP protein 885 52


1114 g118652188Oenococcus OppF 72 41
oeni


1115 g19119 Drosophila fos-related anti en 69 37
s .


1115 g17769652Drosophila Fos-related antigen 69 37


melanogaster


1115 g117862946Drosophila SD04477p 69 37


melanogaster


1116 121212948Mus musculusperoxisomal rotein (PeP)243 83


1116 12347114 Mus musculusCC chemokine receptor-572 28


1116 12431976 Mus musculusCCRS 72 28


1117 gi~20825251~Mus musculussimilar to RE1-silencing77 40
transcription


ref~XP factor; neuron restrictive
1319 silencer


_ factor; re ressor bindin
98.1 ~ to the X2 box


1117 gi~15597871~Pseudomonas probable type II secretion69 41
system


ref~NP_2513aeruginosa protein


65.1


1118 gi~3860513~eMus famulus reverse transcriptase 303 82


mb~CAA135


74.1 ~


1118 gi~3860536~eMus saxicolareverse transcriptase 303 81


mb~CAA135


77.1 ~


1118 gi~3860510~eMus dunni reverse transcriptase 298 63


mb~CAA135


73.1


1119 AA004758 Homo SapiensHYSE- Human polypeptide234 59
SEQ ID


NO 18650.


1119 AAM69569 Homo sapiensMOLE- Human bone marrow220 63


expressed probe encoded
protein SEQ


ID NO: 29875.


1119 AAM67717 Homo SapiensMOLE- Human bone marrow219 49


expressed probe encoded
protein SEQ


ID NO: 28023.


1120 g121107877Xanthomonas cytochrome C 78 27


axonopodis
pv.


citri str.
306


1120 g115292331Drosophila LD47230p . 77 42


melanogaster


1120 115072444Avian phospho rotein 72 38




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
129
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


paramyxovirus
6


1121 AAB44126 Homo SapiensHUMA- Human cancer associated150 83


protein sequence SEQ
ID N0:1571.


1121 gi550015 Homo sapiensribosomal protein L21 150 83


1121 gi619788 Homo sa L21 ribosomal protein 150 83
iens


1122 AAU74448 Homo SapiensOULU- Human protein sequence125 100
of


lysyl hydroxylase 1 (LH
1 ).


1122 1190074 Homo sa lysyl hydroxylase 125 100
iens


1122 g15817297Homo Sapienslysyl hydroxylase 1 125 100


1123 g121281601CaenorhabditisC. elegans PQN-44 protein78 34


ele ans (corresponding sequence
F55A12.9c)


1123 g114578225CaenorhabditisC. elegans PQN-44 protein76 38


elegans (comes ondin se uence
F55A12.9b)


1123 g12088669CaenorhabditisC. elegans PQN-44 protein76 38


elegans comes ondin se uence
F55A12.9a)


1125 AAU17301 Homo SapiensHUMA- Novel signal transduction344 88


athway rotein, Se ID
866.


1125 AAE11776 Homo SapiensINCY- Human kinase (PKIN)-10344 88


protein.


1125 AAU17304 Homo SapiensHUMA- Novel signal transduction340 86


athway rotein, Se ID
869.


1126 AAM41712 Homo sapiensHYSE- Human polypeptide 152 96
SEQ ID


NO 6643.


1126 AAM39926 Homo SapiensHYSE- Human polypeptide 152 96
SEQ ID


NO 3071.


1126 AAM79067 Homo SapiensHYSE- Human protein SEQ 152 96
ID NO


1729.


1127 AAE02938 Homo SapiensMILL- Human adenylate 252 98
cyclase


25678.


1127 AAB02006 Homo sapiensTEXA Adenylyl cyclase 252 98
type II-C2 C2


al ha domain.


1127 g1202752 Rattus adenylyl cyclase type 252 98
II


norvegicus


1128 AAA94860_Homo SapiensTEXA Human caspase activator96 100
Smac


aal codin se uence.


1128 AAU78447 Homo SapiensUYJE- Inhibitor of apoptosis96 100
(IAP)


roteiii Smac.


1128 AAB26210 Homo sa TEXA Human cas ase activator96 100
iens Smac.


1129 g13874765CaenorhabditisSimilarity to Drosophila97 30
acetylcholine


elegans receptor protein


(SW:ACH1 DROME), contains


similarity to Pfam domain:
PF00065


(Neurotransmitter-gated
ion-channel),


Score=296.9, E-value=5e-86,
N=3


1129 g16681597Yaba monkeysimilar to vaccinia G8R 72 28


tumor virus


1129 gi~17548199~Caenorhabditisacetylcholine receptor 97 30


reflNP elegans
5099


32.1 ~


1130 gi~17564116~Caenorhabditistyrosine-proteinkinase 73 29


ref~IVP-5064elegans


84.1


1131 113925613Homo sa insulinoma-associated 88 27
iens protein IA-6


r 1131g1158485 Drosophila son of sevenless protein85 24
~




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
130
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


melanogaster


1131 gi728778205-Feb-1998symbol=Sos; 85 24


synonym=BG:DS00941.4;


match=method:"sim4",
score:"1000.0",


desc:"GenBank::M83931:Drosophila


melanogaster son of sevenless
(Sos)


mRNA, complete cds. CDS:346..5133;


PID:g158485.", species:"Drosophila


melanogaster' ;


match=method: "BLASTX",


version:"2.Oa19MP-WashU
[Build


so12.5-ultra 01:47:30


1132 gi9696 Mytilus of henolic adhesive protein75 25
edulis


1134 gi13562016Plectreurysfibroin 2 72 29
tristis


1134 gi1129074Bacillus beta-N-acetylglucosaminidase69 28
subtilis


1134 gi2636104Bacillus N-acetylglucosaminidase 69 28
subtilis (major


autolysin (CWBP90)


1135 AAB58870 Homo SapiensHUMA- Breast and ovarian72 80
cancer


associated antigen protein
sequence


SEQ ID 578.


1135 111595476Homo sa RPBllblbeta protein 72 80
iens


1135 AAB44840 Homo SapiensHUMA- Human secreted 69 45
protein


encoded by gene 11.


1137 g1206985 Rattus troponin I 70 46


norve icus


1137 g116945895Takifugu SUN-like 1 70 31


rubri es


1137 gi~8394466~rRattus troponin I, skeletal, 70 46
fast 2


ef~NP norvegicus
0588
_


81.1


1140 AA004998 Homo SapiensHYSE- Human polypeptide 277 96
SEQ ID


NO 18890.


1140 g119917538MethanosarcinamttA/Hcf106 protein 80 28


acetivorans
str.


C2A]


[Methanosarcina


acetivorans
C2A


1140 14959705 Mus musculusfibulin-2 76 28


1141 g110141010Vesicular non-structural polyprotein91 31


exanthema
of


swiiia virus


1141 g16566147Drosophila large Forked protein 85 30


melanogaster


1141 g12317953murid glycoprotein 150 79 28


he esvirus
4


1142 AAB54067 Homo SapiensHUMA- Human pancreatic 218 56
cancer


antigen protein sequence
SEQ ID


N0:519.


1142 g11710365Mus musculusnoggin 89 29


1142 g121105761Equus caballusno gin 89 29


1143 gi~21295753~Anopheles agCP1560 69 26


gb~EAA078gambiae
str.


98.1 ~ PEST


1144 g1505094 Homo Sapienssimilar to an actin bundling127 35
~ protein,




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
131
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


dematn.


1144 gi2337952Homo Sapiensactin-binding double-zinc-finger122 36


rotein


1144 gi21304227Oryza sativaovule development aintegumenta-like76 29


rotein BNM3


1145 gi~21298336~Anopheles agCP2121 68 37


gb~EAA104gambiae str.


81.1 ~ PEST


1146 AAW22049 Homo SapiensINCY- Interferon gamma 221 100
inducing


factor-2 (IGIF-2) alternate
transcript


variant.


1146 AAV05368_Homo SapiensSCHE cDNA encoding human167 84


aal interleukin-1-gamma.


1146 AAH78060-Homo SapiensSTRD Nucleotide sequence167 84
of human


aal interleukin 18 (IL-18).


1147 AAY57937 Homo SapiensINCY- Human transmembrane123 100
protein


HTMPN-61.


1147 gi~20345904~Mus musculussimilar to delta-like 105 86
homolog


ref~XP_1098 (Drosophila)


23.1


1148 gi19069293Encephalitozoonsimilarity to ADP/ATP 75 32
CARRIER


cuniculi PROTEIN


1148 gi8978336Arabidopsis contains similarity 74 26
to CHP-rich zinc


thaliana finger rotein~ ene id:K23F3.4


1148 gi19716318Aspergillus antigenic cell wall 74 32
protein MP1


flavus


1149 gi5456699Emericella ATP-binding cassette 70 35
multidrug


nidulans traps ort protein ATRC


1149 gi~20898840~Mus musculussimilar to HSPC038 protein69 0 31


re~XP_1393


87.1 ~


1150 gi3883128Arabidopsis arabinogalactan-protein96 32


thaliana


1150 gi17429208Ralstoua CONSERVED HYPOTHETICAL 92 26


solanacearumPROTEIN


1150 gi4063766Emericella chitinase 91 27


nidulans


1151 gi13561058Homo SapiensdJ1108D11.1 (novel protein107 31
similar to


C. elegans T22C1.7 )


1151 gi21105299Mytilus precollagen-NG 105 26


alloprovincialis


1151 gi14164347Oncorhynchuscollagen al(I) 96 28


mykiss


1152 gil8479434Mus musculusolfactory rece for MOR188-176 33


1152 gi2653915Oran virus glycoprotein G1 and 72 46
G2 precursor;


envelo a Tyco rotein
precursor


1152 gi18479436Mus musculusolfactory rece for MOR188-272 33


1153 gi3403167Homo sa tensGBAS 161 86


1153 112804791Homo sa tensglioblastoma am lifted 161 86
sequence


1153 AAB57149 Homo SapiensROSEI Human prostate 134 81
cancer antigen


protein se uence SEQ
ID N0:1727.


1154 g117742234Agrobacteriumhistidase 87 35


tumefaciens
str.


C58 (U.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
132
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


Washington)


1154 gi15159496AgrobacteriumAGR_L_1400GMp 87 35


tumefaciens
str.


C58 (Cereon)


1154 gi158521Drosophila seven-up protein type 80 32
2


melano aster


1155 gi~10441551~Cryptotermescytochrome b 65 28


gb~AAG170domesticus


99.1~AF189


115 1


1156 AA012089Homo SapiensHYSE- Human polypeptide 475 98
SEQ ID


NO 25981.


1156 gi20147787Xeno us laevisnuclear rece for core 74 25
ressor


1156 gi19881705Oryza sativaPutative transposable 72 32
element


1157 19963851Homo SapiensHT019 80 34


1157 AAB93530Homo SapiensHELI- Human protein sequence77 34
SEQ


ID N0:12884.


1157 11040970Homo sa iensfus-like protein 77 42


1158 19795254Sepia officinalisGABA-A rece for beta 71 27
subunit


1158 g115026157Clostridium amidase, germination 68 34
specific


acetobutylicumcwlC/cwlD B.subtilis
ortholo )


1158 gi~9795254~gSepia officinalisGABA-A receptor beta 71 27
subunit


b~AAF97816


.1


1159 AAB93423Homo sapiensHELI- Human protein sequence336 100
SEQ


ID NO:12641.


1159 g113097768Homo SapiensSimilar to RIKEN cDNA 336 100
2900073H19


ene


1159 g120071708Mus musculusRIKEN cDNA 2900073H19 334 96
gene


1160 AAM72558Homo SapiensMOLE- Human bone marrow 274 100


expressed probe encoded
protein SEQ


ID NO: 32864.


1160 AAM59959Homo sapiensMOLE- Human brain expressed274 100
single


exon probe encoded protein
SEQ ID


NO: 32064.


1161 AAB07704Homo SapiensINMR Protein encoded 139 36
by the


endogenetic fragment
of HERV-W.


1161 g18272464Homo sa iensag 139 36


1161 gi~5726238~gmultiple gag polyprotein 131 35


b~AAD4837sclerosis


5.1~AF1238associated


81_1 retroviriis


element


1162 AAU25448Homo sapiensINCY- Human mddt protein346 79
from clone


LG:1083264.1:2000MAY
19.


1162 AAU11265Homo sa iensBODE- Human zinc finger 319 65
rotein 51.


1162 AAB95637Homo SapiensHELI- Human protein sequence314 67
SEQ


ID N0:18371.


1163 g114189950Homo Sapiensconnexin 58 536 84


1163 g19957542Homo Sapiensconnexin 59 536 84


1163 110946367Danio rerio connexin 55.5 485 81


1164 1755700 Bombyx mori sericinlB 76 27


1164 g119569861DictyosteliumRTOA protein (Ratio-A). 76 28


discoideum




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
133
Table 2
SEQ AccessionSpecies Description Score


ID No, Identity


NO:


1164 gi10580635HalobacteriumVng1087c 76 25


s . NRC-1


1165 gi19915386MethanosarcinaWD-domain containing 89 28
protein


acetivorans
str.


C2A]


[Methanosarcina


acetivorans
C2A


1165 15639663 Homo sa iensWD re eat protein WDR3 83 28


1165 g111544739Homo sa iensdJ776P7.2 (WD re eat 83 28
domain 3


1166 AAM69338 Homo SapiensMOLE- Human bone marrow72 31


expressed probe encoded
protein SEQ


ID NO: 29644.


1166 AAM56953 Homo sapiensMOLE- Human brain expressed72 31
single


exon probe encoded protein
SEQ ID


NO: 29058.


1166 g120197507Arabidopsis expressed protein 67 39


thaliana


1167 g15802812Homo SapiensGa rotein 83 30


1167 g17160650Bordetella pertactin (P.68) 79 31


bronchiseptica


1167 g113173444Bordetella pertactin 79 31


bronchise
tics


1168 g11495029Danio rerio protein kinase CK2 alpha'84 24


1168 g1643443 Penicillium PHOG 82 32


chrysogenum


1168 gi~18858419~Danio rerio casein kinase 2 alpha 84 24
2


re~NP_5713


15.1


1169 g1206716 Rattus salivary proline-rich 90 31
protein


norvegicus


1169 g115029903Mus musculusSimilar to proline-rich89 36
protein BstNI


subfamil 2


1169 g153182 Mus musculusproline rich rotein 81 34


1170 gi~17553370~CaenorhabditisF40H6.S.p 78 33


ref~NP_4983elegans


18.1


1170 gi~15215731~Arabidopsis AT4g36780/C7A10 580 73 30


gb~AAK914thaliana


11.1


1171 1340446 Homo sa ienszinc fm er protein 7 218 61
(ZFP7)


1171 AAB43928 Homo SapiensHLTMA- Human cancer 216 58
associated


protein sequence SEQ
ID NO:1373.


1171 AAB21040 Homo SapiensINCY- Human nucleic 213 48
acid-binding


protein, NuABP-44.


1172 AAE04368 Homo sapiensINCY- Human kinase (PKIN)-9.120 85


1172 AAM79153 Homo SapiensHYSE- Human protein 120 85
SEQ ID NO


1815.


1172 AAE10614 Homo SapiensCUR A- Human novel STE20-like120 85


rotein, NOV-3d.


1173 1218572 Pan troglodytesrot GOR 74 29


1173 1243898 Pan GOR 74 29


1173 11666473 Mus musculusNOV rotein 71 50


1174 g15901830Drosophila BcDNA.GH07910 74 31


melano aster




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
134
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1174 AAM80237 Homo SapiensHYSE- Human protein SEQ 71 38
ID NO


3883.


1174 ABB 11528Homo SapiensHYSE- Human secreted 71 38
protein


homologue, SEQ ID N0:1898.


1175 gi~12054759~Podospora catalase A 65 33


emb~CAC20anserina


748.1


1176 AAM93289 Homo SapiensHELI- Human polypeptide,145 100
SEQ ID


NO: 2777.


1176 gi17431512Ralstonia PUTATIVE OUTER MEMBRANE 71 26


solanacearumCHANNEL LIPOPROTEIN


TRANSMEMBRANE


1176 gi15823991Streptomycesmodular polyketide synthase70 51


avermitilis


1177 AAM41939 Homo SapiensHYSE- Human polypeptide 84 61
SEQ ID


NO 6870.


1177 gi870751 Homo SapiensN-acetylgalactosamine 84 61
6-sulfate


sulfatase (GALNS)


1177 1618426 Homo sa N-acetyl alactosamine 84 61
iens 6-sul hatase


1178 1435855 Mus Sp. CREB-binding protein; 89 22
CBP


1178 AAW40058 Homo sapiensUSSH Cellular transcriptional87 22
factor


CBP.


1178 g117944308Drosophila RE12101p 86 26
-


melanogaster


1179 AAM25814 Homo SapiensHYSE- Human protein sequence73 93
SEQ


ID N0:1329.


1179 AAM25290 Homo SapiensHYSE- Human protein sequence73 93
SEQ


ID N0:805.


1179 AAM79441 Homo SapiensHYSE- Human protein SEQ 73 93
~ NO


3087.


1180 AAB88388 Homo SapiensHELI- Human membrane 719 97
or secretory


protein clone PSEC0131.


1180 g120810493Homo SapiensSimilar to RII~EN cDNA 716 96
2810417M05


gene


1180 AAD30543_Homo SapiensMILL- Human B7RP-2 DNA. 83 38


aal


1181 ABB 14686Homo SapiensHUMA- Human nervous system190 97
related


olypeptide SEQ ID NO
3343.


1181 g114329731Secale cerealehigh molecular weight 88 27
glutenin subunit


x


1181 g114329761Triticum high molecular weight 84 26
glutenin subunit


aestivum x


1182 111692645Mus musculusaspartly beta-hydroxylase74 28


_ g111878112Mus musculusaspartyl beta-hydroxylase74 28
1182 6.6 kb


transcript


1182 g111878110Mus musculusaspartyl beta-hydroxylase74 28
4.5 kb


transcript


1183 g115485622Homo SapiensQ9H4T4 like 80 25


1183 g119714949FusobacteriumTong protein 78 32


nucleatum
subsp.


nucleatum


ATCC 25586


1183 g17717375Homo Sapienshuman CHD2-52 down syndrome71 23
cell


adhesion molecule




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
135
Table 2
SEQ AccessionSpecies Description Score /a


ID No. Identity


NO:


1184 AAU83667 Homo SapiensGETH Human PRO protein,388 100
Seq ID No


152.


1184 AAG89161 Homo SapiensGEST Human secreted 388 100
protein, SEQ ID


NO: 281.


1184 AAY99348 Homo SapiensGETH Human PR01194 (UNQ607)388 100


amino acid sequence
SEQ ID NO:29.


1185 AAB93506 Homo SapiensHELI- Human protein 543 100
sequence SEQ


ID N0:12830.


1185 AAB87570 Homo SapiensGETH Human PR01268. 426 95


1185 AAY78808 Homo sapiensPROT- Hydrophobic domain426 95


containing protein clone
HP10537


rotein se uence.


1187 gi15823978Streptomycesmodular polyketide synthase75 41


avermitilis


1187 AAB66657 Homo SapiensHSCR- Human elastin 71 39
protein without


si nal pe tide.


1187 AAY69137 Homo SapiensUNSY Amino acid sequence71 39
of a


human tropoelastin derivative.


1188 gi6907090Oryza sativaSimilar to Oryza sativa76 30
root-specific


(japonica RCc3 mRNA. (L27208)


cultivar-
ou


1188 AAY36063 Homo SapiensGEST Extended human 74 26
secreted


rotein se uence, SEQ
ID NO. 448.


1188 AAY35971 Homo SapiensGEST Extended human 73 26
secreted


protein sequence, SEQ
ID NO. 220.


1189 gi9827989Leishmania possible CG12797 protein72 36


ma' or


1189 gi~13625467)Leishmania LACK protective antigen68 27


gb~AAK350donovani


68.1


1190 gi17027071Xiphocentronelongation factor-1 107 27
Sp. alpha


UMSP00002937


2-Costa Rica


1190 gi310665 StrongylocentrotNf Y-A subunit 88 24


us p uratus


1190 gi21743 Triticum lugh molecular weight 86 23
glutenin subunit


aestivum lAxl


1191 gi16878287Homo SapiensSimilar to C-terminal 167 96
modulator protein


1191 115866714Homo SapiensC-terminal modulator 167 96
protein


1191 AA006984 Homo SapiensHYSE- Human polypeptide132 83
SEQ ID


NO 20876.


1192 AAD05496_Homo SapiensHUMA- Human secreted 859 100
protein-


aal encoding gene 5 cDNA
clone


HHBCS39, SEQ ID N0:15.


1192 AAE01707 Homo SapiensHUMA- Hurnan gene 5 859 100
encoded


secreted protein HHBCS39,
SEQ ID


N0:119.


1192 AAE01676 Homo SapiensHUMA- Human gene 5 encoded859 100


secreted protein HHBCS39,
SEQ ID


N0:88.


1193 g118650588Homo Sapiensretinoic acid early 1312 99
transcript 1


1193 AAB15540 Homo SapiensINCY- Human immune system1283 97


molecule from Inc a
clone 3402252.


1193 ABB84887 Homo SapiensGETH Human PR0791 protein1234 94




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
136
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


se uence SEQ ID N0:142.


1195 11196427 Homo sa a 2 protein 248 50
iens


1195 g11780975Human gag protein 248 50


endogenous


retrovirus
K


1195 g11556397Human gag 248 50


endogenous


retrovirus
K


1196 g1556256 Leishmania G protein alpha subunit 72 22


donovani


1197 AAY07237 Homo SapiensISTF Wild type monocyte 121 100
chemotactic


rotein 2.


1197 AAY05300 Homo sa ISTF C-C chemokine, MCP2.121 100
iens


1197 AAW42072 Homo sa INCY- Human MC roprotein.121 100
iens


1198 ABB57423 Homo sapiensHUMA- Human secreted 187 79
protein


encodin olypeptide SEQ
ID NO 69.


1198 ABB57394 Homo SapiensHUMA- Human secreted 187 79
protein


encoding polypeptide
SEQ ID NO 40.


1198 AAY59757 Homo SapiensMETA- Human normal ovarian187 79
tissue


derived protein 34.


1199 AAY72603 Homo SapiensINCY- Human Electron 155 100
Transfer


Protein, ETRN-1.


1199 AAB88465 Homo SapiensHELI- Human membrane 155 100
or secretory


protein clone PSEC0259.


1199 AAE03926 Homo sapiensHUMA- Human gene 29 encoded155 100


secreted protein HTADC63,
SEQ ID


N0:89.


1200 g16458884Deinococcuschorismate mutase/prephenate73 42


radioduransdehydratase


1201 g120803920MesorhizobiumHYPOTHETICAL PROTEIN 68 32


loti


1201 gi~17545158~Ralstonia PUTATIVE LIPASE/ESTERASE66 31


ref~NP_5185solanacearumPROTEIN


60.1


1202 AAM67586 Homo SapiensMOLE- Human bone marrow 69 30


expressed probe encoded
protein SEQ


ID NO: 27892.


1202 AAM55191 Homo SapiensMOLE- Human brain expressed69 30
single


exon probe encoded protein
SEQ ID


NO: 27296.


1202 g1849219 SaccharomycesProlp: Glutamate 5-kinase69 33
(Swiss Prot.


cerevisiae accession number P32264)


1203 g118676554Homo SapiensFLJ00174 rotein 269 84


1203 gi~20913341~Mus musculussimilar to FLJ00174 protein125 81


ref~XP-1267


63.1


1203 gi~20850247~Mus musculussimilar to proline-rich 121 33
protein


ref~XP-1366


64.1


1204 AAM68056 Homo SapiensMOLE- Human bone marrow 140 84


expressed probe encoded
protein SEQ


ID NO: 28362.


1204 AAM55676 Homo SapiensMOLE- Hurnan brain expressed140 84
single


exon probe encoded rotein
SEQ ID




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
137
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


NO: 27781.


1205 gi541624 Drosophila pdm2 71 39


virilis


1205 gi9955855AspergillusRNA polymerase II largest69 38
subunit


oryzae


1205 gi662296 Rattus MIBP1 68 32


norvegicus


1206 ABB50703 Homo SapiensHLTMA- Human secreted 260 94
protein


encoded by gene 52 SEQ
ID N0:651.


1206 AAW88802 Homo SapiensHLJMA- Polypeptide fragment260 94
encoded


by ene 52.


1206 ABB50706 Homo sapiensHL1MA- Human secreted 143 96
protein


encoded by gene 52 SEQ
ID N0:654.


1207 AAM79588 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO


3234.


1207 AAM78604 Homo SapiensHYSE- Human protein SEQ 72 41
ID NO


1266.


1207 AAB58944 Homo SapiensHUMA- Breast and ovarian72 41
cancer


associated antigen protein
sequence


SEQ ID 652.


1208 AAE03429 Homo SapiensHLTMA- Human gene 3 encoded575 64


secreted protein HETDB76,
SEQ ID


NO: 112.


1208 gi19110438Homo Sapienspolycystin-1L1 575 64


1208 AAE03463 Homo SapiensHLTMA- Human gene 3 encoded185 97


secreted protein HETDB76,
SEQ ID


NO: 146.


1209 16760015 Homo sa brain rotein 1114 85
iens


1209 g11747306Mus musculusSDR2 151 31


1209 g120381292Mus musculusstromal cell derived 151 31
factor receptor 2


1211 g114043211Homo SapiensSimilar to RIKEN cDNA 460 89
4931428F04


gene


1211 g1190508 Homo Sapienssalivary proline-rich 113 28
rotein recursor


1211 112862320Homo SapiensWDC146 102 28


1212 AAO14407 Homo SapiensFARB Human 11 beta-hydroxysteroid291 63


dehydrogenase 1-like
enzyme.


1212 AAM79592 Homo sapiensHYSE- Human protein SEQ 217 45
ID NO


3238.


1212 g14581319Homo SapiensdJ28O10.3(HSD11B1 (hydroxysteroid217 45


(11-beta) dehydrogenase
1)


1213 AAR06514 Homo SapiensSTRI Natural human Platelet238 64
Factor-


4var1 encoded by EcolZi
fra ment.


1213 g1292390 Homo Sapiensplatelet factor 4 238 64


1213 AAZ28361_Homo SapiensSMIK Platelet factor-4 200 56
(PF-4)


aal nucleotide sequence.


1214 AAD12580 Homo SapiensSAGA Human protein having162 82


_ hydrophobic domain encoding
aal cDNA


clone HP 10753.


1214 AAD08193 Homo SapiensHUMA- Human secreted 162 82
protein-


_ encoding gene 3 cDNA
aal clone


HNTAC64, SEQ ID N0:13.


1214 AAD05544_Homo sapiensHUMA- Human secreted 162 82
protein-


aal encoding gene l2 cDNA
clone


HNTAC64, SEQ ID N0:63.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
13~
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1215 gi21429094Drosophila LD38004p 354 49


melanogaster


1215 gi15292155Drosophila LD40717p 354 49


melanogaster


1215 AAG75596 Homo SapiensHL1MA- Human colon cancer294 50
antigen


protein SEQ ID N0:6360.


1216 gi7248894Xeno us laevisAr rotein-tyrosine kinase84 35


1216 1402191 Mus musculusHNF-3beta 80 26


1216 g1404764 Mus musculusfork head related rotein80 26


1218 AAM39205 Homo SapiensHYSE- Human polypeptide559 74
SEQ ID


NO 2350.


1218 AAO03505 Homo SapiensHYSE- Human polypeptide502 81
SEQ ID


NO 17397.


1218 AAM40991 Homo SapiensHYSE- Human polypeptide467 66
SEQ ID


NO 5922.


1220 AA001188 Homo SapiensHYSE- Human polypeptide248 86
SEQ ID


NO 15080.


1220 AAY73334 Homo sapiens1NCY- HT1ZM clone 180506179 35
protein


se uence.


1220 120249 Oryza sativagt-2 77 32


1221 g14519619Haliotis colla en pro al ha-chain90 28
discus


1221 g17380690Neisseria UDP-N-acetylglucosamine--N-90 37


meningitidesacetylmuramyl-(pentape


22491 pyrophosphoryl-undecaprenol
N-


acetylglucosamine transferase


1221 g17225645Neisseria UDP-N-acetylglucosamine--N-90 37


meningitidesacetylmuramyl-(pentapeptide)


MC58 pyrophosphoryl-undecaprenol
N-


acetyl lucosamine transferase


1222 ABA05334_Homo SapiensMILL- Human fucosyltransferase2154 99


aal family member 32132
coding


sequence.


1222 AAM47905 Homo SapiensMILL- Human fucosyltransferase2154 99


family member 32132.


1222 ABA05333_Homo SapiensMILL- Human fucosyltransferase2154 99


aal family member 32132
encoding cDNA.


1223 AAY21852 Homo SapiensINCY- Human signal peptide-150 100


contianing protein (SIGP)
(clone ID


2652271).


1223 AAY48563 Homo SapiensMETA- Human breast tumour-150 100


associated rotein 24.


1223 AAW75103 Homo SapiensHLTMA- Human secreted 150 100
protein


encoded by ene 47 clone
HMCBP63.


1224 AAM67078 Homo SapiensMOLE- Human bone marrow517 99


expressed probe encoded
protein SEQ


ID NO: 27384.


1224 AAM54676 Homo SapiensMOLE- Human brain expressed517 99
single


exon probe encoded protein
SEQ ID


NO: 26781.


1224 117467358Sus scrofa MIF2 suppressor 184 80


1225 g19454237CochliobolusDNA binding protein 73 30
MAT-1


sativus


1225 g121428792Drosophila GH03582p 72 38


melanogaster




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
139
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1225 gi6633838ArabidopsisF2K11.15 70 31


thaliana


1226 gi21430124Drosophila HL01222p 76 28


melanogaster


1226 AAM77437 Homo SapiensMOLE- Human bone marrow 72 33


expressed probe encoded
protein SEQ


ID NO: 37743.


1226 AAM64659 Homo SapiensMOLE- Human brain expressed72 33
single


exon probe encoded protein
SEQ ID


NO: 36764.


1227 AAM50715 Homo SapiensMILL- Human TRP-like 243 83
calcium


channel-5 (TLCC-5).


1227 gi~20874183~Mus musculussimilar to hornerin 80 29


ref~XP_1310


03.1


1227 gi~17864717~Mus musculushornerin 80 29


gb~AAKl
57


91.1


1229 gi4019247Ateline thymidine kinase 71 46


he esvirus
3


1229 gi2760368Drosophila Shar pei/DRhoGEF2 70 26


melanogaster


1229 gi17862944Drosophila SD04476p 70 26


melanogaster


1230 gi4559296Mus musculussilencing mediator of 80 30
retinoic acid and


thyroid hormone receptor
extended


isoform


1230 118181872Mus musculusGATA-2 protein 78 41


1230 g118033511Rattus transcription factor 78 41
GATA-2


norvegicus


1231 g113365501C rinus integrin beta2-chain 75 27
carpio


1231 g13322933Treponema DNA ligase (11g) 73 32


allidum


1231 gi~13365501~Cyprinus integrinbeta2-chain 75 27
carpio


dbj~BAB391


30.1


1232 AAM79791 Homo SapiensHYSE- Human protein SEQ 78 35
ID NO


3437.


1232 AAM78807 Homo sapiensNYSE- Human protein SEQ 78 35
ID NO


1469.


1232 AAB19338 Homo Sapiens1NCY- Amino acid sequence78 35
of a


human fibrous roteiii
(FIBR).


1233 AAU21459 Homo SapiensHUMA- Human novel foetal87 26
antigen,


SEQ ID NO 1703.


1233 g115081227Arabidopsisglycine-rich protein 75 37
GRP20


thaliana


1233 12645433 Homo SapiensCHD3 74 30


1234 AAU83676 Homo SapiensGETH Human PRO protein, 178 97
Seq ID No


170.


1234 ABB84911 Homo SapiensGETH Human PR01244 protein178 97


sequence SEQ ID N0:190.


1234 AAB62403 Homo sapiensCURA- Human MBSP7 polypeptide178 97


(clone 3499605Ø64 .


1235 ABB 10348Homo SapiensHUMA- Human cDNA SEQ 409 61
ID NO:




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
140
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


656.


1235 AAU18012Homo SapiensHUMA- Human immunoglobulin178 83


olypeptide SEQ ID No
157.


1235 ABB89226Homo SapiensHUMA- Human polypeptide 78 82
SEQ ID


NO 1602.


1236 gi10566951Rattus s-gicerin/MIJC18 85 45


norvegicus


1236 gi10566949Rattus 1-gicerin/MUC18 85 45


norvegicus


1236 AAB90798Homo sapiensNOJI/ Human shear stress-response84 42


rotein SEQ ID NO: 96.


1238 gi21464300Drosophila GH20068p 95 36


melano aster


1238 gi3868879Xeno us laevisZic-related-2 88 35


1238 gi1841756Mus musculusGATA-5 cardiac transcription87 52
factor


1239 gi17946266Drosophila RE61793p 96 40


melanogaster


1239 gi15636898Gallus gallusformin binding protein 91 27
11-related


rotein


1239 gi780454African swinepB407L 88 30


fever virus


1240 AAE05302Homo SapiensMILL- Human TANGO 457 1331 100
protein.


1240 AAE05303Homo SapiensMILL- Human mature TANGO1207 100
457


rotein.


1240 AAE05305Homo SapiensMILL- Human TANGO 457 1201 100
protein


cyto lasmic domain.


1241 gi5640111LycopersiconRAD23 protein 84 25


esculentum


1241 gi17131739Nostoc Sp. polyketide synthase type76 33
PCC I


7120


1241 gi~5640111~eLycopersiconRAD23 protein 84 25


mb~CAB515esculentum


44.1


1242 AAG03496Homo SapiensGEST Human secreted protein,67 39
SEQ ID


NO: 7577.


1242 gi~13876270~Mus musculusprotocadherin alpha 8 66 35


gb~AAK260


55.1


1243 AAE16665Homo SapiensMILL- Human calcium chaimel196 87
family


member, 21784 rotein.


1243 AAB62248Homo SapiensWARN Human calcium channel196 87


alpha2delta subunit.


1243 AAY92320Homo SapiensWARN Human alpha-2-delta-C196 87


calcium channel subunit
polype tide.


1244 gi~4102990~gAspergillus DNA polymerase epsilon 70 30
homolog


b~AAD0163nidulans


7.1


1245 15917666Zea mays extensin-like rotein 94 26


1245 g119481644shrimp whiteWSSV052 89 36


spot syndrome


virus


1245 g117016928shrimp whitewsv001 89 36


spot syndrome


virus




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
141
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1246 AA012623 Homo SapiensHYSE- Human polypeptide 169 69
SEQ ID


NO 26515.


1246 AA012822 Homo SapiensHYSE- Human polypeptide 153 75
SEQ ID


NO 26714.


1246 AAO02255 Homo SapiensHYSE- Human polypeptide 123 65
SEQ ID


NO 16147.


1247 gi1653353Synechocystisnodulation protein 75 28


s . PCC
6803


1247 14468626 Mus musculusTEF-5 74 26


1247 g117430764Ralstonia SKWP PROTEIN 5 74 23


solanacearum


1248 g115139973SinorhizobiumCONSERVED HYPOTHETICAL 77 47


meliloti PROTEIN


1249 g17191078Leishmania L712.2 99 29


maj or


1249 g117384256Homo sapiensmucin 5 85 31


1249 g15821153Homo SapiensRNA binding rotein 83 33


1250 AAY36495 Homo SapiensHUMA- Fragment of human 124 86
secreted


protein encoded by ene
27.


1250 AA012122 Homo sapiensHYSE- Human polypeptide 123 91
SEQ ID


NO 26014.


1250 AAB95063 Homo SapiensHELI- Human protein sequence121 90
SEQ


ID N0:16901.


1252 gi~15839838~Mycobacteriummembrane protein, MmpL 68 27
family


re~NP_3348tuberculosis


75.1 CDC1551


1254 AAG00399 Homo SapiensGEST Human secreted protein,328 100
SEQ ID


NO: 4480.


1254 g121428466Drosophila LD22609p 85 24


melanogaster


1254 g119914274Methanosarcinasensory transduction 85 26
histidine kinase


acetivorans[Methanosarcina
str.


C2A


1256 g114161094Choloepus von Willebrand Factor 80 24


didactylus


1256 g114161092Cyclopes von Willebrand Factor 78 23


didactylus


1256 g113872552Acomys von Willebrand Factor 77 23


cahirinus


1258 g17008025Callithrix prochymosin 715 64


'acchus


1258 g111990126Camelus chymosin 634 57


dromedarius


1258 g1491952 synthetic preprochymosin 618 56


construct


1259 gi~21402709~Bacillus AMP-binding, AMP-binding72 34
enzyme


ref~NP_6586anthracis [Bacillus anthracis
A2012


94.1


1260 gi~4505431~rHomo Sapiensnuclear protein, ataxia-telangiectasia64 33


ef~NP_0025 locus; NPAT gene; E14
gene


10.1


1260 gi~15309894~Homo Sapienssimilar to nuclear protein,64 33
ataxia-


ref~XP_0408 telangiectasia locus;
NPAT gene; E14


46.2 gene




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
142
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1260 gi~1304114~dHomo sapiensNPAT 64 33


bj ~BAA
1186


1.1


1261 gi4519535Homo SapiensLeukotriene B4 ome a-hydroxylase133 49


1261 gi1857022Homo Sapiensleukotriene B4 omega-hydroxylase133 49


1261 gi18266446Homo Sapienscytochrome P450, subfamily133 49
IVF,


of epode 2


1262 gi13363530Escherichia cell division protein 79 26
coli HfIB/FtsH


0157:H7 protease


1262 gi746401 Escherichia ATP-binding rotein 79 26
coli


1262 1146028 Escherichia ftsH 79 26
coli


1263 AAW67859 Homo SapiensHUMA- Human secreted 283 100
protein


encoded by gene 53 clone
HBMCL41.


1264 g111066248Helix lucorumpresenilin 85 21


1264 gi~19115422~Schizosaccharomribonuclease II RNB 69 30
family protein;


ref~NP'5945yces pombe dis3-like


10.1


1264 gi~14720912~Homo Sapienssimilar to Matrin 3 69 32


ref~XP_03
82


04.1


1265 g15757703Mus musculussyntrophin-associated 82 38
serine-threonine


protein kinase


1265 g14996035Human 69.8% identical to U47 76 42
gene of strain


heipesvirus U1102 of HHV-6
6


1265 g1330951 Gallid ICP4 76 36


lie esvirus
1


1266 gi~17511177~CaenorhabditisZK1053.3.p 75 40


ref~NP,4933elegans


24.1 ~


1266 gi~17538077~CaenorhabditisZK1248.2.p 69 34


ref~NP elegans
4951


59.1


1267 g1915540 Ovis aries pregnancy-s ecific antigen85 25


1267 16179989 Capra hircuspregnancy-associated 84 25
glycoprotein-2


1267 g19798658Rhinolophus pepsinogen A 80 23


ferrume uinum


1268 gi~15789526~Halobacteriumserine proteinase; HtrA69 30


ret~NP_2793Sp. NRC-1


50.1


1269 g19988674Influenza hemagglutinin protein 70 24
A virus .


(A/Swine/Wisco


nsin/14094/99(H


3N2))


1269 g16552676Influenza hemagglutinin 70 25
A virus


(ABangkok/1/97


(H3N2))


1269 g16552638Influenza hemagglutinin 70 24
A virus


(A/Trinidad/51/9


6(H3N2))


1270 13378527 Zea mays anther specific protein87 41


1270 AAW 15787Homo sapiensPENN- Human metastasis 85 28
suppressor


KISS-1.


1270 g121410770Homo SapiensSimilar to RTKFN cDNA 84 46
1500005K14


gene




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
143
Table
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1271 gi1335527Human reading frame VP3 75 38


oliovirus
1


1271 gi61253 Human polyprotein 75 38


oliovirus
1


1271 gi~17453412~Homo Sapienssimilar to 60S ribosomal76 40
protein L7A


reflXP-0631 (Surfeit locus protein
3)


32.1


1272 AAU87081 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related


lectin, Siglec-11.


1272 AAU87077 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related


lectin, Siglec-BMS-L3d.


1272 AAU87076 Homo SapiensBRIM Sialic acid-binding69 43
Ig-related


lectin, Siglec-BMS-L3c.


1273 AAA09121 Homo SapiensCURA- Clone 2355875 720 100
cDNA


_ (update), encodes syncollin
aal homologue.


1273 AAY92233 Homo SapiensCURA- Glone 2355875f 720 100
- syncollin


homologue.


1273 AAB54267 Homo SapiensHUMA- Human pancreatic 715 100
cancer


antigen protein sequence
SEQ ID


N0:719.


1274 gi15559064Mus musculusSNAGl 198 59


1274 AAU17435 Homo sapiensHUMA- Novel signal transduction131 62


athway protein, Se ID
1000.


1274 AAW99023 Homo sa iensMOUN 1762 eptide sequence.131 62


1275 gi~6753732~rMus musculusepidermal growth factor65 30


ef~NP_0342


43.1 ~


1275 gi~50801 Mus musculuspolyprotein 65 30
hem


b~CAA2411


5.1


1275 gi~20341089~Mus musculusepidermal growth factor65 30


ref~XP_1093


85.1


1276 AAM39205 Homo sapiensHYSE- Human polypeptide447 78
SEQ ID


NO 2350.


1276 AAM40991 Homo SapiensHYSE- Human polypeptide424 74
SEQ ID


NO 5922.


1276 AA007159 Homo SapiensHYSE- Human polypeptide401 75
SEQ ID


NO 21051.


1277 gi13905120Mus musculusRIKEN cDNA 0610013I17 134 35
gene


1277 113936283Mus musculusTRH3 134 35


1277 AAB92625 Homo SapiensHELI- Human protein 127 35
sequence SEQ


ID N0:10921.


1279 AAM66940 Homo SapiensMOLE- Human bone marrow362 85


expressed probe encoded
protein SEQ


ID NO: 27246.


1279 AAM54534 Homo SapiensMOLE- Human brain expressed362 85
single


exon probe encoded protein
SEQ ID


NO: 26639.


1279 gi~208153~gbsynthetic crystal toxin 79 40


~AAA73184.construct


1~


1280 AAE05187 Homo Sapiens1NCY- Human drug metabolising484 100


enzyme (DME-18) rotein.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
144
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1280 AAU12266 Homo SapiensGETH Human PR05780 polypeptide484 100


sequence.


1280 AAY91631 Homo SapiensHUMA- Human secreted 484 100
protein


sequence encoded by gene
24 SEQ ID


N0:304.


1281 AAH46856 Homo SapiensHUMA- Human serine/threonine238 100


_ phosphatase encoding
aal cDNA (clone ID


HLD0020.


1281 AAG77801 Homo SapiensHUMA- Human HLD0020 238 100


serine/threonine phosphatase
protein


se uence. .


1281 AAB85476 Homo SapiensHUMA- Human serine/threonine238 100


phosphatase (clone ID
HLD0020).


1282 gi~14762786~Homo SapiensGS2 gene 70 30


ref~XP
0478


71.1


1283 gi3860165Arabidopsisdisease resistance protein69 38
RPP1-WsB


thaliana


1283 AA009033 Homo SapiensHYSE- Human polypeptide 68 38
SEQ ID


NO 22925.


1283 gi6967115Arabidopsisdisease resistance protein68 38
homlog


thaliana


1285 gi1055252Rattus pheromone receptor VN5 78 32


norve icus


1285 gi2746733Drosophila circadian clock protein 73 26


virilis


1285 gi2641617Drosophila TIM 73 26


virilis


1286 gi6013135Rattus coxsackie-adenovirus-receptor86 67


norvegicus homolog


1286 AAV50429 Homo SapiensUYNY Human coxsackievirus83 75
and Ad2


_ and Ad5 receptor (HCAR)
aal cDNA.


1286 AAV28845 Homo SapiensDAND Human coxsackievirus83 75
and


_ adenovirus receptor encoding
aal DNA.


1287 AAU83224 Homo SapiensZYMO Novel secreted protein642 100


Z930757G12P.


1287 AAY70692 Homo sa DAND Human soluble aitractin-2.84 54
iens


1287 AAY70691 Homo sa DAND Human membrane attractin-2.84 54
iens


1288 AAW70326 Homo SapiensGEMY Secreted protein 1655 99
DU123 1.


1288 ABB 12473Homo SapiensHYSE- Human bone marrow 547 72
expressed


protein SEQ ID NO: 312.


1288 15689736 Homo SapiensMyopodin rotein 475 100


1289 g14103543Tomato chlorosisheat shock protein 70 73 29


virus


1289 g112247413Cristatellacytochrome b 72 30


mucedo


1289 gi~4103543Tomato chlorosisheat shock protein 70 73 29
~g


b~AAD0179virus


0.1~


1291 AAB94128 Homo SapiensHELI- Human protein sequence520 98
SEQ


ID N0:14383.


1291 AAY85576 Homo sapiensJANC Hs-UNC-53/1 fragment/GFP520 98


fusion insert of plasmid
pGI3150.


1291 AAY85564 Homo Sapiens~ JANC Human homologue ~ 520 ~ 98
of UNC-53




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
145
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


(Hs-UNC-53/1) se uence.


1292 AAY01413 Homo SapiensHLTMA- Secreted protein 207 97
encoded by


gene 31 clone HHBAG64.


1292 AAY05324 Homo SapiensGEMY Human secreted protein207 97


1j167 5.


1292 g115157864AgrobacteriumAGR_C_4816p 71 34


tumefaciens
str.


058 (Cereon)


1294 AAB 12146Homo SapiensPROT- Hydrophobic domain219 100
protein


from clone HP 10672 isolated
from


Thymus cells.


1295 gi~17228767~Nostoc Sp. probable glycogen phosphorylase78 34
PCC


ref~NP,48537120


15.1


1295 gi~10835203~Homo Sapiensadvanced glycosylation 65 58
end product-


ref~NP_0011 specific receptor


27.1 ~


1295 gi~190846~gbHomo Sapiensreceptor for advanced 65 58
glycosylation


~AAA03574. end products


1~


1296 g117511816Homo SapiensSimilar to RIKEN cDNA 1268 99
1110032022


ene


1296 AAB88440 Homo sapiensHELI- Human membrane 688 100
or secretory


rotein clone PSEC0222.


1296 g17211438Homo sa golgin-67 94 30
iens


1298 g118314436Homo SapiensSimilar to RIKEN cDNA 481 79
4921511004


gene


1298 11872546 Mus musculusNIK 86 25


1298 g15533305Homo Sapienssomatostatin receptor 85 29
interacting


rotein s lice variant
a


1299 11334643 Xeno us APEG recursor roteiii 105 27
laevis


1299 g117428053Ralstonia PROBABLE RIBONUCLEASE 100 32
E


solanacearum(RNASE E) PROTEIN


1299 g16690017HerpesvirusNTR 96 25


apio


1300 AAB87346 Homo SapiensHUMA- Human gene 5 encoded586 74


secreted protein HDPIE85,
SEQ ID


N0:87.


1300 AAB44298 Homo SapiensGETH Human PR0706 (UNQ370)586 74


rotein sequence SEQ ID
N0:385.


1300 AAY41742 Homo SapiensGETH Human PR0706 protein586 74


sequence.


1301 g1218572 Pan troglodytesprot GOR 1344 62


1301 1243898 Pan GOR 1040 68


1301 g117862570Drosophila LD38414p 486 45


melano aster


1302 g113276598Homo sapiensdJ614O4.7 (Novel rotein)260 28


1302 g113397804Homo SapiensdJ616B8.3 (novel gene) 230 30


1302 AAB56641 Homo SapiensROSE/ Human prostate 226 30
cancer antigen


protein sequence SEQ
ID N0:1219.


1303 g1603989 Drosophila salivary gland glue protein149 23


melano aster


1303 g113324584Borrelia LMP1 129 17


burgdorferi




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
146
Table 2
SEQ AccessionSpecies Description Score


1D No. Identity


NO:


1303 g1161956 Trypanosomasurface antigen 128 13


cruzi


1304 g113569248Human gag protein 81 34


immunodeficienc


y virus
a 1


1304 g14324832Human gag-pol polyprotein 80 29


immunodeficienc


y virus
a 1


1304 g111691875Mus musculusADP-ribosylation factor 79 22
1 GTPase


activatin rotein


1305 AA006469 Homo SapiensHYSE- Human polypeptide 191 100
SEQ ID


NO 20361.


1305 g13608368Xenopus origin recognition complex69 30
laevis associated


protein p81


1305 ABB 15196Homo SapiensHUMA- Human nervous system68 36
related


polype tide SEQ ID NO
3853.


1306 AAE03657 Homo SapiensINCY- Human extracellular109 27
matrix and


cell adhesion molecule-21
(XMAD-


21).


1306 ABB 11890Homo SapiensHYSE- Human protocadherin109 27


Flamingo 1 homologue,
SEQ ID


NO:2260.


1306 13449298 Homo SapiensMEGF2 109 27


1308 g19294050Arabidopsisprotein kinase-like protein84 32


thaliana


1308 g115983765ArabidopsisAT3g24550/MOB24 8 84 32


thaliana


1308 g113877617Arabidopsisprotein kinase-like protein84 32


thaliana


1309 AAU00375 Homo SapiensBERN/ Htunan stem cell 127 54
growth factor


rece tor.


1309 AAE07145 Homo SapiensSALK Human Kit/stem cell127 54
factor


receptor kinase insert
region.


1309 13236223 E uus caballustyrosine kinase receptor127 50
homolog


1310 g121449343Actinosynnemapolyketide synthase 77 46


pretiosum
subsp.


auranticum


1310 g121114513Xanthomonastranscriptional regulator75 36


campestris
pv.


campestris
str.


ATCC 33913


1310 gi13364364Escherichiaacetylglutamate kinase 73 36
- coli


0157:H7


1311 g120146220Oryza sativasimilar to splicing factor/activator110 33


(japonica protein


cultivar-
oup)


1311 g1206712 Rattus salivary proline-rich 104 27
protein


norvegicus


1311 AAY84592 Homo SapiensUNIW Amino acid sequennce103 34
of a


human artemin olypeptide.


1312 12065210 Mus musculusPro-Pol-dUTPase of rotein530 69


__ gi~10834720~Homo sapiensPP565 249 66
1312


gb~AAG237


90.1 ~AF258




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
147
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


587_1


1312 gi~13194728~Gallus galluspol-like protein ENS-3 115 21


gb~AAK155


26.1
~AF329


451 1


1313 AAW03515Homo sa iensSHKJ Human DOCK180 rotein.147 58


1313 gi1339910Homo sa iensDOCK180 protein 147 58


1313 gi1504002Homo sapienssimilar to a human major111 43
CRK-binding


protein DOCK180.


1314 gi12007418Mus musculusB3 olfactory rece for 76 38


1314 118480290Mus musculusolfactory rece for MOR260-376 38


1314 112007432Mus musculusB3 olfacto rece for 76 38


1315 g1483581Mus musculusNotch 3 82 26


1315 g118159668Pyrobaculum paREP2b 81 29


aerophilum


1315 g14584086Spermatozopsisp210 protein 79 25


similis


1316 AAM71305Homo SapiensMOLE- Human bone marrow 422 98


expressed probe encoded
protein SEQ


ID NO: 31611.


1316 AAM58790Homo SapiensMOLE- Human brain expressed422 98
single


exon probe encoded protein
SEQ ID


NO: 30895.


1316 g1149490Lactococcus sucrose-6-phosphate hydrolase72 31


lactis


1317 g11620040Paramecium Asp-rich 72 28


bursaria


Chlorella
virus 1


1317 13721615C rinus carpioMEF2C 71 25


1317 gi~9631936~rParamecium Asp-rich 72 28


ef~NP_0487bursaria


25.1 Chlorella
virus 1


1318 gi~21291797~Anopheles agCP3974 74 35


gb~EAA039gambiae str.


42.1 PEST
~


1319 g121306283Chlamydomonasiron transporter Ftrl 74 30


reinhardtii


1319 AAB60461Homo sapiens1NCY- Human cell cycle 73 33
and


proliferation protein
CCYPR-9, SEQ


ID N0:9.


1319 g16013155Homo Sapiensp35s ' 73 33


1320 g19717245Mus musculuscytoplasmic dynein heavy430 94
chain


1320 g1402528Rattus cytoplasmic dynein heavy430 94
chain


norvegicus


1320 g1294543Rattus dynein heavy chain 430 94


norvegicus


1323 gig 17221411Burkholderiakdo transferase 70 34
~


emb~CADl2cepacia


639.1
~


1324 g11698601Cricetulus beta-1,6-N- 440 38


griseus acetylglucosaminyltransferase


1324 g1349091Rattus N-acetylglucosaminyltransferase438 43
V


norvegicus


1324 118997007Mus musculusN-acetylglucosaminyltransferase438 43
V




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
148
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1325 AAM70545 Homo SapiensMOLE- Human bone marrow 115 47


expressed probe encoded
protein SEQ


ID NO: 30851.


1325 AAM58098 Homo SapiensMOLE- Human brain expressed115 47
single


exon probe encoded protein
SEQ ID


NO: 30203.


1325 AAM72994 Homo SapiensMOLE- Human bone marrow 111 28


expressed probe encoded
protein SEQ


ID NO: 33300.


1326 gi12724969Lactococcusphenolic acid decarboxylase77 46


lactis subsp.


lactis


1327 AAB53097 Homo SapiensGETH Human angiogenesis-associated372 63


rotein PRO 1246, SEQ
ID N0:167.


1327 AAU12416 Homo SapiensGETH Human PR01246 polypeptide372 63


sequence.


1327 AAY99377 Homo SapiensGETH Human PR01246 (UNQ630)372 63


amino acid sequence SEQ
ID NO:132.


1328 gi6014505Hepatitis polyprotein 76 43
GB


virus B


1328 gi765145 Hepatitis polypeptide 68 41
GB


virus B


1328 gi~20544059~Homo Sapienssimilar to U4/U6-associated294 100
RNA


ref~XP_0862 splicing factor


20.4


1329 AAV42689_Homo sapiensSIBI- DNA encoding human158 91
calcium


aal channel alpha-2 subunit.


1329 AAQ84667_Homo SapiensSALK Human neuronal calcium158 91


aal channel subunit alpha
2c.


1329 AAQ84664-Homo SapiensSALK Human neuronal calcium158 91


aal channel subunit alpha
2b.


1330 gi19923 Nicotiana pistil extensin like 71 38
protein, partial CDS


tabacum


1330 gi~144429~gbCellulomonasbeta-1,4-xylanase 67 30


~AAA56792.fimi


1~


1331 12388676 Mytilus precolla en P 85 35
edulis


1331 g117862044Drosophila LD06016p 75 30


melano aster


1331 g113879780MycobacteriumPE_PGRS family protein 74 30


tuberculosis


CDC1551


1333 AA000015 Homo SapiensHYSE- Human polypeptide 442 61
SEQ ID


NO 13907.


1333 AAB82479 Homo SapiensZYMO Human RING finger 81 31
protein


Za op2.


1333 120975274Homo sapiensskeletrophin 81 31


1334 ABB 11819Homo SapiensHYSE- Human secreted 367 82
protein


homolo ue, SEQ ID N0:2189.


1334 AAW80398 Homo SapiensGEMY A secreted protein 130 67
encoded by


clone cw1543 3.


1334 g15081693Samanea pulvinus inward-rectifying70 34
samara channel


SPICK2


1335 ABB89969 Homo sapiensHUMA- Human polype tide 142 96
SEQ ID




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
149
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


NO 2345.


1335 AAB38385 Homo SapiensHUMA- Human secreted 142 96
protein


encoded by gene 18 clone
HTLEJ24.


1335 AAB38338 Homo SapiensHUMA- Human secreted 142 96
protein


encoded by gene 18 clone
HTLFE57.


1336 gi~14590195~Pyrococcus asparaginyl-tRNA synthetase70 37


re~NP_1422horikoshii


60.1


1337 gi3879419Caenorhabditiscontains similarity to 69 29
Pfam domain:


elegans PF00102 (Protein-tyrosine


phosphatase), Score=51.6,
E-


value=1.8e-14, N=1


1337 gi~17563828~Caenorhabditisprotein tyrosine phosphatase69 29


ref~NP_5059elegans


65.1


1338 gi~2072960~gHomo Sapiensp40 138 33


b~AACS
126


8.1~


1338 gi~4185940~eHuman env protein 124 75


mb~CAA768endogenous


80.1 ~ retrovirus
K


1338 gi~757872~eHuman env 124 75


mb~CAA577endogenous


23.1 ~ retrovirus


1340 gi1491979Molluscum MC036R 78 33


contagiosum


virus subtype
1


1340 gi~9628968~rMolluscum MC036R 78 33


ef~NP_0439contagiosum


87.1 virus


1341 gi18676514Homo SapiensFLJ00154 protein 1560 100


1341 AAB84252 Homo SapiensHUMA- Amino acid sequence572 63
of a


human cytokine receptor-like
rotein.


1341 AAB84251 Homo SapiensHUMA- Human cytokine 572 63
receptor-like


protein fragment.


1342 AAY27757 Homo SapiensHUMA- Human secreted 152 71
protein


encoded by gene No. 47:


1342 AAB27551 Homo SapiensMYRI- Human tumour suppressor77 32


BRG1 encoded by cDNA
mutated at


base 1705.


1342 AAB27550 Homo sapiensMYRI- Human tumour suppressor77 32


BRG1 protein from cell
lines DU145


and NCI-H 1300.


1344 gi21464394Drosophila RE18651p 78 26


melanogaster


1344 AAM39065 Homo SapiensHYSE- Human polypeptide 77 21
SEQ ID


NO 2210.


1344 1338290 Homo Sapiensson3 protein 77 21


1345 12202 Canis s Clox 135 37
.


1345 g13879551Caenorhabditiscontains similarity to 125 33
Pfam domain:


elegans PF01391 (Collagen triple
helix repeat


(20 copies)), Score=56.4,
E-value=2e-


13, N=2; PF01484 (Nematode
cuticle


collagen N-terminal domain),




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
150
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


Score=87.2, E-value=l.le-22,
N=1


1345 gi158695 Drosophila tropomyosin isoform 118 30
33 (9C)


melanogaster


1346 gi7862077Giardia 3-hydroxy-3-methylglutaryl-coenzyme90 26


intestinalisA reductase


1346 gi1098615Mycoplasma adhesin-related 30 kDa 87 23
protein


pneumoniae


1346 gi20380058Homo sa iensSimilar to PRAM-1 rotein84 28


1347 113905302Mus musculusSimilar to ATPase, class736 85
II, type 9A


1347 g117862322Drosophila LD22119p 633 72


melanogaster


1347 AAM25271 Homo SapiensHYSE- Human protein 572 100
sequence SEQ


ID N0:786.


1348 g1456319 Bacteriophage74kDa protein 75 33


FC1


1348 g11524115Lycopersiconsubtilisin-like endoprotease73 28


esculentum


1348 g14200334LycopersiconP69A protein 73 28


esculentum


1349 g121391988Drosophila HL08052p 78 31


melano aster


1349 g120148339Arabidopsis cyclin delta-3 77 25


thaliana


1349 gi~17647607~Drosophila maroon-like; bronzy; 78 31
section 5


ref~NP_5234melanogaster


23.1


1351 g118676524Homo sa iensFLJ00159 rotein 164 52


1351 g121392066Drosophila RE04357p 139 34


melanogaster


1351 AAB92637 Homo SapiensHELI- Human protein 81 43
sequence SEQ


ID N0:10953.


1352 g119071965Aspergillus chitin synthase 79 28


oryzae


1352 g117945592Drosophila RE26660p 78 41


melano aster


1352 g116184663Drosoplula LD28370p 74 22


melanogaster


1353 gi~11037117~Homo SapiensNAG13 307 65


gb~AAG274


85.1 CAF
194


537_1


1353 gi~1335205~eHomo SapiensORFII 305 65


mb~CAA364


80.1


1354 g11388166Drosophila Bowel 80 32


melano aster


1354 g115553187Scyliorhinushomeodomain protein 79 22
Otxl


canicula


1354 AAY85573 Homo sapiensJANC Hs-UNC-53/3 fragment/GFP78 26


fusion insert of plasmid
pGI3303.


1358 gi~21288288~Anopheles agCP9766 71 30


gb~EAA006gambiae str.


09.1 ~ PEST


1358 ~ gi~17465558~Homo Sapiens~ similar to mucin ~ 68 ~ 36




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
151
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


re~XP_0698


88.1


1359 gi~21302892~Anopheles agCP5020 70 31


gb~EAA150gambiae str.


37.1 PEST


1361 gi15080686Lentinula CDCS 79 26
edodes


1361 gi495516 Plasmodium circumsporozoite protein77 31


vivax


1361 gi21070569DictyosteliumVSAE2 (FR.AGMENT). 3/10176 31


discoideum


1362 gi8953400Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23
~


thaliana s these-like rotein


1362 gi~15239030~Arabidopsis 1-D-deoxyxylulose 5-phosphate73 23


ref~NP-1966thaliana synthase - like protein


99.1 ~


1363 gi2444430Xenopus laevisdeacetylase 327 81


1363 gi602098 Xeno us laeviseast ltPD3 homologue 324 80


1363 AAB49954 Homo SapiensMETH- Human histone 323 80
deacetylase


HDAC-1.


1364 AAM69686 Homo SapiensMOLE- Human bone marrow418 55


expressed probe encoded
protein SEQ


ID NO: 29992.


1364 AAM57281 Homo SapiensMOLE- Human brain expressed418 55
single


exon probe encoded protein
SEQ ID


NO: 29386.


1364 gi~1780971~eHuman gag protein 172 37


mb~CAA714endogenous


16.1 ~ retrovirus
K


1365 gi437084 Gallus gallusvitamin D3 hydroxylase 510 41
associated


protein


1365 12149156 Homo Sapiensfatty acid amide hydrolase477 38


1365 AAW57783 Homo SapiensSCRI Human fatty acid 468 38
amide


hydrolase.


1366 g13510695Homo SapiensDNA polymerase theta 77 21


1366 g1309132 Mus musculuscalnexin 72 22


1366 g115214567Mus musculusSimilar to calnexin 72 22


1367 gi~17508849~Caenorhabditishelicase 73 40


re~NP elegans
4914


26.1 ~


1368 g15457567Pyrococcus Na+/H+ antiporter (napA-1)76 33


abyssi


1368 g18247211Candida albicansShe9 rotein 69 31


1368 gi~14590079)Pyrococcus Na(+)/H(+) antiporter 76 30


ref~NP_1421horikoshii


43.1


1369 g117644260Homo SapiensbB206I21.1 (ATPase, 305 98
Class VI, type


11C ) .


1369 AA014200 Homo SapiensINCY- Human transporter166 50
and ion


channel TRICH-17.


1369 g15080816Arabidopsis Putative ATPase 166 49


thaliana


1370 gi~18573281~Homo Sapienssimilar to 40S ribosomal70 38
protein S3A


re~XP_0959


33.1




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
152
Tahle: 7
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1372 gi6683562Mus musculushe aran sulfate 6-sulfotransferase886 91
3


1372 gi6683558Mus musculusheparan sulfate 6-sulfohansferase265 72
2


1372 ABL39900_Homo SapiensSEGK Human HS6ST2v encoding262 71


aal cDNA SEQ ID NO:1.


1373 gi~20882231Mus musculussimilar to LIM domain 76 24
~ only 7


ref~XP_1392


03.1


1373 gi~20302988~Medicago nodule-specific glycine-rich72 26
sativa protein 3


gb~AAM189


48.1 ~AF498


989 1


1373 gi~9965267~ginfectious non-structural protein 72 24
2


b~AAG1000hypodermal
and


8.1 ~ hematopoietic


necrosis
virus


1374 13355835 Rhizobium RBSK 78 32
etli


1374 g17453560Polyangium epoD 73 28


cellulosum


1374 g11749684Schizosaccharomsimilar to Saccharomyces72 28
cerevisiae


yces pombe porphobilinogen deaminase,
SWISS-


PROT Accession Number
P28789


1375 116973455Danio reriobeta-3-galactosyltransferase1050 63


1375 AAB24035 Homo SapiensGETH Human PR04397 protein725 46


sequence SEQ ID NO:42.


1375 AAB88404 Homo SapiensHELI- Human membrane 709 43
or secretory


protein clone PSEC0159.


1376 g17668 Drosophila bsg25D protein 73 33


melanogaster


1376 g120177037Drosophila LD21844p 73 33


melanogaster


1376 g11353669CaenorhabditisUNC-24 69 43


ele ans


1379 AAS16182_Homo SapiensGENA- Human apolipoprotein245 67
C1


aal (APOC1 DNA.


1379 AAU10534 Homo SapiensGENA- Human apolipoprotein245 67
C1


(APOC1) of eptide.


1379 AAS 16825-Homo SapiensGENA- Human apolipoprotein245 67
C1


aal (APOC1) DNA coding se
uence.


1380 AAY36290 Homo sapiensHUMA- Human secreted 177 74
protein


encoded by gene 67.


1380 g116551305Tatianyx DNA-directed RNA polymerase71 38
beta'


arnacites subunit 2


1380 13411013 Candida protein mannosyltransferase68 35
albicans 1


1381 AAM80132 Homo SapiensHYSE- Human protein SEQ 173 66
ID NO


3778.


1381 g14731867Dictyosteliumsterol glucosyltransferase107 30


discoideum


1381 AAB74726 Homo SapiensINCY- Human membrane 89 41
associated


protein MEMAP-32.


1382 AAB62100 Homo SapiensWIST- Human bridging 78 27
integrator-2


(Bin2) rotein.


1382 g16527168Homo Sapiensbreast cancer associated78 27
protein


BRAP 1


1382 g15852834Homo Sapiensbridging integrator-2 78 27
~ ~




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
153
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1383 gi7670050Xeno us type I collagen al ha 92 27
laevis 1


1383 AA001606 Homo SapiensHYSE- Human polypeptide 85 29
SEQ ID


NO 15498.


1383 gi17738485Agrobacteriumbiopolymer transport 85 28
protein


tumefaciens
str.


C58 (U.


Washin ton)


1384 gi20451261CaenorhabditisC. elegans GCY-17 protein71 26


elegans (comes onding se uence
W03F11.2)


1384 gi2665714AgrobacteriummoaC 71 29


tumefaciens


1384 gi~20864452~Mus musculusRIKEN cDNA 2410018E23 130 59


ref]XP-1500


76.1 ~


1385 AAY94938 Homo SapiensGEMY Human secreted protein103 25
clone


ye78 1 protein sequence
SEQ ID


N0:82.


1385 gi12831176Agelaius gamma filamin protein 96 29


phoeniceus


1385 AAU81998 Homo sapiensINCY- Human secreted 87 27
protein


SECP24.


1386 gi10440468Homo SapiensFLJ00070 protein 102 41


1386 gi11136912Danio rerioRPTP-al ha protein 94 32


1386 120377083Homo Sapiensp78 92 36


1387 AAM40810 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID


NO 5741.


138.7 AAM39024 Homo SapiensHYSE- Human polypeptide 190 59
SEQ ID


NO 2169.


1387 g115080474Homo SapiensSimilar to RIKEN cDNA 190 59
1700023011


ene


1388 g112802591Bovine tegument protein 82 30


herpesvirus
4


1388 g1950226 SaccharomycesTrf4p ' 73 26


cerevisiae


1388 gi~13095641~Bovine tegumentprotein 82 30


ref~NP_0765herpesvirus
4


56.1


1389 AAI67224_Homo SapiensCORI- BS11S cDNA sequence.363 100


aal


1389 AAF85500_Homo SapiensEOSB- Nucleotide sequence363 100
of a


aal human breast cancer protein
designated


BCH1.


1389 AAA54120-Homo sapiensEOSB- Breast cancer protein363 100
BCH1


aal codin se uence.


1390 g1184653 Homo SapiensIFN-alpha responsive 74 30
transcription


factor


1390 gi~2580453~gXenopus Xbap 68 47
laevis


b~AAB8233


6.1~


1391 AAB88456 Homo SapiensHELI- Human membrane 85 52
or secretory


protein clone PSEC0246.


1391 AAB62392 Homo SapiensLEXI- Human LDL receptor85 52
family


rotein (LDLP).


1392 ABB 12009Homo Sapiens~ HYSE- Human RAMP 1 ~ 90 ~ 100
homologue,




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
154
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


SEQ ID N0:2379.


1392 gi3171910Homo sa RAMP1 90 100
iens


1392 gi12653551Homo Sapiensreceptor (calcitonin) 90 100
activity modifying


rotein 1


1394 gi4467343Drosophila EG:140G11.1 70 27


melano aster


1394 gi6018879Drosophila BACN4L24.d 70 27


melanogaster


1394 gi157993 Drosophila developmental protein 70 27


melanogaster


1395 gi4928919Arabidopsiszinc forger protein 2 86 26


thaliana


1395 gi2702272Arabidopsisexpressed protein 86 26


thaliana


1396 AAM25276 Homo sapiensHYSE- Human protein sequence729 93
SEQ


ID N0:791.


1396 AAE14340 Homo sapiensINCY- Human protease 528 33
PRTS-5


protein.


1396 AAB47561 Homo sa INCY- Protease PRTS-3. 528 33
iens


1397 gi18369843Infectious P6 89 40


salmon anemia


virus


1397 gi4092530Infectious NS1 protein 87 39


salmon anemia


virus


1397 gi14009648Infectious NS1 87 39


salmon anemia


virus


1398 AAW63707 Homo sa UYOR- Human hSK2 protein.331 91
iens


1398 gi1575663Rattus ~ calcium-activated potassium331 91
channel


norvegicus rSK2


1398 gi15082148Homo Sapienssmall-conductance calcium-activated331 91


otassium channel


1399 AAB01.381Homo sapiensINCY- Neuron-associated 1653 68
protein.


1399 gi18157547Mus musculuspecanex-like 3 1620 66


1399 16650377 Mus musculusecanex 1 1277 51


1400 gi~20887681Mus musculussimilar to melastatin 468 91
~ 1


ref~XP,1405


75.1


1400 gi~3243075~gHomo Sapiensmelastatin 1 355 75


b~AAC8000


0.1~


1400 gi~20552333~Homo Sapienssimilar to melastatin 355 75
1


ref~XP-0076


62.9


1401 AAU15955 Homo SapiensHUMA- Human novel secreted931 92
protein,


Seq ID 908.


1401 g13978441Homo SapiensPITSLRE protein kinase 95 24
alpha SV9


isoform


1401 g11517914Homo Sapiensmonocytic leukaemia zinc91 28
finger


rotein


1402 g11289326Mus musculusROR-al ha 1 84 25


1402 g1530878 Chlamydomonasamino acid feature: N-glycosylation79 32
,


eugametos sites, as 41 .. 43, 46
.. 48, 51 .. 53, 72
..




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
155
Tahle 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


74, 107 .. 109, 128 ..
130, 132 .. 134,


158 .. 160, 163 .. 165;
amino acid


feature: Rod protein
domain, as 169 ..


340; amino acid feature:
globular


protein domain, as 32
.. 168


1402 gi220763 Rattus HES-3 factor 79 52


norve icus


1403 gi~20479430~Homo Sapienssimilar to olfactory 71 32
receptor MOR231-


ref~XP-1149 1


55.1


1403 gi~20480897~Homo sapienssimilar to olfactory 71 32
receptor MOR234-


ref~XP-1150 3


14.1 ~


1404 AAA88548_Homo sapiensSMIK Human CASB616 cDNA.89 100


aal


1404 AAB 19591Homo SapiensSMIK Human CASB616. 89 100


1404 11100110 Homo sa protein-tyrosine kinase 89 100
iens


1405 g14206753Oryctolagushomeodomain-containing 74 24
protein


cuniculus


1405 g113445253Mus musculusorphan Gpr37-like rotein72 33
1


1405 g13080552Mus musculusHoxa-9 71 50


1406 AAM50585 Homo SapiensNISB Benign prostatic 325 100
hyperplasia


associated protein JT460914.


1406 g118031947Homo SapiensSOCS box protein ASB-5 325 100


1406 AAU20593 Homo sapiensHUMA- Human secreted 316 100
protein, Seq


ID No 585.


1407 AAU83222 Homo SapiensZYMO Novel secreted protein895 97


Z930005G2P.


1407 AAY02712 Homo SapiensHUMA- Human secreted 91 56
protein


encoded by gene 63 clone
HBJFV28.


1407 AA000641 Homo SapiensHYSE- Human polypeptide 86 64
SEQ ID


NO 14533.


1408 ABB17944 Homo SapiensHUMA- Human nervous system81 53
related


pol eptide SEQ ID NO
6601.


1408 AAM77906 Homo SapiensMOLE- Human bone marrow 72 40


expressed probe encoded
protein SEQ


ID NO: 38212.


1408 AAM65199 Homo SapiensMOLE- Human brain expressed72 40
single


exon probe encoded protein
SEQ ID


NO: 37304.


1409 g15230847Vitreoscillaglutamine synthetase 68 33
Sp. homolog


C1


1409 g18515736Drosophila highwire 67 35


melano aster


1409 g13138797Sulfolobus Ssh7b 65 48


shibatae


1410 AAW23309 Homo sapiensEIJI- Human Werner's 151 96
syndrome WS-2


protein.


1410 g11913785Homo SapiensRep-8 151 96


1410 g118089098Homo sapiensre roduction 8 151 96


1411 gi~21297468~Anopheles agCP15537 166 56


gb~EAA096gambiae
str.


13.1 PEST


1411 gi~20983200~Mus musculusRIKEN cDNA 1810030007 73 24




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
156
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


ref~XP-1358


12.1


1412 gi532572 Hordeum lipoxygenase 1 82 28


vulgare


1412 gi945419 Mus musculushepatoma derived growth 77 35
factor


(HDGF)


1412 gi17932895stork hepatitispreC/core antigen 77 26
B


virus


1413 gi2370143Homo Sapiensimmunoglobulin-like domain-169 42


containing 1


1413 gi2645890Homo sa IGSF1 169 42
iens


1413 AAB40232 Homo SapiensHUMA- Human secreted 162 40
protein


sequence encoded by gene
46 SEQ ID


N0:142.


1414 gi21204314Staphylococcusproline-tRNA ligase 78 32


aureussubsp.


aureus MW2


1414 gi14247033Staphylococcusproline-tRNA ligase 78 32


aureus subsp.


aureus Mu50


1414 gi13701063Staphylococcusproline-tRNA ligase 78 32


aureus subsp.


aureus N315


1415 gi9948469Pseudomonasprobable non-ribosomal 78 31
peptide


aeruginosa synthetase


1415 AAE19251 Homo SapiensBIOI- SOSl protein sequence75 23
from


PS462.


1415 AAU84311 Homo SapiensBAAI~/ Protein ABCB2 74 30
differentially


ex ressed in breast cancer
tissue.


1416 gi18676710Homo sa FLJ00254 rotein 623 75
iens


1416 gi2065210Mus musculusPro-Pol-dUTPase pol rotein583 69


1416 gi~18676710~Homo SapiensFLJ00254 protein 623 75


dbj~BAB850


07.1 ~


1417 AAR85785 Homo SapiensUYNY Human GRB-10. 77 32


1417 gi841210 Mus musculusgrowth factor receptor 77 32
binding protein


Grb 10


1417 AAM90963 Homo SapiensHUMA- Human 74 32


immune/haematopoietic
antigen SEQ


ID N0:18556.


1419 AAM79990 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO


3636.


1419 AAM79006 Homo SapiensHYSE- Human protein SEQ 82 100
ID NO


1668.


1419 AAR28494 Homo SapiensXIAM/ Sequence encoded 82 100
by the


CAMPATH-1 antigen cDNA.


1420 AAU01383 Homo SapiensMILL- Human TANGO 499 828 73
form 2,


variant 1 amino acid
sequence.


1420 AAU01382 Homo SapiensMILL- Human TANGO 499 828 73
form 2,


variant 4 amino acid
se uence.


1420 AAU01380 Homo SapiensMILL- Human TANGO 499 828 73
form 2,


amino acid se uence.


1421 gi19069609EncephalitozoonPROTEASOME REGULATORY 76 26


cuniculi SUBUNIT YTA6 OF THE AAA




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
157
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


FAMILY OF ATPASES


1422 AAM66177 Homo SapiensMOLE- Human bone marrow199 72


expressed probe encoded
protein SEQ


ID NO: 26483.


1422 AAM53791 Homo SapiensMOLE- Human brain expressed199 72
single


exon probe encoded protein
SEQ ID


NO: 25896.


1422 AAM68472 Homo SapiensMOLE- Human bone marrow176 81


expressed probe encoded
protein SEQ


ID NO: 28778.


1423 11800227 Oryza sativaBowman-Birk roteinase 74 34
inhibitor


1423 g110141005San Miguel non-structural polyprotein74 26
sea


lion virus


1423 gi~17490177~Homo sapienssimilar to RING finger 76 28
protein 18


re~XP-0623 (Testis-specific ring-forger
protein)


00.1 ~


1424 g1461336 Pyrenomonas hsp70 75 29


salina


1424 g113880037Mycobacteriummembrane protein, MmpL 75 24
family


tuberculosis


CDC1551


1424 g11449306MycobacteriummmpL2 75 24


tuberculosis


H37Rv


1425 g115600 Enterobacteriagene 7.3, host range 79 30


ha a T7


1425 g116198065Drosophila LD28477p 77 30


melanogaster


1425 g111870012Drosophila xnp/atr-x DNA helicase 77 30


melanogaster


1426 g116185397Drosophila LD39815p 204 44


melano aster


1426 g12244793Arabidopsis disease resistance N 86 30
like protein


thaliana


1426 AAU84280 Homo SapiensBGHM Human endometrial 77 26
cancer


related rotein, HERC1.


1427 AAY36302 Homo SapiensHUMA- Human secreted 183 79
protein


encoded by gene 79.


1427 AAB88359 Homo SapiensHELI- Human membrane 178 80
or secretory


protein clone PSEC0087.


1427 AAM41635 Homo SapiensHYSE- Human polypeptide178 80
SEQ ID


NO 6566.


1428 AAU82008 Homo Sapiens1NCY- Human secreted 114 64
protein


SECP34.
Y


1428 AAB32391 Homo SapiensHUMA- Human secreted 114 64
protein


sequence encoded by
gene 21 SEQ ID


N0:77.


1428 AAY08306 Homo SapiensFIBR- Human collagen 74 45
IX alpha-3


chain rotein.


1429 g12792523Ralstonia alternative RNA sigma 69 30
factor RpoS


solanacearum


1429 g117428221Ralstonia RNA POLYMERASE SIGMA 69 33
S


solanacearum(SIGMA-38) FACTOR


TRANSCRIPTION REGULATOR




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
158
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


PROTEIN


1429 gi~5032313~rHomo Sapiensdystrophin Dp140bc isoform;73 26


e~NP_0040 Dystrophin (muscular
dystrophy,


14.1 Duchenne and Becker
types)


1433 gi9954445Rattus TEMO 171 62


norve icus


1433 gi14030260maize rayadopolyprotein ~ 79 32


fino virus


1433 AAB95656 Homo sapiensHELI- Human protein 77 36
sequence SEQ


ID N0:18419.


1434 AAR04212 Homo SapiensCALB- Human 32K alveolar391 43
surfactant


rotein.


1434 AAP60661 Homo SapiensKUSH/ Genomic sequence 386 43
of human


alveolar surfactant
protein


(hASP)encoded by genomic
DNA.


1434 AAB58135 Homo SapiensROSE/ Lung cancer associated366 42


pol a tide sequence
SEQ ID 473.


1435 gi17224904Mus musculusimmuno lobulin superfamily180 48
member 9


1435 gi20988778Homo SapiensSimilar to immunoglobulin173 53


su erfamily, member
9


1435 gi14149050Drosophila turtle protein, isoform114 36
4


melanogaster


1436 gi1465855CaenorhabditisC. elegans PQN-57 protein85 23


elegans (correspondin sequence
R09F10.7)


1436 gi1465856CaenorhabditisC. elegans PQN-56 protein85 23


elegans (correspondin sequence
R09F10.2)


1436 117864717Mus musculushornerin 83 26


1437 gi~21292574~Anopheles agCP3449 66 33


gb~EAA047gambiae str.


19.1 PEST


1438 ABB 10160Homo SapiensHUMA- Human cDNA SEQ 166 62
ID NO:


468.


1438 g19657279Vibrio choleraeaspartokinase II/homoserine71 28


dehydrogenase, methionine-sensitive


1439 g14582571Gallus gallusH erion protein, 419 75 24
kD isoform


1439 g113165 Oenothera ATPase alpha-subunit 72 26
(aa 1-511)


biennis


1439 g1903838 Oenothera F-1-ATPase alpha subunit72 26


berteriana


1440 g14558758Homo Sapienstestis-specific chromodomain233 62
Y-like


protein


1440 g14558762Mus musculustestis-specific chromodomain231 36
Y-like


rotein


1440 g13342716Homo Sapienstestis-specific ChromoDomain195 36
Y


isoform 1


1441 g1155627 Acanthamoebamyosin I heavy chain 118 42


castellanii


1441 g113093370Mycobacteriuminitiation factor IF-2 116 33


1e rae


1441 AAY20289 Homo SapiensUYRO- Human apolipoprotein114 39
E


mutant rotein fragment
5.


1442 g12253707Mus musculusDaxx 84 36


1442 g11934970Plasmodium AARP1 protein 79 65


falciparum




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
159
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1442 14050098 Mus musculusFas-bindin protein 78 34


1443 g12425111DictyosteliumZipA 90 26


discoideum


1443 AAY06119 Homo SapiensHARD Human CIITA interacting88 26


protein 104 CIP104).


1443 g15420387Leishmania proteophosphoglycan 86 21


maj or


1444 g1893355 AcinetobacterL-2,4-diaminobutyrate 77 26
decarboxylase


baumannii


1445 ABB55744 Homo sapiensFECH/ Human polypeptide 135 47
SEQ ID


NO 94.


1445 AAU39035 Homo SapiensGEMY Human secreted protein135 47


nh328 5.


1445 AAY28679 Homo SapiensGEMY Human nh328 5 secreted135 47


rotein.


1446 g119744390Homo sapiensretinoic acid inducible 247 54
in


neuroblastoma cells RAINB
1 d


1446 g119744388Homo Sapiensretinoic acid inducible 247 54
in


neuroblastoma cells RAINB
1


1446 AAY85565 Homo SapiensJANC Human homologue 240 52
of UNC-53


(Hs-UNC-53/2) se uence.


1447 AAU19716 Homo SapiensHUMA- Human novel extracellular71 31


matrix protein, Seq ID
No 366.


1447 g118025476cercopithicineBPLF1 71 38


he esvirus
15


1447 AAS 14575_Homo SapiensMILL- Human cDNA encoding69 62
G


aal protein-coupled receptor,
GPCR,


52872.


1448 g114027507Mesorhizobiumsalicylate hydroxylase 69 31


loti


1449 AAG64798 Homo sapiensSREH- Human peptide methionine192 . 71


sulphoxide reductase
(hPMSR).


1449 AAB81893 Homo SapiensSEQU- Human genomic database192 71


related protein SEQ ID
NO: 38.


1449 AAM42046 Homo SapiensHYSE- Human polypeptide 192 71
SEQ ID


NO 6977.


1450 g118249657Mus musculusNC8 1063 80


1450 1406748 Mus musculuszinc finger protein 250 37


1450 AAB43498 Homo SapiensHUMA- Human cancer associated249 37


rotein sequence SEQ ID
N0:943.


1451 ABB89331 Homo SapiensHUMA- Human polypeptide 732 88
SEQ ID


NO 1707.


1451 g113421927CaulobacterMaoC family protein 273 42


crescentus
CB15


1451 g119338616MethylobacteriuR-specific enoyl-CoA 261 44
hydratase


m extorquens


1452 gi~20908171~Mus musculussimilar to NADPH oxidase68 30
3; NADPH


ref~XP_1397 oxidase catalytic subunit-like
3


15.1


1452 gi~17533619~CaenorhabditisF32A5.8.p 67 42


ref~NP_4955elegans


16.1


1453 gi~15614051~Bacillus sodium-dependent phosphate65 34


reflNP halodurans traps orter
2423




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
160
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


54.1 ~


1454 gi~17551878~CaenorliabditisTPRDomain 76- 29


ref~NP_4990elegans


90.1


1455 AAM40727 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID


NO 5658.


1455 AAM38941 Homo SapiensHYSE- Human polypeptide 191 56
SEQ ID


NO 2086.


1455 gi19702127Homo sa P-Rexl rotein 191 56
iens


1456 ABB05666 Homo SapiensGEHU- Human nucleic acid496 91


management rotein clone
amy2 l 1n4.


1456 AAE03372 Homo SapiensHUMA- Human gene 18 encoded496 91


secreted protein fragment,
SEQ ID


N0:152.


1456 AAE03371 Homo SapiensHUMA- Human gene 18 encoded496 91


secreted protein fragment,
SEQ ID


N0:150.


1457 AAM66940 Homo SapiensMOLE- Human bone marrow 290 77


expressed probe encoded
protein SEQ


ID NO: 27246.


1457 AAM54534 Homo SapiensMOLE- Human brain expressed290 77
single


exon probe encoded protein
SEQ ID


NO: 26639.


1457 AAM64410 Homo SapiensMOLE- Human brain expressed287 77
single


exon probe encoded protein
SEQ ID


NO: 36515.


1458 AAB53445 Homo SapiensHUMA- Human colon cancer335 100
antigen


rotein se uence SEQ ID
N0:985.


1458 AAY30055 Homo SapiensARIA- Amino acid sequence165 91
of a


FK506-binding protein
(FKBP).


1458 AAQ52277_Homo sapiensVERT- FK506 binding protein159 100


aal (FKBP12A) cDNA.


1460 AAU20255 Homo SapiensHUMA- Human novel endocrine104 76


antigen, SEQ ID No 312.


1460 ABB 17663Homo SapiensHUMA- Human nervous system94 77
related


pol a tide SEQ ID NO
6320.


1460 AA002331 Homo SapiensHYSE- Human polypeptide 88 61
SEQ ID


NO 16223.


1461 AAM65951 Homo SapiensMOLE- Human bone marrow 97 57


expressed probe encoded
protein SEQ


ID NO: 26257.


1461 AAM53568 Homo SapiensMOLE- Human brain expressed97 57
single


exon probe encoded protein
SEQ ID


NO: 25673.


1461 AAU83199 Homo sapiensZYMO Novel secreted protein96 38


Z891639G1P.


1463 15565687 Homo sa topoisomerase-related 514 75
iens function protein


1463 15139669 Homo SapiensLAK-1 468 75


1463 g121430468Drosoplula LP06848p 332 51


melano aster


1464 AAY91421 Homo sapiensHUMA- Human secreted 109 35
protein


sequence encoded by gene
7 SEQ ID


N0:142.


1464 AAY91396 Homo SapiensHUMA- Human secreted 109 35
rotein




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
161
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


sequence encoded by gene
7 SEQ ID


N0:117.


1464 AAY91352 Homo SapiensHUMA- Human secreted 109 35
protein


sequence encoded by gene
7 SEQ ID


N0:73.


1465 AAU15978 Homo SapiensHUMA- Human novel secreted575 100
protein,


Se ID 931.


1465 AAU15958 Homo SapiensHUMA- Human novel secreted575 100
protein,


Se ID 911.


1465 116041675Homo sa 'oined to JAZF1 575 100
iens


1466 AA001502 Homo SapiensHYSE- Human polypeptide 173 66
SEQ ID


NO 15394.


1466 gi~10947038~Homo Sapiensankyrin 1, isoform l; 74 28
anlcyrin-1,


ref~NP erythrocytic; ankyrin-R
0652


09.1 ~


1466 gi~10947036~Homo Sapiensankyrin 1, isoform4; 74 28
ankyrin-1,


reflNP erythrocytic; ankyrin-R
0652


08.1


1467 g119354550Mus musculussimilar to src homology 842 91
three (SH3)


and cysteine rich domain


1467 AAU17352 Homo SapiensHUMA- Novel signal transduction361 98


athway rotein, Se ID
917.


1467 g11799566Mus musculusstet 302 44


1468 g113506771Mus musculusstructural protein FBF1 767 74


1468 g17549210Babesia 200 lcDa antigen p200 213 29


bigemina


1468 g11747 Oryctolagustrichohyalin 191 30


cuniculus


1469 111345048Homo SapiensSCAN domain-containing 86 32
rotein 2


1469 111320940Homo SapiensSCAND2 86 32


1469 g114210722Tupaia t41 86 30


herpesvirus


1470 AAY88278 Homo SapiensMILL- Human TANGO 188 1442 100
rotein.


1470 114336711Homo Sapienssimilar to C. Elegans 1442 100
protein F17C8.5


1470 AAA39947'Homo SapiensMILL- Human TANGO 188 1438 99
cDNA.


aal


1471 AAE10204 Homo SapiensHYSE-Humen bone marrow 71 44
derived


contig protein, SEQ ID
NO: 69.


1471 AAA23458 Homo SapiensALPH- cDNA encoding human67 46


_ secreted protein vpl5_l,
aal SEQ ID


N0:71.


1471 AAB80228 Homo sa GETH Human PR0269 protein.67 46
iens


1472 AAB88433 Homo SapiensHELI- Human membrane 136 86
or secretory


rotein clone PSEC0210.


1472 AAB95155 Homo SapiensHELI- Human protein sequence136 86
SEQ


ID N0:17188.


1472 AAE01745 Homo SapiensHUMA- Human gene 2 encoded136 86


secreted protein HOGCS52
variant,


SEQ ID N0:160.


1473 g19294201Arabidopsisdisease resistance protein70 24


thaliana


1474 AAE1915 Homo SapiensTHOR/ Human lcinase polypeptide631 98
7


(PKIN-15).


1474 AAM79131 Homo SapiensHYSE- Human protein SEQ ~ 494 ~ 72
ID NO




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
162
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1793.


1474 AAW 19920Homo sapiensREGC Human I~sr' (kinase494 72
suppressor


of Ras).


1475 AAD 12609_Homo SapiensSAGA Human protein having657 73


aal hydrophobic domain encoding
cDNA


clone HP03974.


1475 AA014199Homo Sapiens1NCY- Human transporter 657 73
and ion


channel TRICH-16.


1475 AAE06614Homo SapiensSAGA Human protein having657 73


hydrophobic domain, HP03974.


1476 113905246Mus musculusRIKEN cDNA 2410024K20 71 34
gene


1476 gi~17505208~Mus musculusCD2 antigen (cytoplasmic71 34
tail) binding


ref~NP'0816 protein 2; 1500011B02Rik


29.1
~


1477 g1806491Rarius guanylylcyclase 140 65


norvegicus


1477 g12648066Canis familiarisguanylate cyclase E 118 55


1477 g12623074Bos taurus rod outer segment guanylate116 55
cyclase


precursor


1478 12065210Mus musculusPro-Pol-dUTPase polyprotein585 73


1478 118676710Homo SapiensFLJ00254 protein 408 69


1478 AA004042Homo SapiensHYSE- Human polypeptide 392 75
SEQ ID


NO 17934.


1479 AAU05396Homo SapiensGEHO Human titin (connectin)208 29
protein


sequence.


1479 g11212992Homo SapiensProtein sequence and 208 29
annotation


available soon via Swiss-Prot;
available


at present via e-mail
from


LABEIT EMBL-Heidelber
.DE


1479 g117066105Homo sa iensTitin 208 29


1480 AAV44685,Homo SapiensTEXA Osteoclast inhibitor94 41
protein,


aal OIP-1, coding sequence.


1480 AAB35287Homo sa iensUROG- Human stem call 94 41
antigen-2.


1480 AAY99709Homo SapiensREGC Human stem cell 94 41
antigen-2,


hSCA-2.


1481 AAB57094Homo SapiensROSE/ Human prostate 122 100
cancer antigen


protein sequence SEQ
ID N0:1672.


1481 g132672 Homo Sapiensinterferon alphalbeta 122 100
receptor


1481 AAQ49625-Homo SapiensEUBI- Human interferon 118 96
receptor


aal extracellular domain
codin se uence.


1482 AAD17516_Homo SapiensSENO- Human taste receptor,890 94
hTlR1


aal cDNA coding sequence.


1482 ABB77319Homo Sapiens1NCY- Human G-protein 890 94
coupled


rece for SEQ ID NO 3.


1482 AAE10372Homo SapiensSEND- Human taste receptor,890 94
hTlR1


rotein.


1483 g118376312Neurospora related to SSD1 protein 109 39


crassa


1483 g12645173Schizosaccharomsts5+ 99 42


yces ombe


1483 g12459997Candida albicansrotein phosphatase Ssdl 99 40
homolog


1484 gi~18569064~Homo Sapienssimilar to 40S RIBOSOMAL319 96


ref~XP-0953 PROTEIN S3A (V-FOS


78.1 TRANSFORMATION EFFECTOR
~




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
163
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


PROTEIN


1484 gi~20539276~Homo Sapienssimilar to olfactory 259 94
receptor MOR145-


ref~XP_0952 2


20.2


1484 gi~21295882~Anopheles agCP1347 68 32


gb~EAA080gambiae
str.


27.1 PEST


1485 ABB 11761Homo SapiensHYSE- Human secreted 197 36
protein


homologue, SEQ ID NO:2131.


1485 gi930259 Woolly monkeyreverse transcriptase 148 33
(476 AA)


sarcoma
virus


1485 gi18076262porcine Pol protein 147 38


endogenous


retrovirus


1486 AAM74887 Homo SapiensMOLE- Human bone marrow 172 100


expressed probe encoded
protein SEQ


ID NO: 35193.


1486 AAM62085 Homo sapiensMOLE- Human brain expressed172 100
single


exon probe encoded protein
SEQ ID


NO: 34190.


1486 1152661 Plasmid neomycin resistance rotein75 26
SB24.2


1487 112653493Homo sa Similar to brain acid-soluble75 34
iens protein 1


1487 g117428832Ralstoilia PROBABLE AVRBS3-LIKE 75 33


solanacearuxnPROTEIN


1487 g17329672Arabidopsisphosphatidate cytidylyltransferase-like72 46


thaliana protein


1488 AAU74754 Homo SapiensINCY- Human protease 2042 83
PRTS-14


rotein se uence.


1488 AAU74752 Homo SapiensINCY-Human protease PRTS-12476 39


protein sequence.


1488 111935122Mus musculusa ilin 431 40


1489 gi~17543712~CaenorhabditisYSSF3C.8.p 72 32


ref~NP-4999elegans


76.1


1489 gi~20344600~Mus musculusRIKEN cDNA 4933431K05 70 30


ref~XP_1095


79.1


1489 gi~11692798~Xenopus ataxia telangiectasia 69 26
laevis and Rad3-related


gb~AAG400 protein


02.1 ~AF320


125 1


1490 AAB95817 Homo SapiensHELI- Human protein sequence256 63
SEQ


ID N0:18817.


1490 ABB06369 Homo SapiensBODE- Human neurogenesis173 64
related


rotein 12 SEQ ID N0:2.


1490 AAB44394 Homo sapiensHUMA- Gene 10 encoded 83 66
human


secreted protein fragment
as BLASTX


query se uence.


1491 g1438795 Mus musculusserotonin 1A receptor 73 26


1491 g11066326Mus musculusserotoninlA receptor 72 26


1491 gi~438795~gbMus musculusserotonin 1A receptor 73 26
.


AAA 16850.


1~


1492 g116198083Drosophila LD29875p ~ 87 ~ 33




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
164
Table 2
SEQ AccessionSpecies Description Score


No. Identity


NO:


melano aster


1492 gi2327063Pneumocystisprotease 1 75 34


carinii f.
Sp.


carinii


1492 120420 Prunus dulcisextensin 75 34


1493 AAG67087 Homo SapiensSHAN- Human ATP-dependent106 67
serine


rotein hydrolase 13.


1493 AAM76636 Homo SapiensMOLE- Human bone marrow103 68


expressed probe encoded
protein SEQ


ID NO: 36942.


1493 AAM63822 Homo SapiensMOLE- Human brain expressed103 68
single


exon probe encoded protein
SEQ ID


NO: 35927.


1494 AAY31225 Homo SapiensAVET Human RNA helicase73 38
p135


protein.


1494 g13123906Homo sa ienspre-mRNA splicin factor73 38


1494 g113278975Homo Sapienspre-mRNA splicing factor73 38
similar to S.


cerevisiae P 16


1495 gi~17568307~Caenorhabditiscollagen 74 35


ref~NP-5098elegans


37.1 ~


1496 12065210 Mus musculusPro-Pol-dUTPase polyprotein410 81


1496 gi~10834720~Homo SapiensPP565 301 77


gb~AAG237


90.1~AF258


587 1


1496 gi~6753924~rMus musculusFriend virus susceptibility127 37
1


ef~NP_0343


74.1


1497 g120901968CaenorhabditisC. elegans RPL-36 protein71 34


elegans (comes ondin sequence
F37C12.4)


1497 gig 17554754CaenorhabditisRibosomal protein YL39 71 34


ref~NP elegans
4985


73.1


1498 g15305335Mycobacteriumproline-rich mucin homolog102 27


tuberculosis


1498 g1330130 human latency associated transcript97 37
(LAT)


herpesvirus ORF-2
1


1498 AAU83682 Homo SapiensGETH Human PRO protein,94 30
Seq ID No


182.


1499 AAY57937 Homo Sapiens1NCY- Human transmembrane199 81
protein


HTMPN-61.


1499 AAY36295 Homo SapiensHUMA- Human secreted 151 100
protein


encoded by gene 72.


1499 AAG75708 Homo SapiensHUMA- Human colon cancer141 92
antigen


rotein SEQ ID N0:6472.


1500 g121428712Drosophila SD05267p 165 54


melanogaster


1500 g120975274Homo Sapiensskeletrophin 114 40


1500 g119773434Mus musculusskeletrophin 99 52


1501 ABB 17830Homo SapiensHUMA- Human nervous 82 37
system related


pol epode SEQ ID NO
6487.


1501 AA012929 Homo SapiensHYSE- Human polypeptide73 43
SEQ ID


NO 26821.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
165
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1502 gi8778340ArabidopsisF15O4.13 77 39


thaliana


1503 AAW03515 Homo sa SHKJ Human DOCK180 protein.144 33
iens


1503 11339910 Homo sa DOCK180 protein 144 33
iens


1503 113195147Mus musculusHCH 129 25


1505 AAM70790 Homo SapiensMOLE- Human bone marrow 77 53


expressed probe encoded
protein SEQ


ID NO: 31096.


1505 AAM58316 Homo SapiensMOLE- Human brain expressed77 53
single


exon probe encoded protein
SEQ ID


NO: 30421.


1505 gi~21302711~Anopheles agCP4916 77 30


gb~EAA148gambiae
sir.


56.1 PEST


1506 AAU75102 Homo sa MYRI- Heat shock protein592 79
iens 8 (HspB).


1506 AAB82535 Homo SapiensUYCO- Human heat shock 592 79
protein


Hsc70.


1506 AAE12987 Homo SapiensSRIV/ Human Hsp70 family592 79


homologue, Hsc70.


1507 ABL53627 Homo SapiensGENO- Breast protein-eukaryotic213 92


_ conserved gene 1 (BSTP-ECG1)
aal


cDNA.


1507 ABB75677 Homo SapiensGENO- Breast protein-eukaryotic213 92


conserved gene 1 (BSTP-ECG1)


protein.


1507 AAY99421 Homo sapiensGETH Human PRO1433 (UNQ738)213 92


amino acid se uence SEQ
ID N0:292.


1508 AAW 15565Homo SapiensUYJO Human intracellular79 29
tyrosine


kinase Tnkl-al ha.


1508 g1233062 Gallus gallussrc dovcmstream region 78 33


1508 g118376366Neurospora related to ribosomal 72 30
protein S 15


crassa precursor (mitochondrial)


1509 gi~21297482~Anopheles agCP15541 68 36


gb~EAA096gambiae
str.


27.1 PEST


1510 AAM41631 Homo SapiensHYSE- Human polypeptide 127 37
SEQ ID


NO 6562.


1510 AAM39845 Homo sapiensHYSE- Human polypeptide 127 37
SEQ ID


NO 2990.


1510 AAM79502 Homo SapiensHYSE- Human protein SEQ 127 37
ID NO


3148.


1511 g121217669Mus musculusm osin IIIA 70 28


1511 gi~21302393~Anopheles agCP8799 71 36


gb~EAA145gambiae
str.


38.1 PEST


1511 gi~20822589~Mus musculussimilar to myosin IIIA 70 28


ref~XP,1408


54.1 ~


1512 g16911049Babesia p9.6.2-like variant erythrocyte82 28
bovis surface


antigen-la


1512 g16911045Babesia p9.6.2 variant erythrocyte82 28
bovis surface


antigen-la


1512 g16911047Babesia p8.4.1 variant erythrocyte81 28
bovis surface


antigen-la




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
166
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1513 gi10174843Bacillus maltose transport system77 25
(permease)


halodurans


1513 gi56312 Rattus Gephyrin 76 31


norvegicus


1513 gi4325371Arabidopsis contains similarity to 74 28
Medicago


thaliana truncatula N7 protein
(GB:Y17613)


1514 AAY14196Homo SapiensTAKEI T cell receptor 95 100
zeta chain


protein sequence.


1514 1623042 Homo SapiensT-cell receptor zeta 95 100
chain


1514 14960202Sus scrofa CD3 zeta chain 95 100


1515 ABB07508Homo SapiensINCY- Human aminoacyl 726 100
tRNA


synthetase (ATRS) polypeptide
(ID:


7474756CD 1 ).


1515 AAB43670Homo SapiensHUMA- Human cancer associated604 82


rotein sequence SEQ ID
NO:1115.


1515 g11464742Homo sa iensthreonyl-tRNA synthetase604 82


1516 g121109348Xanthomonas cytochrome B561 77 29


axonopodis
pv.


citri str.
306


1516 g121114046Xanthomonas cytochrome B561 76 28


campestris
pv.


campestris
str.


ATCC 33913


1516 gi~21243760~Xanthomonas cytochrome B561 77 29


reflIVP-6433axonopodis
pv.


42.1 citri str.
306


1517 ABB 11450Homo SapiensHYSE- Human neurotoxin 119 33
homologue,


SEQ ID N0:1820.


1517 18809770Mus musculusLy-6I.1 94 30


1517 18809768Mus musculuslymphocyte antigen LY6I 94 30
recursor


1519 gi~59977~emHuman tripartite fusion transcript171 67
PLA2L


b~CAA7866endogenous


2.1 ~ retrovirus


1519 gi~17826947~Pseudomonas beta-1,4-xylanase 73 34
sp.


dbj~BAB792ND137


87.1
~


1519 gi~21232680~Xanthomonas ribonuclease PH 72 30


ref~NP_6385campestris
pv.


97.1 campestris
~ str.


ATCC 33913


1520 AAM78023Homo sapiensMOLE- Human bone marrow 190 100


expressed probe encoded
protein SEQ


ID NO: 38329.


1520 AAM65326Homo sapiensMOLE- Human brain expressed190 100
single


exon probe encoded protein
SEQ ID


NO: 37431.


1520 g113447468Emericella FH1/FH2 protein homolog 121 49


nidulans


1522 AAG81417Homo SapiensZYMO Human AFP protein 287 100
sequence


SEQ ID N0:352.


1523 AAY90349Homo SapiensSMII~ Human fatty acid 158 85
synthase


(FAS) protein sequence.


1523 AAB43871Homo SapiensHLTMA- Human cancer associated158 85


rotein se uence SEQ ID
N0:1316.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
167
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1523 1915392 Homo Sapiensfatty acid synthase 158 85


1525 AAG03819 Homo SapiensGEST Human secreted protein,93 100
SEQ ID


NO: 7900.


1525 11311466 Homo sa 24-kDa subunit of Com 93 100
iens lex I


1525 g1188852 Homo SapiensNADH-ubi uinone reductase93 100


1526 AAD02855_Homo SapiensSUKA Human platelet membrane73 31


aal lycoprotein VI (GPVI)
cDNA.


1526 AAB49403 Homo SapiensMERE Human glycoprotein 73 31
VI mature


protein.


1526 AAB61257 Homo SapiensMILL- Mature human TANGO73 31
268


rotein.


1527 g117864896Mus musculusrotocadherin 18 precursor81 31


1527 g115980222Yersinia aconitate hydratase 1 79 30
pestis


1527 g112248353Fasciola NADH dehydrogenase subunit75 56
hepatica 5


1528 g12440214Trypanosomainvariant surface glycoprotein83 28
100


bruceibrucei


1528 g110567463Rhizobium probable viral gene 78 22


rhizogenes
.


1529 g12231279Porcine envelope protein 66 31


reproductive
and


respiratory


syndrome
virus


1530 gi~199851~gbMus musculuspot protein 257 42


~AAA39757.


1~


1530 gi~1498648~gMus musculusGag-Pol polyprotein 257 42


b~AAB0645


0.1~


1530 gi~331995~gbAKV marine gag-pot polyprotein (tag257 42
amber codon


~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)


1~


1533 g1435698 Homo sa CD44SP 136 100
iens


1533 AAV63461_Homo SapiensGEHO Human CD44 antigen 130 100
cDNA.


aal


1533 AAT14724_Homo SapiensGEHO Human haematopoietic130 100
CD44


aal cDNA clone CD44.5.


1534 g12622165Methanothermobacetyltransferase 71 29


acter


thermautotrophic


us str.
Delta H


1534 gi~15679078~Methanothermobacetyltransferase 71 29


ref~NP_2761acter


95.1 ~ thermautotrophic


us


1535 g17777 Drosophila protein H 73 28


melanogaster


1535 g1457146 Plasmodium rhoptryprotein 73 38


yoelii


1535 g113195258Plasmodium 235 kDa rhoptry protein 73 38


yoelii yoelii


1536 ABB09740 Homo sapiensBODE- Amino acid sequence132 43
of human


protein hos hatase 11.66.


1536 gi~20830386~Mus musculussimilar to importin alpha72 35
1b


reflXP
1456




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
168
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


42.1


1537 gi14039907Rattus cytochrome P450 monooxygenase353 39


norvegicus CYP2T1


1537 gi2920650Mus musculuscytochrome P450 CYP2B19 275 44


1537 12353336 Capra hircuscytochrome P450 271 31


1538 AAU83175 Homo SapiensZYMO Novel secreted protein282 100


Z874015G4P.


1538 g16714803Streptomycesintegral membrane protein.77 26


coelicolor
A3(2)


1539 g112963397Prunus x ribulose-1,5-bisphosphate74 32


yedoensis carboxylase/oxygenase
lar a subunit


1539 g1466436 SaccharomycesBOI1 69 31


cerevisiae


1539 g15833897Besleria ribulose 1,5-bisphosphate69 31
affinis carboxylase


large subunit


1542 AAY32193 Homo SapiensINCY- Human receptor 73 26
molecule


(REC) encoded by Incyte
clone


044150.


1542 g17576677HelicobacterIceAl 72 44


ylori


1542 gi~20841498~Mus musculussimilar to MUF1 protein 73 26


re~XP_l
315


41.1


1546 114581448Homo SapiensFSHD Region Gene 2 protein73 42


1546 g115982852ArabidopsisAT5g66850/MUD21_ll 71 34


thaliana


1546 gi~14581448~Homo SapiensFSHD Region Gene 2 protein73 42


gb~AAK219


77.1 ~


1547 g118676660Homo sa FLJ00229 protein 192 92
iens


1547 AAU21409 Homo SapiensHUMA- Human novel foetal179 100
antigen,


SEQ ID NO 1653.


1547 AAM42128 Homo SapiensHYSE- Human polypeptide 114 53
SEQ ID


NO 7059.


1548 AAG64494 Homo SapiensSHAN- Human natriuretic 539 100
peptide


receptor 18.


1548 118676710Homo sa FLJ00254 rotein 268 77
iens


1548 AAB28764 Homo SapiensHUMA- Sequence homologous249 72
to


rotein fragment encoded
by gene 21.


1549 AAB67055 Homo Sapiens1NCY- Human immune response606 82


molecule (IMUN) protein
SEQ ID NO:


9.


1549 AA001862 Homo SapiensHYSE- Human polypeptide 404 72
SEQ ID


NO 15754.


1549 gi~6753924~rMus musculusFriend virus susceptibility213 36
1


ef~NP
0343
_


74.1 ~


1550 1190129 Homo Sapiens70kDa peroxisomal membrane92 100
protein


1550 g1825711 Homo Sapiens7bkD peroxisomal integral92 100
membrane


protein


1550 g1220862 Rattus PMP70 89 94


norve icus


1551 AAM69543 Homo SapiensMOLE- Human bone marrow 228 100


expressed robe encoded
rotein SEQ




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
169
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


ID NO: 29849.


1551 AAM57148 Homo SapiensMOLE- Human brain expressed228 100
single


exon probe encoded protein
SEQ ID


NO: 29253.


1551 AAB93944 Homo SapiensHELI- Human protein 94 57
sequence SEQ


ID N0:13960.


1552 gi4884924Rangiferine glycoprotein C 75 34


he esvirus
1


1552 gi~18556240~Homo sapienssimilar to Salivary 78 30
glue protein SGS-3


ref~~ precursor
0676


28.2


1552 gi~4884924~gRangiferine glycoprotein C 75 34


b~AAD3187herpesvirus
1


6.1~


1553 gi~2193870~dMus musculusreverse iranscriptase 176 35


bj ~BAA2041


9.1


1553 gi~2731767~gMus musculusendonuclease/reverse 176 35
transcriptase


b~AAC5354


2.1


1554 ABB08776 Homo SapiensBODE- Human neuregulin 75 29
55 SEQ ID


NO 2.


1554 AAM92816 Homo SapiensHUMA- Human digestive 71 29
system


antigen SEQ ID NO: 2165.


1554 gi~6322838~rSaccharomycesProtein required for 70 27
cell viability;


ef~NP cerevisiae Yk1014cp
0129


_
11.1


1555 gi7528184Drosophila bicoid-interacting protein78 28
B1N3


melanogaster


1555 gi15292595Drosophila SD09926p 78 28


melanogaster


1555 gi4514620Mus musculusRor2 71 24


1557 ABA91504_Homo SapiensEYEE- Human epidermal 144 93
growth factor


aal rece for recursor cDNA.


1557 AAF85332_Homo SapiensNOVS Nucleotide sequence144 93
of wild


aal a EGFRl.


1557 AAM50768 Homo SapiensEPEE- Human epidermal 144 93
growth factor


receptor precursor.


1558 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein221 100


cysteine methyltransferase
14.


1558 AAU16267 Homo SapiensHUMA- Human novel secreted221 100
protein,


Seq ID 1220.


1558 ABB 11507Homo SapiensHYSE- Human secreted 183 97
protein


homologue, SEQ ID N0:1877.


1559 gi14599730Sachea correaematurase 71 28


1559 gi14599648Blepharandramaturase 71 30


hetero etala


1559 gi14599673Galphimia maturase 70 28


acilis


1560 gi2323287multiple polyprotein 340 83


sclerosis


associated


retrovirus


1560 gi 13310191multiple recombinant envelope 260 70
protein




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
170
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


gb~AAK181sclerosis


89.1~AF331associated


500_1 retrovirus


element


1560 gi~21103962~Homo Sapiensenverin-2 248 84


gb~AAM331


41.1


1561 AAB94698 Homo SapiensHELI- Human protein sequence107 95
SEQ


ID NO:15680.


1561 AAU18480 Homo SapiensHUMA- Human endocrine 107 95
polypeptide


SEQ ID No 435.


1561 ABB 10288Homo sapiensHUMA- Human cDNA SEQ 107 95
ID NO:


596.


1562 gi969078 Drosophila S-adenosylhomocysteine 73 26
hydrolase


melanogaster


1562 gi21064553Drosophila RE58316p 73 26


melano aster


1562 AAM41205 Homo SapiensHYSE- Human polypeptide 72 30
SEQ ID


NO 6136.


1563 gi1778844DictyosteliumLimA 71 34


discoideum


1563 gi~20985456~Mus musculussimilar to actin beta 75 36
chain - human


ref~XP-1421


11.1


1563 gi~1778844~gDictyosteliumLimA 71 34


b~AAB4092discoideum


9.1~


1564 gi~9507757~rPlasmid resolvase 507 91
F


etlNP_0614


23.1


1564 gi~148589~gbPlasmid Protein D 507 91
F


~AAA24900.


1~


1564 gi~10955295~Escherichiaresolvase 501 90
coli


retlNP_0526


36.1


1565 gi7649370Arabidopsisguanine nucleotide-exchange-like77 38


thaliana rotein


1565 gi1674160Mycoplasma involved in cytadherence,71 35
see:


neumoniae MPN142


1565 gi~15229258~Arabidopsisguanine nucleotide-exchange77 38
- like


ref~NP_1899thaliana protein


16.1


1566 gi1799600SwissProt similar to 1051 99


Accession


Number P31458


1566 gi13814506Sulfolobus Mandelate racemase /muconate286 35


solfataricuslactonizing enzyme related
protein


(MR/MLE)


1566 gi10640034Thermoplasmastarvation-sensing protein270 35
rspA related


acido hilumprotein


1567 gi13359972Escherichiaacridine efflux pump 573 98
coli


0157:H7


1567 gi1773144Escherichiaprobable transmembrane 573 98
coli protein AcrE




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
171
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1567 gi532311 Escherichia114 kDa rotein 573 98
coli


1569 gi8918871YccA of 96 pct identical to gp:AB021078288 98
plasmid 30


ColIb-P9]


[Plasmid
F


1569 gi~17136976~Drosophila repo-P1; Antibody RK2 71 33


ref~NP_4770melanogaster


26.1)


1569 gi~6502544~gGlomus homeobox protein HB 1 70 31


b~AAF14351intraradices


.1~AF11019


81


1570 gi13363792Escherichiazinc-transporting ATPase410 87
coli


0157:H7


1570 gi466605 EscherichiaNo definition line found410 87
coli


1570 gi12518128Escherichiazinc-transporting ATPase410 87
coli


0157:H7


EDL933


1571 AAU83186 Homo SapiensZYMO Novel secreted protein1006 100


Z887014G7P.


1571 gi7248459Zea mays arabinogalactan protein 85 29


1571 gi3513742Arabidopsiscontains similarity to 82 35
Zea mays


thaliana embryogenesis transmembrane
protein


(GB:X97570)


1572 gi12597465CaenorhabditisCED-1 72 44


elegans


1572 gi19571666Caenorhabditissimilar to EGF-like domain72 44


elegans


1572 gi4883938Drosophila laminin alphal,2 67 31


melanogaster


1573 ABB12490 Homo sapiensHYSE- Human bone marrow 106 38
expressed


rotein SEQ ID NO: 329.


1574 11478205 Mus musculusPNG rotein 75 41


1574 AAM40148 Homo SapiensHYSE- Human polypeptide 69 56
SEQ ID


NO 3293.


1574 AAM79341 Homo SapiensHYSE- Human protein SEQ 69 35
ID NO


2987.


1576 gi~20882651~Mus musculusATPase, class 2, member 234 91
b


ref~XP_1233


03.1


1576 gi~7656918~rMus musculusATPase, class 2, member 234 91
b; ATPase


ef]NP_0566 9B, class II; ATPase
9B, p type


20.1 ~


1577 g118143418Alteromonaschitinase A 77 39
Sp.


O-7


1577 g115426105Leishmania probable surface antigen75 24
protein


ma'or


1578 119702241Homo Sapiensrabconnectin 439 93


1578 g17452946Homo SapiensX-like 1 protein 132 41


1578 g11279384Drosophila X 109 29


melanogaster


1580 AAE20337 Homo SapiensHUMA- Human B7-H11 protein122 23


mature extracellular
domain.


1580 AAE20336 Homo SapiensHUMA- Human B7-H11 protein122 23


extracellular domain.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
172
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1580 gi2062702Homo sa butyrophilin 122 23
iens


1581 AAE18640 Homo SapiensINCY- Human G-protein 70 35
coupled


rece for (GCREC-1).


1581 118369751Oryza sativaethylene res onsive rotein70 50


1581 g115217292Oryza sativa]Putative AP2 domain containing70 50


[Oryza sativaprotein


(japonica


cultivar-
oup)


1583 g16468047Homo SapiensKrup el-like factor 85 73


1583 g15916096Homo SapiensKru pel-like factor LKLF85 73


1583 g14583418Homo SapiensKruppel-like zinc forger85 73
transcription


factor


1585 g12570021Homo Sapienspaired box containing 77 .37
transcription


factor


1585 13115988 Homo SapiensdJ394P2-1.1 (PAX-7) 77 37


1585 12570015 Homo sa alternative 77 37
iens


1586 g17861533Rattus retina specific protein 72 43
PAL


norvegicus


1586 g120977028Xenopus mitotic hosphoprotein 72 34
laevis 39


1586 AAB58458 Homo SapiensROSE/ Lung cancer associated68 39


polype tide se uence
SEQ ID 796.


1587 g15901864Drosophila BcDNA.LD27873 81 24


melanogaster


1587 g115458514StreptococcusPneumococcal histidine 78 27
triad protein D


neumoniae precursor
R6


1587 15042400 Homo sa NFI-X3=transcription 75 30
iens factor AA


1592 g14210501Homo sa BC85722_1 253 61
iens


1592 g114794910Homo sa ca icua protein 253 61
iens


1592 114794914Mus musculusca icua protein 253 61


1593 gi~8131854~gTrypanosomaantigen JL8 69 34


b~AAF73108cruzi


.1 CAF
14795


61


1595 g118892729Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase


furiosus
DSM


3638


1595 gi~20847046~Mus musculussimilar to Transcription70 28
factor BTF3


ref~XP_1366 (RNA polymerise B transcription


21.1 factor 3)


1595 gi~18977088~Pyrococcus 3-hydroxyisobutyrate 70 27
dehydrogenase


ref~NP_5784furiosus
DSM


45.1 3638


1597 AAU83621 Homo SapiensGETH Human PRO protein, 151 42
Seq ID No


60.


1597 AA005826 Homo SapiensHYSE- Human polypeptide 146 83
SEQ ID


NO 19718.


1597 AAM41346 Homo SapiensHYSE- Human polypeptide 102 46
SEQ ID


NO 6277.


1598 AAM79503 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO


3149.


1598 AAM78519 Homo SapiensHYSE- Human protein SEQ 80 35
ID NO


1181.


1598 g118676526Homo sa FLJ00160 rotein 80 35
iens


1599 g12149640ArabidopsisAr~onaute protein 72 33




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
173
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


thaliana


1599 gi15027491respiratoryglycoprotein 71 32


syncytial
virus


1599 gig 15221177Arabidopsisleaf development protein72 33
Argonaute


reflNP-1752thaliana


74.1


1601 gi17130010Nostoc Sp. WD-40 repeat protein 136 28
PCC


7120


1601 gi1653631Synechocystisbeta transducin-like 131 26
protein


s . PCC '
6803


1601 gi17135261Nostoc Sp. WD-40 repeat protein 115 27
PCC


7120


1602 gi1103853Rattus rHAPl-A 89 33


norve icus


1602 gi1103851Rattus huntingtin associated 89 33
protein


norve icus


1602 gi14579673Takifugu pericentriolar material 87 30
1 protein


rubripes


1603 gi537446 ArabidopsisAtHSP101 75 31


thaliana


1603 gi12324908Arabidopsisheat shock protein 101; 75 31
13093-16240


thaliana


1603 gi6715468Arabidopsisheat shock protein 101 75 31


thaliana


1604 12190531 Vibrio choleraemethyl acceptin chemotaxis71 26
rotein


1604 g19657614Vibrio choleraehemolysin secretion protein71 26
HyIB


1604 g19655306Vibrio choleraeheat shock rotein E 70 35


1605 g13912936Geobacillusornithine carbamoyltransferase68 31


stearothermophil


us


1606 g18797 Drosophila CYS3HIS finger protein 678 51


melano aster


1606 g115291975Drosophila LD33756p 617 65


melanogaster


1606 g16967181Homo Sapiensc399E4.1 (similar to 549 75
D.melanogaster


unkem t protein.)


1607 gi~21301783~Anopheles agCP8730 72 35


gb~EAA139gambiae
str.


28.1 PEST


1607 gi~21361276~Homo Sapiensinterferon-stimulated 68 29
transcription


ref~NP_0060 factor 3, gamma (48kD);
interferon-


75.2~ stimulated gene factor
3, gamma


subunit (48 kD)


1609 g12661094Spinacia cold acclimation protein76 32


oleracea


1612 gi~1780975~eHuman gag protein 312 34


mb~CAA714endogenous


18.1 ~ retrovirus
K


1612 gi~5802810~gHomo SapiensGag-Pro-Pol protein 309 34


b~AAD5179


1.1~


1612 gi~887448~eHuman gag 309 34


mb~CAA513endogenous


06.1 ~ retrovirus




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
174
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1613 AA013889Homo SapiensHYSE- Human polypeptide 73 42
SEQ ID


NO 27781.


1614 111065727Homo sa iensdJ493F7.1 (similar to 347 100
marine BET3)


1614 g12791806Mus musculusbeta 253 69


1614 113277654Mus musculusBet3 homolo (S. cerevisiae)253 69


1615 g11122901SaccharomycesMSP8 77 20


cerevisiae


1615 g1825546SaccharomycesCatBp 77 20


cerevisiae


1615 g117978563Xeno us laevisSpl-like zinc-finger 75 40
protein XSPR-1


1616 AAY02536Homo SapiensICOS- Human ICAM-6 protein458 98


sequence.


1616 g112248907Homo sa iensTCAM-1 458 98


1616 g14579740Ratios testicular cell adhesion366 76
molecule 1


norve icus (TCAM1)


1617 AAM67067Homo SapiensMOLE- Human bone marrow 271 64


expressed probe encoded
protein SEQ


ID NO: 27373.


1617 AAM54664Homo SapiensMOLE- Human brain expressed271 64
single


exon probe encoded protein
SEQ ID


NO: 26769.


1617 AAM56747Homo SapiensMOLE- Human brain expressed229 69
single


exon probe encoded protein
SEQ ID


NO: 28852.


1618 g15802814Homo sapiensGag-Pro-Pol-Env rotein 532 52


1618 g11780973Human poi protein 531 52


endogenous


retrovirus
K


1618 15802821Homo sa iensGa -Pro-Pol protein 531 52


1619 g12769587Mus musculusSTOP rotein 662 86


1619 g11370291Rattus STOP protein 662 92


norve icus


1619 g13287265Rattus E-STOP protein 662 92


norve icus


1620 AAM65980Homo sapiensMOLE- Human bone marrow 266 100


expressed probe encoded
protein SEQ


ID N0: 26286.


1620 AAM53601Homo SapiensMOLE- Human brain expressed266 100
single


exon probe encoded protein
SEQ ID


NO: 25706.


1620 gi~20270271~Mus musculusRIKEN cDNA 1190017012 198 80


ref~NP_6200


82.1


1621 g111862941Mus musculusDDM36E 74 33


1621 111862939Mus musculusDDM36 74 33


1621 g17650186Mus musculusneighbor of Punc e1 l 73 33
rotein


1622 g13157464Thermos Sp. integral membrane rotein74 38
A4


1623 gi~59977~emHuman tripartite fusion transcript129 82
PLA2L


b~CAA7866endogenous


2.1 ~ retrovirus


1623 gi~20161147~Oryza sativaVsaA -like protein 88 32


dbj~BAB900(japonica


75.1 cultivar-group)
~


1623 gi~17864474~Drosophila domino ~ 87 41




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
175
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


ref~NP_5248melanogaster


33.1


1626 AA000498 Homo SapiensHYSE- Human polypeptide99 43
SEQ ID


NO 14390.


1627 g114041733Xenorhabdus XptA2 protein 70 23


nematophila


1627 gi~15641593~Vibrio choleraecatalase 69 23


re~NP_2312


25.1


1628 g119888204MethanopyrusSite-specific DNA methylase80 27


kandleri
AV 19


1628 g16358691Simian Pol protein 78 32


immunodeficienc


y virus


1628 gi~20094956~MethanopyrusSite-specific DNA methylase80 27


ref~NP-6148kandleri
AV19


03.1 ~


1629 AAB07704 Homo Sapiens1NMR Protein encoded 594 67
by the


endogenetic fragment
of HERV-W.


1629 g18272464Homo sa iensgag 594 67


1629 AAB07703 Homo SapiensINMR Protein encoded 590 66
by the


endogenetic fragment
of HERV-W.


1630 g132498 Homo sa iensprecursor (AA -23 to 145 100
476)


1630 1339595 Homo sa ienstriglyceride lipase 145 100
precursor


1630 1386859 Homo sa ienshepatic 1i ase 145 100


1631 g18777465Rattus cytoplasmic dynein heavy703 77
chain


norvegicus


1631 g117019507Tripneustes dynein heavy chain isotype505 53
1B


gratilla


1631 AAB93815 Homo SapiensHELI- Human protein 457 71
sequence SEQ


ID N0:13606.


1632 AAM68837 Homo SapiensMOLE- Human bone marrow122 48


expressed probe encoded
protein SEQ


ID NO: 29143.


1632 AAM56460 Homo SapiensMOLE- Human brain expressed122 48
single


exon probe encoded protein
SEQ ID


NO: 28565.


1632 g117861826Drosophila GM01964p 90 51


melano aster


1633 gi~21300783~Anopheles ebiP1105 77 33


gb~EAA129gambiae str.


28.1 ~ PEST


1633 gi~19880523~Bactrocera vitellogenin 1 precursor68 27


gb~AAM003dorsalis


72.1 ~AF3
68


053 1


1633 gi~21070999~Homo Sapiensstromal interaction 68 39
molecule 2


ref~NP-0659 precursor


11.1


1637 g12323287multiple polyprotein 289 91


sclerosis


associated


retrovirus


1637 gi~21103962~Homo Sapiensenverin-2 261 82




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
176
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


gb~AAM331


41.1


1637 gi~13310191~multiple recombinant envelope 259 82
protein


gb~AAK181sclerosis


89.1~AF331associated


500_1 retrovirus


element


1638 AAR58809 Homo sa iensUYNY Human RPTP- aroma.86 26


1638 gi292411 Homo Sapiensreceptor-type protein 86 26
tyrosine


hosphatase aroma


1638 11263069 Homo sa iensreceptor tyrosine phos 86 26
hatase gamma


1639 g19857054Leishmania possible CG7055 protein74 27


maj or


1639 gi~20853034~Mus musculusexpressed sequence AI44751973 35


ref~XP_1259


62.1


1639 gi~7008003~dMus musculustranscription factor 73 35
MAZR


bj ~BAA9087


4.1~


1640 AAG03810 Homo SapiensGEST Human secreted 220 95
protein, SEQ ID


NO: 7891.


1640 1186800 Homo Sapiensribosomal protein L12 220 95


1640 g157680 Rattus rattusribosomal protein L12 220 95


1641 AAB44286 Homo SapiensGETH Human PR01072 (UNQ529)1709 100


protein sequence SEQ
ID N0:303.


1641 AAY41730 Homo sapiensGETH Human PR01072 protein1709 100


sequence.


1641 114602625Homo sapiensPAN2 rotein 1709 100


1642 g120147241Arabidopsis ATSg09850/MYH9 6 74 32


thaliana


1642 g114329782Homo sa iensdJ1121G12.3 (Novel gene)72 28


1642 gi~16648730~Arabidopsis ATSg09850/MYH9_6 74 32


gb~AAL255thaliana


57.1


1643 g12952340Ratios insulin receptor substrate89 31
2


norvegicus


1643 g12653351Bovine product of latency-related83 30
gene


herpesvirus
type


1.1


1643 14511969 Homo Sapiensinsulin rece for substrate-282 26


1644 g19964099Chlamydia inclusion membrane protein73 35


trachomatis


1644 g119171028EncephalitozoonATP DEPENDENT DNA BINDING67 29


cuniculi HELICASE (RAD3/XPD


SUBFAMILY OF HELICASES)


1644 gi~9964095~gChlamydia inclusion membrane protein73 35


b~AAG0982trachomatis


1.1 ~AF2793


62 1


1646 gi~10863995~Homo Sapiensclones 23667 and 23775 67 42
zinc finger


ref~NP_0670 protein


11.1


1647 11196425 Homo sa iensenvelo a rotein 93 39


1647 g1200296 Mus musculusperlecan 85 26




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
177
Tahle 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1647 18131894 Homo Sapiensmitofilin 84 27


1648 g11573040Haemophilusaspartokinase I / homoserine73 36


influenzae dehydrogenase I (thrA
Rd


1648 g18778726ArabidopsisT25N20.14 73 31


thaliana


1648 gi~16272063~Haemophilusaspartokinase I / homoserine73 36


refjNP-4382influenzae dehydrogenase I (thrA)
Rd


62.1


1649 g1295642 Saccharomycesphospholipase C 79 36


cerevisiae


1649 g17548846Saccharomycesdelta class phosphoinositide-specific77 36


cerevisiae hos holi ase C homolo


1649 g1161104 Schistosomaengrailed-like homeodomain74 35
protein


mansoni


1651 gi~13129464~Oryza sativa]Polyprotein 66 40


gb~AAK131[Oryza sativa


22.1~AC080(japonica


019 14 cultivar-
ou )


1652 AAG81446 Homo SapiensZYMO Human AFP protein 249 100
sequence


SEQ ID N0:410.


1652 118032212Homo sa histone acetyltransferase89 34
iens MOZ2


1652 AAR34936 Homo sapiensUYJO CENP-B. 77 35


1653 g120145484Bos taurus SCO-spondin 71 29


1655 AAM86382 Homo SapiensHUMA- Human 129 55


immune/haematopoietic
antigen SEQ


ID N0:13975.


1655 ABB03887 Homo SapiensHLTMA- Human musculoskeletal118 62


system related polypeptide
SEQ ID NO


1834.


1655 AAM75964 Homo SapiensMOLE- Human bone marrow 85 56


expressed probe encoded
protein SEQ


ID NO: 36270.


1659 g138035 Homo Sapiensp25 protein 110 45


1659 g1330915 Equine IR4 protein 99 28


herpesvirus
1


1659 g1156606 Chironomus SpId 84 30


tentans


1660 g19654641Vibrio cholerae3-deoxy-D-manno-octulosonic-acid84 23


transferase


1660 gi~20835446~Mus musculussimilar to STARP antigen73 25


reflXP-1444


09.1 ~


1660 gi~15596880~Pseudomonasprobable sugar aldolase 72 26


re~NP_2503aeruginosa


74.1


1661 g14062318EscherichiaHeat-responsive re ulatory79 36
coli protein


1661 g1976025 EscherichiaHrsA 79 36
coli


1661 g11786951Escherichiaprotein modification 79 36
coli enzyme, induction


K12 of om C


1662 AAM68588 Homo sapiensMOLE- Human bone marrow 155 100


expressed probe encoded
protein SEQ


ID NO: 28894.


1662 AAM56212 Homo SapiensMOLE- Human brain expressed155 100
single


exon probe encoded rotein
SEQ ID




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
178
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


NO: 28317.


1662 gi3845169Plasmodium phosphatase (acid phosphatase66 52
family)


falci arum
3D7


1663 AAG89215 Homo SapiensGEST Human secreted protein,218 100
SEQ ID


NO: 335.


1663 gi20070921Mus musculusRIKEN cDNA 2410008M22 130 55
ene


1663 AAR77602 Homo SapiensFORSI Human circulating 92 44
cytokine


CC-1 C-terminal fragment.


1664 AAE18212 Homo SapiensCURA- Human MOL4 protein.75 47


1664 AAM00966 Homo SapiensHYSE- Human bone marrow 72 35
protein,


SEQ ID NO: 442.


1665 AAB92828 Homo SapiensHELI- Human protein sequence74 93
SEQ


ID N0:11365.


1665 AAG63852 Homo SapiensINCY- Amino acid sequence74 93
of human


GTPase activating protein
GTPAP2.


1665 AAG63851 Homo SapiensINCY- Amino acid sequence74 93
of human


GTPase activatin protein
GTPAP 1.


1666 AAM72897 Homo sapiensMOLE- Human bone marrow 135 65


expressed probe encoded
protein SEQ


ID NO: 33203.


1666 AAM60268 Homo SapiensMOLE- Human brain expressed135 65
single


exon probe encoded protein
SEQ ID


NO: 32373.


1666 gi4007097Homo SapiensdJ1118D24.2 (60S Ribosomal135 65
Protein


L 10 LIKE)


1667 gi212267 Gallus anuscartilage link protein 917 49


1667 12010 Sus scrofa link rotein recursor 913 51
(AA -15 to 339)


1667 g1459439 E uus caballuslink protein 910 51


1668 110443237Mus musculuss licing factor 3a, subunit276 36
2


1668 g1396743 Podocoryne Pod-EPPT 276 30


carnea


1668 g1294131 Plasmodium circumsporozoite protein266 22


falcipanxm


1669 AAM49641 Homo sapiensBOEH Human tumour-associated132 65


antigen B345 rotein SEQ
ID NO 4.


1669 AAU12252 Homo SapiensGETH Human PRO5773 polypeptide132 65


se uence.


1669 AAY91592 Homo SapiensHUMA- Human secreted 132 65
protein


sequence encoded by gene
6 SEQ ID


N0:265.


1670 g14835383Homo sa alias DLC1 226 47
iens


1670 g14704343Homo Sapiensalias DLC1; candidate 226 47
tumor


suppressor ene


1670 g1155627 Acanthamoebamyosin I heavy chain 118 42


castellanii


1671 ABB 12490Homo SapiensHYSE- Human bone marrow 237 88
expressed


protein SEQ ID NO: 329.


1671 g16002932Streptomycesglycosyltransferase 67 35


fradiae


1671 gi~9634613~rHuman Ll 65 39


ef~NP_0381papillomavirus


50.1 ~ type 69


1672 g113938013Homo SapiensSimilar to RIKEN cDNA 333 66
2610509612


ene




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
179
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1672 gi2388970Schizosaccharomtat-binding homolog 235 41
7, AAA ATPase


yces pombe family roteiii


1672 gi6850321Arabidopsis Contains similarity 214 40
to YTA7 ATPase


thaliana gene from Saccharomyces
cerevisiae


gb~X81072, and contains
Bromodomain


PF~00439, AAA PF~00004,,
and Sigma-


54 PF~00158 transcription
factor


domains.


1673 gil 1066113Drosophila Misexpression suppressor71 29
of ras 4


melano aster


1673 gi~20829387~Mus musculusRIKEN cDNA 4930455F23 77 27


rel]XP-1295


40.1


1673 gi~17647635~Drosophila Misexpression suppressor71 29
of ras 4


ref~NP,5237melanogaster


75.1


1674 gi~20535935~Homo sapienssimilar to splicing 75 37
coactivator subunit


ref~XP-1157 SRm300; RNA binding
protein; AT-


87.1 rich element bindin
factor


1674 gi~17544226~CaenorhabditisY76B12C.4.p 72 34


re~NP_5001elegans


51.1


1674 gi~17559826)CaenorhabditissepB domain 70 26


ref~NP_5057elegans


99.1


1675 gi5708067Oryctolagus hyperpolarization activated99 27
cation


cuniculus channel


1675 gi402558 Canis familiarismucin 98 27


1675 110636484Homo Sapienspolyglutamine-containin96 26
protein


1676 AAM95365 Homo SapiensHUMA- Human reproductive73 26
system


related antigen SEQ
ID NO: 4023.


1676 AAB56709 Homo SapiensROSEI Human prostate 72 34
cancer antigen


protein sequence SEQ
ID NO:1287.


1676 g11881288Bacillus FUNCTION UNKNOWN, SIMILAR71 30
subtilis


PRODUCT IN E.COLI, H.


INFLUENZAE AND NEISSERIA


MENINGITIDIS.


1677 gi~15892512~EC:2.7.7.41]phosphatidate cytidylyltransferase65 34


ref~NP_3602[Rickettsia


26.1 conorii


1679 g114231 SaccharomycesNADH dehydrogenase (ubiquinone)75 31


cerevisiae


1679 g1805022 SaccharomycesNdilp 73 31


cerevisiae


1679 g11353352Chlamydomonasalanine aminotransferase70 27


reinhardtii


1680 g11805421Bacillus surfactin production 77 36
subtilis


1680 g1396482 Bacillus srfA2 77 36
subtilis


1680 g1516360 Bacillus surfactin synthetase 77 36
subtilis


1681 AAG64494 Homo SapiensSHAN- Human natriuretic156 80
peptide


rece for 18.


1681 AAE16275 Homo SapiensINCY- Human kinase PKIN-21154 73


protein.


1681 AAM40599 ~ Homo Sapiens~ HYSE- Human polypeptide~ 154 ~ 73
SEQ ID I




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
180
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


NO 5530.


1682 g12323287multiple polyprotein 1646 75


sclerosis


associated


retrovirus


1682 gi~2351212~dFriend marinegag-pol polyproteiii 807 40
(precursor protein)


bj ~BAA2206leukemia
virus


4.1~


1682 gi~9626961~rMarine leukemiaPr180 802 40


ef~NP_0579virus


33.1


1683 AAM39205 Homo SapiensHYSE- Human polypeptide 457 53
SEQ ID


NO 2350.


1683 g13033415Gibbon ape gag polyprotein 353 38


leukemia
virus


1683 gi~6524623~gPhascolarctosgag protein 343 38


b~AAF15097cinereus


.1~


1684 g119110438Homo Sapienspolycystin-1L1 712 98


1684 g16361629Periplanetavitellogenin 81 25


americana


1684 13115393 Rana 1 iensguanylate cyclase inhibitory80 35
protein


1686 AAY91542 Homo SapiensHUMA- Human secreted 212 84
protein


sequence encoded by gene
92 SEQ ID


N0:215.


1686 11279841 Bos taurus glycine trans otter 72 36


1686 119879917Oryza sativaacid hosphatase 70 35


1687 g112056568Homo sa MSTP063 212 88
iens


1687 113539684Homo sa zinc forger rotein 291 212 88
iens


1687 gi~12056568~Homo SapiensMSTP063 212 88


gb~AAG479


45.1~AF119


814 1


1689 g15689766Homosa ienszinc finger 2.2 222 91


1689 AAU16267 Homo SapiensHUMA- Human novel secreted178 58
protein,


Seq ID 1220.


1689 AAB99950 Homo SapiensSHAN- Human alkylated-DNA-protein177 60


cysteine methyltransferase
14.


1690 g13328880Chlamydia Protein Export 73 29


trachomatis


1690 g12832232Brucella flagellin; FIiC 67 29


melitensis
biovar


Aborius


1690 g117984285Brucella FLAGELL1N 67 29


melitensis


1692 g14927443Haemophilushemoglobin/hemoglobin-haptoglobin93 80


influenzae binding protein


1692 g14204775Haemophilushemoglobin and hemoglobin-93 80


influenzae ha toglobin bindin protein


1692 g13647226Haemophilusliemoglobin binding protein93 80


influenzae


1694 AAW95631 Homo SapiensGEMY Homo Sapiens secreted102 100
protein


gene clone hj968 2.


1694 g113162186Homo Sapiens~ calsyntenin-3 protein ~ 102 ~ 100




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
181
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1695 AA004205 Homo SapiensHYSE- Human polypeptide 81 37
SEQ ID


NO 18097.


1695 gi160180 Plasmodium circumsporozoite antigen81 29


cynomolgi


1695 gi495522 Plasmodium circumsporozoite protein80 30


simiovale


1696 AAM80223 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO


3869.


1696 AAM79239 Homo SapiensHYSE- Human protein SEQ 252 66
ID NO


1901.


1696 gi3688394Homo sa triple LIM domain rotein252 66
iens


1697 gi19887715MethanopyrusPredicted membrane protein74 28


kandleri
AV 19


1698 AAM93184 Homo SapiensHELI- Human polypeptide,269 87
SEQ ID


NO: 2552.


1698 118044066Mus musculusRIKEN cDNA 5033406L14 226 76
gene


1698 AAB95302 Homo SapiensHELI- Human protein sequence194 78
SEQ


ID N0:17538.


1699 ABB17279 Homo SapiensHUMA- Human nervous system110 56
related


olypeptide SEQ ID NO
5936.


1699 AA013013 Homo SapiensHYSE- Human polypeptide 101 71
SEQ ID


NO 26905.


1699 gi~7650258~gHepatitis polyprotein 74 28
C virus


b~AAF65960


.1 ~AF20777


0 1


1700 g112697585Arabidopsis4-(cytidine 5'-phospho)-2-C-methyl-D-69 40


thaliana erithritol kinase


1701 g116740569Homo sa Similar to thymus expressed84 27
iens gene 3


1701 g117940760Mus musculuscask-interacting protein79 26
2


1701 g117940758Homo sapienscask-interacting protein77 26
1


1702 g117385401Homo SapiensTPIP alpha 1i id phosphatase234 62


1702 AAU75783 Homo sapiensINCY- Human protein phosphatase208 57
1


(PP1) protein sequence.


1702 AAG67638 Homo SapiensHELI- Amino acid sequence202 56
of a


human rotein.


1703 AAO07887 Homo SapiensHYSE- Human polypeptide 246 85
SEQ ID


NO 21779.


1703 AA008651 Homo SapiensHYSE- Human polypeptide 239 83
SEQ ID


NO 22543.


1703 AA008732 Homo SapiensHYSE- Human polypeptide 221 80
SEQ ID


NO 22624.


1704 AAB94588 Homo SapiensHELI- Human protein sequence82 52
SEQ


ID N0:15392.


1704 g13288914Mus musculusaortic carboxypeptidase-like82 24
protein


ACLP


1704 AAM93437 Homo SapiensHELI- Human polypeptide,81 32
SEQ ID


NO: 3074.


1706 AAM86104 Homo SapiensHUMA- Human 179 100


immune/haematopoietic
antigen SEQ


ID N0:13697.


1706 g110039425E uus caballusALR rotein 120 40


1706 120502826Eimeria cGMP-dependent rotein 115 35
maxima kinase


1707 AAM70251 Homo sapiensMOLE- Human bone marrow ~ 115 ~ 78




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
182
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


expressed probe encoded
protein SEQ


ID NO: 30557.


1707 AAM57834 Homo SapiensMOLE- Human brain expressed115 78
single


exon probe encoded protein
SEQ ID


NO: 29939.


1707 gi15450860Arabidopsisserine/threonine-protein71 56
kinase Mak


thaliana (male germ cell-associated
kiiiase)-like


protein


1708 11620403 Homo sa SF1-Bo isoform 82 41
iens


1708 119072991H ocrea class III chitinase precursor82 40
virens


1708 118765873Hypocrea class III chitinase 82 40
virens


1709 AAM52240 Homo sa 1NCY- Human MFAP4 SEQ 1384 100
iens ID NO 3.


1709 g1790817 Homo sa microfibril-associated 1384 100
iens glycoprotein 4


1709 AAM52239 Homo sapiensINCY- Human MAG4V SEQ 1374 100
ID NO 1.


1710 g116769882Drosophila SD07884p 67 27


melanogaster


1710 gi~17545505~Ralstonia CONSERVED HYPOTHETICAL 66 41


ret)NP_5189solanacearumPROTEIN


07.1


1711 AAU82954 Homo SapiensANAD- Human homologue 111 27
of MPT1


rotein target for antifungal
com ound.


1711 g12058326Homo Sapienssubunit of RNA polymerase111 27
II


transcri tion factor
TFIID


1711 g113559031Homo sapiensbA11M20.1 (TATA box binding108 26


protein (TBP)-associated
factor, RNA


polymerise II, C1, 130kD)


1712 AAB65626 Homo SapiensSUGE- Novel protein kinase,209 82
SEQ ID


NO: 152.


1712 AAM25283 Homo sapiensHYSE- Human protein sequence209 82
SEQ


ID N0:798.


1712 AAU17269 Homo SapiensHUMA- Novel signal transduction176 67


pathway protein, Se ID
834.


1713 g118256065Mus musculusSimilar to ATPase, class127 67
II, type 9A


1713 AAM76495 Homo SapiensMOLE- Human bone marrow 123 70


expressed probe encoded
protein SEQ


ID NO: 36801.


1713 AAM63681 Homo SapiensMOLE- Human brain expressed123 70
single


exon probe encoded protein
SEQ ID


NO: 35786.


1714 g18096269Nicotiana KED 149 28


tabacum


1714 g11752736Saccharomycesgene required for phosphoylation148 30
of


cerevisiae oligosaccharides/ has
high homology


with YJR061w


1714 g12292986Rattus cyclic nucleotide-gated 141 28
channel beta


norvegicus subunit


1715 AAM72995 Homo SapiensMOLE- Human bone marrow 158 47


expressed probe encoded
protein SEQ


ID NO: 33301.


1715 AAM60359 Homo SapiensMOLE- Human brain expressed158 47
single


exon probe encoded protein
SEQ ID


NO: 32464.


1715 gi~13539605~Paramecium cycloplulin-RNA interacting144 45
protein


emb~CAC35tetraurelia




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
183
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


733.1
~


1716 AAM71015 Homo SapiensMOLE- Human bone marrow251 64


expressed probe encoded
protein SEQ


ID NO: 31321.


1716 AAM58517 Homo sapiensMOLE- Human brain expressed251 64
single


exon probe encoded protein
SEQ ID


NO: 30622.


1716 AAU19766 Homo SapiensHUMA- Human novel extracellular161 44


matrix rotein, Seq ~D
No 416.


1718 g11420924Zea mays IN1 75 27


1718 gi~14521970~Pyrococcus O-sialoglycoprotein 73 35
endopeptidase


ref~NP_1274abyssi


47.1


1719 g120513851Hordeum BPM 74 35


vul are


1719 g121039126Cryptosporidium60 kDa glycoprotein 74 26


parvum


1719 g1207158 Ratios big tau 73 36


norvegicus


1720 g118181943Caenorhabditisheparan sulfate GIcNAc 67 34
transferase-I/II


elegans


1720 g12058699Caenorhabditismultiple exostoses homolog67 34
2


ele ans


1720 gi~17554740~CaenorhabditisMULTIPLE EXOSTOSES 67 34


reilNP-4993elegans HOMOLOG 2


68.1 ~


1721 AAM69150 Homo SapiensMOLE- Human bone marrow200 38


expressed probe encoded
protein SEQ


ID NO: 29456.


1721 AAM56769 Homo SapiensMOLE- Human brain expressed200 38
single


exon probe encoded protein
SEQ ID


NO: 28874.


1721 g14185947Human pol protein 196 38


endogenous


retrovirus
I~


1722 g12065210Mus musculusPro-Pol-dUTPase olyprotein615 60


1722 g118676710Homo SapiensFLJ00254 rotein 592 60


1722 gi~20469453~Homo Sapienssimilar to FLJ00254 283 50
protein


ref~XP_1140


40.1


1723 g113881755Mycobacteriumcation efflux system 74 30
protein


tuberculosis


CDC1551


1724 AAG78866 Homo sa iensSHAN- Human zinc fin 141 68
er protein 15.


1724 ABB 17928Homo sapiensHUMA- Human nervous 99 53
system related


polypeptide SEQ ID NO
6585.


1724 gi~21295712~Anopheles agCP1631 75 26


gb~EAA078gambiae str.


57.1 ~ PEST


1725 121104340Homo Sapiensobscurin 1586 83


1725 g17024535Gallus allusstructural muscle rotein207 24
titin


1725 g11513030Gallus gallusconnectin/titin 207 24


1727 AAE19162 Homo SapiensTHOR/ Human lcinase 1096 99
polypeptide


(PK1N-20).




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
184
Table 2
SEQ AccessionSpecies Description Score
~


ID No. Identity


NO:


1727 gi2736151Rattus mytonic dystrophy kinase-related902 78


norvegicus Cdc42-binding kinase


1727 gi1695873Homo Sapiensser-thr rotein kinase 896 77
PK428


1728 AAY99411 Homo SapiensGETH Human PR01487 (UNQ756)862 67


amino acid sequence SEQ
ID N0:260.


1728 115617453Homo sapienschondroitin synthase 862 67


1728 AAE15959 Homo SapiensEUMO- Human 4589624/92-303761 79


protein, member of Fringe
and Brainiac


family.


1729 gi~15804980~EscherichiaUncharacterized conserved71 33
coli protein


ref~NP_29090157:H7


60.1 EDL933 .


1731 114268490Musca domesticahunchback 82 33


1731 AAM93401 Homo SapiensHELI- Human polypeptide,76 27
SEQ ID


NO: 3002.


1731 12076606 Musca domesticahunchback zinc finger 73 30
rotein


1732 AAY91949 Homo SapiensINCY- Human cytoskeleton1047 57
associated


protein 4 (CYSKP-4).


1732 ABB90754 Homo SapiensUYJO Human Tumour Endothelial1043 57


Marker polypeptide SEQ
ID NO 240.


1732 g1619577 Gallus alluscardiac muscle tensin 1043 56


1733 g13090889Homo Sapienssynapsin IIIa 70 38


1733 g16572355Homo sa cE86D10.1 (syna sin III)70 38
iens


1733 gi~19924105~Homo Sapienssynapsin III, isoform 70 38
IIIa


ref~NP
0034


81.2


1734 AAB85144 Homo SapiensHUMA- Human NKCR polypeptide1506 93


(clone ID HMSOM53).


1734 g14973126Mus musculushigh affinity inununoglobulin490 39
gamma


castaneus Fc receptor I


1734 g14973124Mus musculushigh affinity immunoglobulin489 39
gamma


Fc receptor I


1735 gi~15597595~Pseudomonaspyoverdine synthetase 69 30
D


reflIVP-2510aeruginosa


89.1 ~


1736 114488302Oryza sativaPutative trans oson rotein81 24


1736 g13851516Phytophthoracyst germination specific72 33
acidic repeat


infestans rotein precursor


1736 gi~14488302~Oryza sativaPutative transposon protein81 24


gb~AAK638


83.1 ~AC074


105 12


1737 AAB85357 Homo Sapiens1NCY- Human phosphatase 1591 100
(PP) (clone


ID 3402521CD1).


1737 g121205864Homo SapiensT-cell activation protein1591 100
phosphatase


2C; TA-PP2C


1737 g121464366Drosophila RE06653p 758 52


melano aster


1738 g17271811Drosophila GTPase activating protein292 38


melanogaster


1738 AAM76430 Homo SapiensMOLE- Human bone marrow 246 100


expressed probe encoded
protein SEQ


ID NO: 36736.


1738 AAM63615 Homo SapiensMOLE- Human brain ex 246 100
ressed single




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
185
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


exon probe encoded protein
SEQ ID


NO: 35720.


1739 ABB50365 Homo SapiensHUMA- Human secreted 272 87
protein


encoded by gene 65 SEQ
ID N0:313.


1739 AAW88598 Homo SapiensHUMA- Secreted protein 272 87
encoded by


gene 65 clone HFVHY45.


1739 ABB50764 Homo SapiensHUMA- Human secreted 143 92
protein


encoded by ene 65 SEQ
ID N0:716.


1740 12065210 Mus musculusPro-Pol-dUTPase pol rotein1210 58


1740 gi~10834720~Homo SapiensPP565 274 80


gb~AAG237


90.1 ~AF258


587 1


1740 gi~385615~gbMus sp. fibulin gene homolog 248 75


~AAB26708.


1~


1741 ABB90748 Homo SapiensUYJO Human Tumour Endothelial2116 97


Marker polype tide SEQ
ID NO 228.


1741 115987493Homo Sapienstumor endothelial marker2116 97
6


1741 ABB90754 Homo SapiensUYJO Human Tumour Endothelial530 37


Marker of eptide SEQ
1D NO 240.


1742 ABB 11753Homo SapiensHYSE- Human NOV/plexin-A1291 90


homolo ue, SEQ ID N0:2123.


1742 g11665757Mus musculusplexin 1 291 90


1742 16010217 Homo sa NOV/ lexin-A1 rotein 291 90
iens


1743 AAM79514 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO


3160.


1743 AAM78530 Homo SapiensHYSE- Human protein SEQ 149 90
ID NO


1192.


1743 g11244510Homo Sapiensp311 rotein 149 90


1744 AAG93324 Homo SapiensNISC- Human protein HP 83 41
10370.


1744 g121064771Drosophila RH61467p 83 46


melano aster


1744 g118676554Homo sa FLJ00174 protein 77 41
iens


1745 14128039 Homo SapiensTL132 rotein 81 29


1745 g117983118Brucella METAL DEPENDENT HYDROLASE74 23


melitensis


1745 AAU75578 Homo SapiensUYNA- Human ubiquitin 71 31
specific


rotease 10 (USP 10).


1746 g115074154SinorhizobiumPUTATIVE FATTY 76 25


meliloti ACID/PHOSPHOLIPID SYNTHESIS


PROTEIN


1746 g11869833human myristylated tegument 75 27
protein


he esvirus
2


1746 g120516045ThermoanaerobaChemotaxis response regulator69 20
CheB,


cter consists of CheY-like
receiver domain


tengcongensisand a methylesterase
(demethylase)


domain


1747 g118025496cercopithicineEBNA-1 124 37


he esvirus
15


1747 g15821153Homo SapiensRNA binding protein 123 29


1747 g16649242Homo Sapienssplicing coactivator 123 29
subunit SRm300


1748 gi~4321764~gMus musculusMAP kinase kinase 7 alpha65 30
2


b~AAD
1581




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
186
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


9.1~


1748 gi~20859704~Mus musculusmitogen activated protein65 30
kinase kinase


ref~XP'1339 7


86.1


1748 gi~4321768~gMus musculusMAP kinase kinase 7 beta65 30
2


b~AAD
1582


1.1~


1749 AAB50964 Homo sapiensGETH Human PR01313 protein.439 89


1749 AAB47290 Homo sa GETH PR01313 0l a tide. 439 89
iens


1749 AAB24431 Homo SapiensGETH Human PR01313 protein439 89


se uence SEQ ID N0:216.


1750 AAU00502 Homo sa MILL- Human TANGO 437 115 91
iens protein.


1750 g120384654Homo Sapienstwo- ore calcium channel115 91
rotein 2


1750 AAM91059 Homo SapiensHUMA- Human 93 64


immune/haematopoietic
antigen SEQ


ID N0:18652.


1751 g110440494Homo SapiensFLJ00092 rotein 252 97


1751 AAM40956 Homo SapiensHYSE- Human polypeptide 80 30
SEQ ID


NO 5887.


1751 gi~10440494~Homo SapiensFLJ00092 protein 252 97


dbj ~BAB
157


80.1


1752 g115980036Yersinia 2-dehydro-3-deoxyphosphooctonate77 46
pesos


aldolase


1752 g111322261Diceros al ha adrenergic rece 74 26
bicornis for 2B


1752 g120516240Thermoanaerobamethylaspartate mutase 73 25


cter


ten congensis


1753 g119684014Homo Sapienssimilar to brain-specific1387 99
angiogenesis


inhibitor 3 (H. sa iens)


1753 AAB88367 Homo SapiensHELI- Human membrane 1380 99
or secretory


protein clone PSECO101.


1753 11469936 Mus musculusFGF-binding protein 158 29


1754 AAB01397 Homo SapiensINCY- Neuron-associated 435 92
rotein.


1754 g121218140Homo Sapiensrab effector MYRIP 435 92


1754 g121320161Mus musculusexophilin 8 378 77


1755 AAM74815 Homo SapiensMOLE- Human bone marrow 253 75


expressed probe encoded
protein SEQ


ID NO: 35121.


1755 AAM62013 Homo SapiensMOLE- Human brain expressed253 75
single


exon probe encoded protein
SEQ ID


NO: 34118.


1755 AAM70390 Homo sapiensMOLE- Human bone marrow 228 62


expressed probe encoded
protein SEQ


ID NO: 30696.


1756 g16460201Deinococcusphenylacetic acid degradation85 27
protein


radioduransPaaA


1756 g13309543Talcifugu MLL 79 34


rubri es


1756 AAT10059_Homo SapiensUSSH erbB-3 cDNA clone 74 31
E3-16.


aal


1757 118676406Homo sa FLJ00021 protein 70 36
iens


1758 g113423395CaulobacterNADH dehydrogenase I, 78 37
M subunit


crescentus
CB 15




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
187
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1758 gi~17506337~CaenorhabditisD1007.15.p 82 24


ref~NP-4913elegans


90.1 ~


1758 gi~16126181~CaulobacterNADH dehydrogenase I, 78 37
M subunit


ref~NP_4207crescentus
CB 15


45.1


1759 gi19881193chimpanzee transcriptional transactivator83 29
TRS1


cytome alovirus


1759 gi19881161chimpanzee transcriptional transactivator83 29
IRS1


cytomegalovirus


1759 1556297 Mus musculusal ha-1 type IV collagen81 33


1760 118033185Danio rerioUNC45-related rotein 702 79


1760 AAG77802 Homo SapiensHUMA- Human HOGEN50 603 65


serine/threonine phosphatase
protein


se uence.


1760 AAM40290 Homo SapiensHYSE- Human polypeptide 603 65
SEQ ID


NO 3435.


1761 g16634123Drosophila SoxNeuro 70 24


melano aster


1762 gi~14245700~Giardia kinesin-like protein 69 26
4


dbj~BAB561intestinalis


42.1


1762 gi~165011~gbOryctolaguseucaryotic release factor69 24
(eRF)


~AAA31246.cuniculus


1~ ,


1762 gi~15559188~Homo SapiensdJ45P21.3 (butyrophilin,69 26
subfamily 3,


emb~CAC03 member A1)


424.2


1763 AAM93661 Homo SapiensHELI- Human polypeptide,186 80
SEQ ID


NO: 3536.


1763 AAM64398 Homo SapiensMOLE- Human brain expressed154 76
single


exon probe encoded protein
SEQ ID


NO: 36503.


1763 gi~20556958~Homo Sapienssimilar to PAM COOH-terminal73 43


ref~XP_0615 interactor protein 1


62.5


1764 AAU17223 Homo SapiensHUMA- Novel signal transduction211 87


pathwa rotein, Se ID
788.


1765 g11334546Podospora Dod COI 113 grp IB protein71 37


anserina


1765 15679307 Mus musculusROR aroma t 70 27


1765 g14186077Mus musculusROR aroma T rotein 70 27


1766 g117864081Mus musculusPPAR aroma coactivator-lbeta74 26
protein


1766 g144795 Methanococcuspolyferredoxin 71 28


voltae


1766 g114279670Lycopersiconverticillium wilt disease71 31
resistance


esculentum protein


1768 AAE06588 Homo SapiensSAGA Human protein having165 100


hydrophobic domain, HP
10778.


1768 AAM40979 Homo SapiensHYSE- Human polypeptide 165 100
SEQ ID


NO 5910.


1768 AAB24542 Homo SapiensHUMA- Human secreted 73 30
protein


sequence encoded by gene
27 SEQ ID


N0:168.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
188
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1769 gi6174840Achromobacterlow-specificity D-tlueonine78 33
aldolase


xylosoxidans


subsp.


xylosoxidans


1769 gi16769806Drosophila SD02660p 75 23


melano aster


1769 gi1098473Rattus insulin-like growth 73 31
factor binding


norvegicus rotein


1770 AAP94684 Homo SapiensCHIL Amino acid sequence79 56
encoded


by part of human xnamiose
binding


protein(hMBP) genomic
DNA.


1770 gij15790548jHalobacteriumcobyric acid synthase; 69 36
CbiP


ref~NP Sp. NRC-1
2803


72.1 ~


1770 gij11467609jGuillardia Clp protease ATP binding69 27
theta subunit


ref~NP_0506


61.1j


1772 gi5532460Shi eila ShiF 66 32
flexneri


1773 gi 11544663Arabidopsis PTPKIS 1 75 42


thaliana


1773 gi11595504Arabidopsis PTPKIS1 protein 75 42


thaliana


1773 gi18389331Mus musculus2',5'-oli oadenylate 73 42
synthetase-like 10


1774 AAM06519 Homo SapiensHYSE- Human foetal protein,414 90
SEQ ID


NO: 250.


1774 gij18552248jHomo Sapienssimilar to latent transforming69 37
growth


refjXP_0925 factor beta binding
protein 1; latent


10.1 TGF beta binding protein


1775 gi4884924Rangiferine glycoprotein C 67 60


he esvirus
1


1775 AAB94152 Homo sapiensHELI- Human protein 65 34
sequence SEQ


ID N0:14435.


1775 AAB93253 Homo SapiensHELI- Human protein 65 34
sequence SEQ


ID N0:12271.


1776 gi13424176Caulobacter N-carbamyl-L-amino acid89 24


crescentus amidohydrolase
CB 15


1776 gi514267 Homo Sapiensproto-oncogene tyrosine-protein86 29
kinase


1776 128237 Homo Sapiens150 protein (AA 1-1130)84 28


1777 g163370 Gallus anus d strophin (AA 1 - 3660)68 31


1777 gij3046783jeScyliorhinusdystrophin 67 29


mb~CAA680canicula


33.1j


1777 gi~2342682jgArabidopsis Contains similarity 67 31
to Rattus AMP-


bjAAB7040thaliana activated protein kinase
(gbjX95577).


6.1j


1778 AAE16176 Homo SapiensINCY- Human G-protein 1419 100
coupled


receptor 7 (GCREC-7)
rotein.


1778 AAE18021 Homo SapiensCUBA- Human G-protein 1419 100
coupled


receptor-8a (GPCR-8a)
rotein.


1778 AAG72411 Homo SapiensVEDA Human OR-like polypeptide1419 100


query se uence, SEQ
ID NO: 2092.


1779 AAM76040 Homo SapiensMOLE- Human bone marrow93 48


expressed probe encoded
protein SEQ


117 NO: 36346.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
189
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1779 AAM63227 Homo SapiensMOLE- Human brain expressed93 48
single


exon probe encoded protein
SEQ ID


NO: 35332.


1779 gi12620576BradyrllizobiumID342 87 24


' a onicum


1780 gi2459833Rattus Maxpl 81 31


norvegicus


1780 AAB65650 Homo SapiensSUGE- Novel protein kinase,- 80 35
SEQ ID


NO: 177.


1780 AAM39805 Homo sapiensHYSE- Human polypeptide 80 36
SEQ ID


NO 2950.


1781 14877963 Mus musculusNF-ka aB inducin kinase 69 39


1781 115077865Mus musculusbullous emphi oid antigen67 35
1-b


1781 g115077863Mus musculusbullous emphi oid anti 67 35
en 1-a


1782 g14138265Nicotiana Avr9 elicitor response 76 27
protein


tabacum


1782 g112725153LactococcusSOS ribosomal protein 75 32
L3


lactis subsp.


lactis


1782 AAB21008 Homo SapiensINCY- Human nucleic acid-binding73 32


protein, NuABP-12.


1783 g13947714Streptococcusinitiation factor IF2 86 20


agalactiae


1783 g19558387Streptococcusinitiation factor 2 86 20


a alactiae


1783 g19558369Streptococcusinitiation Factor 2 86 20


a alactiae


1786 g1435855 Mus s . CREB-binding protein; 75 22
CBP


1786 g12911464Leishmania sodium stibogluconate 75 34
resistance


tarentolae rotein


1786 g119547887Mus musculusCREB-binding rotein 75 22


1787 13747099 Mus musculusC1 -related factor 616 61


1787 114278927Mus musculusgliacolin ' 615 64


1787 g110566471Mus musculusGliacolin 615 64


1788 gi~21291197~Anopheles agCP7579 71 20


gb~EAA033gambiae
str.


42.1 ~ PEST


1788 gi~20803964~MesorhizobiumHYPOTHETICAL PROTEIN 69 43


emb~CAD31loti


541.1


1789 AAM41125 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID


NO 6056.


1789 AAM39339 Homo SapiensHYSE- Human polypeptide 320 80
SEQ ID


NO 2484.


1789 AAM79857 Homo SapiensHYSE- Human protein SEQ 320 80
ID NO


3503.


1790 g11143585Paracentrotus2 alpha fibrillar collagen69 23


lividus


1791 g19837427Lytechinus embryonic blastocoelar 116 34
extracellular


varie atus matrix rotein recursor


1791 g114089698Mycoplasma OLIGOPEPTIDE ABC 71 23


pulinonis TRANSPORTER PERMEASE


PROTEIN


1791 g16572111Bartonella riboflavin synthase alpha69 29
chain




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
190
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


uintana


1792 gi~4506023~rHomo Sapiensprotein phosphatase 68 39
2, regulatory


ef~NP_0027 subunit B (B56), gamma
isoform


10.1


1793 AAM71170 Homo SapiensMOLE- Human bone marrow180 82


expressed probe encoded
protein SEQ


ID NO: 31476.


1793 AAM58664 Homo SapiensMOLE- Human brain expressed180 82
single


exon probe encoded protein
SEQ ID


NO: 30769.


1793 AAM65679 Homo SapiensMOLE- Human brain expressed168 71
single


exon probe encoded protein
SEQ ID


NO: 37784.


1794 AAG00072 Homo SapiensGEST Human secreted 125 80
protein, SEQ ID


NO: 4153.


1794 AAW34618 Homo SapiensIMUT- Human C3 protein 125 80
mutant DV-


7N.


1794 AAW34617 Homo sapiensIMUT- Human C3 protein 125 80
mutant DV-


6.


1795 AAY05069 Homo SapiensSMIK Human PIGR-2 protein1055 85


sequence.


1795 gi396170 Homo sa iensCMRF-35 anti en 406 45


1795 gi18490143Homo SapiensCMRF35 leukocyte immunoglobulin-406 45


like receptor


1796 gi~6723273~dBaboon gag-pol precursor polyprotein421 41


bj~BAA8965endogenous


9.1~ virus strain
M7


1796 gi~13940448~Murine leukemiapol precursor protein 421 41


gb~AAK503virus


81.1 ~U43202


2


1796 gi~331995~gbAKV murine gag-pol polyprotein 421 41
(tag amber codon


~AAB03091.leukemia at 2250-2252 inserts
virus Gln in Mo-MuLV)


1


1797 121411325Homo SapiensSimilar to LOC205103 260 73


1797 gi~4835878~gHomo Sapiensendocytic receptor Endo18077 31


b~AAD3028


O.1~AF1348


38 1


1797 gi~16076075~Leishmania trypanothione reductase70 30


emb~CAC94donovani


295.1 donovani


1798 g1927721 SaccharomycesSiplp: SNF1 proteiiikinase72 34
substrate;


cerevisiae YDR422C; CAI: 0.13


1798 g1172604 Saccharomycesprotein kinase 72 34


cerevisiae


1798 gi~6320630~rSaccharomycesSNF1 proteinkinase substrate;72 34
Siplp


eflNP_0107cerevisiae


10.1


1799 gi~20839768~Mus musculussimilar to GDP-fucose 71 29
transporter 1


ref~XP_1303


11.1


1801 gi~17461642~Homo Sapienssimilar to Ig kappa 78 23
chain


reflXP
0662




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
191
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


49.1 ~


1801 gi~6325342~rSaccharomycesProtein required for 76 22
cell viability;


0154 cerevisiae Ypr085cp
ef~NP


_
10.1


1801 gi~9635081~rGallid UL47 74 26


ef~NP_0578herpesvirus
2


09.1 ~


1802 AAB94148 Homo SapiensHELI- Human protein sequence250 56
SEQ


ID N0:14427.


1802 AAG64564 Homo SapiensSHAN- Human zinc-finger 250 56
protein 60.


1802 AAM79356 Homo SapiensHYSE- Human protein SEQ 250 56
ID NO


3002.


1803 AAW81754 Homo SapiensBOEF Human Fanconi anaemia-631 85


associated ene II protein.


1803 g12407911Homo Sapiensdifferentially expressed555 74
in Fanconi


anemia


1803 16013073 Mus musculusHemT-3 protein 89 24


1805 g114189735Homo sapiensATP-binding cassette 1508 90
transporter


family A member 12


1805 11943947 Bos taurus ABC transporter 404 31


1805 AAZ94734_Homo SapiensFARB Human ATP binding 395 33
cassette


aal ABCAl (ABC1) cDNA.


1806 AAU12234 Homo SapiensGETH Human PR04350 polypeptide859 100


sequence.


1806 AAA96344_Homo SapiensGETH cDNA encoding a 498 48
novel


aal of epode designated PR04357.


1806 AAU12445 Homo SapiensGETH Human PRO4357 polypeptide498 48


sequence.


1807 1190396 Homo sa rofilaggrin 76 29
iens


1808 AAB88367 Homo SapiensHELI- Human membrane 74 30
or secretory


rotein clone PSECO101.


1808 g119684014Homo Sapienssimilar to brain-specific74 30
angiogenesis


inhibitor 3 (H. Sapiens)


1808 gi~18576362~Homo Sapienssimilar to fibroblast 74 30
growth factor


re~XP_0844 binding protein 1


81.1


1809 g1530876 Chlamydomonasamino acid feature: Rod 126 35
protein


reinhardtiidomain, as 266 .. 468;
amino acid


feature: globular protein
domain, as 32


.. 265


1809 g16578849Myxococcus FrgA 126 29


xanthus


1809 12429362 Santalum proline rich protein 122 27
album


1810 g117428288Ralstonia PROBABLE CATION- 75 28


solanacearumTRANSPORTING ATPASE


LIPOPROTEIN TRANSMEMBRANE


1810 g121483422Drosophila LD34142p 71 29


melano aster


1810 ABB90042 Homo SapiensHUMA- Human polypeptide 70 32
SEQ ID


NO 2418.


1811 gi~20915248~Mus musculussimilar to Collagen alpha148 74
1(VI) chain


ref~XP_1451 precursor


60.1


1812 g12104558Rattus ~ CCA3 ~ 1150 ~ 90




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
192
Tahle 2
SEQ AccessionSpecies Description Score
ID No. Identity


NO:



norvegicus


1812 AAB64963 Homo SapiensROSE/ Human secreted 172 37
protein


sequence encoded by gene
24 SEQ ID


NO:141.


1812 gi12963869Mus musculusgene trap ankyrin repeat172 37
containing


rotein


1813 AAB65201 Homo SapiensGETH Human PR01009 (UNQ493)208 100


rotein se uence SEQ ID
N0:194.


1813 AAY66678 Homo SapiensGETH Membrane-bound protein208 100


PR01009.


1813 AAB24068 Homo SapiensGETH Human PR01009 protein208 100


se uence SEQ ID N0:36.


1815 AAG89314 Homo SapiensGEST Hurnan secreted 191 100
protein, SEQ ID


NO: 434.


1815 gi6460052Deinococcusdipeptidyl peptidase 66 60
IV-related protein


radiodurans


1816 gi1052594Drosophila trithorax protein trxI 75 26


melanogaster


1816 gi1052593Drosophila trithorax protein trxII 75 26


melanogaster


1816 gi158818 Drosophila zinc-binding protein 75 26


melanogaster


1817 AAB49765 Homo SapiensHELI- Human proliferation229 94


differentiation factor
amino acid


se uence.


1817 AAB88393 Homo SapiensHELI- Human membrane 229 94
or secretory


rotein clone PSEC0137.


1817 gi18446895Drosophila AT05866p 73 25


melanogaster


1818 gi6573212Giardia variant-specific surface73 32
protein H7-1


intestinalis


1818 gi159143 Giardia variant-specific surface73 32
protein H7


intestinalis


1818 gi15144254Micrurus neurotoxin homologue 72 32
8


corallinus


1819 gi161857 Tetrahymenasurface antigen 69 35


thermo hila


1821 gi913964 Carcinoscorpiusfactor C 80 26


rotundicauda


1821 gi217397 Tachypleus limulus factor C precursor80 26


tridentatus


1821 gi18542425Tachypleus factor C precursor 80 26


tridentatus


1822 19309473 Mus musculusDNMT1 associated protein-174 37


1822 g11666895Homo sa CHL1 protein 74 23
iens


1822 g116923930Mus musculusMAT1-mediated transcriptional74 37


repressor


1823 g19058659Canis familiarisskeletal muscle chloride73 34
channel C1C-1


1823 g1433182 Drosophila receptor protein tyrosine72 26
phosphatase


melanogaster


1823 g120429105Paracoccus decaprenyl diphosphate 72 27
synthase


zeaxanthinifacie


ns


1824 g113374178Mus musculusTAFII140 rotein 612 88




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
193
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1824 gi17861888Drosophila GM10839p 246 49


melano aster


1824 gi6634096Drosophila BIP2 protein 242 48


melano aster


1825 gi16605480Homo sa G6b-C protein 1159 100
iens


1825 116605484Homo sa G6b-E rotein 1009 90
iens


1825 gi5304877Homo sa immuno lobulin rece for 1003 83
iens


1826 AAB94636 Homo SapiensHELI- Human protein sequence105 37
SEQ


ID N0:15515.


1826 AAU15903 Homo SapiensHUMA- Human novel secreted105 37
protein,


Se ID 856.


1826 gi21430928Drosophila SD27341p 93 39


melanogaster


1827 AAR33270 Homo SapiensWIST- T cell receptor 329 92
alpha chain


clone alphal.3.


1827 gi1806100Homo SapiensT cell rece for alpha 329 92
chain


1827 gi2358032Homo SapiensTCRAV8S3 329 92


1828 gi20513851Hordeum BPM 73 45


vul are


1828 AA001897 Homo SapiensHYSE- Human polypeptide 70 35
SEQ ID


NO 15789.


1828 AAE16477 Homo SapiensOSTE- Human collagen 69 31
alphal (II)


rotein.


1829 AAG66837 Homo SapiensSHAN- Human ATP-dependent356 100
serine


proteinase 31.


1829 AAG66838 Homo SapiensSHAN- Human ATP-dependent89 100
serine


proteinase 31 N-terminal
peptide.


1829 gi5881591Gallus gallushomeodomain protein 77 38


1830 AAB94294 Homo SapiensHELI- Human protein sequence951 99
SEQ


ID N0:14745.


1830 gi10504968Drosophila rho guanine nucleotide 180 22
exchange factor


melano aster4


1830 gi16197921Drosophila LD03170p 180 22


melano aster


1831 ABB 12353Homo SapiensHYSE- Human bone marrow 199 30
expressed


protein SEQ ID NO: 107.


1831 120452161Canis familiarisretinitis i mentosa GTPase143 24
re lator


1831 gi2062609Xenopus middle molecular weight 140 24
laevis neurofilament


rotein NF-M(1)


1832 AAB29778 Homo SapiensRHOD- Human MSF-derived 148 18


tribonectin.


1832 gi142161 Anaplasma surface antigen Amf105 141 25


mar finale


1832 gi4808177Drosophila largest subunit of the 141 20
RNA polymerase


subobscura II com lex


1833 AAM66321 Homo SapiensMOLE- Human bone marrow 424 51


expressed probe encoded
protein SEQ


ID NO: 26627.


1833 AAM53933 Homo SapiensMOLE- Human brain expressed424 51
single


exon probe encoded protein
SEQ ID


NO: 26038.


1833 gi~6723273~dBaboon gag-pol precursor polyprotein357 47


bj~BAA8965endogenous


9.1 virus strain
M7




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
194
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1834 AAM88756 Homo SapiensHUMA- Human 208 100


immune/haematopoietic
antigen SEQ


ID N0:16349.


1834 gi20417 Persea americanacellulase 77 34


1834 gi153337 Streptomyceskanamycin-apramycin resistance69 26


tenebrariusmethylase


1837 AAY02893 Homo SapiensIiLTMA- Fragment of human76 41
secreted


protein encoded by ene
92.


1837 AAY99429 Homo SapiensGETH Human PR01563 (UNQ769)73 35


amino acid se uence SEQ
ID N0:317.


1837 gi6634084Drosophila malate dehydrogenase 73 39
(NADP-


melanogasterdependent oxaloacetate


decarboxylating), malic
enzyme


1838 gi2865602SaccharopolyspoSapI M2 methyltransferase77 37


ra Sp.


1838 gi3089358Rattus MARRLC2A 75 33


norvegicus


1838 gi~2865602~gSaccharopolyspoSapI M2 methyltransferase77 37


b~AAC9718ra Sp.


2.1~


1839 AAM69149 Homo SapiensMOLE- Human bone marrow 154 96


expressed probe encoded
protein SEQ


ID NO: 29455.


1839 AAM56768 Homo SapiensMOLE- Human brain expressed154 96
single


exon probe encoded protein
SEQ ID


NO: 28873.


1839 AAW96209 Homo SapiensSMIK Amyloid precursor 102 78
protein


(APP) C-terminal fragment.


1840 gi9946563Pseudomonasprobable type II secretion81 36
system


aeru inosa protein


1840 gi21108565Xanthomonaspseudouridylate synthase75 35


axonopodis
pv.


citri str.
306


1840 ABB04714 Homo sapiensSHAN- Human PP1744 protein74 31
SEQ


ID N0:23.


1841 gi1491949Molluscum MC006L 85 30


contagiosum


virus sub
a 1


1841 AAM42085 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID


NO 7016.


1841 AAM40299 Homo SapiensHYSE- Human polypeptide 81 27
SEQ ID


NO 3444.


1842 120381413Homo sapiensSimilar to LOC160680 216 44


1842 g113592175Leishmania ppg3 144 24


maj or


1842 g15420387Leishmania proteophosphoglycan 140 23


ma' or


1843 AAB87181 Homo SapiensMILL- Human secreted 278 42
protein


MANGO 349 E41D variant,
SEQ ID


N0:231.


1843 AAB87128 Homo sapiensMILL- Human secreted 278 42
protein


MANGO 349, SEQ ID N0:130.


1843 AAB87179 Homo SapiensMILL- Human secreted 276 41
protein


MANGO 349 I21K variant,
SEQ ID




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
195
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


N0:227.


1844 AAE14341 Homo sapiensINCY- Human protease 886 93
PRTS-6


protein.


1844 gi16768276Drosophila GH27809p 290 41


melano aster


1844 gi2655204Mus musculusubiquitin-specific protease258 35


1846 AAY88300 Homo SapiensMILL- Human TANGO 187-3 1334 90
protein.


1846 gi13097780Homo SapiensSimilar to RIKEN cDNA 1326 90
2810037014


gene


1846 AAY88296 Homo SapiensMILL- Human TANGO 187-2/31312 87


protein.


1847 AAG74984 Homo SapiensHUMA- Human colon cancer75 32
antigen


protein SEQ ID N0:5748.


1847 gi17352449Rattus ErbB3/Her3 precursor 74 38


norve icus


1847 gi~20860870~Mus musculussimilar to H4(D10S170) 75 32
protein


re~XP,1256


64.1 ~


1848 gi3123530Fowlpox I3L, ortholo ue of vaccinia75 27
virus I3L


1848 gi5902659Drosophila ring canal protein 70 27


melanogaster


1848 gi~18110218~Drosophila kel-P2 70 27


ref~NP-4765melanogaster


89.2


1849 gi2065210Mus musculusPro-Pol-dUTPase olyprotein614 78


1849 AAM65715 Homo SapiensMOLE- Human bone marrow 548 73


expressed probe encoded
protein SEQ


ID NO: 26021.


1849 AAM53338 Homo SapiensMOLE- Human brain expressed548 73
single


exon probe encoded protein
SEQ ID


NO: 25443.


1850 gi10999071LophognathusNADH dehydrogenase subunit74 23
2


longirostris


1850 gi18537243Human envelope glycoprotein 74 29


immunodeficienc


y virus
a 1


1850 gi~1099907,1~LophognathusNADH dehydrogenase subunit74 23
2


gb~AAG006longirostris


22.2~AF
128


462 2


1851 gi~17448210~Homo Sapienssimilar to 60 kDa heat 72 28
shock protein,


ref~XP_0685 mitochondrial precursor
(Hsp60) (60


03.1 kDa chaperonin) (CPN60)
(Heat shock


protein 60) (HSP-60)
(Mitochondrial


matrix protein Pl) (P60
lymphocyte


protein) (HuCHA60)


1852 gi1164937SaccharomycesYOR3160w 74 31


cerevisiae


1852 gi3176662ArabidopsisSimilar to mannosyl-oligosaccharide73 31


thaliana glucosidase gb~X87237
from Homo


sa iens.


1852 gi13398928Arabidopsisalpha-glucosidase 1 73 31


thaliana


1853 gi~20889364~Mus musculussimilar to hepatitis ~ 76 ~ 36
A virus cellular




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
196
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1384 receptor 1; T cell immunoglobin
ref~XP


_ domain and mucin doamin
29.1 ~ rotein 1


1853 gi~21288202~Anopheles agCP9342 71 32


gb~EAA005gambiae
str.


23.1 ~ PEST


1854 AAB88481 Homo SapiensHELI- Human membrane 776 99
or secretory


rotein clone PSEC0251.


1854 AAE03835 Homo SapiensHLTMA- Human gene 18 776 99
encoded


secreted protein HFKHW50,
SEQ ID


NO: 81.


1854 AAE03863 Homo SapiensHIJMA- Human gene 18 716 97
encoded


secreted protein HFKHW50,
SEQ ID


N0:109.


1855 gi1663748Chlamydomonasdynein heavy chain 7 82 29


reinhardtii


1855 gi1663744Chlamydomonasdynein heavy chain 5 80 28


reinhardtii


1855 gi1663738Chlamydomonasdynein heavy chain 2 80 27


reinhardtii


1856 gi18032120Gallus gallusshal-like voltage-gated 75 23
potassium


channel


1856 gi1408569Haemophilusadhesion and penetration71 28
protein


influenzae


1856 gig 18032120Gallus gallusshal-like voltage-gated 75 23
potassium


gb~AAL566 chaimel


33.1 ~AF075


160 1


1857 AAM67180 Homo SapiensMOLE- Human bone marrow 129 44


expressed probe encoded
protein SEQ


ID NO: 27486.


1857 AAM54795 Homo sapiensMOLE- Human brain expressed129 44
single


exon probe encoded protein
SEQ ID


NO: 26900.


1857 gi~21040255~Homo Sapienssplicing factor, arginine/serine-rich109 29
12


re~NP_6319


07.1 ~


1858 gi21392190Drosophila RE74758p 71 39


melanogaster


1858 gi9954108TrypanosomaRNA binding protein RGGm68 40


cruzi


1858 gi20302994Medicago nodule-specific glycine-rich66 32
protein 1C


tnmcatula


1859 gi~20536244~Homo Sapienssimilar to autoantigen 72 30
La


ref~XP_0605


05.4


1860 gi~17541362~CaenorhabditisK08E7.S.p 103 29


ref)NP-5024elegans


09.1


1860 gi~17446900~Homo Sapienssimilar to DNA-directed 100 34
RNA


re~XP_0658 polymerase (EC 2.7.7.6)
II largest


33.1 ~ chain - Mastigamoeba
invertens


(fra ment)


1860 gi~9628166~rAfrican CD2 homolog 98 30
swine


eflNP fever virus
0427




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
197
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


52.1


1861 AAY70691 Homo sa DAND Human membrane attractin-2.162 40
iens


1861 AAY70690 Homo SapiensDAND Human membrane attractin-1.162 40


1861 gi12275390Rattus membrane attractin 162 40


norvegicus


1862 gi10039425Equus caballusALR protein 81 28


1862 gi13529521Mus musculusSimilar to elastin microfibril80 32
interface


located protein


1862 AAM40414 Homo SapiensHYSE- Human polypeptide 79 39
SEQ ID


NO 3559.


1863 gi~16588389~Homo SapiensB lymphocyte activation-related247 52
protein


gb~AAL267 BC-1514


87.1 ~AF304


442 1


1863 gi~20479028~Homo Sapienssimilar to B lymphocyte 117 68
activation-


re~XP_1137 related protein BC-1514


29.1


1863 gi~21301715~Anopheles agCP8366 85 41


gb~EAA138gambiae
str.


60.1 ~ PEST


1864 AAU15851 Homo SapiensHUMA- Human novel secreted1275 78
protein,


Seq ID 804.


1864 AAU16312 Homo sapiensHUMA- Human novel secreted1123 76
protein,


Seq ID 1265.


1864 AAG02054 Homo SapiensGEST Human secreted protein,308 91
SEQ ID


NO: 6135.


1865 AAB94953 Homo SapiensHELI- Human protein sequence86 29
SEQ


ID N0:16485.


1865 13746787 Homo SapiensSYT interacting protein 86 29
SIP


1865 g115022507Homo sapienscoactivator activator 86 29


1866 g117133332Nostoc Sp. preprotein translocase 68 43
PCC Sect subunit


7120


1866 gi~13489110~Homo Sapiensgap junction protein, 66 40
alpha 3, 46kD


ref~NP-0687 (connexin 46)


73.1


1867 g1706930 Rattus cyclic GMP stimulated 191 95


norvegicus phosphodiesterase


1867 AAV54762-Homo SapiensUNIW Human cGS-PDE cDNA 137 100
DNA


aal seqeucne.


1867 AAV36157_,Homo SapiensUNIW Human cyclic-GMP-nucleotide137 100


aal phos hodiesterase cDNA.


1868 AAB95695 Homo SapiensHELI- Human protein sequence112 27
SEQ


ID N0:18516.


1868 AAY91447 Homo SapiensHUMA- Human secreted 112 27
protein


sequence encoded by gene
48 SEQ ID


N0:168.


1868 AAY91393 Homo SapiensHUMA- Human secreted 112 27
protein


sequence encoded by gene
48 SEQ ID


N0:114.


1870 AAU07886 Homo SapiensWHED Polypeptide sequence1454 94
for


human hspGlS.


1870 g113603891Homo sa MOV10-like 1 1454 94
iens


1870 113603857Mus musculusMOV10-like 1 954 77


1871 AAM96652 Homo SapiensHUMA- Human reproductive484 96
system




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
198
Table 2
SEQ AccessionSpecies Description Score


ID No, Identity


NO:


related antigen SEQ ID
NO: 5310.


1871 gi18676652Homo sa FLJ00225 rotein 433 95
iens


1871 gi21386760Berneuxia maturase R 70 32


thibeoca


1872 AAQ90304_Homo SapiensNISR Human thryoid peroxidase73 29
gene.


aal


1872 AAW48781 Homo sa RSRR- Thyroid eroxidase.73 29
iens


1872 AAR75689 Homo SapiensNISR Human thryoid eroxidase.73 29


1873 AAG03774 Homo SapiensGEST Human secreted protein,228 90
SEQ ID


NO: 7855.


1873 1338288 Homo Sapienspre rosomatostatin I 228 90


1873 g1342299 Macaca preprosomatostaon 228 90


fascicularis


1875 AAR30418 Homo sa DAND Nearly com lete 76 30
iens p107 rotein.


1875 g1347378 Homo Sapiens107 76 30


1875 g1157871 Drosophila P glycoprotein 76 24


melanogastex


1876 ABB 17955Homo SapiensHUMA- Human nervous system186 40
related


poi a tide SEQ ID NO
6612.


1876 AAS 17764_Homo SapiensGENA- Human Genomic DNA 167 39
for


aal CRYBB1.


1876 AA002331 Homo SapiensHYSE- Human polypepode 165 42
SEQ ID


NO 16223.


1877 gi~59977~emHuman tripartite fusion transcript224 76
PLA2L


b~CAA7866endogenous


2.1 retrovirus


1878 ABB84943 Homo SapiensGETH Human PR01556 protein1056 93


sequence SEQ ID N0:254.


1878 AAB31670 Homo SapiensPROT- Amino acid sequence1056 93
of a


human protein having
a hydrophobic


domain.


1878 AAB47295 Homo SapiensGETH PR01556 0l epode. 1056 93


1879 ABB15861 Homo SapiensHUMA- Human nervous system73 36
related


poi eptide SEQ ID NO
4518.


1880 AAU83117 Homo sapiensZYMO Novel secreted protein66 54


Z799543G2P.


1880 g112723186Lactococcusouter membrane lipoprotein66 26
precursor


lactis subsp.


lactis


1881 1609624 Vibrio choleraeE SC 73 29


1882 g112667456Ratios synaptotagnun VIId 86 32


norvegicus


1882 g112667454Rattus synaptotagmin VIII 85 33


norvegicus


1882 g1334072 PseudorabiesORF-3 protein 83 35


virus


1883 g11747 Oryctolagustrichohyalin 119 29


cuniculus


1883 g12072290Xenopuslae XL-INCENP 100 27
vis


1883 g112584554_ polyprotein 96 25
Human


coxsackievirus


B3


1884 gi~15601413~Vibrio choleraesucrose-6-phosphate dehydrogenase65 55


ref~NP
2330




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
199
Table 2
SEQ AccessionSpecies Description Score 1o


ID No. Identity


NO:


44.1 ~


1885 gi16878287Homo sa Similar to C-terminal 74 35
iens modulator protein


1885 gi15866714Homo sa C-terminal modulator 74 35
iens rotein


1885 AA006984 Homo SapiensHYSE- Human polypeptide 70 60
SEQ ID


NO 20876.


1887 AAW25939 Homo SapiensCNRS T-cell receptor 601 99
V-beta-5.1


pe tide fra ent.


1887. gi36973 Homo SapiensT-cell receptor beta-chain601 99


1887 gi1552498Homo sa V_se meat translation 600 100
iens product


1888 gi18874468Homo Sapienspartitioning-defective 198 73
3-like protein


splice variant c


1888 gi16903870Homo sapienspartitioning-defective 198 73
3-like protein


splice variant b


1888 gi16903868Homo Sapienspartitioning-defective 198 73
3-like protein


s lice variant a


1889 gi21489377Homo SapiensMAPA rotein 1620 99


1889 gi21489330Bos taurus MAPA protein 833 56


1889 gi21489379Mus musculusMAPA protein 630 48


1890 AAY10874 Homo SapiensHUMA- Amino acid sequence503 100
of a


human secreted rotein.


1890 gi17429674Ralstonia PROBABLE LIPOPROTEIN 73 44


solanacearum


1891 gi15723141Homo sa c349E10.1.1 (novel protein,180 46
iens isoform 1)


1891 AAB59006 Homo SapiensHUMA- Breast and ovarian174 47
cancer


associated antigen protein
sequence


SEQ ID 714.


1891 gi19353342Mus musculusRII~EN cDNA 9530058802 162 47
gene


1892 AAM86086 Homo SapiensHUMA- Human 95 53


immiule/haematopoietic
antigen SEQ


ID NO:13679.


1892 AA005973 Homo SapiensHYSE- Human polypeptide 94 82
SEQ ID


NO 19865.


1892 AA009418 Homo SapiensHYSE- Human polypeptide 91 70
SEQ ID


NO 23310.


1893 gi8778607ArabidopsisFSM15.23 71 25


thaliana


1894 AAM65951 Homo SapiensMOLE- Human bone marrow 69 38


expressed probe encoded
protein SEQ


ID NO: 26257.


1894 AAM53568 Homo sapiensMOLE- Human brain expressed69 38
single


exon probe encoded protein
SEQ ID


NO: 25673.


1894 gi~20832567~Mus musculussimilar to Heterogeneous163 76
nuclear


ref~XP_1335 ribonucleoprotein A3
(hnRNP A3)


24.1 ~ (D 10 S 102)


1895 AAM66299 Homo sapiensMOLE- Human bone marrow 440 83


expressed probe encoded
protein SEQ


ID NO: 26605.


1895 AAM53913 Homo SapiensMOLE- Human brain expressed440 83
single


exon probe encoded protein
SEQ ID


NO: 26018.


1895 gi~6723273~dBaboon gag-pol precursor polyprotein270 45


bj ~BAA8965endogenous


9.1~ virus strain
M7




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
200
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1896 gi4883988Bartonella cell division protein 68 28
FtsZ


clarridgeiae


1897 AA013209 Homo sapiensHYSE- Human polypeptide 142 54
SEQ ID


NO 27101.


1897 AAM66708 Homo sapiensMOLE- Human bone marrow 124 46


expressed probe encoded
protein SEQ


ID NO: 27014.


1897 AAM54310 Homo SapiensMOLE- Human brain expressed124 46
single


exon probe encoded protein
SEQ ID


NO: 26415.


1898 gi2565268Drosophila pore-forming protein 75 27
MIP family


virilis


1898 gi7453547Homo Sapiensglioma tumor suppressor 75 31
candidate


re ion rotein 1


1898 gi3218331Metarhiziumnitrogen response regulator74 26


aniso liae


1899 19656609 Vibrio choleraechemotaxis protein CheA 73 32


1899 gi~20908537~Mus musculusRIVEN cDNA 1700001L19 443 80


re~XP_1274


14.1


1899 gi~15642063~Vibrio choleraechemotaxis protein CheA 73 32


re~NP,2316


95.1


1900 gi~18586105~Homo Sapienssimilar to scal 203 84


reflXP
0914


00.1 ~


1900 gi~20888279~Mus musculussimilar to spinocerebellar199 82
ataxia type 1


refjXP_
1465


08.1


1901 g1338033 Homo sa serum rotein 90 32
iens


1901 g14808221Homo SapiensdJ1177I5.2 (serum constituent90 32
protein


MSE55)


1901 g14098993Mus musculuspolyhomeotic 2 88 30


1902 AAB 19933Homo SapiensINCY- Human oxidoreductase250 100
OXRD-


8.


1902 g119713043Fusobacteriumhon/zinc/copper-binding 73 22
protein


nucleatum
subsp.


nucleatum


ATCC 25586


1902 gi~20342079~Mus musculusltIKEN cDNA 1700003E16 77 25


ref~XP_1106


14.1


1903 g1342279 Macaca opiomelanocortin 231 49


nemestrina


1903 128342 Homo sa roo iomelanocortin 230 49
iens


1903 g1190183 Homo sapienso iomelanocortin 230 49


1904 gi~11037117~Homo SapiensNAG13 180 53


gb~AAG274


85.1 CAF
194


537_1


1905 g15360984Homo SapiensdJ228HI3.1 (similar to 152 72
Ribosomal


protein L21 e)


1905 AAB44126 Homo SapiensHUMA- Human cancer associated150 83


protein sequence SEQ
ID N0:1571.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
201
Table 2
SEQ AccessionSpecies Description Score


No, Identity


NO:


1905 gi550015 Homo sa ribosomal protein L21 150 83
iens


1906 gi2654610Pseudomonasarginine/ornitlline succinyltransferase79 25


aeru inosa AIsubunit


1906 gi17226812Botryotiniahistidine kinase 72 33


fuckeliana


1906 gi16904238Botryotiniatwo-component osmosensing72 33
histidine


fuckeliana kinase BOS1


1908 gi330359 Human nuclear antigen precursor91 37


herpesvirus
4


1908 gi1632793Human EBNA3C (EBNA 4B) latent 91 37
protein


herpesvirus
4


1908 11184677 Candida hyphal wall rotein 1 90 38
albicans


1909 g113177635Rattus phospholipase C beta-3 72 26


norve icus


1909 g11150880Mus musculusphos holi ase C beta3 71 26


1909 g117105044Simian 10.1 kDa 71 31


adenovirus
25


1910 g19857054Leishmania possible CG7055 protein 71 47


maj or


1910 g11617560Leishmania LCFACASS; L5701.2 67 33


ma'or


1910 gi~9857054~eLeishmania possible CG7055 protein 71 47


mb~CAC040major


11.1


1911 AAY87278 Homo SapiensINCY- Human signal peptide501 82


containing protein HSPP-55
SEQ ID


NO:55.


1911 AAB 18912Homo SapiensGETH A novel polypeptide501 82
designated


PR01889.


1911 AAU27659 Homo SapiensZYMO Human protein AFP513481.416 77


1912 12065210 Mus musculusPro-Pol-dUTPase olyprotein434 80


1912 gig 18676710Homo SapiensFLJ00254 protein 270 64


dbj~BAB850


07.1


1913 g15713196Caenorhabditisliprin-alpha homolog 479 38
SYD-2


elegans


1913 1930343 Homo SapiensLAR-interacting protein 467 39
1b


1913 g1930341 Homo SapiensLAR-interacting protein 467 39
la


1914 g16651021Mus musculussemaphorin cytoplasmic 274 63
domain-


associated rotein 3B


1914 g16651019Mus musculussemaphorin cytoplasmic 274 63
domain-


associated protein 3A


1914 AAM25720 Homo SapiensHYSE- Human protein sequence266 61
SEQ


ID N0:1235.


1915 g1902214 Zea mays RNA polymerase beta' 72 24
subuW t-2


1915 g112482 Zea mays RNA polymerase beta-2 72 24
subunit (AA


1-1527)


1915 gig 11467184Zea mays RNA polymerase beta' 72 24
subunit-2


reflNP-0430


17.1


1916 g11655432Mus musculuslexin 2 1135 58


1916 AAM93435 Homo SapiensHELI- Human polypeptide,1132 57
SEQ ID


NO: 3070.


1916 g1961515 Xenopus lexin 1126 54
laevis




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
202
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1917 g115559064Mus musculusSNAG1 86 38


1917 gi~20863586~Mus musculussimilar to dJ551D2.5 88 30
(novel protein)


ref~XP_1415


81.1


1917 gi~18644890~Mus musculussorting nexin associated86 38
golgi protein 1


re~NP_5706


14.1


1918 g119528383Drosophila RE04404p 67 32


melanogaster


1919 AAM77461Homo SapiensMOLE- Human bone marrow 189 79


expressed probe encoded
protein SEQ


ID NO: 37767.


1919 AAM64684Homo sapiensMOLE- Human brain expressed189 79
single


exon probe encoded protein
SEQ ID


NO: 36789.


1919 gig 17477135Homo Sapienssimilar to embryonal 263 75
stem cell specific


ref~XP'0634 gene 1


15.1


1920 g12623757Rarius neurabin 172 97


norvegicus


1920 12827450Gallus anus KS5 rotein 154 88


1920 113991829Xenopus laevisneurabin 145 83


1923 g15532302Heterocapsa PSII CP47 apoprotein 75 29


tri uetra


1923 g11881335Bacillus SIMILAR TO YQFU, YXKD, 68 38
subtilis YITB


OF B. SUBTILIS.


1923 gi~5532302~gHeterocapsa PSII CP47 apoprotein 75 29


b~AAD4470triquetra


1.1~


1924 g16855429Leishmania possible mucin 1 precursor77 33


maj or


1924 g15832816Caenorhabditiscontains similarity to 74 34
Pfam domain:


elegans PF01694 (Rhomboid family),


Score=61.7, E-value=5.1e-15,
N=1


1924 AAB51976Homo SapiensHUMA- Human secreted 72 38
protein


sequence encoded by gene
48 SEQ ID


N0:108.


1925 AAB51635Homo SapiensROSE/ Human secreted 205 31
protein


sequence encoded by gene
16 SEQ ID


N0:75.


1925 AAB47128Homo Sapiens1NCY- CDIFF-6, Incyte 199 34
ID No.


2009435CD 1.


1925 ABB55766Homo SapiensFECH/ Human polypeptide 197 38
SEQ ID


NO 138.


1926 AAG89279Homo SapiensGEST Human secreted protein,330 44
SEQ ID


NO: 399.


1926 AAB70690Homo SapiensSREN- Human hDPP protein319 44
sequence


SEQ ID N0:7.


1926 g113182757Homo sa iensHTPAP 319 44


1927 g113177290Ectocarpus EsV-1-8 69 36


siliculosus
virus


1928 g118700171Arabidopsis AT5g20480/F7C8 70 86 39


thaliana


1928 g1915207Sus scrofa gastric mucin 83 29




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
203
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1928 gi532113Caenorhabditishomeotic region most 79 27
like


elegans HMPB_DROME: homeotic


probosci edia rotein


1929 ABB 12295Homo SapiensHYSE- Human secreted 135 59
protein


homologue, SEQ ID N0:2665.


1929 AAG04080Homo SapiensGEST Human secreted 78 38
protein, SEQ ID


NO: 8161.


1929 gi9279807Drosophila cortactin 77 27


melanogaster


1930 AAV81204_Homo sapiensGEHO Human CD7 cDNA. 872 73


aal


1930 AAB36657Homo SapiensIMMV Human CD7 protein 872 73
sequence


SEQ ID N0:2.


1930 AAU02438Homo SapiensGEHO Human lymphocyte 872 73
cell surface


anti en CD7 olype tide.


1931 gi2636248Bacillus similar to transaldolase73 29
subtilis (pentose


hosphate)


1931 gi~21398633~Bacillus Transaldolase, Transaldolase74 29
[Bacillus


reflNP,6546anthracis
A2012


18.1


1931 gi~16080764~Bacillus similar to transaldolase73 29
subtilis (pentose


ref~NP_3915 phosphate)


92.1


1932 AAB43545Homo SapiensHUMA- Human cancer associated73 46


protein sequence SEQ
ID N0:990.


1932 AAM40234Homo SapiensHYSE- Human polypeptide71 26
SEQ ID


NO 3379.


1934 gi3129962Gallus gallusB locus Lectin like 82 30
Natural Killer cell


surface protein


1934 AAB93791Homo SapiensHELI- Human protein 77 38
sequence SEQ


ID N0:13545.


1934 gi2541864Drosophila DAD polypeptide 77 32


melanogaster


1935 gi~4959869~gMurine leukemiapolymerise 335 52


b~AAD3453virus


6.1~


1935 gi~6524624~gPhascolarctospol protein 331 52


b~AAF15098cinereus


.l~


1935 gi~9630313~rGibbon ape pol polyprotein 328 52


ef~NP_0567leukemia
virus


90.1


1936 gi6562332Arabidopsis diaminopimelate decarboxylase86 30


thaliana


1936 gi7573355Arabidopsis diaminopimelate decarboxylase-like86 30


thaliana rotein


1936 gi15146250Arabidopsis ATSg11880/F14F18 50 86 30


thaliana


1939 AAU07442Homo SapiensGETH Human Wntl Upregulated300 100


protein 2 (WUP2).


1939 AAU07441Homo SapiensGETH Human Wntl Upregulated300 100


protein 1 (WUP1).


1939 AAB56802Homo sapiensROSEI Human prostate 300 100
cancer antigen


protein se uence SEQ
ID N0:1380.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
204
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1940 15802814 Homo sa Gag-Pro-Pol-Env rotein 587 57
iens


1940 g14185939Human pol protein 586 57


endogenous


retrovirus
K


1940 15802821 Homo sa Gag-Pro-Pol rotein 586 57
iens


1941 AAU83088 Homo sapiensZYMO Novel secreted protein586 100


Z2812G3P.


1941 AAB20275 Homo sa SCHE Human interleukin 535 76
iens DNAX 80.


1941 AAB20277 Homo SapiensSCHE Human interleukin 529 76
DNAX 80


variant.


1942 AAM06866 Homo SapiensHYSE- Human foetal protein,994 100
SEQ ID


NO: 1074.


1942 g117426446Homo sa bA351K23.5 (novel rotein)933 54
iens


1942 115099951Mus musculusdiacylglycerol acyltransferase915 55
2


1943 AAM06596 Homo sapiensHYSE- Human foetal protein,406 98
SEQ ID


NO: 327.


1943 gi~15640499~Vibrio choleraeS-adenosylmethionine 67 51
synthase


ref~NP-2301


26.1 ~


1945 AAG75561 Homo SapiensHUMA- Human colon cancer327 100
antigen


protein SEQ ID N0:6325.


1945 g116416764Homo SapiensFI~SG16 327 100


1945 g113905212Mus musculusRIKEN cDNA 1200006F02 261 79
gene


1946 g1288174 Mus musculusOct2b 97 85


1946 g153490 Mus musculusOct2.5 transcription 97 85
factor


1946 g19937478Drosophila thyroid hormone receptor-associated72 39


melanogasterrotein TRAP 170


1947 AAM66980 Homo SapiensMOLE- Human bone marrow 170 69


expressed probe encoded
protein SEQ


ID NO: 27286.


1947 AAM54574 Homo SapiensMOLE- Hurnan brain expressed170 69
single


exon probe encoded protein
SEQ ID


NO: 26679.


1947 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86


expressed probe encoded
protein SEQ


ID NO: 35495.


1948 AAY10874 Homo SapiensHUMA- Amino acid sequence100 100
of a


human secreted rotein.


1949 AAA27155_Homo SapiensGENE- Human P2 DNA. 100 100


aal


1949 AAY94475 Homo SapiensGENE- Predicted translation100 100
product of


human P2 splice isoform,
P2-B.


1949 AAY94474 Homo SapiensGENE- Human P2 protein. 100 100


1950 19502082 Homo sapienstubby super-family protein80 40


1950 19502080 Mus musculustubby super-family protein77 41


1950 18118432 Oryza sativabeta-ex ansin 73 35


1951 g14808994walleye envelope polyprotein 69 46


epidermal


hyperplasia
virus


type 1


1951 gig 15642893Thermotoga ribonucleotide reductase,66 46
B 12-


ref~NP_2279maritime dependent


34.1


1952 AAB80264 Homo SapiensGETH Human PR0332 protein.~ 577 ~ 61




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
205
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1952 AAB33425 Homo SapiensGETH Human PR0332 protein577 61


UNQ293 SEQ ID N0:57.


1952 AAY13396 Homo SapiensGETH Amino acid sequence577 61
of protein


PR0332.


1953 gi16648392Drosoplula LD39243p 449 61


melanogaster


1953 AAG73684 Homo SapiensHUMA- Human colon cancer371 55
antigen


rotein SEQ ID N0:4448.


1953 AAY48312 Homo SapiensMETA- Human prostate 371 55
cancer-


associated rotein 9.


1954 AAU84348 Homo SapiensBARK/ Protein MMP2 differentially2068 94


ex ressed in breast cancer
tissue.


1954 ABB90738 Homo SapiensUYJO Human Tumour Endothelial2068 94


Marker poi eptide SEQ
ID NO 208.


1954 AAB84607 Homo SapiensPFIZ Amino acid sequence2068 94
of matrix


metallo roteinase elatinase
A.


1955 gi16769680Drosophila LD46678p 245 35


melano aster


1955 AAM66797 Homo SapiensMOLE- Human bone marrow 148 80


expressed probe encoded
protein SEQ


ID NO: 27103.


1955 AAM54396 Homo SapiensMOLE- Human brain expressed148 80
single


exon probe encoded protein
SEQ ID


NO: 26501.


1957 AAB80242 Homo SapiensGETH Human PR0236 rotein.648 97


_ AAM93378 Homo SapiensHELI- Human polypeptide,648 97
1957 SEQ ID


N0: 2955.


1957 AAB 12157Homo sapiensPROT- Hydrophobic domain648 97
protein


from clone HP03165 isolated
from KB


cells.


1958 AAM41696 Homo SapiensHYSE- Human polypeptide 234 47
SEQ ID


NO 6627.


1958 AAU17119 Homo SapiensHUMA- Novel signal transduction229 46


pathway protein, Seq
ID 684.


1958 gi16741621Homo SapiensSimilar to RAB37, member228 47
of RAS


oncogene family


1959 gi18025526cercopithicineLF3 140 30


he esvirus
15


1959 gi3153821Mus musculusplenty-of prolines-101; 137 25
POP101; SH3-


philo-protein


1959 gi39255 Actinomycessialidase 129 28


viscosus


1960 ABB 12366Homo SapiensHYSE- Human bone marrow 400 90
expressed


rotein SEQ ID NO: 120.


1960 AA012936 Homo SapiensHYSE- Human polypeptide 115 95
SEQ ID


NO 26828.


1960 AAM84898 Homo SapiensHUMA- Human 113 82


immune/haematopoietic
antigen SEQ


ID N0:12491.


1961 gi19110438Homo sa polycystin-1L1 190 94
iens


1961 gi3115393Rana pipiensguanylate cyclase inhibitory80 35
. protein


1961 gi3462887Ratios alpha-fodrin 68 31


norvegicus


1962 AAU83130 Homo Sapiens~ ZYMO Novel secreted ~ 1076~ 100
protein




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
206
Table 2
SEQ AccessionSpecies Description Score /a


ID No. Identity


NO:


Z835892G6P.


1962 11890354 Brassica L-ascorbate eroxidase 80 33
na us


1962 g17529611Leishmania hypoothetical protein 79 31
L787.06


ma' or


1963 AAG78679 Homo sa BODE- Human thrombotic 467 86
iens protein 46.


1963 AAY87347 Homo SapiensINCY- Human signal peptide467 86


containing protein HSPP-124
SEQ ID


N0:124.


1963 AAB01431 Homo sa MILL- Human TANGO 224 467 86
iens (form 2).


1964 g13413504Rattus Bassoon 81 26


norvegicus


1964 g1330452 human DNA polymerase 79 28


he esvirus
5


1964 AAV69717_Homo SapiensLUDW- Tumour rejection 73 33
antigen


aal precursor MAGE-C1 cDNA.


1965 gi~2323'287~gmultiple polyprotein 286 64


b~AAB6652sclerosis


8.1~ associated


retrovirus


1965 gi~2351212~dFriend marinegag-pol polyprotein (precursor179 47
protein)


bj~BAA2206leukemia
virus


4.1~


1965 gi~9629516~rRauscher Pol 179 47
marine


ef~NP_0447leukemia
virus


38.1


1966 gi~2323287~gmultiple polyprotein 476 65


b~AAB6652sclerosis


8.1~ associated


retrovirus


1966 gi~2281588~gsynthetic Pol 323 51


b~AAB6416construct


0.1~


1966 gi~9626961~rMarine leukemiaPr180 323 51


ef~NP_0579virus


33.1


1967 12065210 Mus musculusPro-Pol-dUTPase pol rotein518 73


1967 AAM65715 Homo SapiensMOLE- Human bone marrow 464 69


expressed probe encoded
protein SEQ


ID NO: 26021.


1967 AAM53338 Homo SapiensMOLE- Human brain expressed464 69
single


exon probe encoded protein
SEQ ID


NO: 25443.


1968 AAG78149 Homo SapiensBODE- Human polypeptide-388 82


cytochrome b5-13.


1968 g13150438Human pol-env 345 55


endogenous


retrovirus
K


1968 g11469243Human pol/env 345 55


endogenous


retrovirus
K


1969 g121113108XanthomonasTong-dependent receptor 78 31


campestris
pv.


campestris
str.


ATCC 33913




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
207
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1969 gi476274 Homo SapiensR kappa B 77 23


1969 gi4206769Acanthamoebamyosin I heavy chain 76 27
kinase


castellanii


1970 gi~13310191~multiple recombinant envelope 244 77
protein


gb~AAK181sclerosis


89.1~AF331associated


1 retrovirus
500


_ element


1970 gi~8272468~gHomo Sapiensenvelope protein 219 81


b~AAF74215


.1 ~AF15696


3 1


1970 gi~21103962~Homo Sapiensenverin-2 219 77


gb~AAM331


41.1


1971 AAU83621 Homo SapiensGETH Human PRO protein, 320 100
Seq ID No


60.


1971 AA005826 Homo SapiensHYSE- Human polypeptide 295 93
SEQ ID


NO 19718.


1971 AAM39560 Homo SapiensHYSE- Human polypeptide 194 56
SEQ ID


NO 2705.


1972 gi6456112Mus musculusF-box protein FBX15 128 44


1972 gi21428946Drosophila GH22104p 74 31


melanogaster


1972 gi~6456112~gMus musculusF-box protein FBX15 128 44


b~AAF09139


.1~


1973 1148270 Escherichialambda-integrase 550 94
coli


1973 g11790244Escherichiasite-specific recombinase,550 94
coli acts on cer


I~12 sequence of ColEl, effects


chromosome segregation
at cell


division


1973 g113364217Escherichiasite-specific recombinase544 92
coli XerC


0157:H7


1974 g11805552EscherichiaFORMATE HYDROGENLYASE 887 88
coli


TRANSCRIPTIONAL ACTIVATOR.


1974 11616960 EscherichiaHyfR 887 88
coli


1974 g17920396Salmonella formate hydrogenlyase 522 54
activator


typhimuriumprotein


1975 1409795 EscherichiaNo definition line found1175 99
coli


1975 g115074592SinorllizobiumHYPOTHETICAL 378 33


meliloti TR.ANSMEMBRANE PROTEIN


1975 g117740718AgrobacteriumNa+/Pi-cotransporter 372 34


tumefaciens
str.


C58 (U.


Washington)


1976 AAB82047 Homo SapiensIGAK- Human mast cell 163 23
surface


antigen.


1976 g112654783Homo SapiensSimilar to loss of heterozygosity,163 23
11,


chromosomal region 2,
gene A


1976 AAZ45690-Homo sapiensREGC cDNA sequence encoding108 25
the


aal human minor vault protein
193.


1977 ABB56523 Homo SapiensMERI Human NMDA receptor73 28
subunit


SEQ ID NO 44.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
208
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1977 AAW87504 Homo SapiensSIBI- Human N-methyl-D-aspartate73 28


receptor subunit encoded
by clone


NMDA24.


1978 AAG00471 Homo SapiensGEST Human secreted protein,285 93
SEQ ID


NO: 4552.


1978 gi298489 Papio hamadryasSP-10 133 34


1978 gi452582 Vulpes vulpesfox sperm acrosomal protein132 34
FSA-


Acr. l


1979 AAB87128 Homo SapiensMILL- Human secreted 490 86
protein


MANGO 349, SEQ ID N0:130.


1979 AAB87179 Homo SapiensMILL- Human secreted 488 85
protein


MANGO 349 I21K variant,
SEQ ID


N0:227.


1979 AAB87181 Homo SapiensMILL- Human secreted 487 85
protein


MANGO 349 E41D variant,
SEQ ID


N0:231.


1982 AAM75035 Homo SapiensMOLE- Human bone marrow 109 67


expressed probe encoded
protein SEQ


ID NO: 35341.


1982 AAM62231 Homo SapiensMOLE- Human brain expressed109 67
single


exon probe encoded protein
SEQ ID


NO: 34336.


1982 gi11967423Mus musculusvomeronasal receptor 105 76
V1RC5


1983 AAG89276 Homo sapiensGEST Human secreted protein,224 46
SEQ ID


NO: 396.


1983 AAB56565 Homo sapiensROSE/ Human prostate 99 40
cancer antigen


protein sequence SEQ
ID N0:1143.


1983 AAY44987 Homo sa 1NCY- Human epidermal 78 28
iens protein-4.


1984 AAB95089 Homo SapiensHELI- Human protein sequence498 97
SEQ


ID NO:17025.


1984 AAM06608 Homo SapiensHYSE- Human foetal protein,495 96
SEQ ID


NO: 339.


1984 gi497890 unidentifiedalpha subunit of dinitrogenase73 24


nitrogen-fixingreductase (Fe protein)


bacteria


1985 gi~17455728~Homo Sapienssimilar to Zinc-forger 71 37
protein ubi-d4


ref~XP_0635 (Requiem) (Apoptosis
response zinc


94.1 ~ finger protein)


1986 gi21428886Drosophila GH12469p 69 34


melano aster


1987 17767529 Bos taurus cyclophilin I 364 75


1987 18699209 Canis familiariscyclo hilin A 361 88


1987 111641132Sus scrofa cyclo hilin 361 88


1988 g115073168SinorhizobiumPROBABLE TRANSLATION 81 37


meliloti INITIATION FACTOR IF-2


PROTEIN


1988 g11181352Paramecium Pro-rich protein; PIPG 78 25
(8X)


bursaria


Chlorella
virus 1


1988 g1493242 Feline Feline herpesvirus type 77 20
1 immediate


herpesvirusearly protein
1


1989 AAM65707 Homo SapiensMOLE- Human bone marrow 134 66


expressed probe encoded
protein SEQ


ID NO: 26013.




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
209
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1989 AAM53330 Homo SapiensMOLE- Human brain expressed134 66
single


exon probe encoded protein
SEQ ID


NO: 25435.


1989 gi~20475216~Homo Sapienssimilar to synapsin 228 59
I


ref~XP-1148


02.1 ~


1990 AAM71181 Homo SapiensMOLE- Human bone marrow110 64


expressed probe encoded
protein SEQ


ID NO: 31487.


1990 AAM58674 Homo SapiensMOLE- Human brain expressed110 64
single


exon probe encoded protein
SEQ ID


NO: 30779.


1990 gi21323636CorynebacteriumSulfate permease and 75 26
related


glutamicum transporters (MFS superfamily)


ATCC 13032


1991 gi1932813Xeno us laevisdsRNA adenosine deaminase96 34


1991 AAE10203 Homo SapiensHYSE- Human bone marrow83 25
derived


conti rotein, SEQ ID
NO: 68.


1991 gi3242649Rana catesbeianaalpha 1 type I collagen80 30


1992 gi1181423Paramecium PBCV-1 chitinase 71 41


bursaria


Chlorella
virus 1


1992 gi~21300897~Anopheles agCP14405 72 37


gb~EAA130gambiae str.


42.1 ~ PEST


1992 gi~9631828~rParamecium PBCV-1 chitinase 71 41


ef~NP_0486bursaria


13.1 Chlorella
virus 1


1994 gi8248755Plasmodium protein phosphatase 72 25


falciparum
3D7


1994 gi4104348CampylobacterS-layer-RTX protein 70 38


rectus


1994 gi~8248755~ePlasmodium protein phosphatase 72 25


mb~CAB628falciparum
3D7


78.2


1995 gi21324402CorynebacteriumUncharacterized ATPase 73 38
related to the


glutamicum helicase subunit of
the Holliday


ATCC 13032 junction resolvase


1995 gi~19552845~CorynebacteriumCOG2256:Uncharacterized73 38
ATPase


ref~NP_6008glutamicum related to the helicase
subunit of the


47.1 Holliday 'unction resolvase


1995 gi~17533213~CaenorhabditisF14ES.S.p 73 30


reflNP elegans
4957


77.1 ~


1996 11871223 Rickettsia crystalline surface 92 30
hi layer rotein


1996 g16969926Rickettsia OmpB ~ 79 25


aeschlimannii


1996 g114670347Rickettsia OmpB 78 25
felis


1997 gi~20548733~Homo Sapienssimilar to gag protein 256 58


re~XP-0556


41.2


1997 gi~9739120~gBovine leukemiagag 186 34


b~AAF97916virus


.l




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
210
TahlP 7
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


1997 gi~9626226~rBovine leukemiaPr44 185 34


e~NP_0568virus


97.1


1998 AAM79834 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO


3480.


1998 AAM78850 Homo SapiensHYSE- Human protein SEQ 279 71
ID NO


1512.


1998 AAM79204 Homo SapiensHYSE- Human protein SEQ 272 71
ID NO


1866.


1999 AAM73176 Homo SapiensMOLE- Human bone marrow 168 48


expressed probe encoded
protein SEQ


ID NO: 33482.


1999 AAM60521 Homo sapiensMOLE- Human brain expressed168 48
single


exon probe encoded protein
SEQ ID


NO: 32626.


1999 gi~13929148~Rattus cyclic nucleotide-gated 163 47
channel beta


ref~NP_1139norvegicus subunit 1


97.1 ~


2000 gi1869859human very large tegument protein73 30


he esvirus
2


2000 gi7380253Neisseria 2-keto-4-hydroxyglutarate70 37
aldolase


' meningitidis


22491


2000 gi7226633Neisseria 4-hydroxy-2-oxoglutarate70 37
aldolase/2-


meningitidisdeydro-3-deoxyphosphogluconate


MC58 aldolase


2001 gi17016969Mus musculusNUANCE 138 36


2001 gi6273778Homo Sapienstrabeculin-alpha 137 33


2001 gi1675222Mus musculusACF7 neural isoform 1 136 42


2002 AAM39256 Homo SapiensHYSE- Human polypeptide 81 29
SEQ ID


NO 2401.


2002 1840789 Homo sa bindin re ulato factor 81 29
iens


2002 g117028337Homo Sapiensregulatory factor X, 81 29
5 (influences HLA


class II expression)


2003 g12252814Mus musculusFOG 172 64


2003 AAR58815 Homo SapiensUSSH Human c-myc far 103 42
upstream


element (FUSE) binding
protein


(FBP)variant from HL60
clone 3-1.


2003 g13598974Rattus protein tyrosine phosphatase103 26
TD14


norve icus


2004 g111994696Arabidopsiscontains similarity to 77 28
DNA repair


thaliana protein ene id:K7M2.11


2004 17209527 Mus musculustestis-s ecific gene 73 24


2004 gi~17451912~Homo Sapienssimilar to DNA-binding 234 97
protein B


ref~XP_0710


83.1


2005 AAE12023 Homo sapiens1NCY- Human G-protein 173 100
coupled


receptor, GCREC-2.


2005 AAG65832 Homo SapiensFARB Human G protein-coupled173 100


receptor (GPCR).


2005 AAG68126 Homo SapiensFARB Human 7TM-GPCR protein105 78


sequence SEQ ID N0:6.


2006 g120068811Homo SapiensRab-couplin protein 130 43


2006 g115822596Homo sapiensnRi 11 104 45




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
211
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2006 gi13377897Homo SapiensRabl l interacting protein83 40
I2i l la


2007 gi~17539708)CaenorhabditisFO8B4.S.p 78 42


ref~NP-5014elegans


89.1


2008 AAE10350 Homo SapiensPFIZ Human ADAMTS-J1.4 504 97
variant


protein.


2008 AAE10349 Homo SapiensPFIZ Human ADAMTS-J1.3 504 97
variant


rotein.


2008 AAE10347 Homo sapiensPFIZ Human ADAMTS-J1.1 504 97
variant


protein.


2009 AAV31720_Homo SapiensMOUN Nucleotide sequence87 29
of the


aal PUR-al ha ene.


2009 AAT99264_Homo SapiensMOUN Human PUR-alpha 87 29
gene.


aal


2009 AAQ44800_Homo SapiensMOUN Encodes single-stranded87 29
DNA


aal binding (PUR) protein.


2010 gi170444 Lycopersiconextensin (class II) 123 27


esculentum


2010 gi4662641Arabidopsisexpressed protein 116 30


thaliana


2010 gi188864 Homo sa mucin 115 28
iens


2011 AAY93650 Homo SapiensHUMA- Amino acid sequence1677 100
of a


human prostacyclin-stimulating
factor-


2.


2011 AAS 15723_Homo SapiensCURA- DNA encoding insulin-like1673 99


aal growth factor family
related protein,


NOV3.


2011 AAE17599 Homo SapiensINCY- Human extracellular1673 99
messenger


(XMES)-1 rotein.


2012 gi10440434Homo sa FLJ00052 protein 336 69
iens


2012 gi20502870Mus musculusSDS3 333 68


2012 gi21430678Drosophila RE74901p 170 36


melano aster


2013 AAH77293_Homo SapiensMILL- Human ion channel 214 93
protein


aal IC32391 cDNA coding re
ion.


2013 AAE13278 Homo Sapiens1NCY- Human transporters214 93
and ion


channels (TRICH)-5.


2013 AAG77969 Homo SapiensMILL- Human ion channel 214 93
protein


IC32391.


2014 gi4894768Xeno us ephrin-B2 recursor 78 30
laevis


2015 AAU77498 Homo sapiens1NCY- Human lipid metabolism1291 100


enzyme, LMM-6.


2015 ABB08205 Homo SapiensINCY- Human lipid metabolism1122 100


enzyme-5 (LME-5).


2015 ABB07493 Homo SapiensINCY- Human lipid metabolism864 75


molecule (LMM) polypeptide
(ID:


2965233 CD 1 ).


2016 gi~14769015~Homo Sapiensfibrillin3 68 36


retlXP_0415


69.1 ~


2017 gi2313786Helicobacterchorismate synthase (aroC)78 33


ylori 26695


2017 gi4155160HelicobacterCHORISMATE SYNTHASE 72 32


pylori J99




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
212
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2017 gi~15645287~Helicobacterchorismate synthase (aroC)78 33


reilNP-2074pylori 26695


57.1


2018 gi15485622Homo sa Q9H4T4 like 1068 100
iens


2018 ABB 14744Homo SapiensHUMA- Human nervous system694 98
related


pol epode SEQ ID NO 3401.


2018 AAB95100 Homo SapiensHELI- Human protein sequence101 24
SEQ


ID N0:17064.


2019 18050556 Gorilla carboxyl-ester lipase 223 42
gorilla


2019 AAU09894 Homo SapiensMONS Bile Salt Stimulated217 39
Lipase


(BSSL).


2019 ABB04676 Homo SapiensMONS Human milk bile 217 39
salt-


stimulated lipase (BSSL)
protein SEQ


ID N0:2.


2020 12065210 Mus musculusPro-Pol-dUTPase polyprotein515 74


2020 gi~385615~gbMus Sp. fibulin gene homolog 300 75


~AAB26708.


1~


2020 gi~13194728~Gallus galluspol-like protein ENS-3 170 33


gb~AAK155


26.1 ~AF329


451 1


2021 AAM66980 Homo SapiensMOLE- Human bone marrow 170 75


expressed probe encoded
protein SEQ


ID NO: 27286.


2021 AAM54574 Homo sapiensMOLE- Human brain expressed170 75
single


exon probe encoded protein
SEQ ID


NO: 26679.


2021 AAM75189 Homo SapiensMOLE- Human bone marrow 159 86


expressed probe encoded
protein SEQ


ID NO: 35495.


2022 AAD29146_Homo sapiensZYMO Human Zcyto2l consensus649 83


aal cDNA.


2022 AAU83208 Homo SapiensZYMO Novel secreted protein649 83


Z908463G2P.


2022 AAE18311 Homo SapiensZYMO Human Zcyto2l consensus649 83


protein.


2024 g114336750Homo SapiensCe protein similar to 84 34
Dm Cys3His


forger rotein


2024 AAB50363 Homo sa UYSL- Human SRCAP. 83 34
iens


2024 AAB95541 Homo SapiensHELI- Human protein sequence83 34
SEQ


ID N0:18149.


2025 g118676682Homo SapiensFLJ00240 protein 470 45


2025 g114701866Dictyosteliumcarmil 221 29


discoideum


2025 g11881738Acanthamoebamyosin-I binding protein219 29
Acan125


castellanii


2026 ABB12490 Homo SapiensHYSE- Human bone marrow 212 78
expressed


protein SEQ ID NO: 329.


2027 AAU83147 Homo SapiensZYMO Novel secreted protein1153 100


Z846363G2P.


2027 gi~21287755~Anopheles ebiP4780 205 51


gb~EAA000gambiae
str.


76.1 ~ PEST




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
213
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2027 gi~17552028~CaenorhabditisCOSD11.8.p 91 38


ref~NP-4984elegans


07.1 ~


2028 gi1510143Homo Sapienssimilar to C.elegans 323 57
protein encoded in


cosmid T20D3 (Z68220).


2028 gi3879942CaenorhabditisT20D3.11 124 27


elegans


2028 gi5869818Globodera NADH-ubiquinone oxidoreductase82 27


allida subunit 6


2029 AAE13288 Homo SapiensINCY- Human transporters75 31
and ion


channels (TRICH)-15.


2029 gi3252893Thermotoga ABC transporter 74 37


neapolitana


2029 gi~18403965~Arabidopsisexpressed protein 70 29


re~NP_5658thaliana


26.1


2030 AAB97908 Homo SapiensSHAN- Hurnan GTP-binding79 27
protein


17 SEQ ID N0:2.


2030 AAM42129 Homo SapiensHYSE- Human polypeptide 79 27
SEQ ID


NO 7060.


2030 gi9971156Mus musculusGTP-binding like protein79 27
2


2031 gi~20864803~Mus musculusRIKEN cDNA 4930503K02 89 25


ref)XP'1308


00.1 ~


2031 gi~21262152~Oryza sativaSMC4 protein 77 28


emb~CAD32


690.1


2031 gi~1507705~gBorrelia outer surface protein 74 33


b~AAB0656burgdorferi


8.1~


2032 AAG65898 Homo SapiensSMIK Amino acid sequence481 100
of GSK


ene Id 18525.


2032 AAU83670 Homo sapiensGETH Human PRO protein, 471 97
Seq ID No


158.


2032 ABB84896 Homo SapiensGETH Human PR01309 protein471 97


se uence SEQ ID N0:160.


2034 gi6723273Baboon gag-pol precursor polyprotein687 43


endogenous


virus sham
M7


2034 gi18448744Moloney Pr180 gag-pro-pol polyprotein685 42
marine


leukemia
virus


2034 gi2801471Moloney Pr180 682 42
m'urine


leukemia
virus


2035 gi~17554696~CaenorhabditisR148.7.p 68 32


ref~NP elegans
4976


70.1


2035 gi~16127996fEscherichiaaspartokinase I, homoserine68 43
coli


re~NP K12 dehydrogenase I
4145
~


43.1


2035 gi~19548975~Escherichiaaspartokinase I-homoserine.68 43
coli


gb~AAL908 dehydrogenase I


85.1~AF487


900 1


2036 gi13424459Caulobactermethyl-accepting chemotaxis~ 72 ~ 32
protein




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
214
TahlP 9
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


crescentus Mc I
CB15


2036 gi~16877133~Homo sapienscarboxypeptidase, vitellogenic-like69 30


gb~AAH168


38.1 ~AAH16


838


2037 AAB67055 Homo SapiensINCY- Human immune response532 75


molecule (IMUN) protein
SEQ ID NO:


9.


2037 AA001862 Homo SapiensHYSE- Human polypeptide403 67
SEQ ID


NO 15754.


2037 gi~6753924~rMus musculusFriend virus susceptibility240 39
1


eflNP
0343
_


74.1


2039 AAB38447 Homo SapiensHUMA- Fragment of human80 27
secreted


protein encoded by gene
20 clone


HLTFBY 15.


2039 111527799Mus musculusGTP-bindin rotein like 73 30
1


2039 g1695237 Equine tegument protein 73 33 a


he esvirus
2


2040 gi~20544038~Homo Sapienssimilar to PER-HEXAMER 68 41
REPEAT


ref~XP PROTEIN 5
0896


12.4


2042 AAM77922 Homo SapiensMOLE- Human bone marrow642 85


expressed probe encoded
protein SEQ


ID NO: 38228.


2042 AAM65219 Homo SapiensMOLE- Human brain expressed642 85
single


exon probe encoded protein
SEQ ID


NO: 37324.


2042 gi~6723273~dBaboon gag-pol precursor polyprotein139 26


bj~BAA8965endogenous


9.1 virus strain
M7


2043 g148507 Wolinella formate dehydrogenase 80 27


succinogenes


2043 112381857Danio rerio c-Maf 78 42


2043 gi~18594822~Homo Sapienszinc finger protein 306 100
21 (KOX 14)


reflXP_0929


95.1


2044 13132272 Sus scrofa WT1 homologue 99 47


2044 AAG78446 Homo sapiensMASI Predicted WT1 Wilin's96 45
tumour


pol eptide of humans.


2044 AAG62154 Homo SapiensCORI- Human WT1/PSA 96 45
fusion


rotein SEQ ID NO: 357.


2046 g121483222Drosophila AT16994p 86 33


melanogaster


2046 g121111736Xanthomonas cell division protein 79 30


campestris
pv.


campestris
str.


ATCC 33913


2046 112653493Homo SapiensSimilar to brain acid-soluble79 36
protein 1


2047 ABB 12490Homo SapiensHYSE- Human bone marrow200 83
expressed


rotein SEQ ID NO: 329.


2047 gi~20837783~Mus musculussimilar to 40S ribosomal73 35
protein S11


ret~XP_1459


21.1




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
215
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2047 gi~6002932~gStreptomycesglycosyl transferase 71 35


b~AAF00209fradiae


.1 CAF '
16496


0 5


2048 AAB59012 Homo SapiensHUMA- Breast and ovarian103 32
cancer


associated antigen protein
sequence


SEQ ID 720.


2048 gi2429362Santalum proline rich rotein 99 31
album


2048 gi17945382Drosophila RE17165p 98 25


melanogaster


2051 gi15625542Hepatitis S antigen 71 31
B virus


2051 gi~4884886~gHepatitis surface antigen 68 30
B virus


b~AAD3185


7.1 CAF
1341


40 1


2052 AAB28764 Homo SapiensHUMA- Sequence homologous693 78
to


protein fragment encoded
by gene 21.


2052 gi2065210Mus musculusPro-Pol-dUTPase olyprotein693 78


2052 AAB73606 Homo SapiensSHAN- Human dUTP pyrophosphatase668 77


26.


2053 gi9945983Pseudomonastranscriptional regulator83 34
PcaQ


aeru inosa


2053 gi13874427Homo sa cerebral protein-5 76 35
iens


2053 gi12803205Homo sa CAAX box 1 76 35
iens


2054 gi21307831Aplysia CREB-binding protein 76 26


californica


2054 gi16755887Drosophila guanine nucleotide exchange76 26
factor


melano aster


2054 gi~21307831~Aplysia CREB-binding protein 76 26


gb~AAL548californica


59.1)


2055 gi16588389Homo SapiensB lymphocyte activation-related437 71
protein


BC-1514


2055 AAB92981 Homo SapiensHELI- Human protein sequence407 68
SEQ


ID N0:11698.


2055 AAM48325 Homo SapiensSHAN- Human urine receptor398 74
21.23.


2056 gi~2072969~gHomo Sapiensp40 134 47


b~AACS
127


4.1~


2056 gi~7959889~gHomo SapiensPR02221 123 43


b~AAF71115


.1 CAF
11672


1 95


2056 gi~2072974~gHomo Sapiensp40 122 44


b~AACS
127


7.1


2057 gi19171178Homo Sapiensmetalloprotease disintegrin518 98
16 with


thrombospondin type I
motif


2057 gi19171150Homo sa ADAMTS18 rotein 168 35
iens


2057 AAM39212 Homo SapiensHYSE- Human polypeptide 128 76
SEQ ID


NO 2357.


2058 gi~4959869~gMurine leukemiapolymerase 336 50


b~AAD3453virus


6.1




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
216
Tahlc:
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2058 gi~9630313~rGibbon ape pol polyprotein 331 46


ef~NP_0567leukemia
virus


90.1


2058 gi~6723273~dBaboon gag-pol precursor polyprotein329 49


bj~BAA8965endogenous


9.1 ~ virus strain
M7


2059 gi~20546404~Homo Sapienssimilar to nuclear receptor179 91
coactivator


ref~XP_1164 4; RET-activating gene
ELE1


66.1


2060 gi~6731237~gHomo Sapiensmyoferlin 112 79


b~AAF27177


.1 CAF
18231


7 1


2060 gi~798799~gbMus musculusimmunoglobulin heavy 72 55
chain


~AAC37713.


1~


2060 gi~20819487~Mus musculussimilar to LYRIC 72 27


ref~XP_1453


57.1


2061 gi415738 Euglena PSII D1- olype tide 75 27
gracilis


2061 gi11491 Euglena 32 kd rotein 75 27
gracilis


2061 gi11488 Euglena 32-Kda thylakoid membrane75 27
acilis protein


2062 gi21360549ArabidopsisAT3g01480/F4P13 3 79 29


thaliana


2062 gi3337366Arabidopsisnodulin-like protein 68 36


thaliana


2063 17959778 Homo sa PR01546 121 42
iens


2063 AAG02639 Homo SapiensGEST Human secreted protein,119 53
SEQ ID


NO: 6720.


2063 AAG02753 Homo SapiensGEST Human secreted protein,110 45
SEQ ID


NO: 6834.


2064 g115077406Antheraea fibroin 109 30


yamamai


2064 AAB82806 Homo SapiensBOST- Human low density 92 24
lipoprotein


binding roteiii 2 (LBP-2).


2064 AA001059 Homo SapiensHYSE- Human polypeptide 90 30
SEQ ID


NO 14951.


2065 g1200964 Mus musculusserine 2 ultra hi h sulfur80 30
rotein


2065 1200962 Mus musculusserine 1 ultra high sulfur80 30
protein


2065 AAM99918 Homo SapiensHIJMA- Hurnan polypeptide75 28
SEQ ID


NO 34.


2066 g1544724 Cavia cholecystokinin A receptor;69 29
CCK-A


receptor


2066 g12541920Rattus cholecystokinintype-A 69 29
receptor


norvegicus


2066 12114152 Mus musculuscholecystokinin type-A 69 29
receptor


2067 g12828586Pongo pygmaeusBRCA1 73 22


2068 AAM40813 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID


NO 5744.


2068 AAM39027 Homo SapiensHYSE- Human polypeptide 75 29
SEQ ID


NO 2172.


2068 AAY25768 Homo SapiensHUMA- Human secreted 75 29
protein


encoded from gene 58.


2070 11334150 Mus musculusunidentified reading 169 28
frame (first ATG




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
217
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


at os. 210)


2070 gi557822 Saccharomycesmal5, stay len: 1367, 133 20
CAI: 0.3,


cerevisiae AMYH_YEAST P08640


GLUCOAMYLASE S1 (EC 3.2.1.3)


2070 gi1304387Saccharomycesglucoamylase 133 20


cerevisiae
var.


diastaticus


2071 gi17983056Brucella BETA-HEXOSAMINIDASE A 88 29


melitensis


2071 gi1573917Haemophilus multidrug resistance 81 33
' protein A (emrA)


influenzae
Rd


2071 gi17982813Brucella NITROGEN REGULATION 80 26


melitensis PROTEIN NTRB


2073 gi~17532255~Caenorhabditisankyrin and proline rich67 29
domains


ref~NP elegans
4964


31.1


2074 gi19919730Homo SapiensBTEBS 704 97


2074 gi13195441Homo sapiensBTE-binding protein 4 478 64


2074 114549656Mus musculusdo amine receptor regulating452 76
factor


2076 AAE17482 Homo SapiensZYMO Human leucine-rich 1326 100
repeat-7


(ZLRR7) rotein.


2076 AAU83190 Homo SapiensZYMO Novel secreted protein1326 100


Z887300G2P.


2076 ABB 11242Homo SapiensHYSE- Human SLIT-2 homologue,568 99


SEQ ID N0:1612.


2077 g118893729Pyrococcus proteaseiv 74 34


furiosus
DSM


3638


2077 AAB94745 Homo SapiensHELI- Human protein sequence71 34
SEQ


ID N0:15792.


2077 g116413096Listeria 11n0656 68 35
innocua


2078 g160675 Beet ringspotpolyprotein 75 37


virus


2078 gi~14743288~Homo Sapienssimilar to Alu subfamily92 58
J sequence


reflXP contamination warning
0471 entry


91.1


2078 gi~20260801~Beetringspotpolyprotein 75 37


ref~NP_6201virus


13.1


2079 g13834629Mus musculusdiaphanous-related formin;208 67
p134


mDia2


2079 AAG74400 Homo SapiensHUMA- Human colon cancer71 36
antigen


rotein SEQ ID N0:5164.


2079 13171906 Homo SapiensDIA-156 roteiii 71 36


2080 g117298315Homo sa ienscandidate tumor suppressor125 100
rotein


2080 g17861733Homo Sapienslow density lipoprotein 125 100
receptor related


protein-deleted in tumor


2080 g18926243Mus musculuslow density lipoprotein 90 63
receptor related


protein LRP1B/LRP-DIT


2081 g14574224Fundulus multidrug resistance 343 55
transporter


heteroclitushomolog


2081 g116304396Pseudopleuronecmultidrug resistance 340 52
transporter-like


tes americanusprotein


2081 g13355757Gallus gallus~ ABC transporter protein~ 328 ~ 53




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
218
Table 2
SEQ AccessionSpecies Description Score


ID No. Identity


NO:


2082 gi7532975bacteriophageP10 67 27


phi-8




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
219
Table 3
SEQ ID DatabaseDescription *Results


NO: entr
ID


1059 BL00349CTF/NF-I roteins. BL00349H 15.70 9.710e-09
8-45


1061 DM00215PROLINE-RICH PROTEIN DM00215 19.43 6.143e-10
3. 29-61


DM00215 19.43 8.322e-09
40-72


1062 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 6.092e-12
II 80-99


ORF2.


1063 PR00944COPPER ION BINDING PROTEINPR00944E 9.18 7.132e-09
33-46


SIGNATURE


1076 PD00078REPEAT PROTEIN ANK PD00078B 13.14 9.217e-09
23-35


NUCLEAR ANKYR.


1089 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 8.754e-10
16-25


SIGNATURE


1089 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.658e-09
16-30


SIGNATURE


1089 PR00341PRION PROTEIN SIGNATUREPR00341E 3.32 9.898e-09
24-43


1099 PR00886HIGH MOBILITY GROUP PR00886C 11.84 1.141e-12
28-46


(HMGl/HMG2) PROTEIN


SIGNATURE


1107 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.077e-09
PI 51-65


SIGNATURE


1118 BL00472Small cytokines BL00472A 7.45 5.655e-09
1-12


(intercrine/chemokine)
C-C


subfamily signatur.


1118 PR00655AUXIN BINDING PROTEIN PR00655E 8.06 9.000e-09
88-103


SIGNATURE


1119 BL00970Nuclear transition proteinBL00970C 14.80 8.183e-12
2 proteins. 99-136


1119 BL00826MARCKS family roteins. BL00826B 12.51 4.279e-09
92-143


1119 BL00348p53 tumor antigen proteins.BL00348F 23.19 5.881e-10
93-135


BL00348F 23.19 6.857e-09
91-133


1119 PD01457RIBOSOMAL PROTEIN 40S PD01457A 16.51 8.216e-09
ZINC- 73-117


FINGER METAL.


1119 BL00752XPA protein. BL00752B 19.17 7.866e-09
100-143


BL00752B 19.17 8.979e-09
63-106


1119 DM01269303 kw ACTIVATING RAN DM01269A 23.35 9.446e-09
109-136


GTPASE ISOZYME.


1124 DM01813EGG-LAYING HORMONE. DM01813A 15.31 5.215e-09
15-42


1127 BL00452Guanylate cyclases proteins.BL00452A 17.52 1.170e-09
6-27


1131 BL00113Adenylate kinase roteins.BL00113B 20.49 9.897e-09
157-200


1162 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.000e-35
ZINC- 24-62


FINGER METAL-BINDING
NU.


1163 BL00407Connexins proteins. BL00407B 14.23 9.775e-30
21-51


BL00407C 14.61 2.500e-24
52-79


1163 PR00206CONNEXIN SIGNATURE PR00206B 13.75 1.957e-24
33-55


PR00206A 11.35 6.559e-23
2-26


PR00206C 15.16 7.469e-20
58-78


1171 PD01066PROTEIN ZINC FINGER PD01066 19.43 8.500e-28
ZINC- 35-73


FINGER METAL-BINDING
NU.


1177 DM018031 HERPESVIRUS DM01803C 7.00 7.240e-09
46-55


GLYCOPROTEIN H.


1190 PR00774GUANYLIN PRECURSOR PR00774A 6.49 8.579e-10
69-81


SIGNATURE


1195 PD02059CORE POLYPROTEIN PROTEINPD02059C 21.58 8.031
e-09 100-140


GAG CONTAINS: P.


1197 BL00472Small cytokines BL00472A 7.45 8.000e-14
1-12


(intercrine/chemokine)
C-C


subfamily signatur.


1213 PR00437SMALL CXC CYTOKINE ~ PR00437C 14.85 1.310e-16
33-51




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
220
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


FAMILY SIGNATURE


1213 BL00471Small cytokines BL00471 23.92 7.960e-10
6-53


(intercrine/chemokine)
C-x-C


subfamily signat.


1216 PR00308TYPE I ANTIFREEZE PROTEINPR00308C 3.83 5.208e-09
183-192


SIGNATURE


1222 PF00852Fucosyl transferase. PF00852F 15.97 1.409e-15
195-231


1224 BL00299Ubi uitin domain roteins.BL00299 28.84 6.301e-11
47-98


1230 PR00540MUSCARINIC M3 RECEPTOR PR00540A 10.24 7.174e-09
134-153


SIGNATURE


1240 BL00290Immunoglobulins and BL00290A 20.89 7.480e-10
major 160-182


histocompatibility complexBL00290B 13.17 2.875e-09
roteins. 226-243


1258 PR00792PEPSIN (Al) ASPARTIC PR00792A 11.54 5.500e-18
80-100


PROTEASE FAMILY SIGNATURE


1258 BL00141Eukaryotic and viral BL00141A 12.10 4.789e-15
aspartyl 87-102


proteases roteins. BL00141B 12.14 2.929e-10
228-239


1300 BL00616Histidine acid phosphatasesBL00616A 11.86 1.000e-09
136-143


phos hohistidine proteins.


1301 DM014176 kw INDUCING XPMC2 DM01417C 12.93 9.325e-12
361-372


MUSHROOM SPAC22G7.04. DM01417D 11.08 9.820e-12
400-415


1302 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 6.067e-11
324-338


SIGNATURE


1311 BL00926Lysyl oxidase copper-bindingBL00926B 13.84 7.453e-09
region 84-121


roteins.


1320 PR00830ENDOPEPTIDASE LA (LON) PR00830A 8.41 3.712e-09
29-48


SER1NE PROTEASE (S16)


SIGNATURE


1325 BL00048Protamine P1 proteins. BL00048 6.39 4.671e-10
58-84


BL00048 6.39 4.908e-10
60-86


BL00048 6.39 2.913e-09
59-85


BL00048 6.39 5.950e-09
57-83


1345 PF00424REV protein (anti-repressionPF00424A 14.34 2.436e-09
184-215


transactivator protein).


1345 BL00048Protamine P1 proteins. BL00048 6.39 4.553e-10
178-204


BL00048 6.39 6.513e-09
179-205


1353 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 2.857e-15
II 82-101


ORF2.


1363 PF00850Histone deacetylase PF00850B 10.13 5.154e-14
family. 95-109


PF00850C 14.55 9.063e-11
132-148


1389 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.423e-09
PI 50-64


SIGNATURE


1389 PD00306PROTEIN GLYCOPROTE1N PD00306B 5.57 7.000e-09
59-69


PRECURSOR RE.


1396 BL00427Disinte ins roteins. BL00427 13.93 7.698e-17
260-314


1396 PR00289DISINTEGR1N SIGNATURE PR00289A 13.62 5.667e-14
274-293


1416 BL00419Photosystem I psaA and BL00419B 22.23 9.489e-09
psaB 18-51


roteins.


1434 PF00075RNase H. PF00075I 16.21 7.375e-11
167-173


1440 BL00598Chromo domain proteins.BL00598 14.45 1.500e-15
112-133


1440 PR00504CHROMODOMA1N SIGNATURE PR00504B 9.12 5.200e-13
106-120


PR00504C 11.19 6.510e-09
121-133


1450 PF00622Domain in SPla and the PF00622B 21.00 2.227e-09
RYanodine 93-114


Rece tor.


1451 PD02935FATTY ACID PD02935C 16.62 4.375e-16
59-86


OXIDOREDUCTASE BIOSYNT.


1467 BL00479Phorbol esters / diacylglycerolBL00479A 19.86 3.000e-11
130-152




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
221
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


binding domain proteins.BL00479B 12.57 3.340e-10
156-171


1468 PF00992Tro onin. PF00992A 16.67 5.563e-10
139-173


1468 BL00795Involucrin proteins. BL00795C 17.06 3.600e-09
193-237


1468 PR00042FOS TRANSFORMING PROTEINPR00042D 8.97 7.554e-09
141-162


SIGNATURE


1474 BL00107Protein kinases ATP-bindingBL00107A 18.39 9.308e-12
region 62-92


proteins.


1474 PR00109TYROSINE KINASE CATALYTICPR00109B 12.27 1.563e-09
62-80


DOMAIN SIGNATURE


1474 BL00239Receptor tyrosine kinaseBL00239C 18.75 4.205e-09
class II 49-71


proteins.


1475 BL00456Sodiuxnaolute symporterBL00456C 24.55 4.886e-28
family 15-69


proteins.


1480 BL00983L -6 / u-PAR domain BL00983C 12.69 1.346e-09
roteins. 36-51


1482 BL00979G-protein coupled receptorsBL00979A 19.66 9.633e-12
family 3 74-121


roteins.


1502 PD02561DETHIOBIOTIN SYNTHETASEPD02561B 12.71 9.308e-09
176-182


SYNTHASE.


1506 BL00297Heat shock hsp70 proteinsBL00297H 15.46 9.625e-23
family 302-355


proteins. BL00297D 11.95 6.063e-21
166-205


BL00297E 18.56 6.077e-21
226-269


BL00297C 9.51 9.667e-15
105-156


1506 PR0030170 KD HEAT SHOCK PROTEINPR00301I 12.76 3.208e-11
320-336


SIGNATURE


1513 PR00130DNASE I SIGNATURE PR00130E 14.66 5.046e-09
237-266


1515 DM012423 THREONINE--TRNA LIGASE.DM01242A 20.32 5.286e-20
163-206


1517 BL00983Ly-6 l u-PAR domain BL00983B 8.19 5.935e-10
roteins. 40-49


1520 BL00415S a sins proteins. BL00415P 2.37 3.914e-10
138-173


1520 PR00049WILM'S TUMOUR PROTEIN PR00049D 0.00 3.746e-09
124-138


SIGNATURE PR00049D 0.00 1.000e-08
123-137


1530 PF00075RNase H. PF00075F 12.87 5.500e-10
127-137


1537 PR00463E-CLASS P450 GROUP I PR00463F 17.63 5.219e-13
288-306


SIGNATURE PR00463A 11.40 8.714e-12
52-71


PR00463B 17.50 5.041e-10
76-97


1537 PR00385P450 SUPERFAMILY PR00385C 16.94 6.318e-09
289-300


SIGNATURE


1538 PR00709AVIDIN SIGNATURE PR00709A 4.60 5.585e-09
19-37


1553 DM01354kw TRANSCRIPTASE REVERSEDM01354Y 10.69 6.423e-16
II 113-152


ORF2.


1558 PD01066PROTEIN ZINC FINGER PD01066 19.43 6.400e-25
ZINC- 70-108


FINGER METAL-BINDING
NU.


1564 PF00589Phage integrase family.PF00589B 16.17 1.621e-11
158-171


PF00589C 14.62 9.609e-10
183-194


1566 BL00908Mandelate racemase / BL00908B 37.71 6.455e-13
muconate 191-245


lactonizing enzyme family
signa.


1567 PR00702ACRIFLAVIN RESISTANCE PR00702A 14.92 2.421e-25
8-32


PROTEIN FAMILY SIGNATUREPR00702B 12.77 9.690e-18
36-54


1570 BL01047Heavy-metal-associated BL01047A 13.50 5.125e-17
domain 75-97


proteins.


1575 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 9.429e-15
II 80-99


ORF2.


1606 PF00642Zinc finger C-x8-C-x5-C-x3-HPF00642 11.59 2.575e-11
type 197-207


(and similar).


1610 DM01354kw TRANSCRIPTASE REVERSEDM01354I 15.55 7.702e-34
II 348-388


ORF2. DM01354G 11.57, 3.625e-32
277-307


DM01354H 18.00 2.528e-23
308-347




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
222
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


DM01354F 14.56 4.088e-11
241-276


1616 PD02929 ADHESION GLYCOPROTE1N PD02929A 28.27 2.263e-25
32-85


PRECURSORI.


1627 PR00121 SODIiJM/POTASSITJM- PR00121A 6.71 1.000e-08
15-29


TRANSPORTING ATPASE


SIGNATURE


1630 PR00824 HEPATIC LIPASE SIGNATUREPR00824A 7.81 7.214e-22
6-24


1640 BL00359 Ribosomal protein L11 BL00359C 22.18 1.155e-11
proteins. 93-126


1641 PR00080 ALCOHOL DEHYDROGENASE PR00080A 9.32 8.839e-10
134-145


SUPERFAMILY SIGNATURE


1641 PR00081 GLUCOSE/RIBITOL PR00081A 10.53 2.000e-12
45-62


DEHYDROGENASE FAMILY PR00081E 17.54 1.783e-10
238-255


SIGNATURE PR00081B 10.38 2.227e-09
134-145


1641 BL00061 Short-chain BL00061A 9.41 9.053e-10
134-144


dehydrogenases/reductasesBL00061B 25.79 6.860e-09
family 197-234


roteins.


1666 BL01257 Ribosomal protein LlOeBL01257D 18.80 2.973e-15
proteins. 59-98


1667 BL01241 Link domain proteins. BL01241 35.81 8.579e-37
180-232


BL01241 35.81 7.835e-14
289-341


1667 BL00086 Cytochrome P450 cysteineBL00086 20.87 3.377e-09
heme- 283-314


iron 1i and roteins.


1668 PR00671 INHIBIN BETA B CHAIN PR00671A 8.36 8.088e-09
4-22


SIGNATURE


1672 BL00674 AAA-protein family BL00674E 15.24 5.680e-15
proteins. 31-50


1682 PF00075 RNase H. PF00075A 14.44 4.400e-13
73-89


PF00075C 11.58 8.442e-09
152-163


1689 PD01066 PROTEIN ZINC FINGER PD01066 19.43 6.471 e-27
ZINC- 268-306


FINGER METAL-BINDING
NU.


1689 PR00788 NITROPHOR1N SIGNATURE PR00788A 9.79 6.108e-09
3-15


1692 BL00299 Ubiquitin domain proteins.BL00299 28.84 4.759e-10
32-83


1697 PR00423 CELL DIVISION PROTEIN PR00423E 7.36 4.038e-09
FTSZ 20-41


SIGNATURE


1706 BL00795 Involucrin proteins. BL00795C 17.06 5.395e-10
185-229


1709 BL00514 Fibrinogen beta and BL00514C 17.41 3.618e-25
gamma chains 68-104


C-terminal domain proteins.BL00514H 14.95 6.745e-16
230-254


BL00514G 15.98 6.566e-14
198-227


BL00514E 14.28 8.286e-14
128-144


BL00514D 15.35 2.915e-12
109-121


1714 PF00878 Cation-independent PF00878T 17.51 3.818e-09
mannose-6- 41-67


hos hate receptor re
eat roteins.


1715 PF01140 Matrix rotein (MA), PF01140D 15.54 4.872e-09
15. 123-157


1715 PF00992 Troponin. PF00992A 16.67 6.451e-10
109-143


PF00992A 16.67 3.724e-09
98-132


PF00992A 16.67 6.684e-09
96-130


1718 PD02474 SYNTHASE SMALL SUBUNITPD02474B 21.08 7.940e-10
92-130


ACETOLACT.


1725 BL00412 Neuromodulin (GAP-43) BL00412B 10.60 1.000e-10
proteins. 46-82


1725 PR00215 NEUROMODULIN SIGNATUREPR00215C 13.98 6.116e-10
54-74


1725 DM01688 2 POLY-IG RECEPTOR. DM01688G 16.45 3.160e-09
119-150


DM01688I 14.97 6.885e-09
107-154


1725 PD02870 RECEPTOR INTERLEUKIN-1PD02870B 18.83 8.564e-09
303-335


PRECURSOR.


1727 BL00107 Protein kinases ATP-bindingBL00107A 18.39 7.750e-21
region 185-215


proteins.


1727 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 7.176e-12
185-203


DOMAIN SIGNATURE




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
223
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


1727 BL00239 Receptor tyrosine kinaseBL00239B 25.15 4.387e-09
class II 119-166


roteins.


1728 BL00415 Synapsins proteins. BL00415Q 2.23 8.115e-09
52-87


1734 PD01270 RECEPTOR FC PD01270B 22.18 5.567e-18
75-111


IMMUNOGLOBULIN AFFIN. PD01270C 19.54 1.167e-17
118-146


PD01270A 17.22 4.960e-14
21-60


PD01270D 24.66 4.284e-09
152-187


1736 PD02346 PHOTOSYSTEM II PROTEINPD02346A 9.24 8.851e-09
6-17


PRECURSOR PHOTOSYNTHESIS.


1741 BL00415 Syna sins proteins. BL00415Q 2.23 6.777e-09
317-352


1744 BL00479 Phorbol esters / diacylglycerolBL00479B 12.57 1.000e-08
33-48


binding domain proteins.


1750 PR00763 COAGULIN SIGNATURE PR00763B 8.39 6.457e-09
41-60


1754 PR00276 INSULIN A CHAIN SIGNATUREPR00276A 11.84 7.840e-09
46-55


1755 PR00042 FOS TRANSFORMING PROTEINPR00042D 8.97 2.565e-09
164-185


SIGNATURE


1755 PF00922 Vesiculovirus hospho PF00922A 19.17 5.759e-09
rotein. 99-132


1778 PR00245 OLFACTORY RECEPTOR PR00245A 18.03 9.836e-14
59-80


SIGNATURE PR00245C 7.84 1.540e-13
237-252


PR00245B 10.38 2.125e-13
176-190


1778 BL00237 G-protein coupled receptorsBL00237A 27.68 1.474e-12
proteins. 90-129


1778 PR00534 MELANOCORTIN RECEPTOR PR00534A 11.49 4.729e-09
51-63


FAMILY SIGNATURE


1778 PR00237 RHODOPSIN-LIFE GPCR PR00237A 11.48 3.613e-09
26-50


SUPERFAMILY SIGNATURE PR00237C 15.69 7.525e-09
104-126


1787 PR00007 COMPLEMENT C1Q DOMAIN PR00007B 14.16 5.114e-15
146-165


SIGNATURE PR00007A 19.33 7.052e-10
119-145


1787 PR00524 CHOLECYSTOKININ TYPE PR00524F 5.36 4.351e-09
A 70-83


RECEPTOR SIGNATURE


1787 DM00250 kw ANNEXIN ANTIGEN DM00250B 13.84 6.595e-09
82-105


PROLINE TUMOR.


1787 BL00415 Syna sins roteins. BL00415N 4.29 7.372e-09
62-105


1787 BL01113 Clq domain proteins. BL01113B 18.26 3.786e-23
125-160


BL01113A 17.99 7.968e-15
73-99


BL01113A 17.99 5.091e-14
70-96


BL01113A 17.99 5.295e-11
64-90


BL01113A 17.99 8.568e-11
79-105


BL01113A 17.99 8.977e-11
67-93


BL01113A 17.99 4.635e-09
82-108


BL01113A 17.99 6.192e-09
76-102


BL01113A 17.99 7.750e-09
61-87


1787 BL00420 Speract receptor repeatBL00420A 20.42 8.691
proteins e-11 73-101


domain proteins. BL00420A 20.42 9.673e-11
70-98


BL00420A 20.42 2.180e-10
55-83


BL00420A 20.42 8.062e-09
52-80


1789 DM01930 2 kw FINGER SMCX SMCY DM01930E 15.41 2.964e-33
45-89


YDR096W.


1795 DM01688 2 POLY-IG RECEPTOR. DM01688I 14.97 7.480e-10
107-154


DM01688J 14.69 4.455e-09
60-96


1796 PFO0075 RNase H. PF00075J 15.78 4.115e-13
115-132


1802 PD00066 PROTEIN ZINC-FINGER PD00066 13.92 4.130e-11
METAL- 86-98


BINDI.


1802 BL00028 Zinc finger, C2H2 type,BL00028 16.07 1.600e-10
domain 110-126


proteins. BL00028 16.07 6.100e-10
70-86


1802 PR00048 C2H2-TYPE ZINC FINGER PR00048B 6.02 9.438e-10
83-92


SIGNATURE




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
224
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


1812 PD00078REPEAT PROTEIN ANK PD00078B 13.14 4.130e-09
157-169


NUCLEAR ANI~YR.


1824 PF00628PHD-finger. PF00628 15.84 5.500e-13
78-92


1833 PF00075RNase H. PF00075B 12.56 4.732e-10
156-166


1833 PR00939C2HC-TYPE ZINC-FINGER PR00939A 8.95 3.045e-09
137-146


SIGNATURE


1842 PR00833POLLEN ALLERGEN POA PR00833H 2.30 3.192e-09
PI 244-258


SIGNATURE


1844 BL00972Ubiquitin carboxyl-terminalBL00972D 22.55 3.348e-11
168-192


hydrolases family 2
proteins.


1857 PF00424REV protein (anti-repressionPF00424A 14.34 8.085e-09
71-102


transactivator rotein).


1860 PR00221CAULIMOVIRUS COAT PROTEINPR00221H 12.82 2.410e-09
184-197


SIGNATURE


1864 BL01282BIR re eat proteins. BL01282B 30.49 1.136e-10
214-252


1866 BL00155Cutinase, serine proteins.BL00155D 26.87 5.337e-09
19-67


1895 PF00075RNase H. PF00075F 12.87 7.353e-10
93-103


1911 BL00983Ly-6 J u-PAR domain BL00983C 12.69 6.365e-09
proteins. 101-116


1911 BL00272Snake toxins roteins. BL00272C 8.27 1.000e-08
105-116


1925 PR00308TYPE I ANTIFREEZE PROTEINPR00308A 5.90 6.795e-11
64-78


SIGNATURE PR00308C 3.83 2.385e-10
67-76


1925 PR00456RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.438e-10
57-71


SIGNATURE


1925 PR00833POLLEN ALLERGEN POA PR00833H 2.30 6.654e-09
PI 59-73


SIGNATURE


1930 DM00179w KINASE ALPHA ADHESIONDM00179 13.97 5.263e-10
T- 107-116


CELL.


1935 PF00075RNase H. PF00075J 15.78 2.309e-12
81-98


1940 PF00075RNase H. PF00075F 12.87 3.864e-09
74-84


1952 PR00019LEUCINE-RICH REPEAT PR00019B 11.36 3.250e-10
184-197


SIGNATURE PR00019A 11.19 5.667e-09
187-200


1954 BL00546Matrixins cysteine switch.BL00546A 19.62 8.105e-30
77-106


_ BL00023Type II fibronectin BL00023 24.31 4.682e-35
1954 collagen-binding 340-376


domain proteins. BL00023 24.31 2.969e-28
282-318


BL00023 24.31 9.526e-24
224-260


1954 PR00138MATRIXIN SIGNATURE PR00138B 15.82 5.500e-18
144-159


PR00138A 15.14 8.773e-16
97-110


1954 BL00024Hemopexin domain proteins.BL00024B 21.53 9.591e-33
118-151


BL00024A 11.49 2.800e-13
97-107


BL00024C 22.98 7.796e-11
164-212


1954 PR00013FIBRONECTIN TYPE II PR00013C 12.29 1.000e-20
REPEAT 372-387


SIGNATURE PR00013C 12.29 3.571e-15
314-329


PR00013C 12.29 7.800e-14
256-271


PR00013A 12.26 5.500e-13
344-353


PR00013B 14.75 1.237e-11
355-367


PR00013B 14.75 4.000e-09
297-309


PR00013A 12.26 5.333e-09
286-295


PR00013A 12.26 7.833e-09
228-237


1957 BL01182Glycosyl hydrolases BL01182A 21.39 3.357e-34
family 35 77-119


proteins.


1957 PR00742GLYCOSYL HYDROLASE PR00742B 15.52 2.653e-14
78-96


FAMILY 35 SIGNATURE PR00742A 13.75 6.914e-10
57-74


1958 PR00449TRANSFORMING PROTEIN PR00449A 13.20 8.200e-15
P21 214-235


RAS SIGNATURE


1964 PR00727BACTERIAL LEADER PR00727A 12.93 7.000e-09
9-25


PEPTIDASE 1 (S26) FAMILY




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
225
Table 3
SEQ DatabaseDescription *Results
ID


NO: entr
ID


SIGNATURE


1965 PF00075RNase H. PF00075D 10.71 7.188e-09
71-81


1966 PF00075RNase H. PF00075C 11.58 9.786e-11
110-121


PF00075B 12.56 1.878e-10
78-88


1968 DM008923 RETROVIRAL PROTE1NASE.DM00892C 23.55 4.082e-11
314-347


1970 PF00075RNase H. PF00075J 15.78 8.571e-10
335-352


1973 PF00589Pha a integrase family.PF00589B 16.17 1.450e-14
101-114


1974 BL00675Sigma-54 interaction BL00675B 24.07 1.000e-24
domain 118-172


proteins ATP-binding BL00675C 13.51 6.400e-24
region A 183-210


roteins. BL00675D 12.03 1.750e-09
245-254


1987 PR00153CYCLOPHIL1N PEPTIDYL- PR00153B 11.57 1.500e-17
52-64


PROLYL CIS-TRANS PR00153A 12.98 4.255e-10
23-38


ISOMERASE SIGNATURE


1987 BL00170Cyclophilin-type peptidyl-prolylBL00170B 20.97 6.250e-33
cis- 47-86


trans isomerase signatur.BL00170A 17.08 2.309e-09
17-43


1998 PD01066PROTEIN ZINC FINGER PD01066 19.43 7.750e-37
ZINC- 27-65


FINGER METAL-BINDING PD01066 19.43 8.863e-11
NU. 68-106


1999 PF00992Tro onin. PF00992A 16.67 3.487e-09
108-142


1999 BL00224Clathrin light chain BL00224B 16.94 7.055e-09
proteins. 96-148


1999 BL00422Granins proteins. BL00422C 16.18 8.059e-09
117-144


2001 BL00019Actinin-type actin-bindingBL00019B 13.34 7.158e-14
domain 261-283


roteins.


2001 DM01354kw TRANSCRIPTASE REVERSEDM01354U 12.24 3.500e-13
II 345-364


ORF2.


2008 PD01719PRECURSOR GLYCOPROTEIN PD01719A 12.89 3.483e-16
63-90


SIGNAL RE.


2011 BL00282Kazal serine protease BL00282 16.88 6.577e-10
inhibitors 127-149


family proteins.


2011 BL00222Insulin-like growth BL00222B 11.09 6.940e-10
factor binding 74-89


proteins.


2011 BL00621Tissue factor proteins.BL00621A 8.69 6.473e-09
5-22


2012 PD02563PROTEIN NONSTRUCTURAL PD02563C 13.51 9.634e-10
C 74-128


VP18.


2013 PR00124ATP SYNTHASE C SUBUNIT PR00124A 8.81 5.655e-09
58-77


SIGNATURE


2013 PR00783MAJOR INTRINSIC PROTEINPR00783C 13.54 8.981e-09
48-67


FAMILY SIGNATURE


2034 PF00075RNase H. PF00075F 12.87 6.523e-09
183-193


2037 BL00326Tropom osins proteins. BL00326D 8.76 9.327e-09
115-155


2048 PR00671INHIB1N BETA B CHAIN PR00671B 4.29 8.767e-10
138-157


SIGNATURE


2052 PD02455ELEMENT TRANSPOSABLE PD02455C 29.23 5.230e-09
225-27_6


INSERTION PROTEIN


TRANSPOSITION DNA.


2058 PF00075RNase H. PF00075J 15.78 9.000e-10
81-98


_ PD00066PROTEIN ZINC-FINGER PD00066 13.92 4.000e-13
2074 METAL- 62-74


BINDI.


2074 PR00048C2H2-TYPE ZINC FINGER PR00048B 6.02 4.462e-11
59-68


SIGNATURE PR00048B 6.02 1.000e-10
89-98


PR00048A 10.52 9.609e-10
101-114


2074 BL00028Zinc finger, C2H2 type,BL00028 16.07 9.100e-13
domain 104-120


proteins. BL00028 16.07 1.OOOe-O8
46-62


2076 PR00019LEUCINE-RICH REPEAT PR00019A 11.19 1.900e-11
106-119


SIGNATURE




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
226
Table 3
* Results include in order: Accession No., subtype, e-value, and amino acid
position of the signature in the
corresponding polypeptide


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
227
Table 4
SEQ Pfam Model Description E-value Score No: Position
of of
NO: Pfam the
DomainsDomain


1050 FAA_hydrolaseFumarylacetoacetate 0.64 -89.1 1 22-143
(FAA) hydrolase
fam


1066 rubredoxin Rubredoxin 7.2 -11.1 1 4-37


1076 ank Ankyrin re eat 0.01 22.5 1 25-57


1076 sodfe_C Iron/manganese superoxide3.9 -67.9 1 38-124
dismutases,
C-term


1076 DUF232 Putative transcriptional8.1 -29.1 1 134-254
regulator


1099 box HMG (high mobility grou8 -22.4 1 17-61
HMG ) box


1109 _ u-PAR/Ly-6 domain 0.21 -6.2 1 34-112
UPAR LY6


1110 ldl_recept Low-density lipoprotein8.8e-07 36.0 1 196-240
a receptor
d omain


1110 CUB CUB domain 0.38 -27.8 1 52-161


1118 rvt Reverse transcri tase 0.95 -46.1 1 38-207


1125 adenylatekinaseAdenylate kinase 0.00037 -77.6 1 13-103


1162 KRAB KR AB box 1.1 e-2392.1 1 22-62


1163 connexin Connexin 3.1e-23 90.6 1 1-130


1171 KR.AB KRAB box 6.6e-22 86.2 1 33-73


1193 MHC_I Class I Histocompatibility2e-06 1.1 1 29-205
antigen,
domains


1209 DOMON DOMON domain 1.9e-12 54.8 1 102-215


1213 IL8 Small cytokines (intecrine/chemokine),0.59 -7.8 1 18-
55
inter


1218 cys rich_FGFRCysteine rich repeat 4.4 -11.0 1 28-76


1222 Gl co transfGlycosyltransferase 6.6e-06 -54.1 1 1-322
family 10


1240 ig Immunoglobulin domain 1.6e-06 35.1 2 41-
124:156-
230


1258 as Eukaryotic aspartyl 8e-06 -110.81 19-241
protease


1280 DOMON DOMON domain 8.9 -16.6 1 35-117


1288 PDZ PDZ domain (Also known 1.1 0.4 1 7-73
as DHR or
GLGF)


1301 ExonucleaseExonuclease 3.4e-33 123.7 1 322-479


1311 Gemini_mov Geminivirus putative 5.7 -40.5 1 15-79
movement
protein


1341 fn3 Fibronectin type III 6.6e-36 132.7 2 109-
domain 200:212-
301


1345 Colla en Colla en tri 1e helix 7.3 -65.8 1 185-243
re eat (20 copies)


1365 Amidase Amidase 0.017 -178.91 68-276


1375 Galactosyl Galactosyltransferase 7.1e-44 159.2 1 113-309
T


1375 Glyco transfGlycosyltransferase 3 -77.1 1 146-293
25 family 25


1381 GRAM GRAM domain 6.6e-14 59.6 1 65-116


1396 Pep M12B-propReprolysin family propeptide1.4e-27 105.1 1 75-191
ep


1396 disintegrinDisinte in 2.6e-10 47.7 1 243-318


1398 SK_channel Calcium-activated SK 1.8e-06 34.9 1 1-57
potassium
channel


1413 i Immunoglobulin domain 5.4 9.1 1 29-88


1416 dUTPase dUTPase 0.00044 9.6 1 111-237


1420 Folate rec Folate receptor family 1.7 -111.21 14-175


1434 lectin c Lectin C-type domain 1.5e-05 28.0 1 233-319


1440 chromo 'chromo' (CHRromatin 4.6e-11 50.2 1 92-133
Organization
Modifier)


1449 PMSR Peptide methionine sulfoxide0.0089 -65.8 1 4-79
reductase


1450 SPRY SPRY domain ~ 9e-26 ~ 99.0~ 1 ~ 109-240




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
228
Table 4
SEQ Pfam Model Description E-value Score No: Position
ID of of
NO: Pfam the
DomainsDomain


1451 MaoC dehydrataMaoC like domain 2.1e-15 64.6 1 31-152
s


1463 NTP transf Nucleotidyltransferase 2.6e-12 54.3 1 121-234
2 domain


1467 DAG_PE-bindPhorbol esters/diacylglycerol8.7e-05 27.4 1 130-180
binding
dom


1467 DC1 DC1 domain 0.66 11.2 1 141-172


1470 'rri C jmjC domain 0.46 -18.2 1 166-262


1474 pkinase Protein kinase domain 0.0019 -85.7 1 2-187


1475 SSF Sodiumaolute sym orter 0.13 -177.11 1-311
family


1478 dUTPase dUTPase 7.6 -37.5 1 2-98


1479 fn3 Fibronectin type III 1.1e-19 78.9 1 14-100
domain


1485 rnaseH RNase H 0.36 -28.0 1 59-175


1488 NTR NTR/C345C module 0.044 -6.1 1 293-398


1506 HSP70 Hsp70 rotein 1.6e-13 38.3 1 61-424


1517 UPAR LY6 u-PAR/Ly-6 domain 0.33 -8.2 1 44-106


1530 rnaseH RNase H 0.011 -11.7 1 64-155


1537 p450 Cytochrome P450 2.1 -176.61 31-316


1537 DNA ligase NAD-dependent DNA ligase9.2 -42.9 1 200-256
OB OB-fold
d omain


1558 KRAB KRAB box 1.8e-18 74.8 1 68-108


1564 Phage integrasePha a irate rase family1.2e-09 45.5 1 39-204


1566 MR_MLE Mandelate racemase / 0.00079 -24.5 1 153-352
muconate
lactonizing en


1570 HMA Heavy-metal-associated 6.6e-13 56.3 1 71-131
domain


1580 i Immunoglobulin domain 0.99 15.2 1 23-131


1601 WD40 ' WD domain, G-beta repeat2e-08 41.5 3 39-
75:83-
118:126-
162


1606 zf CCCH Zinc finger C-x8-C-x5-C-x3-H0.094 19.3 3 105-
type 129:141-
173:183-
209


1612 zf CCHC Zinc knuckle 2.1e-05 31.4 2 167-
184:202-
219


1618 rnaseH RNase H 6.3e-14 59.7 1 24-144


1618 Zn Irate ase Zinc binding 3.8e-07 37.2 1 146-185
Irate ase domain


1618 _ Domain of unlaiown function9.3 -7.0 1 104-186
DUF224 (DUF224)


1641 adh short short chain dehydrogenase4.6e-32 119.9 1 42-309


1667 Xlink Extracellular link domain2.9e-83 290.0 2 162-
267:273-
364


1667 ig Immunoglobulin domain 0.0015 25.2 1 61-145


1682 rvt Reverse transcri tase 3.1e-31 117.2 1 56-238


1683 Ga 30 Gag P30 core shell protein2.9e-33 124.0 1 8-197


1689 KRAB KRAB box 4.9e-22 86.6 1 266-306


1692 ubiquitin Ubiquitin family 0.00061 26.5 1 17-91


1709 fibrinogen_CFibrinogen beta and 7.9e-85 295.2 1 37-255
gamma chains, C-
term


1713 HOK GEF Hok/gef family 2.4 -7.8 1 7-54


1716 Ga 30 Gag P30 core shell protein0.0036 -49.7 1 64-229


1721 rnaseH RNase H 0.011 -11.7 1 207-350


1722 dUTPase dUTPase 0.37 -22.9 ~ 1 ~ 93-217




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
229
Table 4
SEQ Pfam Model Description E-valueScore No: Position
ID of of
NO: Pfam the
DomainsDomain


1725 ig Irninunoglobulin domain 4.2e-1357.0 2 80-
141:259-
320


1725 IQ IQ calmodulin-bindin 4.3e-0530.4 1 49-69
motif


1727 pkinase Protein kinase domain 3e-21 84.0 1 71-267


1728 Fringe Frin e-like 5.9 -112.61 165-370


1734 ig Immuno lobulin domain 0.014 22.0 1 117-170


1737 PP2C Protein phos hatase 2C 0.0067 -50.5 1 37-273


1738 SH3 SH3 domain 1.7e-0531.7 1 102-159


1740 rnaseH RNase H 0.0042 -7.3 1 126-270


1744 DAG_PE-bindPhorbol esters/diacylglycerol2.9 -11.1 1 26-55
binding
door


1744 PHD PHD-fin er 3.3 -14.7 1 9-61


1760 GARS_N Phosphoribosylglycinamide8.2 -62.0 1 35-95
synthetase,
N


1760 Armadillo Armadillolbeta-catenin-like9.1 8.7 2 44-
seg repeat 84:131-
171


1778 7tm 1 7 transmembrane receptor1e-12 55.7 1 41-276
(rhodopsin .
family)


1778 YCF9 YCF9 3.1 -18.5 1 203-258


1787 Clq C1 domain 1e-05 13.2 1 111-230


1787 Collagen Collagen tri 1e helix 0.0043 -3.0 1 50-107
re eat (20 co ies)


1789 jm'C jmjC domain 0.0007812.0 1 52-241


1795 i Immunoglobulin domain 0.0037 23.9 1 64-141


1796 rve Inte ase core domain 2.6e-28107.5 1 20-174


1802 zf C2H2 Zinc finger, C2H2 type 6e-15 63.1 2 68-
90:108-
130


1806 Filamin Filamin/ABP280 re eat 0.0005418.6 1 26-131


1812 ank Ankyrin repeat 3.6e-2390.4 3 159-
191:205-
237:244-
276


1824 PHD PHD-forger 1.1e-1255.6 1 62-110


1826 PAP assoc PAP/25A associated domain1.5e-0635.2 1 101-155


1827 ig Immunoglobulin domain 1.6 13.4 1 29-102


1830 RhoGEF RhoGEF domain 3.3e-0624.0 1 110-280


1830 PH PH domain 2.8 6.7 1 356-451


1833 zf CCHC Zinc knuckle 2.1e-0634.7 1 137-154


1833 rvt Reverse transcriptase 7.7e-0625.9 1 84-277


1844 UCH-2 IJbiquitin carboxyl-terminal0.15 -8.5 1 165-238
hydrolase
family


1846 Armadillo Armadillo/beta-catenin-like0.28 17.7 2 50-
seg repeat 91:92-
132


1 zf CCHC Zinc knuckle 3.2e-0530.8 1 179-196
860


_ zf C3HC4 Zinc finger, C3HC4 type 0.0022 23.3 1 218-256
1864 (RING
fin er)


1887 ig Immunoglobulin domain 4e-08 40.4 1 35-112


1889 LRR Leucine Rich Repeat 0.051 20.1 1 62-85


1 rnaseH RNase H 3.4e-0625.8 1 47-177
895


_ Brevenin Brevenin/esculentin/gaegurin/rugosin7.5 -2.9 1 1-51
1899 family


1911 UPAR LY6 u-PAR/Ly-6 domain ~ 1.3e-06~ 35.4~ 1 ~ 44-117




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
230
Table 4
SEQ Pfam Model Description E-valueScore No: Position
of of
NO: Pfam the
DomainsDomain


1911 toxin Snaketoxin 3 -19.5 1 66-117


1911 Activin Activin es I and II receptor9.5 -14.0 1 30-118
rec domain


1912 Retroviral aspa 1 protease7 -26.3 1 42-142


1913 SAM SAM domain (Sterile alpha3.9e-1357.1 2 105-
motif) 170:183-
247


1916 Sema Sema domain 1.4e-1454.6 1 51-434


1926 PAP2 PAP2 su erfamily 2.9e-0737.6 1 48-142


1930 i Immunoglobulin domain 2.7e-0737.6 1 41-116


1935 rve Inte rase core domain 2.5e-1357.7 1 1-138


1940 rnaseH RNase H 1.1e-26102.0 1 24-153


1940 Integrase Integrase Zinc binding 4.7e-1253.5 1 155-194
Zn domain


1952 LRRNT Leucine rich repeat N-terminal0.0027 24.4 1 67-95
domain


1953 UQ con Ubiquitin-con'ugatin 2.8e-0840.9 1 78-219
enzyme


1954 Peptidase Matrixin 6.7e-86298.8 1 53-212
M10


1954 fn2 Fibronectin type II domain1e-79 278.2 3 231-
272:289-
330:347-
388


1958 ras Ras family 1.9 -132.01 215-284


1963 is 1 Thrombos ondin type 1 0.083 8.0 1 20-63
domain


1966 rvt Reverse transcriptase 1.5e-0521.9 1 2-196


1968 G-patch G- atch domain 0.3 6.0 1 307-352


1968 Retroviral aspartyl rotease1.4 -19.9 1 274-385


1970 rve Inte ase core domain 0.78 -16.8 1 265-395


1973 Pha a integrasePha a integrase family 5.7e-0839.9 1 1-153


1974 Si ma54 Sigma-54 interaction 3.1e-37137.2 1 63-253
activat domain


1975 Na Pi cotransNa+/Pi-cotransporter 0.0085 -99.2 1 1-146


_ signal His Kinase A (phosphoacceptor)7 -7.7 1 85-147
1975 domain


1978 UPAR LY6 u-PAR/Ly-6 domain 1.8 -16.0 1 21-96


1978 Zn_clus Fungal Zn(2)-Cys(6) binuclear5.1 -5.7 1 21-60
cluster
domain


1987 pro isomeraseCyclophilin type peptidyl-1.2e-1875.4 1 4-171
rolyl cis-tr


_ zf CCHC Zinc knuckle 1.9e-0531.5 2 181-
1997 198:204-
220


1997 TFIID-31 Transcription initiation7.9 -633 1 75-187
factor I1D,
3lkD su


1997 Ga 12 Gag polyprotein, inner 8.9 -9.5 1 155-229
coat protein 12


1998 KRAB KRAB box 2e-23 91.2 1 27-65


2001 CH Cal onin homology (CH) 0.019 10.8 1 230-330
domain


2001 SAM SAM domain (Sterile al 0.9 6.5 1 248-311
ha motif)


2008 is 1 Thrombospondin a 1 domain0.013 15.1 1 64-98


2011 i Immunoglobulin domain 1.7e-0531.7 1 186-255


2011 kazal Kazal-type serine protease0.0002827.6 1 121-168
inhibitor
domain


2011 IGFBP Insulin-like growth factor0.17 2.5 1 53-113
binding
protein


2011 zf UBR1 Putative zinc fm er in 8.3 -24.0 1 54-112
N-recognin


2015 PH PH domain 0.0002 28.1 1 174-281


2015 efhand EF hand 0.0003127.5 1 339-367


2018 RPEL RPEL re eat 1.3 11.8 1 25-50


2034 rnaseH RNase H 4e-27 103.6 1 122-267




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
231
rr"1~1 o n
SEQ Pfam Model Description E-valueScore No: Position
of of the
Pfam


ID
DomainsDomain


NO:
2038 anulin Granulin 7.7 -17.8 1 62-91
2052 rve Integrase core domain 2.6e-2494.2 1 160-314
2057 Pep Ml2B~ropReprolysin family propeptide0.44 -29.3 1 179-263


ep
2058 rve Integrase core domain 8.7e-1459.2 1 1-140
2074 zf C2H2 Zinc finger, C2H2 type S.Se-2286.5 3 42-
66:72-


96:102-


124


2074 zf BED BED zinc finger 0.94 1.8 1 91-129
2074 TP1 Nuclear transition rotein7.5 2.2 1 21-76
2076 LRR 1 3.2e-2080.6 5 57-
Leucine Rich Repeat 80:81-


104:105-


128:129-


152:153-


176


2076 LRRNT Leucine rich repeat N-terminal0.0001328.8 1 27-55
2076 LRRCT domain 0.047 18.0 1 186-234
Leucine rich repeat C-terminal
domain




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
232
z
° ~ ° ° o ° °O~r
Q, " ~. ~. ~-, o o ~ r0
p N 'zS n n ,n .Q b
a ~ a a a a
o ~ ~ o ~ o, o.
H
J N N ~t .p N
~1 O Cn O~ N
W ~O \O ~O ~ ~~ W
N CAD f~D C~D CAD N N ~ b
i ~ ~ i i n ~
O ,-. ~. ~. ,-. ~ O ~ r..
~1 N V~ W ~. ~. O~
O O O O p O
.p ~ ~. ON O Oo ~
~p c~~i
O .O O .O O O v, b
n
N ~ ~ N ,-'P. O
~O O ~O lp (~D I~
:-' m H
O
r~
d
He
~o ~~ ~~~ y~~c~
' yo r"°'oo 0 0~
~°x ~~~ c ~y ~y
o ~ ~ H ~ '~ H trJ n t=i n o
m No ~ n~ n
O
m ~~m ~n ~n a.
yH yH
a ~ m ~ ~ ~ tHI'J ~ tHrJ
ra err c~~~0 ~ ~~ x~~xx~~
ax~~o~ r c~ c~~~ ~~ ~dx~~d~d~CH~
y~~~~~ ~o~o~~'~ ~o od~ood~o
H°°o°~ ~~m c~~ °~ ~~~~a~~a
.-,~~x~mz v~r~r" t~~ Hy v~~~..~~v~~Hv~ d
maOmm~Om
~~N~ro z~~' ~b r~ ~'y
O~ O~~ r~Zn ~ ~p ~~ ~ ~ o
~,
x ~ ~ '~ ~~ ~~ ~~ o
xr
r~ m m
~ ~ 0 0
r r


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
233
0 0 ~0~~
~d
a a ~ ~ O
H
CNn ~ ~ ~ N
.p N ? s0
:p W .p °° coo b~ b
N ;D ;P i P~ m
'O p ~ O O ~ ,~~. ""
i
0 o O w
N W
~~h
O O O O n
i~
N J
d
~a ~~ ~~~ ~~ c~~ra
r ~~C x~~ ~.~ ~ b
xr
r.;d ZO~ro
x a~Z
x~ ~~ ~ r~ 0 0
~z ~z
x
a
d
o r~
a m ~ ~ ~.
r~~°b~~~°o°z~oo° ~~~~~~~o~~o
~ tzi ~ trl ''d ~ "-~ ~ H a H H k~ ~ ''d ° ~-3 m d
r~~~°~~O~~Z~o~'~~~p~~~'~o~H~'bH~' b
r° ~zo~~-~~- xr~~ x~o~ ~ Nox~ox d
~~~~xb~~~ ar~c~ x~b ~ z~ z~
°r°~~ro~r~~~d~ ~ara~~a~ o
c~~ya~~9~
x °r~r~r~~o~~,x~ ~~~d~~, ~ o
~V N'~~~ ~aN~'~~o ~
Nboo ~ o ~o ~ o W ~ z° z
N ~ ~, ~, ~ H x~


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
234
° ° o~~'
~o ~o ~ ~ a\ a~ o,
-. .-. ~ ~. ~ ~ ro
w ~ ,fl ~ ~ ~ C
o ~~ w ~' ~ ~° o ~ td
a a a x
o~, o~, o~, o~, owo ~ ~ H
H
.a
0 0 0 o w o
0 0 0 0 ~ o ~ b~ ro
°m°o °w .o~ 0 0 0
a\ ~ ~ ~ ~ ~ rt
O O O O
O
.'~P J N
~M
0 0 0 0 0 ~ ro
tn ~-. i-. N
O 0o O v' N c~D
0o IJ
Owo N
O N
r ~
d
b
r ~~ ~~ ~~ r~ ~
o ro ro
~r o
H ~ ~ ~ x n
H ~ ~ o
Na ~
o ~ ~ n ~~ ~~
x
C7 y
a a~
~l C7 trJ ~ v~ H C7 ttJ ~] v~ v~ x~ C~ trJ f'l H C~ ''d H ''d a f~ H
HH~y~-Cr O~O7~~7~~W~O~
H ~ ~ ~ f~ H ~ ~ ~ 7~ c7 H m z ~-d z O O 'T' tn t'-1
z O n ~ z O O ~ z ~ ~ O O ~ ~-3 t" H ~ , tH=i 9 H ' ~'
~ ~p z ~ 7~ ~ O z ~ ~ ~ ~ O z ~-p3 tri ~ ''~ ~ '~ H h~ ~ ';' d
r3 ~ O H '-'3 O ~1 "'' ~ '-' ~ ~y' ~ a ~ ''~' b~
~ ,"dNzt~~'J~ C=1N~~'~ ~'x~'z''Hbz ~C
~t~-~~ ~~'m''~~ t-r~t-~~0~~ p~pOt-~O ~~~c~ a
~'~o ~~~0 HpH~o ~dm~~~ ~~r'o 0
a '~
o~ z°zo~~ °z~°~~ ~o~ ~~ z°~m
z ~ z~ ~z~ r 9~ ~~~r


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
235
0 0 ~ ~°o
N N
UNR ~O
a a a
~-. oo J v~
ov'o ~ ~ ~ ~ H
H
N N O~ ~ ~ ~rJ
01
p\ W W W O
,p N N N O
O N N P
p pp ~. w 00 n-
r-. i-' p O ~ C
.OP
i r ~. O O m FtJ
O ~ O ~ O
~ H
~ O ~
r
d
bH ~bH ~~~~x~d da~~~d
O~ aH~ ~ ~x~~a~o~~~a o~00
trJ ,~Z, t~r1 H ~ ~ ~~ ~ t~ ~ ~, ~ ~ ~ ~ CJ ro w c~
N ~ '~' ~ H ~ ~ G7 trJ H '~'' lzJ
U',~, ~dp'~.~'' _~d~''~'~CG~~r~O~;'34~~ ~'_'j~~ o
O~ O~ ~aHw(~Ox"H~~~O~-''G' ~C~%btrl "o
t~J ~-C O ~ o
w
,~~° ~o ~~~~o "a x
~t~d~
~~ozm . C
a d
aa~~H~aa
o ~' r
-~3 ~ 9 C7 ~ txrJ H H ~ O
~~1 . ~~ 'z7 n7 nH
~o b
bbx~~'~bb
H H3 H ~~"' ~ m ~ H
tii t~ ~ H ~ ~ tii trJ
d
n ~' n
~-3 d H


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
236
z
0 0 0 o O~p
N N N N
~b
UNG UQ UNQ USG
b~
a a a a
z
N N
W W O
H
o~', .tea o
-.
b~
b
..
~o o .-~ o
o ,o C
N ~. oo N n A
,.... W O ~ O ~-6
A
O . ~ ~ ro
o
0
o w ~ A rrJ
~' H
~ O ~°
'° r
d
(~ b H C~ 'b H f~ "b H (7 b H C~
xx~x x~x xx~x x~x x
aor~ aor~ om aor~ a
~-3 H ~ H H
,~z, m ,z~, tai tri ~ tai
yea y~ a~ ~~ a
~~x ~?~x ~~x !~~x ~' o
Or' O~ O~ O~
o.
r ~ r ~ r
,~ v~ ~. v~ ., v~
b~ '"' H H a a ~ ''" "~ ''~ a b~ '"' H ~l a b~ '"' H H td '""' H H
r~or~xzzr°~xz~~°~ zZ~°~xz~m°~x
H d tri ~-3 H '~ d tn ~-3 H H d ~ H H ''~ d ~ H H ''~ d
o ~ r N N ~ O ~ ~ N N r o ~ N N r O ~ N N O N d
xz ~r~r~xz ,~r~~~z ~mr~~z ~r~r~~~
~~bbH ~~bbH ~~broH ~~bbH ~
~o ~~ ~o r~x~ ~o~~~ ~o x~~ ~o
~r~oo ~r~oo ~rr~oo ~r~oo ~r~ o
x '-' H ~7 x "'' ~-3 ~ x '-' ~-3 ~ ~ x ~..~ ~ H ~1 ~y x H
H 0 ~ trJ tri ~ ~1 0 ~ trJ hi ~ ~l p ~ lTJ tii b ;~ p ~ tri tii b '~ p ~
~~~Z~ x~Z~ xx~~ZZ xr~~ZZ x~~~ o
d ~ d
CJ ,~ C7 ,~ d ,.~ H


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
237
O O O ~O t0 O O
O O ~ N N CD ~
a a a a
w w w N
oho ~ ~O oo N
H
N o O O N oho
W .p .p ~. W
coo cNO °co° c°~o can cNO ~ b
i i i i i i ~
O O ~ ~' O rt ""
O o0 0o N O ~D
O O O O ~ !-'
O ON O O ~ N p
M
.O ,O ,O O .O ,O r
r-. ~. O
l0 ~O J '-' N ~ fD
~ H
O ~
t~
(~ '"d H C7 'b H n 'b ~ C~ b H
xo~ yon o~ xo~
Z a~~ a~~ yZ~
0
r~; r~
r r
~~°~~~HHaa~~HHaa~~HHaa~~~~aa
~~~~m°~xzz~°mxzz~°~xzzm°mxzz
~-3 d trJ F-3 H '"3 C7 tai H H ''~ d trJ ~ H '~ d trJ H H
Oc~°~a~~ ~~a~~ ~~aZ~
OOOO~a~
~~x~~r'-~orr~r~r'-~ormr~r'-'ortnmr'-'ormr~ b
ooookzNx~~xzNxm~xzNx~mxzNx~~
~~~d~d,-~ ~
ZZZZ ~°~~x~ ~o,~~~ ~oH~x~
~rr~oo ~r~,oo ~rr~oo ~r~,oo
""' H H x ""' ~-3 ~-3 x '_'' H H x "" H H °.
H ~ ~ tii h7 ~ ~] ~ r~ trJ trJ ~ H ~ ~ t=i h7 ~ ~3 ~ ~ trJ t~J
9x~~Z~ ~x~~~Z x~x~~Z~ xx~ZZ
z
d
d H d H d H d H


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
238
O O ~O O ~O O O O
r, ~O l0 ~O ~D .P
b
o ~ w w
a a a a a a
z
J ~ .7~ N W
O\ W N O\ ~ O
H
.p W N N W .P
QO W J ~O 01 ~O v-'
00 O 00 O~ -P l0
O O O~ Oo 01 W .J~. W
eD
O O .P mP N co N G
O ~o ~ ~o ~u ~o ~u ~ "~d
.? d1 ~ N O O ~ ~ ..
O O O O O O 0 O
N G O ~ ~ O '"
Ov N O~ .p tm O~ N N "2~
O O p .O O :O o O n
W ,Wp '-' ,-. i--' i-. ~ ~-. O
V, r-~ 00 .p lp ~
~ H
~ O
r~
d
o~~~~o ~~"'..,''mb n~tr~iv ~tr~i i ~x.ItrrJ n~mp~t~ii
a ~ ~ ~ ~ ,..n.3 ~ H ~ ,.,n,3 ~i ~-3 H
d~~r9~ ~~ ~~Z~~~Z~~~~HZ°~r~
~~~do~ ~x ~~'~~r~~r~~~~~xx x
o~~~~~ ~ ~~~~~~~~~~~y~~zz ~ b
o~ ~~~C~~~~H~~H
r~ WH~Wa'.W~~WH~ r
'~W~
x ,
rbb rr
xxx
d
Hx oo d
~aa
o~~
0
aim
x


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
239
z
p O o O o 0
c'_'''" a- a. ~ ~ ~: ~ ~ d
,~N pp ,..
a a a a a
z
~' ~ 0 0 0 ~ H
H
W N N .p W N '_'
W ~O ~O O W ~ "
W N r-r. :p tp v0 O
.p oo Ov O~ C
N N N ON
O W
O
O i-J. O O O O O
O W O f7 fD
O W t~ .P ~ W
fD cC~h
O O O ~ O O
O 0o O
.p
.p
O
h
'° r
d
a~o ~~ o~~ ~~ ~~ a~~~~~
t7 ~~d "~dld tHnbt7 bd "mbt7 ~OHH
~-3 O
cZn b0 ~~ n'"O~' ~ ~~ u''fO,~rZ O~y~~ o
o~ ~~ ~~~ x~ ~~~C
r ~~ ~~ ,~~, a~ ar r ~ ~ °c
nH y~ °~ d7~ ~r,.b ~7~ ~'v'
O O O O r O r O
m m ~ y
w
~~~o~~,~~~~~,~a~~~d~~d~
o~ ~ r~ rn r~ ,goo
HH
~rHnb~~r~a~H~~.,~~H~~~H~~~ Ga
~~°~~'~~~d~~~'~~~db°db°~ zox d
cnzYbr~rrb~~r~rp~ro,~ro
ro
~m ~~b~ ~ c~ n .
arm ~a ~ ~ v ~9 ~ ~~a
o r~ ~'~rc~
~~~,~~~c~or~mc~~r~~~r~ o'~ .
r ~r~dg r~H r~H
~~dy~~~m~~~~~~~~o~~o ~xz
d ~~da~ ~~. ~~ . ~ . ~ rz~
N r


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
240
-. ~-. .., ~. "' ,r
"'' '-' ~ "' o 0
co 0 0 0 0
..-. vo ~ ~o ~ ~ ~ ~ b
U~G QG U~0 ~ 'C 'C
a~ w
a~ a a a a a a
0 0 0 ~ ~ ~ o
'-' '-' w o J N ~"
.P ~ ~ .P
~,, ~,, w .-, w P o, ;P
o, N oo :~ oo td
~o rn cn ~u cn ~ b
;, ,~ ,r ~-. ~-~ o
w o0 0 ~ o
0 0 0 ~ o C,
o .° '~-. N 'v, ~ v,
o N w N w J O
~M
~ o i i O ~ o ~ b
O O
-w.. \O ~ ~ 0 0 ~ f~D
F
lp
t
O f
rt
d
Hxc~H~Hx ~~r~~r~ r~~o ~o ~o
~~~~y~~ o ~o ~o~~ ~~ ~~ ~~ c~
r~Hr~HryHrrod rod bd b
v~ v~ " rn ~ ,~ ~ n H ~ n H ~ ~-.. H ~ H ~ H ~ H
C~~i-~r~ ~JC~aC~~y ~-3G~~~~HG~~-~~HC~,-,t-'OZ Oz OZ O
~tCrl~ ~x~~ ~-~~~~~~~~G~d~~ ~~ ~~ n
.-] "',d H ~--~ H 'T~ N ~ C,) N ~ O N ~ .n, vW'
~" a C a w ~ r., ~ r.., ~ r r, ~-C x '-C ~-C x
~~~r~ '°ax~ax~ax d
_ ~ ~ H ~ O H ~ O H
~r~ar~ar~ ~° y° aro y
C7 n Cn C~ w ~ ~ w ~ ~ w ~ ~ x 7~ 'sb 7~
H H H ao 0 0
~~'
HH d HH rt~x~~x~~x~
~ d r~
~r r_~~r~~r~
r~~ ~trl'Jc~n~ ,,.aj~~xH~g~~~g~H
~m do~~ ~~~ao~~o~ao b
rbb ~ ~ ~ d
z~~~ do~~°H~~~c~~~~~
H ~ H ~ H
~x ~mx '~~~ a~ a~, a o
~~~xo~xo~x
°~~
o~ Nom ~do~d~Zd~~
r' w
z ~ ~ r ~' r' ';'
' ~ hi ~ ~ 0 0 0


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
241
o~A
b
.,
a
N
~H
H
v,
p.
o, o, 00 0o t~TJ b
.° ~°
p OWO N ''t'
J
O ~ O
J
~M
o O ~ ~ ~"d
0o ~ ~ O
O
N r'
O "~J
'° r
d
x ~d~~c~~~~x~d~~~~~ax~~~~x~~~
zx x x ~xzxx x ~x x ~x x
~~~ay~~~a~~~~~a~~ ~~a~~~~a~a
Z _~ZZ~~~ ~~ Z~Z
.. , ~ ~~~,~a~~~~a
m~~~~~' ~ '~~~~~
xxxybxb~~ xxxabx~m~ bxb~~ ~x
~r~ r~ ~~"_'~r~ ~"_'~r ~'w o
~nro~~~ anro....~~. ~.~~. ,-.
dx"adx~~ rxa~x
b zaz~ar~~ z~H~a~~ x~'x
a
.-xo~a~ a '~xdx~ p Z
d x ~; '. H ~ ,. .
~C
~,~~w,~Hl-CH -CH~W~-3~-C~7 ''t~~'~~lH 7~
~W
ny 7~n< ~ ~~7~C~C '~ dOn''~ d0
a~~Zy~a~ a~HZy~a~, z~'a~, z~'
~a~dmmC~ ~~d~yt~C~ ,-~~-m~-~, ~ a
0
r~a~ ~ r~a~ ~ ~ c C
N ~ ~ ~ ~ C~ ~-~3 H '-'3 v~ ''~
O o "~ tii . o O o tr!
W ~ ro r ~, W ~'.''P'' b r N O (?J N
z~
x x


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
242
z
c c ~ ~ ~, b
d
a a~ ~ a ~ a
d
z
H
w ~ ~ i
w .r
i.~ o~ ov o0 0~ td
o ~ .G .p. i~ .~~n-. b
~O N N N w N
O O O O O r" C
w p i-. W N N n
O~ -P -P .P ~-' N ~ ~i
O O O O O m b
N
-~t ~ N
r~
d
C~ ~-3 ~ C~ ~-3 x ~ H x C~ H x C~ ~-3 ~ ~ ~C ~x-' H
C ~ ~C Cc~
d° Nor ~ ~~ z~ ~~~Ndx~~~
o ~x~m ~~ ~ ~ .. ~ how
~n
H ~ td ~ ~ td ~ ~ b7 ~ ~ td ~ ~ ~ r o
H
w~~ ~a~ ~a~ ~a~ ~a~ _~b a
'~ ~ t~rJ m ~ t~l,'J ~ ~-~3 ~J m ~~-l ~t~r,J ~ ~ ta'' ~ t~rJ o
~ °
w ~ ~ c'~r, ''r cHr, '-' c'T'r, ~ vx, ~ .p N y P.
a can ~ .P .p .p .~. ~ o
r
xbH ~r~~rH~~r r
y~~ .~~'~t~~,7~x~m~,~~t~~~~,tn
x ~ ~,m~o ~~o r~HO mho b
y~~' z~r z~r zxr zxr
r o ~ (~ x ~ C~ x ~ n x ~ C~ o~
~~z
aim acm a~~
rn
m


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
243
o b
~rn ~ ~ w .-.. Uv ~ td
a a a a a
H
~z
~1 N N
d
r, N P .p VO
O~ .p v0 ~O Oo W 01
O~'00~"~'
N ~-~ N v0 tn oo v0
O O O O O O
J tl~ ~~-. ~O J ~ n f~D
J ~p W Ch O ~ .~.
O .O O O O O ,O ~ b
n
'W --~ N N N N
~O O O O O ~O
~ H
~ O ~°
'° r ~
a
y~
z~ ~H x ~~~ ~ o do
a ~ o
x a n c
.. ~~ a~ ~9 a
d ~ r t~ t~
r ym
bab ~ ~~~~~~H~~x~~r~~~~~~a~~c~
x~x 00 oor ors o or~oo
p d O ~ ~ ~ H ~ H ~ H ~ ~ L~ ~ -~~- ~ ~ Hl ,~,x,, ~ ~ ~-Z3
a
rn~~~~~o~~H~~~p~~r~~~~~~~ '~
-!.a t~rJ ~ y-H, ~ m '"~ ~ ~~-7 ~ p ~ ~ ~ ~J a '_~' '"~'' ~ ~ "H_"H-'
~~~o~~~Za~a~~Zx~ '~~~r~~~ o
b ~r~o 000
~'v'~Np~'b~ h~7~~ o
n~~~~ z r ~o~~~~o~ d~~
r ° o ~ z~ ~ o off ,..., ~ c~
r ~ ~~nr~ z~ o


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
244
~_ ~_ ~ ~_
N N N N N
x' ~ C 'C
a a a
o'~', w .tea
H
~_
:~ ov td
N N N
lh W W O ~O '"
O O O O O ~ C
i--. tl~ N O i-~!
l~h ON1 O c.h W
O O O O O
W--P. O
O
~ O ~°
r
d
~~~o~~ ~H~ 9~b ~~b
r~
~c
~~~r~~ err z ~z
,..a.3 ~~ X00 H ~a-3 f~
azz
b b ~. ~c
xy aoo
bb
a
n
° zrz'~or~~ ~z°b°bro° bo°
~x~ ~°~~b~m~~~~~x~x-~~-ro~x'~~
°Htnb~wn v~~ ~ Hv~ btrJ b'J
~z~~
x~~~ ro o o~~'~~ ~ ~~r
m
O O
~y Crl f-3 H - " tiiw trJ
H N


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
245
~_
W N N
N N
r-. ~. W N 'r
-H. ~ N ('' ~. ~ tC
n
W ~O J W W W
H
N '-' ~ ~' ~ d
o ~, o ~, ~-.
o, o i.~ by o~ td
° '~~°, ° N ~ N ~ ro
o . . C
0 0 0 0 ~,
v, o J o ~-. 'v, ~ c~
~ ° ~r
°~
o . 0 0 0 0 ~ ~-d
°° ~ ~~" ° v, °
0 0 ~ "~J
o ~
'~ O
'° r
d
oza ~x ~~~~o°~o~
o ymzWV~ ~~, r~
mm~ ~ m~~~~z b~
v~ n ~ ~ ~ C7 ~~ v~ m m p y x m y
v~ ~, N . m ~, m
.. a~~ N~bm~ W m
p ~ Y ~ ~.:p ~ c
b m
~..~, C~~~rJd ~ ~~p°m''H.b~t~rJ ~ R.
x W~bN~ m~ r~
n
td~ ~Hd~~o >°c~~'>°c ~xx
m~om~'~ pW~p ~xx
O ~ ~ H ~'' ~ ~ x d x ~-~d d ~ ~ "b
o~;~~~ yd~y
xg ~oo~ ~oZ~, ~c
x~
m o~~ rim
m
a~Z d
p


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
246
"' '~ o~ o~ O ~ p
w w
ua ao
0 o d
a a
H
r
0
0 0
0 o trJ b
°o
~D ~O N O~ .-r
O O O
i n N ~'
0o W '-' O
M
fD eC
O O ~
O ~O O
O O W A
v~',
o ,p
o "~J
O e°
r~
d
~ ~ ~ H
~~~r ~a~r
n
al ~,
a ~~~ ~°~ o
~~a ~~a b
r o
~a~ ~a~ ~y~ Hy
°~ ~°
H ~ ~ ~ H
d
o~~~,~~o~~'"~ zo~~~~~~zo~~~~~~
~m~~rG~~~~~r rH~y~d~~HH~~Yd~~
cz°oxz~z°ox '~~' ~~~~x ~' ~~~~'x "°
z~~~z~N~~~z ~~~~H~~~~~~~~Z~ro
~~~HdHd
~r,r~ r~~~r~r, r~ ~ a~ ~ a~
° ~ ''~ Z ° a '"'~ z ~ ° v~ p ~ ~ ~ ~ ° ~ p n U' m
c
z~~~Z~~ z~~~Z~~
>C ~ ~C d ~~>C H~ ~~~C "-39
.~


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
247
o ~ o
w w w
"' , ~ w w w w b
as ao ~ ~ ~ G
N a. a~ A- bd
a~ a~ a a a
w t~ ~ ~ ~ ~'' H
H
0 o w oN, v~, N°
d
0 0 ~o ~o w o
0 0 ow, i.~ o b~ ,b
0 o rn co co 0
0 0 ~ ~ a, ,_'
O~o N N N .P
,
p .O ~ ~ ~ C
.oP J ,-N.
~M
O O '"' ~ n
A
Ov ~ C/~
w p
~ N
r N
d
x rxbxxb~x~~b~,x~a~ a~~ ~x
y ~.,~Hy~-~d~~c~obG~Ccj° C~~ xy
t.~~ xt~Od~ O~~ O~ O
N ~ ~ O ~ ~ ~ rØ, ~ ~ ~ r C) r~-, ~ C7 ~'
° o
a N~,~CI~~ ~ br.. br.. r9 "O
°a
c~~tnp~c~~t~p~ p~ ~ a.
xH
a a a
x f~1 ~ ~ N ~ N ~ N r
Y ~ ~ td td td
ax~obOOxxOxOxxOxOxx~H~c~H~O~
~~~oooo~ro~o~ro~rr~r~ m'~o~"~o
~ H o H t~ ~ ~ ~ ~ p ~ O td ~ O ~ O ~ ~
~p~~~rrp~rnr~~rp~ndr v~r~v~rp
~zoZOZk~°xx°x~xx° ~'~~ ~x~~xz d
~a~a~,
~c bzbz ~<r~~ bc~~ o
a a~r~x H~~~ H~~~ ~'~n ~ rtd b
~' ~m_xm° ~x~~ ~x~,~ ~ ° ~ ~~ ~ o
x r Z ~ ~~" ,-C''., n "~~' ~-r, ~ n ~ rr-r O ''~b (z ~-°d t~t\'l h%
d tii trJ tn G~ O 47 0
Y ~~~ ~Y~~ ~Yro~ ~' r
t~ ~ r~nH ~~~-3 ~ ~ n
°
0 0


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
248



0
w w w r0
w


o ~' ~ a. ~
G



a a a a



H


H


N N N ~ ~j


J


pp r, 01
i
O~ .P


J



O O
O rn
O


M


A


.-. ~. O ~n
O O b


O O


.,o,~ v' H


o, ~ a
,p


~ ~
O
'


r ~
d


wax ax d~zab~~x ~o~~x ~ ~d
~ trJ CWa-7 9 ~ ~
H ~ ~ "rd H C~ H a
~' y
~


~ , ~~~~a
,, "'b~~go~~
.3
x oo~o
~


zo o ~N~'ra~ ~~~~
d~o


~~~o~~x ~'~~'~ '~ o


II zH ~ ~r ~rHa H~oc~ a ~ b
~


~td ~ d~ ~rra
~~N
w


,~N H ~ N O n ~ C~ ~ ~ . n r a.
(~ ~ ~ n ~ ,~Z,


nr ~Y 9 H rY d


a


o ~ Z e~


. .


d


~ ~ r~ t"''
~ ~o


r,d~- o p ~ ~
d o"~o ~ r~ d
r '~~~ word ~


~ y


~i ~ O O''~G~~ O
ro ~H a~~ ~td~VC ,~tiO


O n~ ~~ ~~~d


n


rr~ o oar ~r~


r~N~ N


a m





CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
249
~ ~ ~


_ _ _
~O ~O ~O
W W W


A. tn~ ~ ~ y "~
b
d



a a a a a


z


H


H


N .P N W


O~


i i i i
b



O O O


O O



~d
O
f~D



~' H


0 0 0


O ~
r ~


d


z~c~~ araxax ar x x ~axax a~ax~x ~~
a v ~
x ~


x ~~r~ v~ y~~nHv~ aHv~Hv~ ~n
n ~m~ cn w ,~ w ,~
z a ~ ~m H ,..., H ,.~
n H H ,~ H H H
H ~


O ~ ~ ,~ H ~ O H O O rJ
~ ~, ~ C~ ~ O O ~ O
n ~ C~ ~ O ~ O
O O O


b r mom~m~ momnm~ mm~mn mm~mn
mr~a z~zoz~ z~zoz~ ~zozo ~zoz~ ~r,


b x~ ~c ~ --,~ ~ ~~ ~~ roo
'-~ '-' H a '-' ,~ ~ x ~ ~ x
O ,.d n a ~ ~ ~
v~ ~ ,.d


~ ~m~aya ~m~y~a ~ N~a a H~
~ ~ ~'~ N~a a ~aH H ~
H ~ ~a~ H


mr.. ~ v~ er o
x ~v~ ~
9 ~'' ~ ~
~


C~ ~ ~ ~
~ ~ N ~ ~ ~ ~ ~ ~ '
d ~ 0


W W l ~ '' 3t x
'y H ~c ~c xrHr ''H o~
~c ~c l ~r-
~9~c~.c ~a~c~~c


(-7 .p ~ .p N .P
~ a
N


~



n


H



x ro
r~vrH


d


td~o 0
o


y
ro


p
H



r





CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
250
~ ~ ~ ~ ~z~v~
~ '-' "' , W W O G ~
w w '
_ ~ ~ ~
.° ° °
0
w N N
a a a a a x
w N V1
H
N
00 ~. ~ ~ ~
Q1
N ~ c~D ~ b
,..
O
J J
O O m
D\
O
~p e~~i
~d
0
0 o coo "~J
o\ ~ a1
o N °° LsJ
c~'n o 0
O
'° r
d
-,~~ax~xz~~~
xx
. r ~.. N~~r,~~~r
a~?~rroac~~rr~a ~td~zo~br~~
C) ~ O ~ ~ n ~ O ~ r' C7 C~ ~ ~ ~ H yn o
~ trJ ~ ~ '~ ~ m ~ ~"' ~ ~ V'' ,b O 'y H ~ ~ ~ x "C
H
x o r~ N ~m m~
d o~~a~~ o~~~~ ~~°~x~ ~~a~d
x m~x~~ ~~x~x ~mx~c ~~~m
H am ,~ r ~y
~ ~ N ~ ~ ~ ~ N ~ ~ 9 C~ ~ N
~~~bc,~~ ~bc,~c, ~r~~x~xz~~~x"~
o~m~o~o~m~o~o~rm~o~ roo ono
"'"~'"d H ~ ~ ~ oo ~ n-7 n ~ ~-~d oo ~ O ~ ~ ~ ~ Q ~ ~tiy ~ ~ O ~ p ~ O
r~ r°~ r r°~ r ~mor~ rrxr~r~
md~~m~mC~~m~m ~x~omoo amxmomx
xvo xxx~o xxx abrx~~'~~x~x x~ d
z~
0
o~~b~ o~~b~ ~~r~'~ rr~o ~~o
~w~x~ ~x ~ r~~
~Nmdro ~Nm~b rz~a~ xa a H°~a o
o x ~ H x ~ H ~z~N~c a~ ~ ~c
n ~~'1 n ~ N r l~'~~'~ "'~C
a~


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
251
z
W ° ° ~ ,~
d
a a a
~ o
H
°o a~
a, ov b, b~ ro
v, J so
mo o ~'
0
0
~s
b
r~
0
~ H
t~J
A ~
'.''' ~'' o
~ O ~°
r
d
~~N~,dozx~z~~~~omx d~~x~x
d,~~c~ ~~ ~,~aa~~~~
0 0 00 0
,.pb ta..,~ ~ ~ ~ O ~ t~ n ~ ~ ~ x c~ p b 'b ~ ~
°a~~drroa~~ob°~~~ro ~°o°~~ o
n~9~H
~ ~~~a~~~~ ~~~~~ ~~ ~a
x ~da~~ ~~ b~a~H x° c~~ °'
~~c d~c ~xNz~c
H N
9 O ~~ 9 ~~ .~~. .
... m ~ .
aO~~'~ ~ mo ~~~~~'~~Ha
w H r-, ra.., ~ ~ ~ ~ ~ ~° O H ~-t~Jd ono y G~ O rv'n
C/~ O ~ ~-d r C~ '~ H ° trJ C~ ~..~
~ ''d
~ C'7 ~ ~ b r0 ~ ~ ~ trJ ~~ C7 r ~ ~ N
O 04~N ~'~ ~-a3t~~tm''~~Htn-~
r ~ r v,
daoo cr-.
"' G7 ~"~ '~ ~ H r '~
ooa ,~~" r~
~zm~


CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
252
N c~~, v~,c~~,t_~n~~"


--. ~ ~ ~. ~. w ,"d



y td t~ ~ 9 ~ Y n



H


~-. ~ ~. r-. ~ N
~O J N ~ J ~ ~


G


00 ~D O~ ~ Ov W W
.p .p -P N N


~
J O N ~ W N P ro
~
"'


~1 Oo c~ O v0


O O O O O O rn
N i W ~ .? f~
-J. tD


W J pp tn 01


0 0 0 0 0 0 ~
b


vp .p. W J W O


Cn
~ H


~ ~_
A


O ~
'


r ~
d


ox ~~ b~ ro ~ d~c~z


~ro ~~ Ob o"G~dOro


a
r


y n
r



~,~. ~~ ~~ ~Z ~ oy


N
~ ~ i


C ~ b~ td ' ~ i~
~



O d t-'
~C ~C ~


~. C~ by C.'C~'~ ~ ~
' '~' 0 ~' ~''~ trJ
~ ~'


~
~r-~oO ~ ~~ ~~ ~~ ~
,.b~~~ x ~r
~~


~Ot7t7CJ b ~~ ~~ ~~ ~ ~ t7
O ~~
~ t~~%n~ ~
~


ror~~~r rr rr rr r~zr o


~~o~~ b"~r~a mb bb bb bb b Hb e~
9 ~ m


N ~ c O O O O O t=i r r
n ~ O w ~ b O O O O
7~ ~1 '~ O
~J


trJ CrJ ~-3 H H H Wn
H 7~ H H H H trJ
~-~ trJ '~ '~ h7 trJtrJ "
" ~ ti1 trJtrJtrJ tit


~~ ~ ~0~0~ ~~ ~~ ~~ ~ ~~ ~ a
~


no ~N
r


~r o ~ o


<
~


V




CA 02456955 2004-02-09
WO 03/080795 PCT/US02/25485
253
_ _ _ ~' "' z
N N N N_ ~ ~ O ~ A
N N N N N N
ro
w p' ~ a.
a a a
z
H
H
o .-. o 0
o~ ~, ~~ 00
o, :~ N w ~ ~ ro
i-. N N ~ ~-. N v1 v,
O O -P N O
O O O O O O
w .Np 01 ~ '".
~~h
O O ~ O O O vW .b
J ~ O W '-~ W "Ot
fD
~ H
~ O ~
r
d
~~~o~,doo a~d9 ~~ ~dd~
m ~. x r
~~r~~~
wz~~~a ~~ a ~~'x ao
~'~~~~o d ox ° ~ ode' ° ~ c~
~~od~~~~ ~~d ~ ~o~ m
~a"O l7d ~~W"~; d °a
~tnxwt~~Ct~ r~~ c~ r~' ~ c~ a.
c~ ~W n~ "~ n
y d ~i ~ x m ~ x ran
r a r~ m
x Zo
ad
o °d~oo~orr~odorH aoor
' ~xx~x~x~x~x'~ x°dr xx
~'~~ooZO ~ZO~'~~o~~r~o~
o ~~dE~~or~~~o'~o~dd~ ro
~~x~o~°~'~~~~o~'~~~
y Ha ay~yza°~ y~~N°a~~
~~~m
o~ ~ r~~ b ~ z HoH~ o
d x yx ~ x
a ~ o ~o o , ~o
~ ~ ro
tai




DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 253
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 253
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

Representative Drawing

Sorry, the representative drawing for patent document number 2456955 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2002-08-09
(87) PCT Publication Date 2003-10-02
(85) National Entry 2004-02-09
Dead Application 2008-08-11

Abandonment History

Abandonment Date Reason Reinstatement Date
2007-08-09 FAILURE TO REQUEST EXAMINATION
2008-08-11 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Registration of a document - section 124 $100.00 2004-02-09
Application Fee $400.00 2004-02-09
Maintenance Fee - Application - New Act 2 2004-08-09 $100.00 2004-06-17
Registration of a document - section 124 $100.00 2005-03-11
Registration of a document - section 124 $100.00 2005-03-11
Registration of a document - section 124 $100.00 2005-03-11
Maintenance Fee - Application - New Act 3 2005-08-09 $100.00 2005-06-15
Maintenance Fee - Application - New Act 4 2006-08-09 $100.00 2006-06-14
Maintenance Fee - Application - New Act 5 2007-08-09 $200.00 2007-06-19
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
NUVELO, INC.
Past Owners on Record
HYSEQ, INC.
MA, YUNQING
TANG, Y. TOM
WANG, ZHIWEI
WENG, GEZHI
YANG, YONGHONG
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Abstract 2004-02-09 1 51
Claims 2004-02-09 4 123
Description 2004-02-09 315 13,841
Description 2004-02-09 255 15,218
Cover Page 2004-05-17 1 26
Assignment 2004-02-09 2 93
PCT 2004-02-09 2 93
Prosecution-Amendment 2004-02-09 1 18
Correspondence 2004-05-19 1 24
Prosecution-Amendment 2004-04-15 1 40
Assignment 2005-03-11 13 582
Correspondence 2005-04-07 1 13

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :